厉柏伸,李领治,孙涌,朱艳琴.基于伪梯度提升决策树的内网防御算法[J].计算机科学,2018,45(4):157-162
基于伪梯度提升决策树的内网防御算法
Intranet Defense Algorithm Based on Pseudo Boosting Decision Tree
投稿时间:2017-01-09  修订日期:2017-03-11
DOI:10.11896/j.issn.1002-137X.2018.04.026
中文关键词:  伪梯度提升决策树,分布式集群,内网防御
英文关键词:Pseudo boosting decision tree,Distributed cluster,Intranet defense
基金项目:本文受国家自然科学基金(61373164,1)资助
作者单位E-mail
厉柏伸 苏州大学计算机科学与技术学院 江苏 苏州215006 zgzjlbs@163.com 
李领治 苏州大学计算机科学与技术学院 江苏 苏州215006 sdlilingzhi@163.com 
孙涌 苏州大学计算机科学与技术学院 江苏 苏州215006  
朱艳琴 苏州大学计算机科学与技术学院 江苏 苏州215006  
摘要点击次数: 282
全文下载次数: 175
中文摘要:
      结合TF-IDF算法思想,提出了特征频率、森林频率以及伪梯度提升决策树,解决了梯度提升决策树随着迭代次数的增加,错误数据被边缘化的问题。在伪梯度提升决策树中,所有决策树分别在原始数据集的Bootstrapping后的数据集上产生,无须针对每次迭代来对数据集采样。在分布式集群上进行内网防御的实验,结果表明在一定规模的训练集上,伪梯度提升决策树具有更好的预测准确度。
英文摘要:
      Combining with the idea of TF-IDF algorithm,the frequency of characteristics(Eigen Frequency),the frequency of forest(Forest Frequency) and the pseudo boosting decision tree(PBDT) were put forward,solving the margi-nalized problem of wrong data with the increasing number of iterations for gradient boosting decision tree(GBDT).In PBDT,all the decision trees produce respectively in data sets after the original data set of the Bootstrapping,without aiming at each iteration to sample data sets.Then intranet defense experiment was conducted on distributed cluster.The experimental results show that on the training set with a certain scale,PBDT has better prediction accuracy.
查看全文  查看/发表评论  下载PDF阅读器