進階搜尋


下載電子全文  
系統識別號 U0026-1703201410380600
論文名稱(中文) 利用回歸分析萃取殭屍網路特徵之研究
論文名稱(英文) Using Logistic Regression for effective feature extraction on botnet detection
校院名稱 成功大學
系所名稱(中) 電腦與通信工程研究所
系所名稱(英) Institute of Computer & Communication
學年度 102
學期 2
出版年 103
研究生(中文) 林俊良
研究生(英文) Chun-Liang Lin
學號 q36004303
學位類別 碩士
語文別 英文
論文頁數 36頁
口試委員 口試委員-曾黎明
指導教授-謝錫堃
共同指導教授-張志標
口試委員-陳中和
口試委員-李昇暾
中文關鍵字 雲端運算  資訊安全  偵測演算法 
英文關鍵字 Cloud computing  Network security  Detection Algorithm 
學科別分類
中文摘要 網路科技的發達,也造就了殭屍網路惡意攻擊的崛起,在發展期間也衍伸了許多型態,其中以P2P的殭屍網路之網路行為最為複雜,在網路行為分析的偵測方法之下,更加難以偵測其通訊行為。為了定義殭屍網路之間彼此連繫的通訊行為,大多數的研究列出常見的幾個網路行為特徵,但是這些特徵並不一定適合每一種殭屍網路行為。
因此,我們提出一個基於羅吉斯回歸的分析方法來自動確認所選用的特徵值是否對於目標殭屍網路檢定,並在分析後用分散式Co-clustering 的演算法將具有相同特徵的溝通流量聚合來加以觀察其行為結果。設計出一個可以自動為不同P2P殭屍網路選取所對應合適的網路行為特徵來做偵測。
英文摘要 As the development of network technology, on the other hand, the botnet also becomes more robust and resilient. It also produce different type of botnet during the development. The P2P botnet has the most complicated structure in these case. It’s hardly to detect by the network behavior analysis in single site. Most of this methods define some feature of the network behavior for detection botnet communication. But these features might not fit for allevery different botnets.
Therefore, we proposed a botnet detection schema method with logistic regression model to automatically choose the features which are suitable for each different botnet. After the statistical analysis, we use the selected features to process distributed co-clustering algorithm in MapReduce to merge gather the traffic with same behavior together and extract result the malicious IPs. Combine these two model to automatically choose corresponding features with each different botnet to detect the P2P botnet.
論文目次 Chapter 1 : Introduction 1
Chapter 2 : Background 4
2.1 Botnet Detection 4
2.1.1 Botnet Life cycle 4
2.1.1 Botnet Structure 6
2.2 Logistic Regression 9
2.2.1 Linear regression 9
2.2.2 Dichotomous outcome variables 11
2.2.3 The Logistic Regression Model 12
2.3 MapReduce programming model and Apache Hadoop 13
Chapter 3 : Related Work 15
3.1 Distributed Co-clustering Algorithm 15
3.2 Detecting P2P botnets through network behavior analysis and machine learning 16
Chapter 4 : Detection Method 19
4.1 Filter 20
4.2 Feature Extraction 22
4.3 Statistical analysis 25
4.3.1 Logistic regression 25
4.3.2 Maximum likelihood estimation (MLE) 26
4.4 Distributed Co-clustering 27
4.5 Classifier 28
Chapter 5 : Evaluation 30
5.1 Environment Setup 30
5.2 Traffic Log Collection 31
5.3 Evaluation Results 33
Chapter 6 : Conclusion and Discussion 38
6.1 Conclusion 38
6.2 Discussion 38
Reference 39 
參考文獻 1. Yen, T.-F., Detecting stealthy malware using behavioral features in network traffic, 2011, Carnegie Mellon University.
2. Strayer, W.T., et al., Botnet detection based on network behavior, in Botnet Detection. 2008, Springer. p. 1-24.
3. Hang, H., et al. Entelecheia: Detecting p2p botnets in their waiting stage. in IFIP Networking. 2013.
4. Logistic Regression. Available from: http://en.wikipedia.org/wiki/Logistic_regression.
5. Sweet, S.A. and K. Grace-Martin, Data analysis with SPSS. 1999: Allyn & Bacon.
6. Dean, J. and S. Ghemawat, MapReduce: simplified data processing on large clusters. Communications of the ACM, 2008. 51(1): p. 107-113.
7. Papadimitriou, S. and J. Sun. Disco: Distributed co-clustering with map-reduce: A case study towards petabyte-scale end-to-end mining. in Data Mining, 2008. ICDM'08. Eighth IEEE International Conference on. 2008. IEEE.
8. Joseph F. Hair Jr , W.C.B., Barry J. Babin, Rolph E. Anderson, Multivariate Data Analysis 2007/2008
9. Hachem, N., et al. Botnets: lifecycle and taxonomy. in Network and Information Systems Security (SAR-SSI), 2011 Conference on. 2011. IEEE.
10. Abu Rajab, M., et al. A multifaceted approach to understanding the botnet phenomenon. in Proceedings of the 6th ACM SIGCOMM conference on Internet measurement. 2006. ACM.
11. Wang, P., S. Sparks, and C.C. Zou, An advanced hybrid peer-to-peer botnet. Dependable and Secure Computing, IEEE Transactions on, 2010. 7(2): p. 113-127.
12. Sayad, D.S. An Introduction to Data Mining. 2010-2012 [cited 2013; Available from: http://www.saedsayad.com/logistic_regression.htm.
13. Apache Hadoop. Available from: http://hadoop.apache.org/.
14. Chakrabarti, D., et al. Fully automatic cross-associations. in Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. 2004. ACM.
15. Hanisch, D., et al., Co-clustering of biological networks and gene expression data. Bioinformatics, 2002. 18(suppl 1): p. S145-S154.
16. Li, H. and N. Abe. Word clustering and disambiguation based on co-occurrence data. in Proceedings of the 17th international conference on Computational linguistics-Volume 2. 1998. Association for Computational Linguistics.
17. Saad, S., et al. Detecting P2P botnets through network behavior analysis and machine learning. in Privacy, Security and Trust (PST), 2011 Ninth Annual International Conference on. 2011. IEEE.
18. 8 The Likelihood Ratio Test. Available from: http://www.stats.ox.ac.uk/~dlunn/b8_02/b8pdf_8.pdf.
19. Sinclair, G., C. Nunnery, and B.-H. Kang. The Waledac protocol: The how and why. in Malicious and Unwanted Software (MALWARE), 2009 4th International Conference on. 2009. IEEE.
20. Lemos, R., Bot software looks to improve peerage, 2006.
21. Open Malware. Available from: http://oc.gtisc.gatech.edu:8080.
22. Orebaugh, A., G. Ramirez, and J. Beale, Wireshark & Ethereal network protocol analyzer toolkit. 2006: Syngress.
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2014-03-21起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2014-03-21起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw