Evaluating the effectiveness of Discriminator network in GAN architecture for phishing URL classification

Pham Thi Thanh Thuy pham; Ta Viet Cuong Ta

doi:10.54939/1859-1043.j.mst.86.2023.110-119

Các tác giả

Pham Thi Thanh Thuy Khoa An ninh thông tin, Học viện An ninh nhân dân
Ta Viet Cuong (Tác giả đại diện) Trường Đại học Công nghệ, Đại học Quốc gia Hà Nội

DOI:

https://doi.org/10.54939/1859-1043.j.mst.86.2023.110-119

Từ khóa:

Phát hiện URL độc hại; GAN; Phân lớp dựa trên mạng Discriminator.

Tóm tắt

Tấn công lừa đảo bằng các URL bất hợp pháp là một trong những thách thức an toàn thông tin phổ biến nhất đối với cả cá nhân và tổ chức. Gần đây, phương pháp tiếp cận dựa trên học máy đang được áp dụng phổ biến để phát hiện các URL độc. Các bộ phân lớp như SVM, Random Forest, LSTM,... được xây dựng trên các bộ dữ liệu tiêu chuẩn để đưa ra dự đoán một mẫu URL là độc hay không độc. Một số nghiên cứu gần đây tập trung vào việc sử dụng mạng GAN để làm phong phú các mẫu URL độc hại được sử dụng trong huấn luyện bộ phân lớp dựa trên các mô hình học sâu. Trong bài báo này, chúng tôi khám phá khả năng huấn luyện một kiến trúc GAN với hai mạng đối nghịch là Discriminator và Generator. Các mẫu URL được tạo ra bởi mạng Generator sẽ được mạng Discriminator tinh chỉnh và phản hồi cho Trình tạo. Điều này giúp Trình tạo tạo ra các mẫu URL ngày càng giống lại so với mẫu thực. Theo đó, mạng Discriminator cũng học được các đặc trưung không độc và độc của các mẫu URL. Để đánh giá hiệu quả của việc huấn luyện này, các thử nghiệm được tiến hành trên các bộ dữ liệu thử nghiệm hoàn toàn mới so với bộ dữ liệu huấn luyện. Các kết quả thử nghiệm đầy hứa hẹn với độ chính xác phân loại của cả URL độc và không độc là khoảng 97%.

Tài liệu tham khảo

[1]. R. Srinivasa Rao, A. R. Pais, “Detecting phishing websites using automation of human behavior”, in: Proceedings of the 3rd ACM workshop on cyber-physical system security, pp. 33–42, (2017). DOI: https://doi.org/10.1145/3055186.3055188

[2]. C. L. Tan et al., “Phishwho: Phishing webpage detection via identity keywords extraction and target domain name finder”, Decision Support Systems 88, pp. 18–27, (2016). DOI: https://doi.org/10.1016/j.dss.2016.05.005

[3]. D. L. Cook, V. K. Gurbani, M. Daniluk, “Phishwish: a stateless phishing filter using minimal rules”, in: International conference on financial cryptography and data security, Springer, pp. 182–186, (2008). DOI: https://doi.org/10.1007/978-3-540-85230-8_15

[4]. L. Xu, Z. Zhan, S. Xu, K. Ye, “Cross-layer detection of malicious websites”, in: Proceedings of the third ACM conference on Data and application security and privacy, pp. 141–152, (2013).

[5]. B. Eshete, A. Villafiorita, K. Weldemariam, ‘Binspect: Holistic analysis and detection of malicious web pages”, in: International conference on security and privacy in communication systems, Springer, pp. 149–166, (2012). DOI: https://doi.org/10.1007/978-3-642-36883-7_10

[6]. A. Blum, B. Wardman, T. Solorio, G. Warner, “Lexical feature based phishing url detection using online learning”, in: ACM Workshop on Artificial Intelligence and Security, pp. 54–60, (2010). DOI: https://doi.org/10.1145/1866423.1866434

[7]. Madhu Chandra, S., K. T. Chandrashekar. "Malicious url detection using extreme gradient boosting technique", International Research Journal of Modernization in Engineering Technology and Science, Volume:02, Issue:10, pp. 675-682, (2020).

[8]. J. Saxe, K. Berlin, “Expose: A character-level convolutional neural network with embeddings for detecting malicious urls”, file paths and registry keys, arXiv preprint arXiv:1702.08568.

[9]. P. Yang, G. Zhao, P. Zeng, “Phishing website detection based on multidimensional features driven by deep learning”, IEEE access 7, pp. 15196–15209, (2019). DOI: https://doi.org/10.1109/ACCESS.2019.2892066

[10]. Y. Huang, Q. Yang, J. Qin, W. Wen, “Phishing url detection via cnn and attention-based hierarchical rnn”, in: 18th IEEE International Conf. On TrustCom/BigDataSE, pp. 112–119, (2019). DOI: https://doi.org/10.1109/TrustCom/BigDataSE.2019.00024

[11]. A. AlEroud, G. Karabatis, “Bypassing detection of url-based phishing attacks using generative adversarial deep neural networks”, in: Proceedings of the Sixth International Workshop on Security and Privacy Analytics, pp. 53–60, (2020). DOI: https://doi.org/10.1145/3375708.3380315

[12]. T. T. T. Pham, V. N. Hoang, T. N. Ha, “Exploring efficiency of character-level convolution neuron network and long short term memory on malicious url detection”, in: Proceedings of the 2018 VII International Conference on Network, Communication and Computing, pp. 82–86, (2018).

[13]. S. A. Kamran, S. Sengupta, A. Tavakkoli, “Semi-supervised conditional gan for simultaneous generation and detection of phishing urls: A game theoretic perspective”, arXiv preprint arXiv:2108.01852.

[14]. P. Robic-Butez, T. Y. Win, “Detection of phishing websites using generative adversarial network”, in: IEEE International Conference on Big Data. pp. 3216–3221, (2019). DOI: https://doi.org/10.1109/BigData47090.2019.9006352

[15]. H. V. Chi, “Xây dựng cơ sở dữ liệu huấn luyện phục vụ phát hiện URL độc hại”, http://www.antoanthongtin.vn/gp-atm/ (2020) (in Vietnamese).

Đánh giá hiệu quả của Discriminator trong kiến trúc GAN đối với phân loại URL độc hại

Các tác giả

DOI:

Từ khóa:

Tóm tắt

Tài liệu tham khảo

Tải xuống

Đã Xuất bản

Cách trích dẫn

Số

Chuyên mục

ISSN: 1859-1043

Ngôn ngữ

Gửi bài mới

Indexed by

Thông tin

Visitors

GTM