ORIGINAL RESEARCH article

Front. Artif. Intell.

Sec. Machine Learning and Artificial Intelligence

Volume 8 - 2025 | doi: 10.3389/frai.2025.1593944

This article is part of the Research TopicApplications of Graph Neural Networks (GNNs)View all 4 articles

PACKETCLIP : Multi-Modal Embedding of Network Traffic and Language for Cybersecurity Reasoning

Provisionally accepted
  • 1University of California, Irvine, Irvine, United States
  • 2United States Military Academy West Point, West Point, Georgia, United States

The final, formatted version of the article will be published soon.

Traffic classification is vital for cybersecurity, yet encrypted traffic poses significant challenges.We introduce PACKETCLIP which is a multi-modal framework combining packet data with natural language semantics through contrastive pretraining and hierarchical Graph Neural Network (GNN) reasoning. PACKETCLIP integrates semantic reasoning with efficient classification, enabling robust detection of anomalies in encrypted network flows. By aligning textual descriptions with packet behaviors, PACKETCLIP offers enhanced interpretability, scalability, and practical applicability across diverse security scenarios. With a 95% mean AUC, an 11.6 % improvement over baselines, and a 92 % reduction in intrusion detection training parameters, it is ideally suited for real-time anomaly detection. By bridging advanced machine-learning techniques and practical cybersecurity needs, PACKETCLIP provides a foundation for scalable, efficient, and interpretable solutions to tackle encrypted traffic classification and network intrusion detection challenges in resource-constrained environments.

Keywords: Contrastive Pretraining, Graph neural network, machine learning, multimodal, reasoning

Received: 14 Mar 2025; Accepted: 07 Jul 2025.

Copyright: © 2025 Masukawa, Yun, Jeong, Huang, Ni, Bryant, Bastian and Imani. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Ryozo Masukawa, University of California, Irvine, Irvine, United States

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.