the spatula

I had a brief chat on IRC recently about the current state and future of Stable Diffusion, and it got us thinking about if models like this could be used to analyze network traffic. Convolutional neural networks (CNNs), which are used for images can also be used to classify sound. All you need is a representation of sound as an image, such as you might find in a spectrogram

Image source

You could add the labels for what these images represent and feed them in as images. This is incredibly simplified, but is essentially what was done in (this paper)[https://arxiv.org/pdf/2007.11154.pdf] where they converted sound samples to spectrograms and then fed them into a fine-tuned ImageNet model. One could do something similar for network traffic patterns, as was done in this research(can’t find the full text). I also came across this, where Chinese researchers from Hefei and Beijing were using CNNs to classify both malware and end-to-end encrypted traffic. I also found an associated repo, which I took a quick look through. In it they are using this dataset which is composed of packets in pcap format which can be captured via Wireshark.

Image source

You can take these and create a byte-wise black-and-white mapping of the packet to pixels, padding out the missing data to keep image size consistent, but it’ll only capture a singular static moment of the traffic. Using recurrent network networks (RNNs) would be much better, as this paper suggests, since they are built to handle sequential information.

This is something I’d like to look into more when I have the time. If you’re interest in more, I’d suggest fully reading that last paper, as it seems to the more recent and up to date overview of the subject.