Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: Finding patterns in packet data?

by lhoward (Vicar)
on Aug 04, 2000 at 22:26 UTC ( [id://26247] : note . print w/replies, xml ) Need Help??


in reply to Finding patterns in packet data?

For your original question, I'd try something like the following. (it is pretty rough, but its heart is in the right place).
#!/usr/bin/perl -w use strict; my @packets=('abcdef','123456','abc123'); my $size; for $size(2..4){ print "size=$size\n"; my %substrs; my $packet; foreach $packet(@packets){ my %data; for(0..(length($packet)-$size)){ $data{substr($packet,$_,$size)}=1; } my $k; foreach $k(keys %data){ if(defined $substrs{$k}){ $substrs{$k}++; }else{ $substrs{$k}=1; } } } foreach((sort {$substrs{$b} <=> $substrs{$a}} keys %substrs)[0..5]){ print "$_ $substrs{$_}\n"; } }
It isn't really efficient, but it will tell you which substrings of a particular length are most common across packets. It will tell you the most common substrings of a particular length. Answering your actual question "most common, longest substrings" is harder since you're trying to optomize 2 criteria at the same time. Which is better, a 5 character string that happens 20 times or a 20 character string tha happens 5 times?

However, in thinking about your problem in general I'd do an analysis something like this:

  1. Do a statistical analysis of the raw data to determine if it is encrypted, and if it is encrypted well. If it is encrypted well, it will be statistically indistinguisable from random noise. If it is encrypted poorly it will be somewhat distinguishable from random noise. I used to have a good reference to some algorithms for performaning this kind of analysis but can't find them right now.
  2. Compare (by hand) the same transaction done several times from several different hosts. Can you pick anything out.
  3. Since you said these are UDP packets, can you "replay" them from a different host to cause the same event?

Replies are listed 'Best First'.
(Guildenstern) RE: Re: Finding patterns in packet data?
by Guildenstern (Deacon) on Aug 04, 2000 at 22:47 UTC
    Thanks for the start. I can always modify it for larger strings and see what happens. As far as your analysis steps:
    1. According to the vendor, it's encrypted. No word on how strong they beleive it is. I performed actions that caused known data to be sent, so it should be fairly simple to tell how strong the encryption is. The problem here is that I have no clue how to go about it.
    2. The software is set up such that we can only use the two boxes it's installed on. I have run the same actions several different times, however, and there does appear to be similarities in the packet data.
    3. Did not think of that, but it sounds like an interesting test. Will investigate how to do it.