http://qs321.pair.com?node_id=285385


in reply to Re: Re: File Comparison
in thread File Comparison

Sounds like an interesting problem, but I still can't quite picture the data and the result you want. These questions might clear things up for me:

Are the lines guaranteed to be unique?

Does the order of the lines matter as it does in diff?

If I see a line "XYZ" in file 1 and "XYZ" in file 2, and "XYZ" in file 3, are these the same line no matter where they show up in the respective files?

How big are the files? Would it be feasible to load them all into memory at the same time?

Is it ok to sort the files before doing the comparison or does your output need to be in a specific order?

Pretend letters are lines. What should be the output if the following are the contents of the three files?

file 1: A B C D E G file 2: B A D E G H file 3: A B D E G I

-- Eric Hammond

Replies are listed 'Best First'.
Re: Re: Re: Re: File Comparison
by sunadmn (Curate) on Aug 21, 2003 at 17:44 UTC
    ok I will give you a sample of the files, what these files are , are logs from the output of the namedxfer daemon within bind 9.2.2. What I am seeing is that I am have a large lack of transfers to a single server in Atl, GA and I can not get my network Nazi's here at work to do anymore digging as they say their switch is fine. What I want to do is take the log and split it into three files one for each "Master Server" so ns1, ns2, and ns3.mycompany.com from these three files I want to compare the three to find out when and where I have degridation on my network so I can go back to the NetEng group with hard evidence that there is a network issue.
    Here is a small exert from a parsed file for a single server.
    Aug 06 15:00:36.747 xfer-out: info: client 68.168.192.17#50840: transfer of '112.23.67.in-addr.arpa/IN': AXFR started
    Aug 06 16:00:36.326 xfer-out: info: client 68.168.192.17#50963: transfer of '129.23.67.in-addr.arpa/IN': AXFR started
    Aug 06 16:00:36.829 xfer-out: info: client 68.168.192.17#50964: transfer of '131.23.67.in-addr.arpa/IN': AXFR started
    Aug 06 16:00:36.840 xfer-out: info: client 68.168.192.17#50965: transfer of '130.23.67.in-addr.arpa/IN': AXFR started
    Aug 06 16:00:37.327 xfer-out: info: client 68.168.192.17#50966: transfer of '128.23.67.in-addr.arpa/IN': AXFR started
    Aug 06 16:06:09.468 xfer-out: info: client 68.168.192.17#50978: transfer of '78.168.68.in-addr.arpa/IN': AXFR-style IXFR started
    Aug 06 16:12:06.719 xfer-out: info: client 68.168.192.17#50989: transfer of 'colememorial.com/IN': AXFR-style IXFR started
    Aug 06 16:15:44.581 xfer-out: info: client 68.168.192.17#50999: transfer of 'charlescolehospital.com/IN': AXFR-style IXFR started
    Aug 06 16:20:25.301 xfer-out: info: client 68.168.192.17#51010: transfer of 'coudersporthospital.com/IN': AXFR-style IXFR started
      Given that data set I do not see how you are going to make a case for a network issue. The most I could see that data set implying is that there may be a differance in the number of IXFR/AXFR started on two different servers -- where that difference comes from is not stated by that data. Could one of the servers be overloaded and not accepting or initiating transfers? could the named be compiled differently on one of the servers or the config file be different? could the kernel on the server be a different revision/patch level/compile options different? It may be a better option to look at network data instead of application data to pinpoint network issues.. IMHO.

      -Waswas
        I have had my sun onsite guys look at the box and they can find nothing wrong and the servers are the same from head to toe including all applications. these were all installed from a Flash archive ( jump start ) and the Bind install is a package I built and installed on all servers. I have some network data ( MRTG reports from the switch ) , but they are not really showing me much only that I have some spikes during the day where a large amount of traffic is forced. With that said I think I will have to prove to them that there is an issue, but they always push the blame onto my shoulders. Real pain if you ask me, but it's a job LOL.