Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

processing timelog

by sandy105 (Scribe)
on Aug 24, 2014 at 05:51 UTC ( [id://1098445]=perlquestion: print w/replies, xml ) Need Help??

sandy105 has asked for the wisdom of the Perl Monks concerning the following question:

i have a timestamp log file and i need to perform some performance benchmarking. however i am not able to find a way ,it's a programming problem ;really sorry for the long question

here is the log file

DATE time,unique ID ,Message

2014-05-29 10:22:21,880,165ab6a8-e736-11e3-8748-8d365226be24,TP Servic +e Request 2014-05-29 10:22:21,962,165ab6a8-e736-11e3-8748-8d365226be24,ProcessNa +me: TC 2014-05-29 10:22:21,965,165ab6a8-e736-11e3-8748-8d365226be24,P5 Starte +d 2014-05-29 10:22:21,965,165ab6a8-e736-11e3-8748-8d365226be24,ProcessNa +me: SL 2014-05-29 10:22:21,968,165ab6a8-e736-11e3-8748-8d365226be24,Process M +essage :PreProcess 2014-05-29 10:22:21,969,165ab6a8-e736-11e3-8748-8d365226be24,Input Mes +sage to P5 2014-05-29 10:22:22,159,165ab6a8-e736-11e3-8748-8d365226be24,ProcessNa +me: PP 2014-05-29 10:22:22,170,165ab6a8-e736-11e3-8748-8d365226be24,Process M +essage :ValidationAndCompliance 2014-05-29 10:22:22,174,165ab6a8-e736-11e3-8748-8d365226be24,Input Mes +sage to P5 2014-05-29 10:22:58,488,165ab6a8-e736-11e3-8748-8d365226be24,ProcessNa +me: VC 2014-05-29 10:22:58,493,165ab6a8-e736-11e3-8748-8d365226be24,Process M +essage :TranslateAndLoad 2014-05-29 10:22:58,493,165ab6a8-e736-11e3-8748-8d365226be24,Input Mes +sage to P5 2014-05-29 10:23:08,301,165ab6a8-e736-11e3-8748-8d365226be24,ProcessNa +me: TL 2014-05-29 10:23:08,306,165ab6a8-e736-11e3-8748-8d365226be24,P5 Move r +equest 2014-05-29 07:12:15,966,1770ebca-e722-11e3-b793-c6903cc19f13,P5 Starte +d 2014-05-29 07:12:15,966,1770ebca-e722-11e3-b793-c6903cc19f13,ProcessNa +me: SL 2014-05-29 07:12:16,644,1770ebca-e722-11e3-b793-c6903cc19f13,Process M +essage :PreProcess 2014-05-29 07:12:16,644,1770ebca-e722-11e3-b793-c6903cc19f13,Input Mes +sage to P5 2014-05-29 07:12:16,923,1770ebca-e722-11e3-b793-c6903cc19f13,ProcessNa +me: PP 2014-05-29 07:12:17,730,1770ebca-e722-11e3-b793-c6903cc19f13,Process M +essage :TranslateAndLoad 2014-05-29 07:12:17,731,1770ebca-e722-11e3-b793-c6903cc19f13,Input Mes +sage to P5 2014-05-29 07:14:03,187,1770ebca-e722-11e3-b793-c6903cc19f13,ProcessNa +me: TL 2014-05-29 07:14:04,048,1770ebca-e722-11e3-b793-c6903cc19f13,P5 Move r +equest

here's is what i want to do ,i need to find

1. Total time : btw P5 move request and TP service request; however sometimes some messages might not be present like u can see for second unique id above(it does'nt have TP service request)

2. Process time :for each process like TC,SL,PP,VC,TL ; i need to find time till i encounter something like P5 Started || input message to p5 || p5 move request. eg like for the first id 165ab6a8-e736-11e3-8748-8d365226be24

TC -> p5 started+ SL -> input+PP -> input + VC -> input+ TAL -> p5 move request

i've written some code ;but i cant seem to find a way to solve the 2nd process time summation problem

use warnings; #read the file (PLEASE PROVIDE INPUT FILE PATH) open(hanr,"D:/Process_debug.txt")or die"error $!\n"; #digesting the lines @lines = <hanr>; open (hanw,">D:/process_time.txt") or die("error:cannot create $! \n") +; for $value2 (@lines) { my @lines2 = split(/,/,$value2); my $datetime = $lines2[0]; my $ms = $lines2[1]; my $inid = $lines2[2]; my $message = $lines2[3]; sub gettime($); if($message eq "TP Service Request"){ $date = substr $datetime,11; @timearray = split (/:/,$date); $hours = $timearray[0]; $minutes = $timearray[1]; $seconds = $timearray[2]; $hours = $hours*60*60*1000; $minutes = $minutes*60*1000; $seconds = $seconds*1000; $totaltimeinms = $hours+$minutes+$seconds+$ms; $time1 = $totaltimeinms; } if($message eq "P5 Move request" ){ $date = substr $datetime,11; @timearray = split (/:/,$date); $hours = $timearray[0]; $minutes = $timearray[1]; $seconds = $timearray[2]; $hours = $hours*60*60*1000; $minutes = $minutes*60*1000; $seconds = $seconds*1000; $totaltimeinms2 = $hours+$minutes+$seconds+$ms; $time2 = $totaltimeinms2; } print timefile "DATA : $time1 && $time2\n"; if ($time1 && $time2){ $finaltime = $time2-$time1; print timefile "$inid took : $finaltime \n"; last; } } close hanw; close hanr;

output file could be like unique id :total time = 100 ms and process time = 30 ms;

Replies are listed 'Best First'.
Re: processing timelog
by hippo (Bishop) on Aug 24, 2014 at 09:01 UTC

    Your code doesn't look too problematic in general. However you are preforming string equality tests like this:

    if($message eq "TP Service Request"){

    but $message is the last field in the line from the input file and isn't being chomped so your string test will likely fail unless you take into account the EOL character(s). If you can correct that (probably best to use chomp to do so) you may have more success.

      An additional potential problem is that the lines have a trailing space that should also probably be removed for the equality test to work.
Re: processing timelog
by Laurent_R (Canon) on Aug 24, 2014 at 09:01 UTC
    Hi sandy105,

    you might want to try this:

    use strict; use warnings; my ($tot1, $tot2); while (<DATA>) { chomp; s/\s+$//g; next unless /TP Service Request$/ or /P5 Move request$/; my ($datetime, $ms, $inid, $message) = split /,/, $_; my $date = substr $datetime,11; my ($hours, $min, $sec) = map { $_ * 1000} split /:/, $date; $hours *= 60 * 60; $min *= 60; my $total_time = $hours + $min + $sec + $ms; $tot1 = $total_time if $message eq "TP Service Request"; $tot2 = $total_time if $message eq "P5 Move request"; next unless defined $tot2; print "Duration is: ", $tot2 - $tot1, " ms.\n"; last; } __DATA__ 2014-05-29 10:22:21,880,165ab6a8-e736-11e3-8748-8d365226be24,TP Servic +e Request 2014-05-29 10:22:21,962,165ab6a8-e736-11e3-8748-8d365226be24,ProcessNa +me: TC ... (rest of data omitted from this post for brevity)
    Executing it gives the following:
    $ perl log_files.pl Duration is: 46426 ms.
    There are some issues left. For example, this does not work correctly if the date changes between the two dates/times. Please note that I have also tried to make the code more "DRY" (Don't Repeat Yourself).

      the total time is almost figured out ; i cant seem to think of a way to get process time without writing inefficient & repetitive code :(

        the total time is almost figured out ; i cant seem to think of a way to get process time without writing inefficient & repetitive code :(

        Congratulations, you are about to start programming, start naming subroutines, subroutines that take arguments and return values, and die on critical errors (like can't open file), subroutines with meaningful names that make sense in the context of your application, subroutines whose names are part of your application vocabulary/glossary

        perlintro#Writing subroutines and http://learn.perl.org/books/beginning-perl/ Chapter 8: Subroutines

        Consider using a database perhaps ?
        #!perl use strict; use warnings; use DBI; #create db my $dbfile = 'database.sqlite'; unlink($dbfile); my $dbh = DBI->connect('dbi:SQLite:dbname='.$dbfile , undef , undef , {RaiseError =>1, AutoCommit =>0}) or die $DBI::errstr; $dbh->do('CREATE TABLE DATA (F1 datetime,F2 INTEGER,F3,F4)'); $dbh->do('CREATE TABLE STARTED (F1 datetime,F2,F3)'); $dbh->commit; # load data my $sth = $dbh->prepare('INSERT INTO DATA VALUES (?,?,?,?)'); while (<DATA>){ chomp; my @f = split ','; if (@f==4){ $sth->execute(@f); } else { print "Input Line $. Skipped : $_\n"; } } $dbh->commit; # select started records $dbh->do("INSERT INTO STARTED SELECT F1,F2,F3 FROM DATA WHERE F4 LIKE '%started%'"); $dbh->commit; # calculate duration from start my $ar = $dbh->selectall_arrayref(" SELECT D.F1,D.F2,D.F3,D.F4, (1000*strftime('%s',D.F1))+D.F2 - (1000*strftime('%s',S.F1))-S.F2 FROM DATA AS D LEFT JOIN STARTED AS S ON D.F3 = S.F3"); for (@$ar){ printf "@$_\n"; } $dbh->disconnect; __DATA__
        poj
      thanks this did help me,and i learned a better approach .this is certainly very "dry" :)
Re: processing timelog
by CountZero (Bishop) on Aug 24, 2014 at 16:30 UTC
    I suggest you first parse your timelog and put the data into a data structure that you can then more easily traverse.
    use Modern::Perl qw/2014/; use DateTime; use DateTime::Format::Flexible; use DateTime::Duration; my %data; while (<DATA>) { chomp; my ( $datetime, $msecs, $id, $description ) = split /,/; $msecs *=1000000; $datetime .= ":$msecs"; my $dt = DateTime::Format::Flexible->parse_datetime($datetime); $data{$id}->{$description} = $dt; } for my $id ( keys %data ) { print "Results for $id: "; if ( exists $data{$id}{'P5 Move request'} and exists $data{$id}{'TP Service Request'} ) { my $duration = $data{$id}{'P5 Move request'} ->subtract_datetime_absolute( $data{$id}{'TP Service Request +'} ); say "$data{$id}{'P5 Move request'} -> $data{$id}{'TP Service Request'} = " +, $duration->seconds + $duration->nanoseconds/1000000000, ' se +conds'; } else { say 'Missing data'; } } __DATA__ 2014-05-29 10:22:21,880,165ab6a8-e736-11e3-8748-8d365226be24,TP Servic +e Request 2014-05-29 10:22:21,962,165ab6a8-e736-11e3-8748-8d365226be24,ProcessNa +me: TC 2014-05-29 10:22:21,965,165ab6a8-e736-11e3-8748-8d365226be24,P5 Starte +d (...) 2014-05-29 07:14:04,048,1770ebca-e722-11e3-b793-c6903cc19f13,P5 Move r +equest
    Output:
    Results for 1770ebca-e722-11e3-b793-c6903cc19f13: Missing data Results for 165ab6a8-e736-11e3-8748-8d365226be24: 2014-05-29T10:23:08 +-> 2014-05-29T10:22:21 = 46.426 seconds
    As you see, calculating the time between a "P5 Move request" and "TP Service Request" gets really easy: a few look-ups in the data structure is all that is needed. As the datetime is a DateTime-object, calculating the difference (i.e. duration) between two datetimes requires no complicated calculations.

    Two more comments:

    1. If for the same ID you can have multiple processes with the same name, this data-structure is too simple as a later process will overwrite the data of a previous one.
    2. Always use a DateTime (or similar) object to represent a datetime. Time and date calculations are fraught with exceptions and special rules. Not all hours are 60 minutes and not all minutes are 60 seconds. The DateTime module takes care of all of that.

    Update: Amended code to take into account the millisecond field in the timelog.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics
Re: processing timelog
by sandy105 (Scribe) on Aug 26, 2014 at 10:38 UTC

    i wrote it all right ,took some time .thanks for your help guys .specially laurent_R ,will post the code shortly

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1098445]
Approved by vinoth.ree
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (3)
As of 2024-04-25 07:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found