Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Matching multiple digits

by skoney (Novice)
on Feb 03, 2008 at 13:44 UTC ( [id://665814]=perlquestion: print w/replies, xml ) Need Help??

skoney has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to convert the timestamps in a .csv file to conventional dates to be used in a spreadsheet. Some of the timestamps are 9 digits and some are 10. Towards the end of the file there are multiple occurrences of the timestamps but it's not picking all of them up. Here is my code:
#!/usr/bin/perl -w use strict; my $source = "libsource.txt"; my $dest = "library2.txt"; open(SOURCE, "$source")||die; open(DEST, ">$dest")||die; my @contents=<SOURCE>; my $day; foreach my $stamp (@contents) { $stamp=~/\d{9,10}/g; my ($sec, $min, $hour, $mday, $mon, $year_off)=localtime($&); $mon=$mon+1; if ($mday<10) { $day="0".$mday; } else { ($day=$mday); } if ($mon<10) { $mon="0".$mon; } my $year=$year_off+1900; my @filedate=($mon, $day, $year); my $conversion=join('/',@filedate); $stamp =~ s|$&|$conversion|; } print DEST @contents; close (SOURCE); close (DEST);
Here is a piece of the original .csv source file:
8897,17,0,99,8888,,0,0,1005,36405,ABCD,0,,1194116569,1000824,ABCD,Live + Presentation CAM2,Jenifer Smith ,60:00,Ch1 & 2,0,,0,0,0,3,0,,0,Reel #3,NDF,TOD,,,,,,,,,,0,,0,0,0.000,0.000,,,,,0,0,0,1197399969,1197399769 +,0, 0,0,0,SUPERVISOR,SUPERVISOR,0,,,,,,,,,,,,0,,,,,,,,,,,0,0,0.000,0.000,, +,,, ,,, 8898,17,0,99,8889,,0,0,1014,36406,ABCD,0,,1173207769,1000824,ABCD,Segm +ent s,Marcus,60:00,Ch1 & 2,0,,0,0,0,14,0,,0,Reel #1,DF,01:00:00:00,,,,,,,,,,0,,0,0,0.000,0.000,,,,,0,0,0,1197400130,119 +739 9769,0,0,0,0,SUPERVISOR,SUPERVISOR,0,,,,,,,,,,,,0,,,,,,,,,,,0,0,0.000, +0.0 00,,,,,,,, 8899,17,0,99,8890,,0,0,1023,36407,ABCD,0,,1193806800,1000665,,Buyout,, +19: 00,Ch1 & 2,0,,0,0,0,2,0,,0,Reel 1,NDF,1:00:00:00,,,,,,,,,,0,,910138,0,0.000,0.000,,,,,0,0,0,1197400324 +,0, 0,0,0,0,,SUPERVISOR,0,,,,,,,,,,,,0,,,,,,,,,,,0,0,0.000,0.000,,,,,,,, 8900,17,0,99,8891,,0,0,1005,36408,ABCD,0,,1172413184,1000797,ppp,Skiin +g,M ountain,01:00:00,Ch1 & 2,0,4070,0,0,0,25,0,,0,Reel #4,DF,00:00:00,,,,,,,,,,0,,910138,0,0.000,0.000,,,,,0,0,0,1197897083,1 +169 216384,0,0,0,0,SUPERVISOR,SUPERVISOR,0,,,,,,,,,,,,0,,,,,,,,,,,0,0,0.00 +0,0 .000,,,,,,,, 8901,17,0,99,8892,,0,0,1005,36409,ABCD,0,,1198248398,1000707,ttt,Natio +nal Meeting,ppp,1:00:00,Ch1 & 2,0,4071,0,0,0,25,0,,0,Reel #1,NDF,1:00:00,,,,,,,,,,0,,1000705,0,0.000,0.000,,,,,0,0,0,1198502505, +109 3009598,0,0,0,0,SUPERVISOR,SUPERVISOR,0,,,,,,,,,,,,0,,,,,,,,,,,0,0,0.0 +00, 0.000,,,,,,,, 8902,17,0,99,8893,,0,0,1005,36410,ABCD,0,,1198248398,1000707,,National + Meeting,ppp,1:00:00,Ch1 & 2,0,4072,0,0,0,25,0,,0,Reel #2,NDF,2:00:00,,,,,,,,,,0,,1000705,0,0.000,0.000,,,,,0,0,0,1198502514, +109 3009598,0,0,0,0,SUPERVISOR,SUPERVISOR,0,,,,,,,,,,,,0,,,,,,,,,,,0,0,0.0 +00, 0.000,,,,,,,,
And here are the results:
12/31/1969 8897,17,0,99,8888,,0,0,1005,36405,ABCD,0,,11/03/2007,1000824,ABCD,Live + Presentation CAM2,Jenifer Smith ,60:00,Ch1 & 2,0,,0,0,0,3,0,,0,Reel +#3,NDF,TOD,,,,,,,,,,0,,0,0,0.000,0.000,,,,,0,0,0,1197399969,119739976 +9,0,0,0,0,SUPERVISOR,SUPERVISOR,0,,,,,,,,,,,,0,,,,,,,,,,,0,0,0.000,0. +000,,,,,,,, 8898,17,0,99,8889,,0,0,1014,36406,ABCD,0,,03/06/2007,1000824,ABCD,Segm +ents,Marcus,60:00,Ch1 & 2,0,,0,0,0,14,0,,0,Reel #1,DF,01:00:00:00,,,, +,,,,,,0,,0,0,0.000,0.000,,,,,0,0,0,1197400130,1197399769,0,0,0,0,SUPE +RVISOR,SUPERVISOR,0,,,,,,,,,,,,0,,,,,,,,,,,0,0,0.000,0.000,,,,,,,, 8899,17,0,99,8890,,0,0,1023,36407,ABCD,0,,10/31/2007,1000665,,Buyout,, +19:00,Ch1 & 2,0,,0,0,0,2,0,,0,Reel 1,NDF,1:00:00:00,,,,,,,,,,0,,91013 +8,0,0.000,0.000,,,,,0,0,0,1197400324,0,0,0,0,0,,SUPERVISOR,0,,,,,,,,, +,,,0,,,,,,,,,,,0,0,0.000,0.000,,,,,,,, 8900,17,0,99,8891,,0,0,1005,36408,ABCD,0,,02/25/2007,1000797,ppp,Skiin +g,Mountain,01:00:00,Ch1 & 2,0,4070,0,0,0,25,0,,0,Reel #4,DF,00:00:00, +,,,,,,,,,0,,910138,0,0.000,0.000,,,,,0,0,0,1197897083,1169216384,0,0, +0,0,SUPERVISOR,SUPERVISOR,0,,,,,,,,,,,,0,,,,,,,,,,,0,0,0.000,0.000,,, +,,,,, 8901,17,0,99,8892,,0,0,1005,36409,ABCD,0,,12/21/2007,1000707,ttt,Natio +nal Meeting,ppp,1:00:00,Ch1 & 2,0,4071,0,0,0,25,0,,0,Reel #1,NDF,1:00 +:00,,,,,,,,,,0,,1000705,0,0.000,0.000,,,,,0,0,0,1198502505,1093009598 +,0,0,0,0,SUPERVISOR,SUPERVISOR,0,,,,,,,,,,,,0,,,,,,,,,,,0,0,0.000,0.0 +00,,,,,,,, 8902,17,0,99,8893,,0,0,1005,36410,ABCD,0,,12/21/2007,1000707,,National + Meeting,ppp,1:00:00,Ch1 & 2,0,4072,0,0,0,25,0,,0,Reel #2,NDF,2:00:00 +,,,,,,,,,,0,,1000705,0,0.000,0.000,,,,,0,0,0,1198502514,1093009598,0, +0,0,0,SUPERVISOR,SUPERVISOR,0,,,,,,,,,,,,0,,,,,,,,,,,0,0,0.000,0.000, +,,,,,,,
What am I doing wrong?

In my Learning Perl book there is a footnote about having to use anchors to match a specific number of digits (ie. that the regexp /\d{3}/ will match groups of digits that are 4, 5, or more digits long) but I can't for the life of me see how any of the anchors would enable that. None of my other books mention this quirk. Am I missing something? Is there another anchor I don't know about?

Also, notice at the beginning of the results it's putting in a date of 12/31/1969, even though there's no timestamp there. What's causing this and how do I prevent it? Does it have anything to do with these warnings I'm getting?

Use of uninitialized value in localtime at DataConvert.pl line 22, <SOURCE> line 7. Use of uninitialized value in regexp compilation at DataConvert.pl lin +e 41, <Source> line 7.
One more question: is it OK to use $& the way I'm using it, or should I be assigning it to a conventional variable?

Any help would be greatly appreciated! Thanks!

Replies are listed 'Best First'.
Re: Matching multiple digits
by jwkrahn (Abbot) on Feb 03, 2008 at 14:42 UTC

    You want something like this:

    #!/usr/bin/perl use warnings; use strict; use POSIX qw/ strftime /; my $source = 'libsource.txt'; my $dest = 'library2.txt'; open SOURCE, '<', $source or die "Cannot open '$source' $!"; open DEST, '>', $dest or die "Cannot open '$dest' $!"; while ( <SOURCE> ) { s{\b(\d{9,10})\b}{ strftime '%m/%d/%Y', localtime $1 }eg; print DEST; } close SOURCE; close DEST;
Re: Matching multiple digits
by pc88mxer (Vicar) on Feb 03, 2008 at 18:26 UTC
    Just to elaborate on what you're doing wrong, when you use the /g modifier on regular expression matches as in:

    $stamp=~/\d{9,10}/g;

    you need to use a while loop to iterate through all the matches like this:

    while ($stamp =~ /\d{9,10}/g) { ... }

    Besides executing your code for every match, this will also make your code work correctly in the case when there are no timestamps in the input line. This is why you are getting the uninitialized value errors and the spurious '12/31/1969' output. You are using $& in localtime and elsewhere even when the line doesn't contain any timestamps.

      jwkrahn:

      *Slaps hand to forehead* I should have known there'd be a module to handle this! Although, having looked at the module documentation, I have to admit I never would have come up with your solution. None of my books have anything about using the substitution regexp with two sets of braces. Thanks for the help - it ran perfectly!

      pc88mxr:

      Thanks for the explanation! Now it all makes sense. At one point I tried a  while <SOURCE> loop but it wasn't working so I abandoned that and went with what (little) I know. I'm going to go back and play with this some more so I understand the concept.

      Thank you both for helping me add another tool to my toolkit!

        None of my books have anything about using the substitution regexp with two sets of braces.

        Using braces isn't what made the solution possible. Using braces is just a stylistic choice. Using the "e" modifier after the replacement is what made the solution possible.

        s/this/that/
        is equivalent to any of these:
        s{this}{that} s(this)(that) s#this#that#

        For more on the "e" modifier, see perlre.

        If you want to learn what is possible in Perl, start reading the perldoc. Honestly it sounds silly, but take any kind of reference works to the toilet with you (like perldoc printouts), and read them whenever you are NOT trying to solve a problem. You'll gain familiarity with many features, and you'll be able to go back for more details when you need them at your desk.

        --
        [ e d @ h a l l e y . c c ]

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://665814]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (5)
As of 2024-04-23 18:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found