Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Perl split on regex match

by Eshan_k (Acolyte)
on Jan 03, 2017 at 08:15 UTC ( [id://1178827]=perlquestion: print w/replies, xml ) Need Help??

Eshan_k has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, I am beginner in perl. I have fairly simple problem. I have a file structure below.

cls1,37,Media.vdenc.abcunit,media_vd cls2,7,Media.Wigig.plsunit,media_vd cls3,27,Media.vdenc,media_vd cls4,47,Media.hevc,media_vd cls5,57,Media.ENC,media_vd

I am spliting each line on "," and storing 3rd element in an array (for ex: Media.vdenc.abcunit). I want to print a line in which third field in the element is empty (for ex: Media.vdenc). My idea is to split the each string in an array (cluster) and if third field is empty print that line. Can anyone help me with this? So far I tried:

#!usr/bin/perl my $file = $ARGV[0] or die "Please provide csv file \n"; open(my $data , '<', $file ) or die "Cannot open csv file $!\n"; while (my $line = <$data>){ chomp $line; my @col = split ",", $line; push (my @cluster, $col[2]); foreach my $field (@cluster){ my @cls = split ".", $field; print @cls; } }
Output should be : cls3,27,Media.vdenc,media_vd cls4,47,Media.hevc,media_vd cls5,57,Media.ENC,media_vd

Replies are listed 'Best First'.
Re: Perl split on regex match
by Athanasius (Archbishop) on Jan 03, 2017 at 08:39 UTC

    Hello Eshan_k,

    Building on Corion’s answer:

    use strict; use warnings; while (my $line = <DATA>) { chomp $line; my @cols = split /,/, $line; my @third = split /\./, $cols[2]; print join(',', @cols), "\n" if @third < 3; } __DATA__ cls1,37,Media.vdenc.abcunit,media_vd cls2,7,Media.Wigig.plsunit,media_vd cls3,27,Media.vdenc,media_vd cls4,47,Media.hevc,media_vd cls5,57,Media.ENC,media_vd

    Output:

    18:37 >perl 1736_SoPW.pl cls3,27,Media.vdenc,media_vd cls4,47,Media.hevc,media_vd cls5,57,Media.ENC,media_vd 18:37 >

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

      Hi Athanasius

      Why do you print join ',', @cols when you already have $line in scope, with the complete line in it?

      I would be tempted to drop the chomp too, as we are never looking that far down the line. Then we have this code...

      use strict; use warnings; while (my $line = <DATA>) { my @cols = split /,/, $line; my @third = split /\./, $cols[2]; print $line if @third < 3; }

      Cheers,
      R.

      Pereant, qui ante nos nostra dixerunt!
        while (my $line = <DATA>) { print $line if (split /\./,$line) < 3; }
        But God demonstrates His own love toward us, in that while we were yet sinners, Christ died for us. Romans 5:8 (NASB)

      Hi Athanasius

      Maybe nitpicking, but the OP said

      > > I want to print a line in which third field in the element is empty

      and not "less than 3 elements" (though he might have meant it)

      And it may be possible to split "a.b.c..e" where the 3rd element is empty.

      If this is a legal edge case I'd rather prefer unless length $cols[2]; over if @third < 3;

      (N.B. length undef and length "" are both false)

      Cheers Rolf
      (addicted to the Perl Programming Language and ☆☆☆☆ :)
      Je suis Charlie!

Re: Perl split on regex match
by Corion (Patriarch) on Jan 03, 2017 at 08:23 UTC
    my @cls = split ".", $field;

    split operates on a regular expression, not a string. Most likely, you don't get the results you expect from that line.

    Maybe try with

    my @cls = split /\./, $field;

    Also, it looks as if you're trying to parse CSV formatted text. Have a look at Text::CSV_XS to create a more robust CSV parser instead of manually trying to split the lines.

Re: Perl split on regex match
by Random_Walk (Prior) on Jan 03, 2017 at 08:33 UTC

    You could do it all with a regex, a bit like this...

    use strict; use warnings; while (<DATA>) { print unless /[^,]+,\d+,([^.]+\.){2}[^,.]+/; } __DATA__ cls1,37,Media.vdenc.abcunit,media_vd cls2,7,Media.Wigig.plsunit,media_vd cls3,27,Media.vdenc,media_vd cls4,47,Media.hevc,media_vd cls5,57,Media.ENC,media_vd

    Output

    cls3,27,Media.vdenc,media_vd cls4,47,Media.hevc,media_vd cls5,57,Media.ENC,media_vd

    Though that is not the most readable way :)

    Cheers,
    R.

    Pereant, qui ante nos nostra dixerunt!
      ... that is not the most readable way ...

      Yea and amen to that, brother!


      Give a man a fish:  <%-{-{-{-<

Re: Perl split on regex match
by kcott (Archbishop) on Jan 04, 2017 at 06:43 UTC

    G'day Eshan_k,

    When dealing with CSV data, reach for Text::CSV. If you also have Text::CSV_XS installed, it will run faster. It makes this sort of task trivial.

    #!/usr/bin/env perl -l use strict; use warnings; use Text::CSV; my $csv = Text::CSV::->new; while (my $row = $csv->getline(\*DATA)) { next unless $row->[2] =~ /^[^.]*[.][^.]*$/; $csv->print(\*STDOUT, $row); } __DATA__ cls1,37,Media.vdenc.abcunit,media_vd cls2,7,Media.Wigig.plsunit,media_vd cls3,27,Media.vdenc,media_vd cls4,47,Media.hevc,media_vd cls5,57,Media.ENC,media_vd

    Output:

    cls3,27,Media.vdenc,media_vd cls4,47,Media.hevc,media_vd cls5,57,Media.ENC,media_vd

    — Ken

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1178827]
Approved by beech
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (5)
As of 2024-04-19 07:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found