Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Using the Substr

by Anonymous Monk
on Mar 04, 2014 at 19:20 UTC ( [id://1076941]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi, First time poster. I have a string like 1:13:17:6:5854:0x00E37F06:0x00D1314C. I need everything before the 3rd colon (1:13:17). I use SUBSTR(g.column,1,7) AS myField. But at times it comes in like 1:13:1:6:5854:0x00E37F06:0x00D1314C. Then it returns 1:13:1: I need it as 1:13:1 if the : is in 7th position or 1:13:17,if the : is in the normal 8th. Thanks, pargo

Replies are listed 'Best First'.
Re: Using the Substr
by hdb (Monsignor) on Mar 04, 2014 at 19:29 UTC

    splitting the string at the colons and joining the first three fields seems the easiest way to me:

    join( ':',(split /:/,$string)[0..2] )
Re: Using the Substr
by roboticus (Chancellor) on Mar 04, 2014 at 19:35 UTC

    In your problem statement, you say that you need everything before the third colon. So your solution should be expressed in a similar form.

    You're mentioning substr in the title, but that will require you to find the position of the third colon so you know what length to specify.

    Rather than doing that, I'd suggest using split and join: Use split on colons to break the field into chunks, and then keep the first three chunks, gluing them back together with colons to make youre result. Like:

    my @chunks = split /:/, $input_string; my $answer = join(':', $chunks[0], $chunks[1], $chunks[2]);

    You could use regular expressions, too. But if you really want to use substr, then look up perldoc -f index to see how you might find the position of a string. (Of course, since you want the third one, you'll need to call it three times.)

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

Re: Using the Substr
by Ratazong (Monsignor) on Mar 04, 2014 at 19:28 UTC

    Is substr mandatory? Or can you use a regex, like

    $string =~ /^(\d+:\d+:\d+):/; $result = $1;
    (assuming there are only digits between the colons).

    HTH, Rata

      and assuming they're not always digits but a colon is always the separator, a combination of split, array slicing, and join:

      $result = join( ':', (split( /:/, $string))[0..2]);

      -derby

      update: I like hdb's better ... less parens.

Re: Using the Substr
by Kenosis (Priest) on Mar 04, 2014 at 21:22 UTC

    If you decide to go with split/join and have quite a few strings to process, consider setting split's LIMIT parameter to 4 (since you're only interested in the first three resulting elements), as this can significantly speed up processing w/o compromising readability:

    use strict; use warnings; use Benchmark qw/cmpthese/; my $string = '1:13:1:6:5854:0x00E37F06:0x00D1314C'; sub _split { my $result = join( ':', ( split /:/, $string )[ 0 .. 2 ] ); } sub _split_LIMIT { my $result = join( ':', ( split /:/, $string, 4 )[ 0 .. 2 ] ); } cmpthese( -5, { _split => sub { _split() }, _split_LIMIT => sub { _split_LIMIT() } } );

    Output (faster times are lower in the table):

    Rate _split _split_LIMIT _split 938083/s -- -25% _split_LIMIT 1248002/s 33% --

      The advantage of limited over unlimited split is more pronounced if a redundant subroutine call and lexical creation are not included, with regex extraction thrown in for good measure:

      c:\@Work\Perl>perl -wMstrict -le "use Benchmark qw/cmpthese/; ;; my $string = '1:13:1:6:5854:0x00E37F06:0x00D1314C'; my $result; ;; sub _split { $result = join ':', (split /:/, $string)[0 .. 2]; } ;; sub _split_LIMIT { $result = join ':', (split /:/, $string, 4)[0 .. 2]; } ;; sub _regex { ($result) = $string =~ m{ \A \d+ : \d+ : \d+ }xmsg; } ;; cmpthese(-5, { _split => \&_split, _split_LIMIT => \&_split_LIMIT, _regex => \&_regex, }); " Rate _split _split_LIMIT _regex _split 261407/s -- -39% -54% _split_LIMIT 425387/s 63% -- -24% _regex 562521/s 115% 32% --

        I wondered if index and substr would be any quicker but, no, it came in a bit behind &_split_LIMIT.

        use strict; use warnings; use Benchmark qw{ cmpthese }; my $string = q{1:13:1:6:5854:0x00E37F06:0x00D1314C}; my $result; sub _index { my $pos3 = -1; $pos3 = index $string, q{:}, $pos3 + 1 for 1 .. 3; $result = substr $string, 0, $pos3; } sub _split { $result = join ':', (split /:/, $string)[0 .. 2]; } sub _split_LIMIT { $result = join ':', (split /:/, $string, 4)[0 .. 2]; } sub _regex { ($result) = $string =~ m{ \A \d+ : \d+ : \d+ }xmsg; } cmpthese( -5, { _index => \&_index, _split => \&_split, _split_LIMIT => \&_split_LIMIT, _regex => \&_regex, } );
        Rate _split _index _split_LIMIT _r +egex _split 657900/s -- -26% -31% +-38% _index 890097/s 35% -- -7% +-17% _split_LIMIT 952857/s 45% 7% -- +-11% _regex 1066336/s 62% 20% 12% + --

        I hope this is of interest.

        Cheers,

        JohnGG

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1076941]
Approved by hdb
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2024-04-19 23:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found