Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Display shortened paragraph

by Anonymous Monk
on Feb 01, 2006 at 04:52 UTC ( [id://526961]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

hi all!
just wondering how i can have perl display part of my long memo. basically i want the first lets say 255 charachters of the paragraph. im really new to perl so i don't know how i would come about this? a regex perhaps? just started reading about that today. so basically something like:

just wondering how i can have perl display part of my long memo. basi +cally i want the first lets say 255 charachters of the paragraph. im + really new to perl so i don't know how i would come about this?...[ +read more ]


thank you all

Replies are listed 'Best First'.
Re: Display shortened paragraph
by graff (Chancellor) on Feb 01, 2006 at 05:27 UTC
    I think you want "substr" (see "perldoc -f substr"):
    my $maxlen = 20; # could be 255 if you like my $memo = "This is a really long string and I only need to show the f +irst $maxlen characters..."; print substr( $memo, 0, $maxlen ), "\n";
      Indeed his request sounds quite strange, since breaking at 255 charachters may be suboptimal for some text to be shown. However it's not entirely clear to me if he wants to read in a file by chunks of fixed lenght. If so, then curiously I wrote a note just a few minutes ago about how to use $/ to do so. E.g.:
      $/=\255;
Re: Display shortened paragraph
by duff (Parson) on Feb 01, 2006 at 05:54 UTC

    Here's an off-the-cuff version that uses regular expressions:

    #!/usr/bin/perl use strict; use warnings; my $str = qq(just wondering how i can have perl display part of my lon +g memo. basically i want the first lets say 255 charachters of the p +aragraph. im really new to perl so i don't know how i would come abo +ut this?); my $max = 50; (my $copy = $str) =~ s/(.{1,$max})\b.*/$1.../; print "$copy\n"; __END__

    What it does is match a maximum of $max characters and stores that in $1, a word boundary, then the rest of the string and essentially replaces the entire string with whatever it matched in $1 and an ellipsis. Matching the word boundary is nice so that you don't chop off the text in the middle of a word.

    Update: Here's another way that does essentially the same thing:

    my ($short) = $str =~ m/(.{1,$max}\b)/; $short .= '...';

    This is probably how I would do it in my code, I just happened to think of the other way first for some reason.

Re: Display shortened paragraph
by GrandFather (Saint) on Feb 01, 2006 at 06:11 UTC

    and the best answer is:

    Rate jbrug2 duff duff2 jbrug graff2 graff jbrug2 1348/s -- -99% -99% -100% -100% -100% duff 101001/s 7393% -- -59% -73% -94% -95% duff2 245077/s 18082% 143% -- -35% -84% -89% jbrug 379315/s 28041% 276% 55% -- -76% -82% graff2 1556035/s 115341% 1441% 535% 310% -- -28% graff 2162638/s 160344% 2041% 782% 470% 39% --

    A touch of apples and oranges however because duff's solution performs a slightly different (and likely more usefull) task.

    Updated for duff's second solution
    Beah - and jbrug2
    and graff2 (rindex version)


    DWIM is Perl's answer to Gödel
      To complete the stats (Extremely usefull :) )
      Rate jbrug2 duff jbrug graff jbrug2 3596/s -- -98% -99% -100% duff 153325/s 4164% -- -78% -94% jbrug 689852/s 19086% 350% -- -73% graff 2525239/s 70130% 1547% 266% --
      "We all agree on the necessity of compromise. We just can't agree on when it's necessary to compromise." - Larry Wall.

      <rant>
      While it is interesting to note that duff's solution is reasonably fast (compared to the other ones) even if it does a more complex task, and definitely a more appropriate one to the needs of the OP, let's please try not to go mad about doing Benchmarks for every single thing that is posted here. This may convey the impression that premature optimization does matter, while it doesn't. By any means. In particular in this case as soon as I have some code that works, I would rather "rate" it based on readability or perhaps adaptability but for heaven's sake: not on speed!
      </rant>

        I agree that premature optimization is an evil thing. Having some knowledge about the relative merits of different approaches to solving a problem is an essential part of providing a good solution. Execution efficiency is one parameter that should be considered when evaluating different approaches.

        That is not to say that you should "benchmark early and often". In fact, generally you don't need to benchmark at all in the normal course of writing code (in any language). However at PerlMonks we are in a special situation where all facits of a piece of code come under scrutiny for the edification of the monks.

        I say, in the context of PerlMonks, where many different approaches to solving a problem are suggested, that benchmarking should be done "early and often". In this particular case there are in essence three different approaches suggested with wildly different execution speeds. The benchmark should strongly convey that dealing retail with characters is SLOW (jbrug2) and that narrow purpose functions (substr, rindex) are faster than general purpose functions (regex). That is a valuable lesson in any book!


        DWIM is Perl's answer to Gödel
Re: Display shortened paragraph
by jbrugger (Parson) on Feb 01, 2006 at 05:58 UTC
    TIMTOWTDI
    update: More or less the same idea as duff i see :)
    #!/usr/bin/perl -w use strict; my $txt = "just wondering how i can have perl display part of my long +memo. basically i want the first lets say 255 charachters of the para +graph. im really new to perl so i don't know how i would come about t +his? a regex perhaps? just started reading about that today. so basic +ally something like"; my $n = 20; $txt =~ m/(.{$n})/gs; print $1;


    ok, yet another way then :)
    #!/usr/bin/perl -w use strict; my $txt = "just wondering how i can have perl display part of my long +memo. basically i want the first lets say 255 charachters of the para +graph. im really new to perl so i don't know how i would come about t +his? a regex perhaps? just started reading about that today. so basic +ally something like"; my @a = split("",$txt); for (my $i=0; $i<20; $i++) { print $a[$i]; }


    "We all agree on the necessity of compromise. We just can't agree on when it's necessary to compromise." - Larry Wall.
      hi! works, but how would i shorten the paragraph so that it does not cut off a word if it reaches the max char. limit? for example..."just wondering how..." compared to "just wonder h..." ---notice how it cuts off the "ow" in "how"....thanks
        Since people have shown how the use of a regex approach tends to be slower, here's a way to observe word boundaries (well, spaces between words, anyway) without using a regex:
        my $maxlen = 20; my $longtext = "This is some very long string that needs to be truncat +ed to $maxlen characters..."; my $trunctext = substr( $longtext, 0, rindex( $longtext, " ", $maxlen +)); print "$longtext\n$trunctext\n";
        The rindex function, like substr, is faster than a regex match.
        By using duff's solution with the Matching the word boundary \b so that you don't chop off the text in the middle of a word.

        update:
        A clumsy other way to do this :)
        use strict; use warnings; my $txt = "just wondering how i can have perl display part of my long +memo. basically i want the first lets say 255 charachters of the para +graph. im really new to perl so i don't know how i would come about t +his? a regex perhaps? just started reading about that today. so basic +ally something like"; my @a = split("",$txt); my $l = 7; for (my $i=0; $i < $l; $i++) { print $a[$i]; $l++ if ($i ==($l-1) && ($a[$i] ne " " )) ; }


        "We all agree on the necessity of compromise. We just can't agree on when it's necessary to compromise." - Larry Wall.
        A reply falls below the community's threshold of quality. You may see it by logging in.
        actually the code given by graff works just fine:

        my $max = 230; (my $copy = $string) =~ s/(.{1,$max})\b.*/$1.../; print "$copy\n";
        Thanks!
Re: Display shortened paragraph
by planetscape (Chancellor) on Feb 03, 2006 at 01:32 UTC

    For a totally different approach you may wish to take a look at Lingua::EN::Keywords and Lingua::EN::Summarize. These modules may allow you to get a better sense of what your memo is actually about...

    HTH,

    planetscape

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://526961]
Approved by graff
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (6)
As of 2024-04-24 07:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found