http://qs321.pair.com?node_id=390417

kiat has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks,

Please suggest on how I can improve the code below that inserts appropriate breaklines into a long paragraph to achieve a desired vertical spacing between the lines.

use strict; my $string = q~Suppose you run a Web site that includes a banner adver +tisement service. You contract with companies that want their ads dis +played when poeple visit the pages on your site. Each time a visitor +hits one of your pages, you serve an ad embedded in the page that is +sent to the visitor's browser and assess the company a small fee. To +represent this information, you maintain three tables. One table, com +pany, has columns for company name, number, address, and telephone nu +mber. Another table, ad lists ad numbers, the number for the company +that owns the ad, and the amount you charge per hit. The third table, + hit, logs each ad hit by number and the date on which the ad was ser +ved.~; my @words = split / /, $string; my ($string_len, $new_string); foreach (@words) { my $space = 1; $string_len += length($_); if ($string_len >= 50) { $new_string .= "$_<br /><br />"; $string_len = 0; $space = 0; } else { $new_string .= $_; $new_string .= ' ' if $space; } } # output Suppose you run a Web site that includes a banner advertisement<br />< +br />service. You contract with companies that want their ads display +ed<br /><br />when poeple visit the pages on your site. Each time a v +isitor<br /><br />hits one of your pages, you serve an ad embedded in + the page that<br /><br />is sent to the visitor's browser and assess + the company a small<br /><br />fee. To represent this information, y +ou maintain three tables.<br /><br />One table, company, has columns +for company name, number, address,<br /><br />and telephone number. A +nother table, ad lists ad numbers, the<br /><br />number for the comp +any that owns the ad, and the amount you charge<br /><br />per hit. T +he third table, hit, logs each ad hit by number and the<br /><br />da +te on which the ad was served.
As usual, many thanks in advance :)

Update: Thanks all for your wonderful solutions :)

Replies are listed 'Best First'.
Re: Insert breaklines into string
by lhoward (Vicar) on Sep 12, 2004 at 14:18 UTC
    Get rid of the br tags and use a cascading style sheet, its what CSS was made to do. With all those BR's the paragraph won't scale well for narrow or wide browser resolution, or for people using a large font for accessability reasons. L
      Thanks, lhoward!

      I'll check out what I can do with CSS. In the meantime, do you have any suggestion on how the css is supposed to look like?,

      Update: Don't bother. Did it with line-height :)

Re: Insert breaklines into string
by graff (Chancellor) on Sep 12, 2004 at 15:20 UTC
    For the general case of breaking any long line into a bunch of lines having some maximum width, you want Text::Wrap. Others have stated better ideas for your specific case, but you could also do it like this:
    $Text::Wrap::$columns = 50; $new_string = Text::Wrap::wrap( '', '<br /><br />', $string );
    You'll notice that the wrap function will always include a "\n" as part of the line break (it appends "\n" at the end of each output chunk), and the width of the "indent" args (the <br /><br /> for non-initial lines, in this case) will be counted as part of the final line-width. I haven't studied the entire man page myself yet, and there's probably lots of flexibility...

    But if you're really picky and you believe that Text::Wrap won't do exactly what you want, then something like the following would be simpler than the code you posted:

    my $new_string; my $width = 50; while ( length( $string ) > $width ) { my $brksp = rindex( $string, ' ', $width ); $new_string .= substr( $string, 0, $brksp ) . '<br /><br />'; $string = substr( $string, $brksp+1 ); } $new_string .= $string;
    (update: replaced literal "50" with $width in "while" condition)
      Thanks for that simpler code, graff!
Re: Insert breaklines into string
by gaal (Parson) on Sep 12, 2004 at 14:33 UTC
    I agree completely with lhoward that this is something your browser should do if the output is for the web. However, if you're interested in this kind of formatting for other applications as well, take a look at the unix fmt command (which even has a Perl port). It does clever things like find the best way to arrange line breaks so that each line in the paragraph is more or less the same length as the ones around it.
      Thanks, gaal!

      I'll look it up.

Re: Insert breaklines into string
by jbware (Chaplain) on Sep 12, 2004 at 15:13 UTC
    I agree with the the above posts on CSS & the browser being the best solution to this problem. But since I can't resist a good regex challenge, the below code does your vertical break at column 50, or earlier at a space so words aren't broken.
    $string =~ s|(.{0,49}[^\s])\s+|$1<br /><br />|g;

    -jbWare
      A bug? I ran your code. Towards the end of the paragraph, the breakline gave wrong results (the words was and served were broken up when they shouldn't have been):

      Suppose you run a Web site that includes a banner<br /><br />advertise +ment service. You contract with companies<br /><br />that want their +ads displayed when poeple visit<br /><br />the pages on your site. Ea +ch time a visitor hits<br /><br />one of your pages, you serve an ad +embedded in the<br /><br />page that is sent to the visitor's browser + and<br /><br />assess the company a small fee. To represent this<br +/><br />information, you maintain three tables. One table,<br /><br / +>company, has columns for company name, number,<br /><br />address, a +nd telephone number. Another table, ad<br /><br />lists ad numbers, t +he number for the company that<br /><br />owns the ad, and the amount + you charge per hit.<br /><br />The third table, hit, logs each ad hi +t by number<br /><br />and the date on which the ad was<br /><br />se +rved.
Re: Insert breaklines into string
by bart (Canon) on Sep 12, 2004 at 15:26 UTC
    For wrapping lines to a maximum character count per line (for fixed pitch fonts), check out Text::Wrap.

    For variable pitch fonts, the results would depend on the font used. I've built a custom solution once for Postscript fonts, using Font::AFM, but I don't know a similar, more generic (and better tested) module on CPAN. In short, what is needed, is to replace the simpleminded test for string length, with a test based upon stringwidth() of a substring, calculating the physical width of a text when printed.