Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Re: process multiline text and print in desired format

by jcb (Parson)
on Mar 17, 2021 at 23:33 UTC ( [id://11129860] : note . print w/replies, xml ) Need Help??

in reply to process multiline text and print in desired format

This is an example of a problem that formats can solve quite nicely:

#!/usr/bin/perl use strict; use warnings; our %rec; # formats need global variables, not mere lexicals format STDOUT_TOP = Product Product basic all overse +as Release Type color colors shippi +ng ---------------------------------------------------------------------- +-------- . ; format STDOUT = ^>>>>>>>>>> ^<<<<<<<<<<< ^<<<<<<<<<< ^<<<<<<<<<<<<<<<<<< ^<< +<<<<< ~~ $rec{rel}, $rec{type}, $rec{bascol}, $rec{allcol}, $rec +{ovship} . ; # to read a file instead, use this and change "<DATA>" below to "<IN> +" # open IN, '<', $ARGV[0] or die "open $ARGV[0]: $!"; $: .= ','; # split filled lines also on comma while (<DATA>) { chomp; if (m/^Release Date([\d\/]+$)/) { %rec = (type => 'none', map { $_ => 'N/A' } qw(bascol allcol ovship)); $rec{rel} = $1; } $rec{type} = $1 if m/^\s+product [^(]+\(([^)]+)\)$/; $rec{bascol} = $1 if m/^\s+color basic (.*)$/; $rec{allcol} = $1 if m/^\s+color all (.*)$/; $rec{ovship} = $1 if m/^\s+overseas shipping (.*)$/; write if m/^$/; } write # emit the last record if there was no trailing blank line __DATA__ Release Date2/2/2019 product clock1(analog) color basic white color all white,black,silver warranty 1 year not sold yet Release Date2/2/2020 product none Release Date2/2/2021 product clock1(digital) color basic black color all black,silver warranty 1 year not sold yet Release Date2/2/2022 product clock2(digital) color basic white color all white overseas shipping yes shipping charges yes warranty 1 year not sold yet

See perlform for more information about the format mechanism, although it is somewhat obscure and most suited to simple scripts like this. If this is part of a larger system as some of my fellow monks suspect, this is probably a sub-optimal solution.

Replies are listed 'Best First'.
Re^2: process multiline text and print in desired format
by ak_mmx (Novice) on Mar 18, 2021 at 01:49 UTC
    Thanks for the response, yes the input data can be few 1000 lines or more. I just pasted a portion to get an idea on how to approach the problem. Would you suggest other optimal solutions if input lines are more ? the script logic is likely not to change much though.

      More input is no problem; just change the script as indicated to read from a file instead and delete the __DATA__ section. The only problems would come if the script logic were to be embedded in a larger script because formats are a very old feature and have some limitations due to being far older than many of the features that support modern Perl programming. Most notably, formats can only access global variables and are themselves in a global namespace.