Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Processing tagged text

by diggernz (Sexton)
on Jul 19, 2006 at 09:37 UTC ( [id://562241]=perlquestion: print w/replies, xml ) Need Help??

diggernz has asked for the wisdom of the Perl Monks concerning the following question:

I am wanting to process the following tagged text file to use with Adobe Indesign CS. The data is about a given horse race.
1 p FRODO HEST q 1r Fronts C J Campbell,Mrs M J Pault B G 3yrs Earl-Amy Hest Driver:...........u Trainer: Phil Williamson, Oamaruvx Black, Blue & White Chequered Sashw oFirst Start at a Race Meeting. Qualified: 25/04/2006y y 2 p SMOKEY MICKPOT q 2r Front s (1 Starts: 0(0) - 0 - 0 - Lt$0($0) - W$0($0))t W R Low,E T Murphyt BR G 3yrs Wrestle-Tough Whiz Driver:...........u Trainer: Wayne Low, Waimatev (Last Driver:Wayne Low)x Green, Brown & White Striped Braces, White Sleevesw 0 o15Jun06 Forbury Pk 2200 Std Ft 11 of 14 Wnr:Daddy Warbucks +y  y
I want to read this data into variables so I can create a new tagged text file to be imported into InDesign. InDesign has its own tagged format.

Desired result
$horse = "FRODO HEST";
$postion = "1";
$detail = "FRONT";
$history = "1 Starts: 0(0) - 0 - 0 - Lt$0($0) - W$0($0))";
$
etc Heres a sample of my previous program I used to process each line
if ($line =~ /v/) { # B M 5yrs Straphanger-Amy Hest D +river:...........u chomp $line; my ($trainer, $last_driver) = split //, $line; # Remove whitespaces from start $trainer =~ s/^\s+//; # Remove the first chars of variable $last_driver = substr($last_driver, 1); # Remove whitespaces from start $last_driver =~ s/^\s+//; print OUTPUT "<ParaStyle:><pHyphenationLadderLimit:0><pHyp +henationZone:22.700000><pTabRuler:28.350000\,Left\,.\,0\,\;251.050000 +\,Right\,.\,0\,\;><pMaxWordSpace:1.500000><pMinWordSpace:0.750000><pM +axLetterspace:0.250000><pMinLetterspace:-0.050000><pKeepFirstNLines:1 +><pKeepLastNLines:1><pRuleAboveColor:Black><pRuleAboveTint:100.000000 +><pRuleBelowColor:Black><pRuleBelowTint:100.000000><cSize:5.500000><c +BaselineShift:12.000000><cLeading:5.500000><cFont:Switzerland> $tr +ainer $last_driver <cSize:><cBaselineShift:><cLeading:><cFont:><pHyphenationLadderLimit:> +<pHyphenationZone:><pTabRuler:><pMaxWordSpace:><pMinWordSpace:><pMaxL +etterspace:><pMinLetterspace:><pKeepFirstNLines:><pKeepLastNLines:><p +RuleAboveColor:><pRuleAboveTint:><pRuleBelowColor:><pRuleBelowTint:>" +; }
The "OUTPUT" is to the other text file I mentioned earlier. One of the problems I am facing is that the source tag file can often change (E.g. an extra field added). I am wanting some advise on a better approach to handle the processing. Can any one help me with a clear way to subtract this data into appropriate variables or arrays using pattern matching. Will I always require some hard coding for the type of tags used. I am wanting to be able to have these variables at my finger tips, so I can prompt the user to choose an approprate layout.
I think i've blabbed on enough by now
Thanks

Replies are listed 'Best First'.
Re: Processing tagged text
by planetscape (Chancellor) on Jul 19, 2006 at 11:45 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://562241]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (5)
As of 2024-03-28 16:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found