Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^4: convert tags to punctuation

by BillKSmith (Prior)
on Jan 16, 2021 at 16:47 UTC ( #11127001=note: print w/replies, xml ) Need Help??


in reply to Re^3: convert tags to punctuation
in thread convert tags to punctuation

You should ask the person who prepares your input file if he can direct you to either a specification of the file format or to the documentation of the program that created it. If this fails, I would write a perl program to list all the tags. The only way I know to get the values, is use an editor to examine the tags in context and make your best guess. (It usually will be obvious.)

It is nearly impossible to guess what will or will not make a Perl program faster. The usual advice is to profile your program. Only work on those parts which are using the most time. Use benchmark to measure possible improvement. In your case, I/O is probably taking much longer than processing. Slurping the entire file into memory is probably not an option. Reading the file in large blocks may help, but it is not easy to get right. I recommend against any optimization unless it is absolutely necessary.

Bill

Replies are listed 'Best First'.
Re^5: convert tags to punctuation
by Anonymous Monk on Jan 16, 2021 at 17:26 UTC

    I noticed something interesting about this document: If I view it with the 'more' filter. I see a bunch of black rectangles with the tags inside them. If I view it with gedi or ptked I see \x{93} , \x{94} , \x{95} , etc. Does it matter what chars go in my s/ ... / line? What does PERL see?

      Could you post a representative sample of your input file, following the advice given here and here?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://11127001]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (5)
As of 2021-04-17 07:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?