Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine

parsing text file

by Anonymous Monk
on Apr 08, 2002 at 16:57 UTC ( #157481=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have a text file that has multiple tabs between entries and I need to get rid of all but one, and therefore have a common delimeter, such as "\t" so when I parse this file into a MySQL database it will not add any blank spots where there should be a data entry.

Any help is appreciated, this is for a case study program, not homework. Here is a sample line entry from the text file that is being parsed:

These are the heading variables where I want the text entry to be put into, I have a program that actually inserts the text file into the database, but there are problems when there are extra tabs or whitespace inside an entry.
'$id_num','$firstname','$lastname','$grade','$grad','$addr','$city','$ +state','$zip','$phone','$dob','$sex','$id_num2'

Here is a dummy entry:<br> 1415 John Doe +10 2004 1380 SE 8TH AVE ANYTOWN + OR 90210 5556611 11/16/85 M 1415

Replies are listed 'Best First'.
Re: parsing text file
by ViceRaid (Chaplain) on Apr 08, 2002 at 17:09 UTC


    Assuming that

    1. $line is a line you've read from your text file, and that
    2. a newline is your record delimiter, and that
    3. your fields' content never contains tabs:
    $line =~ s/\t+/\t/g;

    The + modifier to the \t tells the regular expression to find one or more of the preceding thingie (in this case, a tab). The g modifier at the end tells the regular expression to perform the substitution on all occurrences of the match.


Re: parsing text file
by schumi (Hermit) on Apr 08, 2002 at 18:03 UTC
    On the basis of the same assumptions ViceRaid makes above, you could also do the following:

    $line =~ s/\t{2,}/\t/g;

    This only matches on occurences of two or more tabs, and hence would prevent any single tabs within your data from being stripped.


    There are nights when the wolves are silent and only the moon howls. - George Carlin

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://157481]
Approved by buckaduck
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (2)
As of 2022-05-19 06:13 GMT
Find Nodes?
    Voting Booth?
    Do you prefer to work remotely?

    Results (71 votes). Check out past polls.