http://qs321.pair.com?node_id=157481

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have a text file that has multiple tabs between entries and I need to get rid of all but one, and therefore have a common delimeter, such as "\t" so when I parse this file into a MySQL database it will not add any blank spots where there should be a data entry.

Any help is appreciated, this is for a case study program, not homework. Here is a sample line entry from the text file that is being parsed:

These are the heading variables where I want the text entry to be put into, I have a program that actually inserts the text file into the database, but there are problems when there are extra tabs or whitespace inside an entry.
'$id_num','$firstname','$lastname','$grade','$grad','$addr','$city','$ +state','$zip','$phone','$dob','$sex','$id_num2'


Here is a dummy entry:<br> 1415 John Doe +10 2004 1380 SE 8TH AVE ANYTOWN + OR 90210 5556611 11/16/85 M 1415

Replies are listed 'Best First'.
Re: parsing text file
by ViceRaid (Chaplain) on Apr 08, 2002 at 17:09 UTC

    hi

    Assuming that

    1. $line is a line you've read from your text file, and that
    2. a newline is your record delimiter, and that
    3. your fields' content never contains tabs:
    $line =~ s/\t+/\t/g;

    The + modifier to the \t tells the regular expression to find one or more of the preceding thingie (in this case, a tab). The g modifier at the end tells the regular expression to perform the substitution on all occurrences of the match.

    HTH

    //=\\
Re: parsing text file
by schumi (Hermit) on Apr 08, 2002 at 18:03 UTC
    On the basis of the same assumptions ViceRaid makes above, you could also do the following:

    $line =~ s/\t{2,}/\t/g;

    This only matches on occurences of two or more tabs, and hence would prevent any single tabs within your data from being stripped.

    --cs

    There are nights when the wolves are silent and only the moon howls. - George Carlin