Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Parsing Pipe Delimited Multi-line File

by LinuxBoxRocks (Initiate)
on Feb 04, 2004 at 18:21 UTC ( [id://326548]=perlquestion: print w/replies, xml ) Need Help??

LinuxBoxRocks has asked for the wisdom of the Perl Monks concerning the following question:

I will be receiving a text file from an outside application that is pipe delimited. Basically, there will be 3 lines of text in the following format:
|John Smith|123 Any Street||Any City|KS|23568| |Mary Smith|123 Any Avenue||Any Town|KS|32546| etc......
Once the information has been parsed, I will need to insert it into a database. What would be the best way to grab this information? How do you parse a file that is more than one line? Should I place this information into an array?Thanks!!!!

Replies are listed 'Best First'.
Re: Parsing Pipe Delimited Multi-line File
by valentin (Abbot) on Feb 04, 2004 at 18:41 UTC
    pseudo code
    #!/usr/bin/perl use DBI; my $dbh = DBI->connect; my $sth = $dbh->prepare("INSERT INTO mytable ( name, street, city ) VA +LUES ( ?, ?, ? )"); $sth->execute; while(defined ( $_ = <DATA>) ) { chomp; # ??, name, street, ?, city, ida, id, ?? my @d = split /\|/, $_; $sth->execute($d[1], $d[2], $d[4]); } __DATA__ |John Smith|123 Any Street||Any City|KS|23568| |Mary Smith|123 Any Avenue||Any Town|KS|32546|
Re: Parsing Pipe Delimited Multi-line File
by revdiablo (Prior) on Feb 04, 2004 at 18:58 UTC

    There are many ways to accomplish this task. If it's a one-time deal, you can do something quick-n-dirty:

    #!/usr/bin/perl -nl chomp; @record = split /\|/, $_; print "INSERT INTO table (foo, bar, baz)" , " VALUES ('$record[0]', '$record[1]', '$record[2]')";

    Then run it with something like:

        ./yourscript < inputdata > output.sql

    And you'd end up with a .sql file you could run against the database.

    Now, if you need something more reliable and reusable, you'll probably want to wrap your own I/O code rather than letting perl -n take care of it for you (for more on this, you can see PerlMonks' Basic Input and Output tutorial). You'd probably want to insert the data directly into the database with DBI (make sure you use placeholders, as documented in the DBI docs). The basic idea about splitting the lines, as my quick-n-dirty snippet demonstrates, would be the same.

    Update: Oh yeah, also forgot to mention. For anything other than a one-off throwaway script, you should probably use strict and warnings, unless there's a good reason not to.

Re: Parsing Pipe Delimited Multi-line File
by b10m (Vicar) on Feb 04, 2004 at 18:41 UTC

    You can just loop through the lines, like the code below. See split on this too.

    use Data::Dumper; while(<DATA>) { @array = split('\|', $_); # printing to Dumper, just to show what @array looks like now print Dumper @array; # insert DBI stuff here ... } __DATA__ |John Smith|123 Any Street||Any City|KS|23568| |Mary Smith|123 Any Avenue||Any Town|KS|32546|
    --
    b10m

    All code is usually tested, but rarely trusted.
Re: Parsing Pipe Delimited Multi-line File
by blue_cowdawg (Monsignor) on Feb 04, 2004 at 18:47 UTC

        What would be the best way to grab this information? How do you parse a file that is more than one line? Should I place this information into an array?

    What have you tried and what has failed?

    First off, you have to open the file, read each line in, split it on the pipes and then depending on the database in question use DBI or something similar to access the database with the appropriate SQL statements to insert the data into the proper table.

    As I asked in my first paragraph, what have you tried and what has failed? The Perl Monks are a helpful lot but we kinda insist on others at least attemtpting helping themselves first.


    Peter L. Berghold -- Unix Professional
    Peter at Berghold dot Net
       Dog trainer, dog agility exhibitor, brewer of fine Belgian style ales. Happiness is a warm, tired, contented dog curled up at your side and a good Belgian ale in your chalice.
Re: Parsing Pipe Delimited Multi-line File
by ysth (Canon) on Feb 04, 2004 at 19:04 UTC

    Do you have the "insert it into a database" part designed yet? What you want to do with the data generally determines what structure to store it in. It's easy enough to put it into either an array or separate variables.

    If there is some form of encoding for allowing | to be in the data of one of the fields, or allowing fields to be optionally quoted in some way, you want to look at one of the CSV modules. Otherwise, use perl's split function. (Remember that split takes a regular expression as the first argument; since | is a special character in regular expressions, you'd need to use /\|/).

Re: Parsing Pipe Delimited Multi-line File
by Plankton (Vicar) on Feb 04, 2004 at 18:42 UTC
    Sorry if this is off topic, but the answer might depend on the database you are planning to use. For example if you where going to load this data into an Oracel database I would suggest using Oracle's sqlldr instead of a Perl script.

    Plankton: 1% Evil, 99% Hot Gas.
      Also, if you are using MySQL read up on the LOAD DATA INFILE command. This will also remove perl as a middle man.


      -silent11

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://326548]
Approved by blue_cowdawg
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (5)
As of 2024-04-19 13:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found