Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Strange MySQL Problem

by tyndyll (Novice)
on Aug 17, 2004 at 15:15 UTC ( [id://383703]=perlquestion: print w/replies, xml ) Need Help??

tyndyll has asked for the wisdom of the Perl Monks concerning the following question:

Good Afternoon all,

I'm having problems with a script that I am writing. The purpose of it is to take a dbf file and convert it into SQL for insertion into a MySQL database. The process is

Open File Read in columns and derive CREATE TABLE syntax While(Reading in dbf file) Optimise column length print OUT data into a text file Load data into database using LOAD INFILE syntax

The script by and large works, in that it creates the appropriate table and commences the LOAD INFILE, the problem is that it does not complete the job - there are always fewer records in the table than the original file. I have changed the code to print out the statement it sends to the database and entered it straight into the MySQL console, which uploads the correct amount of records..? Can anyone give me any guidance on this please??

Many thanks in advance

Tyndyll

(PS - still new to Perl so any style guidelines also appreciated)

use strict; use XBase; use DBI; use Getopt::Std; sub buildTable; sub parse_data; sub alter_table; sub query($); sub mod; getopt('ijd'); our($opt_i,$opt_j,$opt_d); if(!(defined($opt_i) && defined($opt_j) && defined($opt_d))){ exit; } my $host = 'localhost'; my $user = '###user###'; my $pass = '###password###'; my $dbh = DBI->connect("DBI:mysql:database=$opt_d;host=$host", $user, +$pass ); if(!defined($dbh)){exit(-1);} my $table = 'jn'.$opt_j; # Table Name my @cols = $dbh->tables; # Using cols as a temp holder foreach(@cols){ $_=~s/`//g; if($_ eq $table){ exit(-2); } } @cols = (); $opt_i=~s/\\{1,2}/\//g; my $dbf = new XBase $opt_i or die Xbase->errstr; my $temp = ""; if($opt_i=~/^(.+)\/(.+?)$/){ $temp = $1."\/".$table."_".$^T; } mod(); sub buildTable{ my @names = $dbf->field_names; my @size = $dbf->field_lengths; my $sql = 'CREATE TABLE '.$table.' (GENID mediumint UNSIGNED AUTO_IN +CREMENT NOT NULL,'; for(my $i = 0; $i <= $#names; $i++){ $sql .= $names[$i].' char('.$size[$i].') null,'; $cols[$i] = {'name'=> $names[$i], 'type'=>'n','value'=>0}; } return $sql.='PRIMARY KEY(GENID))'; } sub parse_data(){ open (TEMP, ">".$temp) or die("Cannot create temp file"); my $cursor = $dbf->prepare_select; my $i = 1; my $sql .= ''; my @queries = (); while(my @data = $cursor->fetch){ my $j = 0; for(@data){ s/([\'\"\;\\\/\*])/\\$1/g; if($cols[$j]{'type'} eq 'n'){ if(/^\d$/){ if($cols[$j]{'value'} < $_){ $cols[$j]{'value'} = $_; } }else{ $cols[$j]{'type'} = 'c'; if($cols[$j]{'value'} < length($_)){ $cols[$j]{'value'} = le +ngth($_); } } }else{ if($cols[$j]{'value'} < length($_)){ $cols[$j]{'value'} = leng +th($_); } } $j++; } print TEMP $i.',"'.join('","', @data)."\"\n"; $i++; } } sub alter_table(){ my @new_cols = (); my $type = ""; for(my $i=0;$i<$#cols;$i++){ if($cols[$i]{'type'} eq 'n'){ $type = pick_type($cols[$i]{'value'}, 'mysql'); #if undefined reset and return false; }else{ $type = 'varchar('.$cols[$i]{'value'}.')'; } if(!defined $type){ return undef; }else{ $new_cols[$i] = ' MODIFY '.$cols[$i]{'name'}.' '.$type.' NULL'; } if($cols[$i]{'value'} == 0){ $new_cols[$i] = ' DROP COLUMN '.$cols[$i]{'name'}; } } return 'ALTER TABLE '.$table.' '.join(',',@new_cols); } sub pick_type($$){ my @type = (); if($_[1] eq 'mysql'){ @type = ( {'name' => 'tinyint' , 's' => 128, 'u' => 255, }, {'name' => 'smallint' , 's' => 32767, 'u' => 65535, }, {'name' => 'mediumint' , 's' => 8388607, 'u' => 16777215 +, }, {'name' => 'int' , 's' => 2147483647, 'u' => 4294967295, + } ) }else{ return undef; } #only checking if smaller than unsigned - later check for negative a +nd floats for(my $i=0; $i <= $#type; $i++){ if($_[0] < $type[$i]{'u'}){ return $type[$i]{'name'}; } } return undef; } sub query($){ my $sth = $dbh->prepare($_[0]); $sth->execute(); } sub mod{ query(buildTable()); parse_data(); query('LOAD DATA INFILE "'.$temp.'" INTO TABLE '.$table.' FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY "\"" LINES TERMINATED BY "\r\n"'); query(alter_table()); }

Replies are listed 'Best First'.
Re: Strange MySQL Problem
by davido (Cardinal) on Aug 17, 2004 at 15:40 UTC
    It's probably not a good idea to post your database username and password to a public forum.

    Dave

      doh!!! lets pretend that never happened....
Re: Strange MySQL Problem
by jZed (Prior) on Aug 17, 2004 at 17:10 UTC
    I'm not quite sure why you're using XBase instead of DBD::XBase. Also, you might consider using DBD::CSV for writing the file to avoid the hand parsing and ensure proper embedding and quoting. By using DBI for all three purposes (reading the .dbf, writing the CSV, loading the MySQL data), your code should be easier to maintain.
Re: Strange MySQL Problem
by dragonchild (Archbishop) on Aug 18, 2004 at 02:54 UTC
    When I have done this type of task, I've broken it up into several discrete portions and stitched them together using a shell script. Right now, I have a daily load from a sub-optimally normalized Oracle database into a MySQL database (for generating reports). I use the following process:
    1. Oracle's SQL*Plus connections kicked off from command-line to generate tab-delimited data files
    2. A cleanup step to make sure all backslashes are backslashes, there's a \N for empty datafields, and the like. This a shell script that calls Perl in commandline.
    3. MySQL's client is kicked off in batch mode. I have to do some massaging of the data, as well, so I need to use MySQL.

    While all of those things could've been done in Perl, it's a lot quicker to do it in the native client. And, the reason I wrote the second step as a shell script around commandlines is that it's easier to do

    #!/bin/sh foreach f in *.xsv loop perl -pi -e 's/asdf/jkl;/g' $f endloop
    than it is for me to open the file, loop through all the lines, and write the file back out. Why make it harder for yourself?

    ------
    We are the carpenters and bricklayers of the Information Age.

    Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

    I shouldn't have to say this, but any code, unless otherwise stated, is untested

      What is the best way to implement the shell script? I am using my desktop machine for this at the moment (proof of concept) so will hopefully be moving to a Linux server afterwards... Is there a method of doing this so I don't have to rewrite the script?

      Also, does anyone have any idea why the Insert is failing? I've tried using sleep to no avail. I have noticed however that the average row length is 582 when inserted using MySQL.. don't know if that helps at all..

        Use Cygwin.

        ------
        We are the carpenters and bricklayers of the Information Age.

        Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

        I shouldn't have to say this, but any code, unless otherwise stated, is untested

Re: Strange MySQL Problem
by Plankton (Vicar) on Aug 17, 2004 at 15:29 UTC
    I don't see a $dbh->disconnect(); anywhere?

    Plankton: 1% Evil, 99% Hot Gas.
Re: Strange MySQL Problem
by blm (Hermit) on Aug 18, 2004 at 12:12 UTC

    This may not be a perl problem. I am currently doing a similar thing trying to pull data out of foxpro dbf files and put them in other SQL databases. Some lines are not coming across because there are subtle differences in the data types between the data source (foxpro dbf) and the destination.

    To figure out what was going haywire, in my case, I tried importing the data into a MS Access database using a system DSN. Some rows generated an error because the value in an integer feild in the row on the source was too large for integer feild on the destination table.

    Maybe checking the differences between the values allowed for each datatype in dbf's Vs Mysql might shed light?

      differences between the values allowed for each datatype in dbf's Vs Mysql might shed light?

      When the script starts I go through each of the columns and create a base table of a char type, which in MySQL goes holds up to 255 characters. As I go through the data and dump it into a text file I check whether the column can be converted into a number type, and if not check to see if it has the longest string length (if it does update the length). At the end the tables are updated to optimise them for space etc.

      I'm preety sure the problem isn't a type one - using the MySQL client it inserts fine. I thought it was maybe a timing issue - Looking at the last line of the shortened data shows that the last few columns are null.. is the script moving faster than it is writing? If so why did

      parse_data(); sleep 10; query('LOAD DATA INFILE "'.$temp.'" INTO TABLE '.$table.' FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY "\"" LINES TERMINATED BY "\r\n"'); unlink($temp);
      not work?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://383703]
Approved by been42
Front-paged by castaway
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (5)
As of 2024-03-28 14:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found