Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Spaces Between Chars While Reading File

by Godsrock37 (Sexton)
on Aug 01, 2008 at 13:15 UTC ( [id://701689]=perlquestion: print w/replies, xml ) Need Help??

Godsrock37 has asked for the wisdom of the Perl Monks concerning the following question:

Hello fellow perlmonks... I've got a simple one (i hope) today that i could use some help with if ur willing

I'm reading the contents of a file... which I've been doing the same way for months now...

say("Dequeuing ... "); my @lines = <MYQUEUE>; if (@lines){ foreach my $line (@lines){ #$line =~ s/"(.+)"/$1/gix; #say("$line \n"); schedule($line); } $is_queued = 1; close (MYQUEUE); }

I changed the contents from url's on a single line to a part number on a single line... I read the file with excel 2007 and save it as a .txt and it looks fine to me in vi... but when i read it in it adds a space between EVERY two characters so "1071901" becomes "1 0 7 1 9 0 1"

Any thoughts?

Replies are listed 'Best First'.
Re: Spaces Between Chars While Reading File
by massa (Hermit) on Aug 01, 2008 at 14:31 UTC
    I would say you have an encoding problem (like your file is in UTF-16 encoding, and your program is garbling it). In vi, what is the encoding your file says it is? (":set fenc?" will tell) if it really is utf-16 (or ucs-2, etc), the solution is to
    binmode MYQUEUE, ':encoding(utf16)';
    before the read...
    []s, HTH, Massa (κς,πμ,πλ)

      The encoding was in fact usc-2... thank you very much sir

      I'll try the code included... thanks again

Re: Spaces Between Chars While Reading File
by igelkott (Priest) on Aug 01, 2008 at 13:33 UTC

    Using ".+" in a regular expression is generally going to lead to trouble.

    Given the excel source and desired removal of quotes, it seems rather likely that you're reading from a CSV file. As such, you might consider using Text::CSV to save you from the complications of the format.

    Alternatively, you might even consider reading from the excel file directly with Spreadsheet::ParseExcel, though I must admit that I've never used it on 2007.

      Using ".+" in a regular expression is generally going to lead to trouble.

      Why is that (generally)? — I mean, it all depends on what you want to achieve, doesn't it?

        "Greedily garble everything" is rarely what you want to achieve. In the OP case, it will change
        "ab", "abc", "cde"
        into
        ab", "abc", "cde
        which is (possibly) not the intended result. The ideal would be
        s/"([^"]+)"/$1/g
        that would yield
        ab, abc, cde
        []s, HTH, Massa (κς,πμ,πλ)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://701689]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2024-04-19 14:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found