Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Missing \t in print output

by Sophienz (Acolyte)
on Jul 14, 2015 at 11:25 UTC ( [id://1134685]=perlquestion: print w/replies, xml ) Need Help??

Sophienz has asked for the wisdom of the Perl Monks concerning the following question:

Dear PerlMonks,

I'm having an issue with printing long strings. I build an array of 251 elements and join it with \t to print to a file. The problem is that the output randomly misses \t 's and sometimes both the \t and the last letter of the previous value.

I've tried using autoflush and printflush but the problems remain. Any suggestion would be much appreciated. This is the problem line at the moment:

$normal_fh->printflush(join("\t",@arrayToPrint)."\n");

UPDATED: Here is the code I am using

open my( $normal_fh ), ">>", $normalOut or die("Couldn't open NORM +AL $normalOut $!\n"); my @arrayToPrint; if ($nbGABlocks == 0) { @arrayToPrint = (); #print ("NoAlignmentBlocks!\n"); my @subArray = ($chrInterval,$startInterval,$endInterval,"NA", +"NA","NA"); my @naArray = ("NA","NA","NA","NA","NA","NA","NA") x scalar(@s +pecies); push(@arrayToPrint,@subArray); push(@arrayToPrint,@naArray); } NORMALBLOCK: foreach my $block (sort{$blocksHash{$a}{"orderBlock"} + <=> $blocksHash{$b}{"orderBlock"}} keys %blocksHash) { my $spNb = 0; my $order = $blocksHash{$block}{"orderBlock"}; @arrayToPrint = (); my @subArray = ($chrInterval,$startInterval,$endInterval,$bloc +k,$order,$warning); push(@arrayToPrint,@subArray); @subArray = (); SP: foreach my $sp (@species) { my $newSp = $speciesCorrespond{$sp}; $spNb++; if (exists($blocksHash{$block}{$newSp})) { my $chr = $blocksHash{$block}{$newSp}{"chr"}; my $start = $blocksHash{$block}{$newSp}{"start"}; my $end = $blocksHash{$block}{$newSp}{"end"}; my $strand = $blocksHash{$block}{$newSp}{"strand"}; my $size = $blocksHash{$block}{$newSp}{"sizeBlock"}; my $comment = $spBlocksHash{$newSp}{$block}{"comment"} +; @subArray = ($newSp,$chr,$start,$end,$strand,$size,$co +mment); push(@arrayToPrint,@subArray); @subArray = (); } else { @subArray = ($newSp,"NA","NA","NA","NA","NA","NA"); push(@arrayToPrint,@subArray); @subArray = (); } } my $printing = join("\t",@arrayToPrint); $" = "\t"; $normal_fh->printflush("$printing\n"); }

The output is very long, but here is a sample, where you can see that for some lines, instead of getting: spermophilus_tridecemlineatus\tNA, I get spermophilus_tridecemlineatuNA:

rattus_norvegicus NA NA NA NA NA NA dipodomys_ord +ii NA NA NA NA NA NA spermophilus_tridecemlineat +us NA NA NA NA NA NA ochotona_princeps NA +NA NA NA NA NA oryctolagus_cuniculus NA NA NA + NA NA NA rattus_norvegicus 5 13171176 13038994 -1 132183 NotC +ontiguous_326154Gap dipodomys_ordii NA NA NA NA NA + NA spermophilus_tridecemlineatus NA NA NA NA NA + NA ochotona_princeps NA NA NA NA NA NA ory +ctolagus_cuniculus NA NA NA NA NA NA rattus_norvegicus NA NA NA NA NA NA dipodomys_ord +ii NA NA NA NA NA NA spermophilus_tridecemlineat +uNA NA NA NA NA NA ochotona_princeps NA NA + NA NA NA NA oryctolagus_cuniculus NA NA NA N +A NA NA rattus_norvegicus NA NA NA NA NA NA dipodomys_ord +ii NA NA NA NA NA NA spermophilus_tridecemlineat +uNA NA NA NA NA NA ochotona_princeps NA NA + NA NA NA NA oryctolagus_cuniculus NA NA NA N +A NA NA rattus_norvegicus 5 13004812 12917777 -1 87036 NotCo +ntiguous_253399Gap dipodomys_ordii NA NA NA NA NA + NA spermophilus_tridecemlineatus NA NA NA NA NA + NA ochotona_princeps NA NA NA NA NA NA oryc +tolagus_cuniculuNA NA NA NA NA NA rattus_norvegicus 5 12917776 12899724 -1 18053 Conti +guous dipodomys_ordii NA NA NA NA NA NA sperm +ophilus_tridecemlineatus NA NA NA NA NA NA ochot +ona_princeps NA NA NA NA NA NA oryctolagus_cunic +ulus NA NA NA NA NA NA

Thanks a lot for your help,

Sophie.

Replies are listed 'Best First'.
Re: Missing \t in print output
by pme (Monsignor) on Jul 14, 2015 at 11:52 UTC
    Hi Sophienz

    First of all I would check the elements of the array. You can use Data::Dumper to print the array like this:

    use Data::Dumper: ... print Dumper(\@arrayToPrint) . "\n";
      Thanks for your suggestion, the array is completely fine, so it seems like it is the printing or joining step.
Re: Missing \t in print output
by Discipulus (Canon) on Jul 14, 2015 at 12:03 UTC
    Welcome to the monastery Sophienz

    At first glance it not seems a behaviour related to flush or autoflush the filehandle and about this (never used) method i read Turns on autoflush, print ARGS and then restores the autoflush status of the IO::Handle object. that seems also not so efficient while itarating over array.

    Anyway as the famous motto says 'Bad data ruins your day' check your array using something like Data::Dump dump or dd method: is better of the core Data::Dumper.

    HtH
    L*

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Re: Missing \t in print output
by 1nickt (Canon) on Jul 14, 2015 at 11:52 UTC

    Hi, have you validated the data before you print? Eg by dumping it with Data::Dumper ...

    Remember: Ne dederis in spiritu molere illegitimi!
Re: Missing \t in print output
by ww (Archbishop) on Jul 14, 2015 at 18:24 UTC

    As I asked earlier,

    HAVE YOU RE-READ perldoc -f join?
            and
    WHERE DOES printflush COME FROM?

    Answers:

    1. Here's what the join doc says (NB: "list"!):
      "join EXPR,LIST
      
      "Joins the separate strings of LIST into a single string with fields separated by the value of EXPR, and returns that new string. string. Example:
      
           $rec = join(':', $login,$passwd,$uid,$gid,$gcos,$home,$shell);
      
      "

      <UPDATE: In some cases, the use of double quotes instead of single quotes around the EXPR produces unexpected results. I'll try to create some relevant examples. </UPDATE>
       
    2. I still can't tell.

    If my previous attempt (in the CB) to help was unclear or misleading, my apologies for the sharp tone here.

      Hi,

      Sorry if I misunderstood what you were asking earlier.

      1. I have read the documentation for join, and as far as I can tell, I'm doing it properly. I have tried changing the double quotes to single quotes but that actually made printed \t instead of the tabs themselves

      2. The printflush method comes from http://perldoc.perl.org/IO/Handle.html, which states that "$io->printflush ( ARGS ) Turns on autoflush, print ARGS and then restores the autoflush status of the IO::Handle object. Returns the return value from print."

      However, this doesn't make any difference to my output from when I just use print:

      my $printing = join("\t",@arrayToPrint); $normal_fh->print("$printing\n");

      Thanks for your help, I will update with any progress.

        Thank you for the clarification.

        But you never showed us parts of your code -- the hashbang, if any, and the use IO::Handle;. That would have forestalled my concern about the appearance of printflush without a predicate. Discipulus sussed that out, but obviously, I didn't... and, in any case, we often see problems such as using a function from a module without useing the module. Those cases make it very hard to help if the code presented isn't an exact copy of the code which generated anomalies or errors.

        BTW, the preceding para may have some value for you but I hope it also provides some benefit for future newcomers who stumble upon it.

        And, as has already been said, welcome to PM.

Re: Missing \t in print output
by akuk (Beadle) on Jul 14, 2015 at 11:50 UTC

    Hi

    You might need to try :

    $" = "\t"; $normal_fh->printflush("@arrayToPrint\n");
Re: Missing \t in print output
by Sophienz (Acolyte) on Jul 15, 2015 at 13:10 UTC

    SOLVED

    Thank you all for the suggestions and advice. I now know how to post properly for potential next posts.

    The issue seems to have been the way I was viewing the output, and not a fault in the Perl script itself. I was viewing the output in the terminal (either by printing directly to STDOUT or by calling head on the output file) and somehow that created issues (missing tabs) that were not in the actual output file. I ended up downloading the output file locally and this showed that the file had all the required tabs.

    Many thanks, and sorry for the multiple confusions. I really expected the issue to be a Perl one, not a matter of the way I was viewing the output.

      Glad to hear it, although I do still think it's strange that characters just went missing in the terminal, so I still have a small suspicion that you may have control characters in your strings (they usually don't affect editors as much as they can affect the terminal). Using $Data::Dumper::Useqq=1; or Data::Dump to look at the strings may still be worth it.

        Thanks, I agree that it is strange as the distribution of missing \t seems very random. But I've printed the array using $Data::Dumper::Useqq=1; and there are no odd characters there.

        On the same run, looking at the output through the terminal (head outputFile) resulted in missing tabs, but viewing the file locally in a different text editor didn't.

Re: Missing \t in print output
by Anonymous Monk on Jul 15, 2015 at 11:35 UTC

    First, did you use $Data::Dumper::Useqq=1; when using Data::Dumper? (or just use Data::Dump) That will make it easier to spot any funny control characters.

    The problem with helping you out is that we need to be able to reproduce the issue. You'll need to boil down your code and input to something that still runs and reproduces the same issue and post it here; see also http://sscce.org/. If the sample input data is too large, you can try replacing it with some code that generates enough fake input data to reproduce the same problem.

Re: Missing \t in print output
by Sophienz (Acolyte) on Jul 14, 2015 at 15:11 UTC
    Thanks all for your suggestions, I've checked the array and it looks fine, unfortunately. I've also tried the following suggestion, but that didn't help either.
    $" = "\t"; $normal_fh->printflush("@arrayToPrint\n");

      If one of your (presumed) tabs in the original data is actually a space, your regex will fail. Likewise, if the offending element is two tabs... KA-BOOM! So, as advised in the consideration, you should add code tags around your sample data (for our ease of helping) and use any one of a host of tools to doublecheck the separators.

      I've checked the array and it looks fine

      Can you show us the output of the print Dumper?

        I cannot reproduce the issue on a short scale, it only happens when the array has a large number of elements.

Re: Missing \t in print output
by Anonymous Monk on Jul 15, 2015 at 11:01 UTC

    Hi Sophie.

    I don't even understand how your script is able to produce that output. For example:
    rattus_norvegicus NA NA NA NA NA NA (etc)... rattus_norvegicus 5 13171176 13038994 -1 132183 NotC +ontiguous_326154Gap (etc)... rattus_norvegicus 5 13004812 12917777 -1 87036 NotCo +ntiguous_253399Gap (etc)...
    As far as I can tell, 'rattus norvegicus' at the beginning of a line should come from
    @arrayToPrint = (); my @subArray = ($chrInterval,$startInterval,$endInterval,$block,$orde +r,$warning); push(@arrayToPrint,@subArray);
    But $chrInterval, $startInterval, $endInterval and $warning don't even change during the 'NORMALBLOCK' loop. Why does the output change then? Am I missing something?

    And $chrInterval seems an unlikely name for a variable that contains 'rattus norvegicus'?

      Hi,

      Thanks for your question. You are correct, I've cropped the output to only show where the problem appears.

      So the ($chrInterval,$startInterval,$endInterval,$block,$order,$warning) elements don't appear, and they do not change within the NORMALBLOCK loop.

      However, for each iteration of the SP loop, the elements ($newSp,$chr,$start,$end,$strand,$size,$comment) do change. The problem only happens when I print those elements separated by \t.

      The whole line is 251 elements and would be unreadable on here, and won't fit in either apparently and unfortunately the problem only arises when I use a large number of species, ie. a large array.

      Also note that this happened without any join step as well, when I was just printing the elements as they were defined.

      Please let me know if I can make things clearer.

        Hi Sophienz,

        Please let me know if I can make things clearer.

        Post a complete, working test script that shows your problem. Fine if it needs a 251-element array, just make one:
        push @long_list, $_ for (1001..1251);

        Cut out all the code that doesn't affect your problem. Start a new test script with just the loop that is giving you trouble.

        Isolate the problem. Make it happen when you don't have your data in the script. That will prove whether or not the data are to blame.

        The monks don't need to see all your code, but neither do they need to see an arbitrary subsection of it. They do not need your variable names (unless that is the problem) and they do not need your specific data (unless that is the problem). When the problem is programming, your logic, or, Mysteriously Unidentified, the monks need a small, self-contained, working test script that demonstrates the problem.

        The benefit for you is that while you are making the test script, you will usually discover the problem and see how to fix it. And if not, you'll present the monks with something with which they can help you.

        The way forward always starts with a minimal test.
        I see. You're saying the example output shows only middle parts of lines. Well...

        Thinking logically, there are four possibilities:

        1. Bug in your program
        2. Bug in your tool that you use to view the output file
        3. Bug in Perl
        4. User error
        So,
        1. I don't see any apparent bugs in the snippet you provided. Although some things are strange, for example, if ($nbGABlocks == 0) {... doesn't seem to have any effect, since @arrayToPrint is emptied before appending more elements anyway (in the NORMALBLOCK loop). Bugs are not impossible, but they must be somewhere else...
        2. How are you viewing the file?
        3. That kind of bug in Perl seems unlikely... What is the Perl version?
        4. Yeah, how are you viewing the file? For example, you're opening the file for appending, are you making sure you're viewing the new part and not the old one?
        There is a number of 'paste your code here' sites on the Internet, e.g. hastebin.com. Would it be possible to paste the output of Data::Dumper and of your program using it or a similar site? And more of your script, too (ideally something that we could actually run).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1134685]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2024-04-18 04:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found