Sophienz has asked for the wisdom of the Perl Monks concerning the following question:
Dear PerlMonks,
I'm having an issue with printing long strings. I build an array of 251 elements and join it with \t to print to a file. The problem is that the output randomly misses \t 's and sometimes both the \t and the last letter of the previous value.
I've tried using autoflush and printflush but the problems remain. Any suggestion would be much appreciated. This is the problem line at the moment:
$normal_fh->printflush(join("\t",@arrayToPrint)."\n");
UPDATED: Here is the code I am using
open my( $normal_fh ), ">>", $normalOut or die("Couldn't open NORM
+AL $normalOut $!\n");
my @arrayToPrint;
if ($nbGABlocks == 0) {
@arrayToPrint = ();
#print ("NoAlignmentBlocks!\n");
my @subArray = ($chrInterval,$startInterval,$endInterval,"NA",
+"NA","NA");
my @naArray = ("NA","NA","NA","NA","NA","NA","NA") x scalar(@s
+pecies);
push(@arrayToPrint,@subArray);
push(@arrayToPrint,@naArray);
}
NORMALBLOCK: foreach my $block (sort{$blocksHash{$a}{"orderBlock"}
+ <=> $blocksHash{$b}{"orderBlock"}} keys %blocksHash) {
my $spNb = 0;
my $order = $blocksHash{$block}{"orderBlock"};
@arrayToPrint = ();
my @subArray = ($chrInterval,$startInterval,$endInterval,$bloc
+k,$order,$warning);
push(@arrayToPrint,@subArray);
@subArray = ();
SP: foreach my $sp (@species) {
my $newSp = $speciesCorrespond{$sp};
$spNb++;
if (exists($blocksHash{$block}{$newSp})) {
my $chr = $blocksHash{$block}{$newSp}{"chr"};
my $start = $blocksHash{$block}{$newSp}{"start"};
my $end = $blocksHash{$block}{$newSp}{"end"};
my $strand = $blocksHash{$block}{$newSp}{"strand"};
my $size = $blocksHash{$block}{$newSp}{"sizeBlock"};
my $comment = $spBlocksHash{$newSp}{$block}{"comment"}
+;
@subArray = ($newSp,$chr,$start,$end,$strand,$size,$co
+mment);
push(@arrayToPrint,@subArray);
@subArray = ();
} else {
@subArray = ($newSp,"NA","NA","NA","NA","NA","NA");
push(@arrayToPrint,@subArray);
@subArray = ();
}
}
my $printing = join("\t",@arrayToPrint);
$" = "\t";
$normal_fh->printflush("$printing\n");
}
The output is very long, but here is a sample, where you can see that for some lines, instead of getting: spermophilus_tridecemlineatus\tNA, I get spermophilus_tridecemlineatuNA:
rattus_norvegicus NA NA NA NA NA NA dipodomys_ord
+ii NA NA NA NA NA NA spermophilus_tridecemlineat
+us NA NA NA NA NA NA ochotona_princeps NA
+NA NA NA NA NA oryctolagus_cuniculus NA NA NA
+ NA NA NA
rattus_norvegicus 5 13171176 13038994 -1 132183 NotC
+ontiguous_326154Gap dipodomys_ordii NA NA NA NA NA
+ NA spermophilus_tridecemlineatus NA NA NA NA NA
+ NA ochotona_princeps NA NA NA NA NA NA ory
+ctolagus_cuniculus NA NA NA NA NA NA
rattus_norvegicus NA NA NA NA NA NA dipodomys_ord
+ii NA NA NA NA NA NA spermophilus_tridecemlineat
+uNA NA NA NA NA NA ochotona_princeps NA NA
+ NA NA NA NA oryctolagus_cuniculus NA NA NA N
+A NA NA
rattus_norvegicus NA NA NA NA NA NA dipodomys_ord
+ii NA NA NA NA NA NA spermophilus_tridecemlineat
+uNA NA NA NA NA NA ochotona_princeps NA NA
+ NA NA NA NA oryctolagus_cuniculus NA NA NA N
+A NA NA
rattus_norvegicus 5 13004812 12917777 -1 87036 NotCo
+ntiguous_253399Gap dipodomys_ordii NA NA NA NA NA
+ NA spermophilus_tridecemlineatus NA NA NA NA NA
+ NA ochotona_princeps NA NA NA NA NA NA oryc
+tolagus_cuniculuNA NA NA NA NA NA
rattus_norvegicus 5 12917776 12899724 -1 18053 Conti
+guous dipodomys_ordii NA NA NA NA NA NA sperm
+ophilus_tridecemlineatus NA NA NA NA NA NA ochot
+ona_princeps NA NA NA NA NA NA oryctolagus_cunic
+ulus NA NA NA NA NA NA
Thanks a lot for your help,
Sophie.
Re: Missing \t in print output
by pme (Monsignor) on Jul 14, 2015 at 11:52 UTC
|
Hi Sophienz
First of all I would check the elements of the array.
You can use Data::Dumper to print the array like this:
use Data::Dumper:
...
print Dumper(\@arrayToPrint) . "\n";
| [reply] [d/l] |
|
Thanks for your suggestion, the array is completely fine, so it seems like it is the printing or joining step.
| [reply] |
Re: Missing \t in print output
by Discipulus (Canon) on Jul 14, 2015 at 12:03 UTC
|
| [reply] [d/l] |
Re: Missing \t in print output
by 1nickt (Canon) on Jul 14, 2015 at 11:52 UTC
|
Hi, have you validated the data before you print? Eg by dumping it with Data::Dumper ...
Remember: Ne dederis in spiritu molere illegitimi!
| [reply] [d/l] |
Re: Missing \t in print output
by ww (Archbishop) on Jul 14, 2015 at 18:24 UTC
|
As I asked earlier,
HAVE YOU RE-READ perldoc -f join? and WHERE DOES printflush COME FROM?
Answers:
- Here's what the join doc says (NB: "list"!):
"join EXPR,LIST
"Joins the separate strings of LIST into a single string with fields separated by the value of EXPR, and returns that new string. string. Example:
$rec = join(':', $login,$passwd,$uid,$gid,$gcos,$home,$shell);
"
<UPDATE: In some cases, the use of double quotes instead of single quotes around the EXPR produces unexpected results. I'll try to create some relevant examples. </UPDATE>
- I still can't tell.
If my previous attempt (in the CB) to help was unclear or misleading, my apologies for the sharp tone here.
| [reply] [d/l] [select] |
|
Hi,
Sorry if I misunderstood what you were asking earlier.
1. I have read the documentation for join, and as far as I can tell, I'm doing it properly. I have tried changing the double quotes to single quotes but that actually made printed \t instead of the tabs themselves
2. The printflush method comes from http://perldoc.perl.org/IO/Handle.html, which states that "$io->printflush ( ARGS )
Turns on autoflush, print ARGS and then restores the autoflush status of the IO::Handle object. Returns the return value from print."
However, this doesn't make any difference to my output from when I just use print:
my $printing = join("\t",@arrayToPrint);
$normal_fh->print("$printing\n");
Thanks for your help, I will update with any progress. | [reply] [d/l] |
|
Thank you for the clarification.
But you never showed us parts of your code -- the hashbang, if any, and the use IO::Handle;. That would have forestalled my concern about the appearance of printflush without a predicate. Discipulus sussed that out, but obviously, I didn't... and, in any case, we often see problems such as using a function from a module without useing the module. Those cases make it very hard to help if the code presented isn't an exact copy of the code which generated anomalies or errors.
BTW, the preceding para may have some value for you but I hope it also provides some benefit for future newcomers who stumble upon it.
And, as has already been said, welcome to PM.
| [reply] [d/l] [select] |
Re: Missing \t in print output
by akuk (Beadle) on Jul 14, 2015 at 11:50 UTC
|
Hi
You might need to try :
$" = "\t";
$normal_fh->printflush("@arrayToPrint\n");
| [reply] [d/l] |
Re: Missing \t in print output
by Sophienz (Acolyte) on Jul 15, 2015 at 13:10 UTC
|
SOLVED
Thank you all for the suggestions and advice.
I now know how to post properly for potential next posts.
The issue seems to have been the way I was viewing the output, and not a fault in the Perl script itself. I was viewing the output in the terminal (either by printing directly to STDOUT or by calling head on the output file) and somehow that created issues (missing tabs) that were not in the actual output file. I ended up downloading the output file locally and this showed that the file had all the required tabs.
Many thanks, and sorry for the multiple confusions. I really expected the issue to be a Perl one, not a matter of the way I was viewing the output.
| [reply] |
|
Glad to hear it, although I do still think it's strange that characters just went missing in the terminal, so I still have a small suspicion that you may have control characters in your strings (they usually don't affect editors as much as they can affect the terminal). Using $Data::Dumper::Useqq=1; or Data::Dump to look at the strings may still be worth it.
| [reply] [d/l] |
|
Thanks, I agree that it is strange as the distribution of missing \t seems very random. But I've printed the array using $Data::Dumper::Useqq=1; and there are no odd characters there.
On the same run, looking at the output through the terminal (head outputFile) resulted in missing tabs, but viewing the file locally in a different text editor didn't.
| [reply] |
Re: Missing \t in print output
by Anonymous Monk on Jul 15, 2015 at 11:35 UTC
|
First, did you use $Data::Dumper::Useqq=1; when using Data::Dumper? (or just use Data::Dump) That will make it easier to spot any funny control characters.
The problem with helping you out is that we need to be able to reproduce the issue. You'll need to boil down your code and input to something that still runs and reproduces the same issue and post it here; see also http://sscce.org/. If the sample input data is too large, you can try replacing it with some code that generates enough fake input data to reproduce the same problem.
| [reply] [d/l] [select] |
Re: Missing \t in print output
by Sophienz (Acolyte) on Jul 14, 2015 at 15:11 UTC
|
Thanks all for your suggestions, I've checked the array and it looks fine, unfortunately.
I've also tried the following suggestion, but that didn't help either.
$" = "\t";
$normal_fh->printflush("@arrayToPrint\n");
| [reply] [d/l] |
|
If one of your (presumed) tabs in the original data is actually a space, your regex will fail. Likewise, if the offending element is two tabs... KA-BOOM! So, as advised in the consideration, you should add code tags around your sample data (for our ease of helping) and use any one of a host of tools to doublecheck the separators.
| [reply] [d/l] |
|
| [reply] |
|
| [reply] |
Re: Missing \t in print output
by Anonymous Monk on Jul 15, 2015 at 11:01 UTC
|
Hi Sophie.
I don't even understand how your script is able to produce that output. For example:
rattus_norvegicus NA NA NA NA NA NA (etc)...
rattus_norvegicus 5 13171176 13038994 -1 132183 NotC
+ontiguous_326154Gap (etc)...
rattus_norvegicus 5 13004812 12917777 -1 87036 NotCo
+ntiguous_253399Gap (etc)...
As far as I can tell, 'rattus norvegicus' at the beginning of a line should come from
@arrayToPrint = ();
my @subArray = ($chrInterval,$startInterval,$endInterval,$block,$orde
+r,$warning);
push(@arrayToPrint,@subArray);
But $chrInterval, $startInterval, $endInterval and $warning don't even change during the 'NORMALBLOCK' loop. Why does the output change then? Am I missing something?
And $chrInterval seems an unlikely name for a variable that contains 'rattus norvegicus'? | [reply] [d/l] [select] |
|
Hi,
Thanks for your question. You are correct, I've cropped the output to only show where the problem appears.
So the ($chrInterval,$startInterval,$endInterval,$block,$order,$warning) elements don't appear, and they do not change within the NORMALBLOCK loop.
However, for each iteration of the SP loop, the elements ($newSp,$chr,$start,$end,$strand,$size,$comment) do change. The problem only happens when I print those elements separated by \t.
The whole line is 251 elements and would be unreadable on here, and won't fit in either apparently and unfortunately the problem only arises when I use a large number of species, ie. a large array.
Also note that this happened without any join step as well, when I was just printing the elements as they were defined.
Please let me know if I can make things clearer.
| [reply] |
|
Hi Sophienz,
Please let me know if I can make things clearer.
Post a complete, working test script that shows your problem. Fine if it needs a 251-element array, just make one:
push @long_list, $_ for (1001..1251);
Cut out all the code that doesn't affect your problem. Start a new test script with just the loop that is giving you trouble.
Isolate the problem. Make it happen when you don't have your data in the script. That will prove whether or not the data are to blame.
The monks don't need to see all your code, but neither do they need to see an arbitrary subsection of it. They do not need your variable names (unless that is the problem) and they do not need your specific data (unless that is the problem). When the problem is programming, your logic, or, Mysteriously Unidentified, the monks need a small, self-contained, working test script that demonstrates the problem.
The benefit for you is that while you are making the test script, you will usually discover the problem and see how to fix it. And if not, you'll present the monks with something with which they can help you.
The way forward always starts with a minimal test.
| [reply] [d/l] [select] |
|
|
| [reply] [d/l] [select] |
|
|
|
|