http://qs321.pair.com?node_id=11118322


in reply to Re^4: Calculated position incorrect when using regex in text file that also contains binary info
in thread Calculated position incorrect when using regex in text file that also contains binary info

I really have to learn to be more precise (and maybe concise too???) in my answers. Really sorry for that, I'm feeling a bit uncomfortable and embarrassed now...

Don't worry! Even the pros sometimes need to be reminded of SSCCE, sometimes because one becomes so deeply buried in a problem that one forgets that not everyone else is so into it as well :-)

Could you tell me how to attach a file to a message, if that's allowed?

Everything goes in <code> tags, individual ones for input, code, output, etc., so it's easier to download. As long as it's ASCII, it'll work fine, hence my suggestion for showing binary data via hexdump -C file or od -tx1c file (Update: or on Windows, I like this little tool, see the "Releases" for a single-exe download). There are other potential formats too, like for example Data::Dump will automatically use pack or MIME::Base64 for binary data as necessary - however, if you use this module to show binary data, make sure you've read the data from the file in "raw" format, that is, with open mode '<:raw', binmoded the filehandle before reading, or an equivalent method like slurp_raw from Path::Tiny.

So for example, you can use the following to take the binary file $filename and output its contents in a Perl format, suitable for pasting into your SSCCE:

use Data::Dump qw/pp/; print 'my $data = ',pp(do { open my $fh, '<:raw', $filename or die $!; local $/; <$fh> }),";\n";

Sometimes, if the problem might be related to the UTF-8 encoding of the Perl source code itself (use utf8;), the source can be posted inside <pre> instead of <code> tags, but only if all HTML special characters are escaped - one way is "perl -MHTML::Entities -CSD -pe 'encode_entities $_' source.pl". Personally I try to avoid this, and use escapes like "\N{U+0000}" or "\N{CHARNAME}", so the source code can stay in ASCII. And as hippo said, use PerlMonks' <readmore> tags if necessary, although you should try to keep things as short as possible - but still representative.

If the only way to demonstrate a problem is with a fairly large file, then sometimes it's possible to provide a short script that generates such a file here, instead of the file itself (it shouldn't use rand or similar though, for reproducibility). And in the very worst case, large files can be uploaded to third-party sites, although that's the least preferable method because it's not permanent.

In your case, I think you should be able to edit down your input file to something short enough to post here that still reproduces the issue, following the above guidelines. And as usual in such cases, this may even help you pinpoint the problem better.