Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Dear monks, I'm asking you a favour in troubleshooting a perl script on a mac.

The part that's broken is the following: the script tells the user to drag and drop the two input files into the console window, it parses the resulting strings to find the folder, filename and extension, checks that the files indeed exist and moves on to open the files and do its thing.
This works for me in Windows XP and 7, but in OS X Leopard, it fails to find the files even though the filepath seems to be parsed correctly. I don't have a mac so I can't troubleshoot it myself. Perhaps it's some problem with the encoding the filepath itself is in or the if (-e "file") command I used is wrong in some mysterious way... to be honest, I have no idea what's going on.

Backstory: The program is a text aligner that creates parallel corpora out of texts and their translations.
It is an open source project that will end up on sourceforge in Windows, linux and mac flavours (I'm using PAR::Packer to package the script and the modules it relies on into an executable). The script is broadly the same for all three platforms, with if/else branching to account for platform-specific issues.
The windows executable is done and it works fine, but I ran into problems with getting it to work on a mac. As I don't have a mac, I asked a friend to test it. He tells me the script says it can't find the files, but that's about as far as we got with troubleshooting. I would appreciate any help.

A short test script to check basic functionality:
#!/usr/bin/perl use strict; use warnings; use utf8; my $inputfile; print "\nDrag and drop a file here:\n"; chomp ($inputfile = <STDIN>); $inputfile =~ s/^ *[\"\']?([^\"\']*)[\"\']?/$1/; print "\nFilepath with quotes and spaces stripped: >$inputfile<\n"; print "\n--------------------------------------------\n"; print "\nTest 1: no parsing, just checking if file is found:\n"; if (-e "$inputfile") {print "\nOK, file found\n";} else {print "\nERRO +R: file not found\n";} print "\n--------------------------------------------\n\nTest 2, check +ing if file can be opened:\n\n"; open(FILE, "<", "$inputfile") or die "Can't open file: $!"; print "OK, file opened successfully. \n\nPress enter to quit\n"; <STDIN>;

Could some kind monk run this and let me know if one or both tests pass on a mac?

The actual aligner script is almost 2K lines and it needs a couple of other bits and pieces to run, so I uploaded the full package to mediafire. It's my first perl project and I'm not a programmer... Let me know if you notice something hideous that should be improved. Everything in there is tested and working on Windows, though.
To test on mac OS X, just start LF_aligner_10_12.pl and drag and drop the two pdf files from testfiles. You will get some feedback in the console and a log will also be created in aligner/scripts. The error message it threw in testing is "ERROR! File 1 not found"
If you want to have a look at the code itself, the relevant bit starts at "# DRAG & DROP FILES (t, h, p)" around line 400.
The code looks about like this (this is just a simplified, cleaned-up sample that won't actually run; to run the script, please get the mediafire package.)
# DRAG & DROP FILES do { print "\n\n-------------------------------------------------"; print "\n\nDrag and drop file 1 here and press enter.\n"; chomp ($file1_full = <STDIN>); # windows doesn't add quotes if there is no space in the path, + linux adds single quotes # strip any leading and trailing spaces and quotes; $1=everyth +ing up to last / or \, $2= everything from there up to the end except + spaces and "'. $file1_full =~ /^ *[\"\']?(.*)[\/\\]([^\"\']*)[\"\']? *$/; $folder = $1; $file1 = $2; $file1 =~ /(.*)\.(.*)/; $f1 = $1; $ext = lc($2); print "\nDrag and drop file 2 here and press enter. (This file + has to be in the same folder as file 1!)\n"; chomp ($file2_full = <STDIN>); $file2_full =~ /^ *[\"\']?(.*)[\/\\]([^\"\']*)[\"\']? *$/; $folder2 = $1; $file2 = $2; $file2 =~ /(.*)\.(.*)/; $f2 = $1; $ext2 = lc($2); print LOG "\nInput files dropped in: $file1 (${file1_full}), $ +file2 (${file2_full})"; unless ("$folder" eq "$folder2") { print "\n\n\nERROR! The two files are not in same folder. Try +again!\n($folder vs ${folder2})\n"; print LOG "\nERROR: The two files are not in same folder. $fol +der, $folder2"; } unless ("$file1_full" ne "$file2_full") { print "\n\n\nERROR! You dragged in the same file twice. Try ag +ain!\n"; print LOG "\nERROR: Same file dropped in twice"; } unless ("$ext" eq "$ext2") { print "\n\n\nERROR! The file extensions don't match. Try again +!\n($ext vs. $ext2)\n"; print LOG "\nERROR: Extensions don't match: $ext vs. $ext2"; } unless (-e "$folder/$file1") { print "\n\n\nERROR! File 1 not found (maybe its path or its fi +lename contains accented letters). Try again!\n(file: $folder/$file1) +\n"; print LOG "\nERROR: File 1 not found; folder: $folder, file: $ +file1"; } unless (-e "$folder2/$file2") { print "\n\n\nERROR! File 2 not found (maybe its path or its fi +lename contains accented letters). Try again!\n(file: $folder2/$file2 +)\n"; print LOG "\nERROR: File 2 not found; folder: $folder2, file: +$file2"; } if ($ext eq "doc") { print "\n\n\nERROR! Doc files are not supported. Convert to do +cx or txt and try again!\n"; print LOG "\nERROR: doc file dropped in"; } $alignfilename = "${f1}-${f2}"; close LOG; open (LOG, ">>:encoding(UTF-8)", "$scriptpath/scripts/log.txt" +) or print "\nCan't create log file: $!\nContinuing anyway.\n"; } until (("$folder" eq "$folder2") && ("$file1_full" ne "$file2_fu +ll") && ("$ext" eq "$ext2") && (-e "$folder/$file1") && (-e "$folder/ +$file2") && ($ext ne "doc"));

In reply to OS X troubleshooting help needed - parse filename & open file by elef

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (3)
As of 2024-04-19 19:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found