Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re^3: Regex with Backslashes

by haukex (Archbishop)
on May 18, 2020 at 19:51 UTC ( [id://11116914]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Regex with Backslashes (updated)
in thread Regex with Backslashes

If you have control over the format the string is generated in, then why not use a well-established format like CSV? The defaults of Text::CSV are that fields are separated by commas, if a field contains commas (or whitespace), it is surrounded by double quotes, and if a double quote needs to be escaped, then it is doubled up. For example:

use warnings; use strict; use Text::CSV; my $data = <<'END'; 1,"Text,with,commas and ""quotes""",X,99 END open my $fh, '<', \$data or die $!; my $csv = Text::CSV->new({ binary=>1, auto_diag=>2 }); while ( my $row = $csv->getline($fh) ) { print "<<$_>>\n" for @$row; } $csv->eof or $csv->error_diag; close $fh; __END__ <<1>> <<Text,with,commas and "quotes">> <<X>> <<99>>
Maybe I will take the plunge and post my 'lcd daemon with battery meter script' once it is completed.

Yes, that'd be interesting!

Replies are listed 'Best First'.
Re^4: Regex with Backslashes
by anita2R (Scribe) on May 18, 2020 at 20:20 UTC

    The comma separated data is entered by a user and I want to keep it as simple as possible, so extra quoting is something I want to avoid.

    I felt that escaped commas and backslashes was just about OK, or two commas and two backslashes also just about OK, but the more complex it gets the harder it is for the user. I am happy to add extra load to the script to help the user.

    I have included some code to handle simple input errors such as a space inserted in a command: '-- text' instead of '--text'.

      The comma separated data is entered by a user and I want to keep it as simple as possible, so extra quoting is something I want to avoid.

      Ok, I see, although there are of course other alternatives. Like for example, it may not be so difficult on the user if you require all fields to be quoted, that's one less rule for the user to remember. In the end, it'll be up to you to decide what is easiest for the user and for the implementation. I agree with AnomalousMonk's point that doubling up the commas leads to ambiguity, so if you really don't like the quoting, perhaps the backslashes are not such a bad idea (and the only issue was really the misunderstandings about the format); the parser I showed here has some pretty simple rules: commas are field separators, backslashes and commas can be escaped with backslashes, plus the support for the \x... sequence.

        Thanks, I've downloaded your parser script and will see how it fits with what I have now. I think I prefer the escaped comma & escaped backslash option so it could be a solution.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11116914]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (6)
As of 2024-03-28 20:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found