Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

I believe Modern Perl should have a core module that can easily parse these simple Unicode CSV records. It should handle them in any character encoding scheme of Unicode:  UTF-8, UTF-16, or UTF-32. And it should handle the Unicode byte order mark seamlessly.

Why not?

🎥Film🎥🎬🎥Year🎥🎬🎥Awards🎥🎬🎥Nominations🎥🎬🎥Director🎥
🎥12 Years a Slave🎥🎬2013🎬3🎬9🎬🎥🎥🎥 Steve McQueen🎥
🎥Argo🎥🎬2012🎬3🎬7🎬🎥🎥🎥 Ben Affleck🎥
🎥The Artist🎥🎬2012🎬5🎬10🎬🎥🎥🎥 Michel Hazanavicius🎥
🎥The King's Speech🎥🎬2010🎬4🎬12🎬🎥🎥🎥 Tom Hooper🎥
🎥The Hurt Locker🎥🎬2009🎬6🎬9🎬🎥🎥🎥 Kathryn Bigelow🎥
🎥Slumdog Millionaire🎥🎬2008🎬8🎬10🎬🎥🎥🎥 Danny Boyle🎥
🎥No Country for Old Men🎥🎬2007🎬4🎬8🎬🎥🎥🎥 Joel Coen
🎥🎥 Ethan Coen🎥
🎥The Departed🎥🎬2006🎬4🎬5🎬🎥🎥🎥 Martin Scorsese🎥

sep_char	🎬	U+1F3AC CLAPPER BOARD (UTF-8: F0 9F 8E AC)
quote_char	🎥	U+1F3A5 MOVIE CAMERA  (UTF-8: F0 9F 8E A5)
escape_char	🎥	U+1F3A5 MOVIE CAMERA  (UTF-8: F0 9F 8E A5)
"Film","Year","Awards","Nominations","Director"
"12 Years a Slave",2013,3,9,"🎥 Steve McQueen"
"Argo",2012,3,7,"🎥 Ben Affleck"
"The Artist",2012,5,10,"🎥 Michel Hazanavicius"
"The King's Speech",2010,4,12,"🎥 Tom Hooper"
"The Hurt Locker",2009,6,9,"🎥 Kathryn Bigelow"
"Slumdog Millionaire",2008,8,10,"🎥 Danny Boyle"
"No Country for Old Men",2007,4,8,"🎥 Joel Coen
🎥 Ethan Coen"
"The Departed",2006,4,5,"🎥 Martin Scorsese"

I recognize that the current XS core module for parsing CSV records, Text::CSV_XS (marvelously maintained by Tux), may not be the right module to use as the basis for a new, fully Unicode-capable module. But because Perl's native Unicode capabilities exceed those of most other programming languages, Perl should have a proper FSM-based Unicode CSV parser, even if it's pure Perl and not XS.

I long ago accepted that Unicode conformance and comparative slowness go hand in hand 👫. So what? Look what you're trading a few seconds here and there for:  the technological foundation of World Peace ☮ and Universal Love 💕.

UPDATE:  Removed references to core module. I don't care about that. I just want a Unicode-capable Perl CSV module.


In reply to Re^4: Speeds vs functionality by Jim
in thread Speeds vs functionality by Tux

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others scrutinizing the Monastery: (2)
    As of 2020-10-24 09:42 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?
      My favourite web site is:












      Results (242 votes). Check out past polls.

      Notices?