Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Hi Crosis,

I'm going to write provocatively about a series of topics, one comment per topic. My thesis for each will be that Perl is strong in some particular area in which Python is weak.

This one's about text processing that involves text segmentation (i.e. character or substring processing) of Unicode text.

In a nutshell Perl is a world leader in getting this right. The Perl 5 community has trailblazed supporting devs in dealing with all the fiddly details in as practical a manner as it could manage given its existing runtime and standard library functions. Perl 6 has trailblazed developing a new runtime and standard library that makes it easy for mere mortals to get the right results without having to have a degree in Emoji data science.

In the meantime, the Python language, string type, standard library, and doc all entirely ignore the pieces necessary for getting text segmentation right per Unicode annex #29 (linked above) so it is all but impossible for any ordinary dev to correctly segment arbitrary Unicode text in Python 3.7.

Feel free to ask what the heck I'm talking about if it's not obvious from what I've written and the link I provided.

If you follow up on this comment I'll post another topic so we can keep things rolling. And if you comment on that, I'll post on another topic. I think I've got maybe 10 if you've got the stamina...

Hi monks, hope you're all doing well.


In reply to Re: Curious about Perl's strengths in 2018 by raiph
in thread Curious about Perl's strengths in 2018 by Crosis

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (6)
As of 2024-04-23 14:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found