Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: Unicode vulgar fraction composition

by ikegami (Patriarch)
on Sep 24, 2020 at 02:14 UTC ( [id://11122146]=note: print w/replies, xml ) Need Help??


in reply to Unicode vulgar fraction composition

The "C" and "D" transformations are inverse of each other, but there's no inverse to "K".

It's a destructive transformation. For example, both the ANGSTROM SIGN (Å) and the LATIN CAPITAL LETTER A WITH RING ABOVE (Å) are independent symbols with distinct meanings, but they have the same KC form and KD form. There's no way to know how to reverse the transformation to restore the original meaning.

Replies are listed 'Best First'.
Re^2: Unicode vulgar fraction composition
by tobyink (Canon) on Sep 26, 2020 at 09:55 UTC

    One way of thinking about it, in a simplified ASCII world, would be if you lowercased words to do a case comparison:

    chomp( my $name = lc <$fh> ); if ( $name eq 'bob jones' ) { die 'rejecting annoying person'; } # Now I want to restore $name to its original mixture of upper and l +ower case

      Good analogy (though you really want fc instead of lc to perform a case-insensitive comparison).

        For ASCII, fc does the same thing as lc though. And I specified ASCII for that reason.

      Sure, I think it's intuitive why lc('Boaty McBoat') is conceptually a "lossy" transformation (in terms of being able to restore the original string).

      But NFKC("\N{VULGAR FRACTION THREE EIGHTHS}") is conceptually "lossless": there is only one Unicode character the resultant string "3\N{FRACTION SLASH}8" could be "composed" into.

      As I wrote, I get now why NFKC is conceptually lossy in general. But—unlike with lc—some specific decompositions are exceptions.

        consider:
        • 123\N{FRACTION SLASH}8
        • 12\N{VULGAR FRACTION THREE EIGHTHS}
        I would read the former as "one hundred twenty three eights", but the latter as "twelve (plus) three eights", so it's not completely a one-to-one relationship.

        There's no way to know that 3/8 means three-eights. For example, it could mean March 8th. As such there are two possible compositions for 3/8: VULGAR FRACTION THREE EIGHTHS and 3/8.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11122146]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2024-03-29 06:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found