Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^4: Global substitution of non-base-plane Unicode characters

by Jim (Curate)
on Feb 24, 2014 at 04:00 UTC ( [id://1075930]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Global substitution of non-base-plane Unicode characters
in thread Global substitution of non-base-plane Unicode characters

In this case, using printf instead of print is justified and, in fact, smart. The default value of the predefined variable $\ ($OUTPUT_RECORD_SEPARATOR) is undef, which is what Peter wants and expects here. But a very surprising and potentially elusive bug can be introduced into Peter's program when the value of $\ is later changed. Using printf in this admittedly unusual way ensures that the Unicode byte order mark is never followed by any unexpected character such as newline (\n).

Jim

Replies are listed 'Best First'.
Re^5: Global substitution of non-base-plane Unicode characters
by kcott (Archbishop) on Feb 24, 2014 at 04:29 UTC

    If "the value of $\ is later changed" was a genuine concern, a better way would be to explicitly code the following rather than expecting a subsequent maintainer to automatically realise why printf was used here:

    ... { local $\; print "\x{FEFF}"; } ...

    And, of course, a much better way to change $\ in the middle of the program, would be along these lines:

    ... code as it is now ... # later changes: ... { local $\ = "\n"; ... code using changed $\ ... } ...

    -- Ken

      I like your solution making the printing of the BOM a local block better than my use of printf, thank you. I will use that.

      Peter

      TIMTOWTDI.

      In Perl, printing exactly one character—a Unicode byte order mark—and nothing else is a special case of formatted printing, vis-à-vis generalized printing of lines of text with built-in programming conveniences (e.g., automatic newline handling).

      Would you find this troublesome?

      printf '%s', "\N{U+FEFF}";

      Or this?

      printf '%c', 0xfeff;

      Jim

        This really has nothing to do with what, if anything, I find "troublesome".

        You hit the nail on the head with your earlier post: "Using printf in this admittedly unusual way ...".

        Writing code in an unusual way (without any indication of why this was done) makes future maintenance error-prone.

        Yes, there's more than one way to do it. Here's another, that simply adds a comment, that I would consider better:

        printf "\x{FEFF}"; # printf() so $\ is not appended

        -- Ken

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1075930]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (2)
As of 2024-04-24 23:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found