http://qs321.pair.com?node_id=851785


in reply to passing data structures from java to perl

If you use JSON or XML, be mindful of your character sets. Java IS UTF-16, which covers ASCII. If you're doing it outside of ASCII, be mindful.

XML has character encoding, to its advantage but takes more work to rig up.

  • Comment on Re: passing data structures from java to perl

Replies are listed 'Best First'.
Re^2: passing data structures from java to perl
by almut (Canon) on Jul 28, 2010 at 20:15 UTC

    It should be no real problem to have Java create UTF-8 encoded files/streams (both JSON and XML), and JSON can read UTF-8 data just fine... (Similarly for XML)

      Agreed. As long as no UTF-16 chars are used, which is really easy to do in java since strings are all UTF-16, it's all gravy.

      Java strings are UTF-16

      A good XML writer that prevents you from going outside of the declared format will protect you from future mistakes. Down side is the extra effort in using some XML api over others. JSON is usually gravy in all languages.

      Gravy.. mmmm...

        Java strings are UTF-16

        The way strings are stored internally doesn't really matter.

        While Perl stores unicode strings internally as UTF-8 (or something very close to it), it can encode those strings to many other encodings for output.  The same holds for Java: while it stores strings internally as UTF-16, there's no problem creating UTF-8 output, for example.

        Writer utf8out = new BufferedWriter( new OutputStreamWriter( new FileOutputStream("outfile"), "UTF-8" ) ); utf8out.write("some unicode data");