Thanks again. This code was a bit of a mess and your comments and the others have helped me see what was going wrong. I appologise for now providing better information but there was a lot of code for something which should have been quite simple. This is what the original code did:
- Opened data file with encoding(UTF-8)
- Read a line of comma separated strings from it and split them on the comma
- Put the split fields into a hash with keys describing the data
- Passed to hash to a hand written function that tried to produce a x-url-formencoded string but this function was broken and instead just stuck an '&' between each key=value so it wasn't form encoded at all
- Passed the resulting string into NFKD and did the substitution as I described earlier
- Passed the resulting string into encode to encode as UTF-8
- Passed the resulting string into a LWP POST
So it was horribly broken because it did not form encode properly and then NFKD was a workaround he discovered which I suspect only works because the API does normalization itself (which would not surprise me). I replaced the hand written (incorrect) form encoding with WWW::Form::UrlEncoded build_urlencoded and as you both state the NFKD is a noop as is the substitution and and it works. This was confused because it appears when it didn't work originally (without the NFKD) he was told by the API support to turn diacritics into normal characters. The actual code was a lot more complicated than this and the more I looked at it the more problems I found so I've spent most of the day rewriting it.
Thanks again for your insights.
Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
Want more info? How to link
or How to display code and escape characters
are good places to start.