The stupid question is the question not asked | |
PerlMonks |
Matching non-ASCII file contents with file name.by mldvx4 (Friar) |
on Dec 22, 2022 at 11:56 UTC ( [id://11149027]=perlquestion: print w/replies, xml ) | Need Help?? |
mldvx4 has asked for the wisdom of the Perl Monks concerning the following question: My goal is to use the substitution operator (s///) to replace occurrences of a question mark (?) with an inverted question mark (¿) om specific line in a large number of files. I am having trouble with what is actually getting substituted inside the file in that it does not match what ends up in the file name in the file system. I am grateful for any tips or guidance as to what to have that which is inside the files match various file names out in the file system. Perhaps it is matter of encoding, again? It is claimed that the inverted questionmark is \x00BF, which strangely is C2 BF in UTF-8 according to a "Unicode Character Table site. In the shell (Bash) on an EXT4, that seems to be the case and the Perl utility rename seems to work that way, too.
And those files show up in Apache2's access logs containing the escape sequence "%C2%BF" in the URL in place of the inverted question mark.
Though if I leave out the use utf8 part, then I kind of get the "right" result only according to xxd,
While keeping UTF-8, how can I get "¿" inside the files to match the "¿" out in the file name and still look right?
Back to
Seekers of Perl Wisdom
|
|