in reply to Re: Printing the first letter of the Hebrew alphabet (U05D0) kills script?
in thread Printing the first letter of the Hebrew alphabet (U05D0) kills script?
++
xterm -en UTF-8 made everything work. No lost/hidden output and I even see the Hebrew glyphs. Yeah! Now I not only have an explanation for the weird behavior, but a way to get everything to work just as I want it. -wc was giving me the output, but not the glyphs.
It has been a good day. Thank-you.
Update: I thought add a couple notes on configuring Xterm so that one need not type xterm -en UTF every time one starts a shell.
Each flavor Linux seems to have its own locations for XTerm configuration files and figuring out the ones that were right for my system took some searching. Also web pages are a bit confusing on this matter because xterm appears to have undergone some development. -u8 is part of an older way of managing utf8 and is not well integrated into the current way xterm handles encoding issues. Newer versions of xterm use -en on the command line and locale in a configuration file.
For Debian (Lenny) the important facts are:
- machine/site-wide configurations are in /etc/X11/app-defaults/XTerm
- personal xterm configurations are in ~/.Xdefaults Note: some webpages say the personal configuration file is ~/.Xresources. Ignore them if you are using Debian. For non-Debian systems YMMV. You may be able to figure out what your own system requires by checking the end of the man page for xterm that ships with your system.
- The following line needs to be added to either the site or personal configuration file: XTerm*locale: UTF-8. That one line is equivalent to -en UTF-8 on the command line.
- By default, xterm assumes that any input to the terminal via keyboard or via program output will be UTF-8 characters. If that is the case, one need not set LANG, LANGUAGE, LC_ALL or LC_CTYPE to make xterm happy (other applications may need them, just not xterm).
- If characters are represented as something other than UTF-8, then one must set LC_CTYPE to the encoding used by the keyboard/program output to the terminal. xterm uses the value of this variable to help it process non UTF-8 input.
|
---|