An anonymous reader asks: "Like many others these past few weeks, I took the time to download the latest RedHat (replace with your favorite distro here) and upgrade my system. Despite the usual mail hangovers (corporate mail is still Outlook through POP3, etc.), the new *Office suites are great and I can almost dump Windows. However I was amazed at the sorry state of Linux instant messaging. Before you flame me, mod me down or doom me to a lifetime of Windows usage, allow me to explain: I am not a native English speaker, and it seems that every single Windows IM client I use (except Trillian) can deal properly with accented characters. Worse, every third-party Linux client I have tried deals with them differently, resulting in garbled (vaguely Unicode-ish) junk! Does the Slashdot crowd (especially the non-English folk) have a solution for this? (Short of VMware and Win32 clients, that is. Wine doesn't work at all for me)"
"Portuguese (Continental) is my native language, and I speak French, Spanish, some German and (after spending quite a few months in Poland on a project) passable Polish. I speak those languages practically every other day, and besides e-mail, I have taken to using MSN and Yahoo to discuss work with my clients and colleagues.
I have thus far tried GAIM (both the RedHat 8.0 bundled and the CVS versions), Kopete, Everybuddy, the native Yahoo! (which crashes more often than not and does not even deserve the 'beta' moniker), and none of them are suitable. IRC does the job, sure, but most of the people I have to reach can't use it (firewalls, usually) and won't install another IM client, so the solution has got to be at my end."
Why it works on windows (Score:5, Informative)
Ideally, every text would either be unicode or declare the encoding, but of course that's not going to happen. So we want an IM client that knows about different encodings and can be told that a certain person sends and receives messages in a particular encoding (and probably a way to specify a relation between encodings, keyboard bindings, and fonts - or is that the responsibility of the operating system or window manager or KDE or whatever the architecture is).
As far as I know, no such client exists.
Re:Why it works on windows (Score:4, Insightful)
Trillian, for one, does not support accented characters properly either, and it's a Win32 app. If it handled ISO-8859-1 encoding properly, it might just about work.
Gaim, AFAIK, suffers from the same malaise, and I could never get the Linux Yahoo client to run twice properly (I have to keep removing the ~/.ymessenger dir)
Fire (http://www.epicware.com, for the Mac) allows you to select different character encodings, and I have nearly perfect interaction with Win 32 clients once I figure out their settings (except for Yahoo, who are notoriously finicky and keep changing their protocol)
And as for the "Just speak English" replies... Well... That's an attitude problem I certainly am glad half the world doesn't have.
Re:Why it works on windows (Score:5, Informative)
Windows IS 8-bit for the most part. (Score:2)
Effectively, yes. Windows does use an 8-bit character set for average consumers. If you code your apps to use 16-bit Unicode characters, they will only run on an NT-based version, and not at all on Win95/98/98se/ME... (Remember, there is not really a CreateWindow function in win32, just a macro that will resolve to either CreateWindowA or CreateWindowW )
One of the problems is the evil TCHAR that MS and MFC pushes on unknowing developers. They themselves do not use either, but instead had their own proprietary means. Recently, however, they did go and publish these libraries [microsoft.com] as a standard extension... (IIRC, this could be one of those "hidden API" things that the Office team created out of need)
The main problem usually comes in with the mix of controls, protocols, and approaches that people take.
(However, programmer laziness is a lot of the cause of the problems you encounter)
Gaim and GnomeICU work fine (Score:4, Informative)
I'm an English speaker, but I just opened a window in each and sent a friend a message copied and pasted from a Spanish website. The messages went through just fine with accented characters and all.
Re:Gaim and GnomeICU work fine (Score:3, Informative)
Take care - RL
Re:Gaim and GnomeICU work fine (Score:5, Insightful)
Try it with, say, Hindi or Hebrew or Thai - a language with completely different characters, not just accents.
Re:Gaim and GnomeICU work fine (Score:5, Informative)
I found no way to change your character set in either of these programs, although they may use your system's default character set.
Mozilla has no problem displaying any glyphs from these character sets on my computer. It would be nice if other Linux programs got to that point too. (I tested this by copying text from the World categories of the open directory project [dmoz.org].)
Re:Gaim and GnomeICU work fine (Score:1)
How do you deal with all languages? (Score:5, Informative)
There are three levels: bytes, character sets, and glyphs. Your program recieves a stream of bytes and you have to display those bytes to the user as text in the correct langugage. That display is done using glyphs (font characters). A character set maps bytes to glyphs.
There are tens if not hundreds of different character sets. A character set might map each byte to a different glyph (latin 1 and 2), only some bytes to glyphs (ASCII), multiple bytes to a glyph (Unicode), or varying numbers of bytes to glyphs and some bytes to no glyph (UTF-8).
Java APIs handle this by reprensenting all Strings internally in Unicode. Unicode is the granddaddy of all character sets. Almost every glyph has a value in Unicode. When you get a stream of bytes in java you can use a Reader to translate that stream into Unicode. The Reader is constructed with the name of a character set. If no character set is specified, the system's default is used. The character to be used usually comes from meta-data. In html for example, the character set for the page is transmitted by the server in the data that comes before the page itself is delivered (the http header.)
Once you have a unicode string it is straighforward to find a glyph to display for each character. This all depends on the right fonts being installed, but usually APIs handle it for you.
I18n problems usually occur when a programmer doesn't know to how to translate bytes into unicode characters. The programmer may always use the default character set, ignoring any meta-data. Similarly on sending data, the programmer must tell the other end with which character set the data is sent. Other problems may occur when a needed font is not installed.
Often, a system works with a specific character set that doesn't support all characters (such as latin 1). When more characters are desired in such an instance, escape characters are often used. There are \uXXXX style escape sequences in source code, and &XXX; escape sequences in html. Such escape sequences may be able to retrofit an older system in which a specific non-inclusive character set is assumed.
AMSN (Score:4, Informative)
Licq (Score:3, Informative)
My experience with Licq is that it works perfectly in ISO 8519-1 (I use mainly French). In the options you can select other languages as well (UTF-8, ISO 8519-2, ISO 8519-6, CP 1256, KOI8-R, JIS7, etc.), but I never used them. Just make sure the proper fonts are installed, and you should be good to go.
The only problem I see is that you don't mention ICQ in your possible choices...
Instead of railing against English speaking... (Score:1, Flamebait)
Licq (Score:2, Informative)
So I would recommend it for western languages and gg or ekg (two excelent gadu-gadu clients) for chats with your Polish friends
Raf
Kinkatta (Score:2)
State of Linux IM (Score:4, Insightful)
Let's take a look.
Pros: no Linux IM client that I know of has ads or relies extensively on client-side security (*cough* ICQ).
Cons:
* There is no Linux ICQ client that can consistently transfer files to Windows users. GnomeICU doesn't support it with any recent ICQ version. Licq requires you to convince the other person to manually flip a protocol switch and only sometimes works then. Licq's file transfer does not work with Trillian (which is growing in popularity). There's a host of IM dead projects, all of which only partly work.
* Protocol lag. I don't know of anything that can *fully* speak the new ICQ protocols.
* Periodically doesn't work. Not sure how bad this is on GAIM these days, but it used to be awful as AOL and MS duked it out. Sudden, unexpected protocol changes are the name of the game.
* Only reasonable ICQ front end requires Qt: Licq is the most usable Linux ICQ client I know of, and the best gtk front end *sucks*, with only one feature over the qt version (auto-establishment of secure connections). It's unstable, doesn't have middle-button-opening of message windows, doesn't highlight usernames in the username clist...
* Gabber *could* be an answer, but isn't. There's a really nice, secure Jabber client called Gabber. Unfortunately, no one *uses* Jabber, and the gateways to other protocols for it are flaky as hell.
* talk is dead. It was the most reliable system I've ever used to talk to people on any computer system, VMS or UNIX.
Meanwhile, IM continues to be a crucial part of the desktop, even entering the business world. I'd like to see Red Hat on some of the groups shaping this stuff (as Microsoft and AOL are). I'd *hate* the standard protocol to be unencrypted or have a message size limit, or rely on propriatary Windows libraries.
Re:State of Linux IM (Score:3, Informative)
* Re: periodically not working: We have not had any problems with oscar for nearly a year. Any problem has been immediately resolved.
Re:State of Linux IM (Score:1, Interesting)
-- dr.Noah
Perfect timing! i18n chat is coming to Linux! (Score:5, Informative)
Oh ho HO! International chat certainly is a problem in the Linux community, owing to many factors; not the least of which being that the developers of the major IM forces out there seem to largely be from ISO-8859-1 locales. Thus ISO-8859-1 works pretty well, and other, more ASCII-deviant (CJK) locales work with virtually no success.
The good news is that an answer is here! I've been on a crusade to make Gaim [sourceforge.net] the penguin-pimpin'est international chat machine available, and it's really paying off! For the stable series of gaim (0.59.x, currently at 0.59.4 with 0.59.5 to be released possibly as I type this) (I just looked and 0.59.5 is out), if your locale is set correctly you should be able to chat in whatever language your little heart desires... (I have personally successfully used it with English and Japanese) as long as you aren't chatting over the AOL Instant Messenger service or the ICQ service, both of which use the Oscar protocol. However, MSN and Jabber, for instance, should be substantially correct.
The fabulous news is that the development version of gaim coming at us right now has first class i18n support on the whole gamut of protocols! With a timely move to Gtk+ 2.0 [gtk.org] and the Pango text formatting system, Gaim now has international text formatting second to none in the OSS community and hardly rivalled in the commercial world. Images like this shot of gaim displaying Japanese, Russian, and English simultaneously [sourceforge.net] display what I'm talking about very nicely. Not only can we do non-English text, thanks to UTF-8 we can do all of the modern languages of the world simultaneously. In addition, support for internationalization on the troublesome Oscar (AIM and ICQ) protocol has been added and is coming along very nicely.
In short, look for the next major release of gaim to clear up these issues in a big way. For those hardy souls wanting to test the code that's currently in CVS, please note that it is NOT currently complete, and isses that you have are most likely transient.
Also, please be aware that your locale MUST be set correctly for internationalized programs to work the way you expect. Programs that only deal with your system can be more forgiving, but programs that communicate over the network absolutely must know more about your locale, including your character set. If the output of your 'locale' command lists LC_CTYPE as, for instance, "C", it's no wonder i18n isn't working! Set your LANG or LC_CTYPE correctly for your language (en_US for English with ISO-8859-1, es_ES or es_MX for Spanish, pt_PT or pt_BR for Portuguese, ja_JP for Japanese, zh_CN for Chinese, etc.) and you might see general i18n support improving dramatically.
PSI (Score:2)
Psi features the following:
* Message (ICQ-style) and Chat (AIM-style) modes
* Drag and drop to send to multiple contacts
* Full Unicode support
* Secure connections
* Saving contact list locally, and server sync on login
* Icon Themes
* Agent registration and searching
* Retrieving and updating User Info
* Sound support for incoming events
* Auto-away after a configurable amount of time
* Tray/dock icon for KDE/GNOME environments
* Language plugins
It also available for MS Windows, Mac OS X as well as X (Linux)
Try pebrot (Score:1, Informative)
Jabber has Unicode support (Score:2)