Recently I tried to read contents of the drive on my machine with
Dolphin XP. I found that it does not display Russian letters correctly. I am not sure where exactly the problem starts. Is it in external interfacing? Any solutions? The following code demonstrates a problem: c := OrderedCollection new. File forAll: '*' in: 'C:\Russian' do: [:each | each fileName ~= '..' ifTrue: [c addLast: each fileName ]]. c inspect By the way, is the any better way to go through the directory structure? Not that it is better, but I use recursion in C# or Java. Thank you |
"Sergei Gnezdov" <[hidden email]> wrote in message
news:1095479764.VwjC0ADW8T4FF4BmTgTw4g@teranews... > Recently I tried to read contents of the drive on my machine with Dolphin > XP. I found that it does not display Russian letters correctly. I am not > sure where exactly the problem starts. Is it in external interfacing? > Any solutions? I think it probably has to do with the default font saved down into the RichEdit control used as the source editor. This thread from comp.lang.smalltalk.dolphin may help: http://groups.google.co.uk/groups?hl=en&lr=&ie=UTF-8&selm=aq8j61%24dof2%241%40as201.hinet.hr In order to change the default workspace font double click the 'User Preferences' item in the main system launcher window. Locate 'Workspace' at the end of the list, and expand the tree node. You can then double-click the defaultFont aspect and change the script appropriately. > ... > By the way, is the any better way to go through the directory structure? > Not that it is better, but I use recursion in C# or Java. We don't provide a comprehensive object model for the file system in Dolphin. See however the method #example2 on the class side of AXTypeLibraryAnalyzer. If you try to run the example, however, it will fail if you are running Windows XP since the example assumes that the system directory will be WINNT on NT class systems (this is easily corrected). If you do a search in Google groups for 'comp.lang.smalltalk.dolphin IFileSystem' it will bring back a number of postings that may be helpful. Regards Blair |
Blair McGlashan wrote:
> > Recently I tried to read contents of the drive on my machine with > > Dolphin > > XP. I found that it does not display Russian letters correctly. I am > > not > > sure where exactly the problem starts. Is it in external interfacing? > > Any solutions? > > I think it probably has to do with the default font saved down into the > RichEdit control used as the source editor. More than that, I think. As far as I can tell, the underlying Windows findFirstFile/findNextFile stuff (or whatever it's called) is answering data where the characters in filenames that cannot be represented in 8bits are replaced by $?. That's to say that the actual bytes pointed to by the WIIN32_FIND_DATA have 63 as their value. Sergei, you /might/ be able to use the #cAlternateFileName from the same data (the old DOS-style name), but it does depend on what you are trying to do. I did take a look at what would be involved in duplicating Dolphin's file enumeration using the "wide" API, and it looks doable with some work. More work than I fancied just for an experiment, though. Of course, once you have defined wide versions of the existing code, then you'd still have problems if you need, say, to display a list of those filenames to a user. -- chris |
"Chris Uppal" <[hidden email]> wrote in message
news:[hidden email]... > Blair McGlashan wrote: > >> > Recently I tried to read contents of the drive on my machine with >> > Dolphin >> > XP. I found that it does not display Russian letters correctly. I am >> > not >> > sure where exactly the problem starts. Is it in external interfacing? >> > Any solutions? >> >> I think it probably has to do with the default font saved down into the >> RichEdit control used as the source editor. > > More than that, I think. As far as I can tell, the underlying Windows > findFirstFile/findNextFile stuff (or whatever it's called) is answering > data > where the characters in filenames that cannot be represented in 8bits are > replaced by $?. That's to say that the actual bytes pointed to by the > WIIN32_FIND_DATA have 63 as their value. >... Are wide characters actually needed for Russian though? I don't know, just asking. Regards Blair |
Blair McGlashan wrote:
> Are wide characters actually needed for Russian though? I don't know, just > asking. Good question, I'm not sure. It may only be that the small-caps characters I used for the test, which I found on some random Russian web page, may use code-points that are outside the range normally used by a Russian Windows installation. Or maybe Windows recognises that my machine is basically English (albeit with a fair amount of "foreign" language support installed) and is unable to repreent the name in 8bits using /my/ code page, but would have been able to do it on a Russian installation. Anyway, the following ugly hack of a loop will dump the byte representation of the names of all the files in a folder to the transcript. It may help Sergei isolate the problem: File for: '*' in: 'C:\Temp\' do: [:each || name altName addr len bytes | name := each cFileName. altName := each cAlternateFileName. addr := each bytes yourAddress + 44. len := name size. bytes := ByteArray fromAddress: addr length: len. Transcript display: altName; display: ': '; print: name; display: ' = '; print: bytes; cr]. Which, on my machine, writes: : '.' = #[46] : '..' = #[46 46] AAATXT~1.BZ2: 'aaa.txt.bz2' = #[97 97 97 46 116 120 116 46 98 122 50] ... XYSTXT~1.BZ2: 'xys.txt.bz2' = #[120 121 115 46 116 120 116 46 98 122 50] 221B~1.TXT: '???????????????.txt' = #[63 63 63 63 63 63 63 63 63 63 63 63 63 63 63 46 116 120 116] 3572~1.TXT: '???? ????????.txt' = #[63 63 63 63 32 63 63 63 63 63 63 63 63 46 116 120 116] 2C05~1.TXT: '????????????.txt' = #[63 63 63 63 63 63 63 63 63 63 63 63 46 116 120 116] to my Transcript. Notice that the last three files have names that are mostly made up out of $?, and that that is what Windows has supplied in the raw byte data. (Those files have names created by cut-and-pasting a random string from a Russion, Urdu, and Japanse website respectively. They do display correctly in explorer). Sergei, if you try that and the byte data isn't all 63s, then you probably only have a display issue, if not then you have a more difficult problem to deal with. I'd be interested to know which. BTW, it's a little disturbing that some of the entries don't have "alternate" filenames. I know you can turn that off, but I'm surprised to find that Window's hasn't generated alternate names for all the files, even though that feature is deliberately left turned on on this box :-( -- chris |
A follow-up:
> Or maybe Windows recognises that my > machine is basically English (albeit with a fair amount of "foreign" > language support installed) and is unable to repreent the name in 8bits > using /my/ code page, but would have been able to do it on a Russian > installation. I got interested enough to take a risk and reset my system (after all, I only spent /five hours/ yesterday getting Windows bloody Update to work, so what's a little more messing around going to hurt). It seems that the speculation is correct. I switched my machine to use a Cyrillic code page -- at least, that's what I assume "Control Panel / Regional and Language Settings / Advanced / Language for non-Unicode programs" means ("This system setting enables non-Unicode programs to display menus and dialogs in their native language. It does not affect Unicode programs, but it does apply to all users of this computer.") I chose "Serbian (Cyrillic)" arbitrarily, and rebooted. (I hadn't changed anything else, all the Windows menus etc were still in English.) After doing that, my loop (see earlier post) was producing meaningful byte values for the Cyrillic filename, and -- somewhat to my surprise, since I hadn't told Dolphin about the change -- even displayed correctly in the Transcript. The Urdu and Japanese filenames were still coming back as all 63s. (So naturally I tried changing that setting to Urdu then Japanese too, both worked the same way). So now I'm back with a proper British computer again, and the only lasting consequence of my rashness seems that Outlook Express's flashing text cursor now has a little flag at the top. Anyone know how to fix that ;-) ? -- chris |
In reply to this post by Blair McGlashan-3
> Are wide characters actually needed for Russian though? I don't know, just
> asking. I am probably reiterating what Chris Uppal found out already. Russian characters are traditionally presented with 8 bytes. Normally, non-English system is configured to use Cyrillic code page or however else they name it. Traditionally Russian letters are encoded in range 128-256. I assume that Russian computer should not have any problem, because of the 8 byte length. The newest systems (Windows 2K, XP) seem to be capable of displaying Russian (and many other languages) even if Encoding is not enabled. I assume that they store file names in Unicode. ... I prefer not to enable Russian encoding, because it has some negative font effects (font sizes are not always the same). Chris Uppal found another one of such problems :( Unicode is not a big deal to me. It is just that Unicode support is taken for granted these days (C#, Java). Thank you |
Sergei Gnezdov wrote:
> Unicode is not a big deal to me. It is just that Unicode support is > taken for granted these days (C#, Java). I agree that UNICODE is important from the point of view of compatiblity with other environments. Thanks -Panu Viljamaa |
In reply to this post by Sergei Gnezdov-4
Hello Sergei,
> Recently I tried to read contents of the drive on my machine with > Dolphin XP. I found that it does not display Russian letters correctly. > I am not sure where exactly the problem starts. Is it in external > interfacing? Any solutions? 1. Change scrypt of font in "User preferences>Workspace>defaultFont" and possible in "User preferences>Development System>defaultFont" to Cyrillic. 2. Find key \HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage. 3. Change all values named 1250-1258 to "c_1251.nls" 4. Reboot, everything should be allright. Dmitry Zamotkin |
Free forum by Nabble | Edit this page |