Hi,
I think UnicodeString>>asString has a bug. In its code (copied below), buf is initialized as a multi-byte string of the length same as the unicode string itself. That assumption is correct when only single byte characters are used, i.e., the code does not work with such environment as Japanese and Korean, where multi-byte characters can be indeed double-bytes. Hideo Mizoguchi asString "Answer a byte string representation of the receiver." | buf size | size := self size. buf := String new: size. size == 0 ifTrue: [^buf]. "Avoid 'The Parameter is Incorrect' error" (KernelLibrary default wideCharToMultiByte: 0 dwFlags: 0 lpWideCharStr: self cchWideChar: size lpMultiByteStr: buf cchMultiByte: size lpDefaultChar: nil lpUsedDefaultChar: nil) == 0 ifTrue: [KernelLibrary default systemError]. ^buf |
"Hideo Mizoguchi" <[hidden email]> wrote in message
news:9a25sh$36dh1$[hidden email]... > > I think UnicodeString>>asString has a bug. In its code (copied below), buf > is initialized as a multi-byte string of the length same as the unicode > string itself. That assumption is correct when only single byte characters > are used, i.e., the code does not work with such environment as Japanese and > Korean, where multi-byte characters can be indeed double-bytes. > ... code snip... Indeed that method is incorrect, and could perhaps be fixed as the attached. However Dolphin does not currently support multi-byte characters in general, so I'm not sure this will help much because it is widely assumed elsewhere in the image and VM that characters require only a single byte to represent a code point. In fact the Character class has 256 fixed instances. Full support for multi-byte character sets will not be available until a future release (unless someone knows how to work around the limitations in the image, but I would be expect that to be pretty difficult). Sorry. Regards Blair begin 666 UnicodeString_asString.st M(55N:6-O9&53=')I;F<@;65T:&]D<T9O<B$-"@T*87-3=')I;F<-"@DB06YS M=V5R(&$@8GET92!S=')I;F<@<F5P<F5S96YT871I;VX@;V8@=&AE(')E8V5I M=F5R+B(-"@T*"7P@8G5F('-I>F4@8GET97,@? T*"7-I>F4@.CT@<V5L9B!S M:7IE+@T*"6)U9B Z/2!3=')I;F<@;F5W.B!S:7IE*C(N#0H)<VEZ92 ]/2 P M(&EF5')U93H@6UYB=69=+@DB079O:60@)U1H92!087)A;65T97(@:7,@26YC M;W)R96-T)R!E<G)O<B(-"@EB>71E<R Z/2!+97)N96Q,:6)R87)Y(&1E9F%U M;'0-"@D)=VED94-H87)4;TUU;'1I0GET93H@, T*"0ED=T9L86=S.B P#0H) M"6QP5VED94-H87)3='(Z('-E;&8-"@D)8V-H5VED94-H87(Z('-I>F4-"@D) M;'!-=6QT:4)Y=&53='(Z(&)U9@T*"0EC8VA-=6QT:4)Y=&4Z(&)U9B!S:7IE M#0H)"6QP1&5F875L=$-H87(Z(&YI; T*"0EL<%5S961$969A=6QT0VAA<CH@ M;FEL+@T*"6)Y=&5S(#T](# @:694<G5E.B!;7DME<FYE;$QI8G)A<GD@9&5F M875L="!S>7-T96U%<G)O<ETN#0H)8G5F(')E<VEZ93H@8GET97,N#0H)7F)U $9B$@(0`` ` end |
Hi,
I can't write English with the Japanese well. Therefore, forgive it in poor English though I am sorry. There was this problem from the time of Dolphin3. It appeared when COM was used in the case of me. Then, I modified two methods of the UnicodeString class. One is asString. One more is UnicodeString>>replaceFrom:to:with:startingAt:. One byte of the Japanese ends is removed only with asString. I modified it in the end of trial and error as follows. ^super replaceFrom: start+start-1 to: stop+stop with: aString startingAt: startAt There is no problem in the range of the use of me. (COM access (especially, ADO and DTS)). However, what in fact do you do? Is it impossible to have had it answer already? Teach if it is good. Regards Takeya Suzuki |
Takeya Suzuki
You wrote in message news:9a3pir$3eqdm$[hidden email]... > Hi, > I can't write English with the Japanese well. > Therefore, forgive it in poor English though I am sorry. > > There was this problem from the time of Dolphin3. > It appeared when COM was used in the case of me. > Then, I modified two methods of the UnicodeString class. > One is asString. > One more is UnicodeString>>replaceFrom:to:with:startingAt:. > One byte of the Japanese ends is removed only with asString. > I modified it in the end of trial and error as follows. > > ^super replaceFrom: start+start-1 to: stop+stop with: aString startingAt: > startAt Thank you. I think there are other methods that UnicodeString may strictly need to override if it were to be a full String implementation, however as its class comment says it is a "minimal" class. > > There is no problem in the range of the use of me. > (COM access (especially, ADO and DTS)). If I understand you correctly, you found no problem with that fix in your own use. > > However, what in fact do you do? > Is it impossible to have had it answer already? > Teach if it is good. I'm sorry, but I cannot understand that. Can you try again and rephrase slightly? Regards Blair |
Blair,
Thank you. I am sorry in poor English. An UnicodeString is made [UnicodeString>>fromAddress:length:]. ex) The case of 'ab' 1)UnicodeString>>fromAddress:length: | answer | answer := self new: anInteger. ^answer replaceFrom: 1 to: anInteger with: anAddress asExternalAddress startingAt: 1 2)UnicodeString>>replaceFrom:to:with:startingAt: ^super replaceFrom: start+start-1 to: stop+stop-1 with: aString startingAt: startAt The value which is actually delivered is this. ^super replaceFrom: 1 to: 3 with: 'ab'(unicode) startingAt: 1 ^super replaceFrom: 1 to: 4 with: 'ab'(unicode) startingAt: 1 1 to 3 ? 1 to 4 ? Correct answer? Therefore. ^super replaceFrom: start+start-1 to: stop+stop with: aString startingAt: startAt Regards Takeya Suzuki |
Takeya Suzuki
You wrote in message news:9abomp$4d24k$[hidden email]... > > Thank you. > > I am sorry in poor English. > > An UnicodeString is made [UnicodeString>>fromAddress:length:]. > > ex) The case of 'ab' > > 1)UnicodeString>>fromAddress:length: > | answer | > answer := self new: anInteger. > ^answer replaceFrom: 1 to: anInteger > with: anAddress asExternalAddress startingAt: 1 > > 2)UnicodeString>>replaceFrom:to:with:startingAt: > ^super replaceFrom: start+start-1 to: stop+stop-1 > with: aString startingAt: startAt > > The value which is actually delivered is this. > > ^super replaceFrom: 1 to: 3 with: 'ab'(unicode) startingAt: 1 > ^super replaceFrom: 1 to: 4 with: 'ab'(unicode) startingAt: 1 > > 1 to 3 ? > 1 to 4 ? > > Correct answer? > > Therefore. > > ^super replaceFrom: start+start-1 to: stop+stop with: aString startingAt: > startAt Now I understand thank you. There is an off-by-one error in UnicodeString>>replaceFrom:to:with:startingAt:. Your fix is correct, thank you. We will incorporate this in the new patch level. Regards Blair |
Free forum by Nabble | Edit this page |