About 7051 and Multilingual-TextConversion

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

About 7051 and Multilingual-TextConversion

Edgar J. De Cleene
Folks:

I now researching how to help to have a new real "Minimal.image" using
Pavel's kernel as starting point.
The attached pict shows what happened when I try to do TextMorph fileOut
form a 7051.
The .sources and .changes now have many different encodings (read recent
John post on FFI wide-character type)
I fix a couple of missing converter bugs .
What about a conversion of all to one format ? UTF8 and Unix end of line
convention ? Or another , but only one.
And a smart OS guess on Squeak for deal this rewrited .changes .sources for
no more troubles.

What you think ?

Edgar




Picture 2.png (20K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: About 7051 and Multilingual-TextConversion

Marcus Denker

On 03.08.2006, at 13:24, Lic. Edgar J. De Cleene wrote:

> Folks:
>
> I now researching how to help to have a new real "Minimal.image" using
> Pavel's kernel as starting point.
> The attached pict shows what happened when I try to do TextMorph  
> fileOut
> form a 7051.

- I have no problem with "TextMorph fileOut" on 7051 (the new one).
- .changes and .sources do have different encoding right now... this
   will be fixed as soon as we write a new .sources (yet another  
reason to do
    so).

> The .sources and .changes now have many different encodings (read  
> recent
> John post on FFI wide-character type)

what does that have to do with FFI?

      Marcus





smime.p7s (5K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: About 7051 and Multilingual-TextConversion

Edgar J. De Cleene
Marcus Denker puso en su mail :

> - I have no problem with "TextMorph fileOut" on 7051 (the new one).
> - .changes and .sources do have different encoding right now... this
>    will be fixed as soon as we write a new .sources (yet another
> reason to do
>     so).
Ok I download the new one.

> what does that have to do with FFI?

Nothing. I mean what he put in his answer

> I'll note that if you have a string (8 bits) in Squeak you must
> decide what the bits mean, is that a
> latin 1 string, a mac roman, or something else.
>
>
>     converter := Smalltalk platformName = 'Mac OS'
>         ifTrue:  [MacRomanUnicodeTextConverter new]
>         ifFalse: [Latin1TextConverter new].
>     wideStringMangled := string convertFromWithConverter: converter.

I could do the fileOut in my image with

'From Squeak3.9alpha of 4 July 2005 [latest update: #7051] on 3 August 2006
at 9:00:16 am'!

!PositionableStream methodsFor: 'fileIn/Out' stamp: 'edc 8/3/2006 08:27'!
copyMethodChunkFrom: aStream
    "Copy the next chunk from aStream (must be different from the
receiver)."
    | chunk |
    aStream converter: Latin1TextConverter new. "this fixes the problem for
me."
    chunk := aStream nextChunkText.
    chunk runs values size = 1 "Optimize for unembellished text"
        ifTrue: [self nextChunkPut: chunk asString]
        ifFalse: [self nextChunkPutWithStyle: chunk]! !

Again thanks, and yes, a new .changes and .sources in only a format could
help all.
What format you choose ?

Edgar



       
       
               
__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas


Reply | Threaded
Open this post in threaded view
|

Re: About 7051 and Multilingual-TextConversion

Yoshiki Ohshima
  Edgar,

>     aStream converter: Latin1TextConverter new. "this fixes the problem for
> me."

  Is it really about the file out of Squeak code, or are you just
taking it as an example?

  If it is the former, this line can't be a fix for any real problems.
You would have wanted to import the created fileout to some other
application, but if it doesn't support UTF-8 with byte order mark or
MacRoman, that is the problem with it, not Squeak's.

  Squeak's file out should be in UTF-8 with byte order mark or
MacRoman (latter is only supported for backward compatibility).  Even
before 3.8, Latin1 had never been an accepted encoding.

  In short, don't do it.  Treat the file out as a binary file, if you
don't want to mess it up.

-- Yoshiki

Reply | Threaded
Open this post in threaded view
|

Re: About 7051 and Multilingual-TextConversion

Edgar J. De Cleene
Yoshiki Ohshima puso en su mail :

> Is it really about the file out of Squeak code, or are you just
> taking it as an example?
>
>   If it is the former, this line can't be a fix for any real problems.
> You would have wanted to import the created fileout to some other
> application, but if it doesn't support UTF-8 with byte order mark or
> MacRoman, that is the problem with it, not Squeak's.
>
>   Squeak's file out should be in UTF-8 with byte order mark or
> MacRoman (latter is only supported for backward compatibility).  Even
> before 3.8, Latin1 had never been an accepted encoding.
>
>   In short, don't do it.  Treat the file out as a binary file, if you
> don't want to mess it up.
>
> -- Yoshiki
Yoshiki:

Believe or not, is a fileOut of the 7051.image what I was using.
That was a 7044 and all the updates, what each take hours to complete.
I put in this .image only a couple of things, like Tracing Message Browser ,
a modified SARBuilder and a few class of mine.
When Markus said what something was wrong , I download a new one "trusted
7051".

But the question is what my own use and particular fix works.
So , if that encoding never was used, why the fileOut failed before this and
why works?
Other class in this image what I need fileOut do not have this problem.

Very thanks.

Edgar



       
       
               
__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas


Reply | Threaded
Open this post in threaded view
|

Re: About 7051 and Multilingual-TextConversion

Yoshiki Ohshima
  Edgar,

> But the question is what my own use and particular fix works.
> So , if that encoding never was used, why the fileOut failed before this and
> why works?
> Other class in this image what I need fileOut do not have this
> problem.

  Probably, TextMorph had a MacRoman character in it (perhaps in the
class comment).  You file out
it in non-standard way, probably.  If this happens, the loader gets
confused by looking at such file out.

  The first .png file you sent around was the notifier from when you
tried to read the fileOut, right?  This confused me for some
extent...

-- Yoshiki