Hi sven
the web site I was using remove the file for my book. So I copied the file on github. When I open the file with texmate it tells that the encoding is western-latin1 but when I try to load it as follow I get an UTF-8 illegal error. | lines | lines := (ZnDefaultCharacterEncoder value: ZnCharacterEncoder latin1 during: [ ZnClient new get: 'https://raw.githubusercontent.com/SquareBracketAssociates/LearningOOPWithPharo/master/resources/listeDeMotsFrancaisFrGut.txt' ]) lines. Do you have any idea? Tx Stef |
> On 26 Sep 2017, at 17:25, Stephane Ducasse <[hidden email]> wrote: > > Hi sven > > the web site I was using remove the file for my book. > So I copied the file on github. > When I open the file with texmate it tells that the encoding is western-latin1 > but when I try to load it as follow I get an UTF-8 illegal error. > > | lines | > lines := (ZnDefaultCharacterEncoder > value: ZnCharacterEncoder latin1 > during: [ > ZnClient new > get: 'https://raw.githubusercontent.com/SquareBracketAssociates/LearningOOPWithPharo/master/resources/listeDeMotsFrancaisFrGut.txt' > ]) lines. > > Do you have any idea? > > Tx > > Stef Any chance you can point me to the original file ? The file is indeed in Latin1 encoded, but GitHub serves it as UTF-8 (it did not change the contents, but the meta data). The default encoder option only works when the server says nothing, it does not override what the server says. The only way to read it, is by reading it binary (which basically ignores the meta data) and then convert it manually: (ZnCharacterEncoder latin1 decodeBytes: (ZnClient new beBinary; get: 'https://raw.githubusercontent.com/SquareBracketAssociates/LearningOOPWithPharo/master/resources/listeDeMotsFrancaisFrGut.txt')) lines. But this is very ugly. Best convert the original file to UTF-8 before uploading to GitHub. Sven |
> Any chance you can point me to the original file ?
No they removed it May be I could try to convert it to utf-8 (I do not know how to do it) > The file is indeed in Latin1 encoded, but GitHub serves it as UTF-8 (it did not change the contents, but the meta data). Ok I see the problem > > The default encoder option only works when the server says nothing, it does not override what the server says. Ah ok. > The only way to read it, is by reading it binary (which basically ignores the meta data) and then convert it manually: > > (ZnCharacterEncoder latin1 decodeBytes: > (ZnClient new > beBinary; > get: 'https://raw.githubusercontent.com/SquareBracketAssociates/LearningOOPWithPharo/master/resources/listeDeMotsFrancaisFrGut.txt')) lines. > > But this is very ugly. > > Best convert the original file to UTF-8 before uploading to GitHub. OK I will try to leanr how to do it. > > Sven > > > |
You can convert it in Pharo, of course:
(FileLocator desktop / 'mots.txt') writeStreamDo: [ :out | out << (ZnCharacterEncoder latin1 decodeBytes: (ZnClient new beBinary; get: 'https://raw.githubusercontent.com/SquareBracketAssociates/LearningOOPWithPharo/master/resources/listeDeMotsFrancaisFrGut.txt')) ]. You just take the string as it is in Pharo and write it out to a file, it is by default utf-8 encoded. > On 26 Sep 2017, at 17:40, Stephane Ducasse <[hidden email]> wrote: > >> Any chance you can point me to the original file ? > > No they removed it > May be I could try to convert it to utf-8 (I do not know how to do it) > >> The file is indeed in Latin1 encoded, but GitHub serves it as UTF-8 (it did not change the contents, but the meta data). > > Ok I see the problem >> >> The default encoder option only works when the server says nothing, it does not override what the server says. > > Ah ok. > >> The only way to read it, is by reading it binary (which basically ignores the meta data) and then convert it manually: >> >> (ZnCharacterEncoder latin1 decodeBytes: >> (ZnClient new >> beBinary; >> get: 'https://raw.githubusercontent.com/SquareBracketAssociates/LearningOOPWithPharo/master/resources/listeDeMotsFrancaisFrGut.txt')) lines. >> >> But this is very ugly. >> >> Best convert the original file to UTF-8 before uploading to GitHub. > > OK I will try to leanr how to do it. > > >> >> Sven >> >> >> > |
In reply to this post by Stephane Ducasse-3
I'm reading your chapter :)
Now I understand the file I found is totally bogus :) But the first one I found is indeed encoded in latin1. So I'm trying to convert it. On Tue, Sep 26, 2017 at 5:40 PM, Stephane Ducasse <[hidden email]> wrote: >> Any chance you can point me to the original file ? > > No they removed it > May be I could try to convert it to utf-8 (I do not know how to do it) > >> The file is indeed in Latin1 encoded, but GitHub serves it as UTF-8 (it did not change the contents, but the meta data). > > Ok I see the problem >> >> The default encoder option only works when the server says nothing, it does not override what the server says. > > Ah ok. > >> The only way to read it, is by reading it binary (which basically ignores the meta data) and then convert it manually: >> >> (ZnCharacterEncoder latin1 decodeBytes: >> (ZnClient new >> beBinary; >> get: 'https://raw.githubusercontent.com/SquareBracketAssociates/LearningOOPWithPharo/master/resources/listeDeMotsFrancaisFrGut.txt')) lines. >> >> But this is very ugly. >> >> Best convert the original file to UTF-8 before uploading to GitHub. > > OK I will try to leanr how to do it. > > >> >> Sven >> >> >> |
Now inspecting the file containent opens gtinspector and freezes Pharo :(
I think that I will remove this part of my book. It is simpler. On Tue, Sep 26, 2017 at 5:53 PM, Stephane Ducasse <[hidden email]> wrote: > I'm reading your chapter :) > Now I understand the file I found is totally bogus :) > But the first one I found is indeed encoded in latin1. > So I'm trying to convert it. > > > > > On Tue, Sep 26, 2017 at 5:40 PM, Stephane Ducasse > <[hidden email]> wrote: >>> Any chance you can point me to the original file ? >> >> No they removed it >> May be I could try to convert it to utf-8 (I do not know how to do it) >> >>> The file is indeed in Latin1 encoded, but GitHub serves it as UTF-8 (it did not change the contents, but the meta data). >> >> Ok I see the problem >>> >>> The default encoder option only works when the server says nothing, it does not override what the server says. >> >> Ah ok. >> >>> The only way to read it, is by reading it binary (which basically ignores the meta data) and then convert it manually: >>> >>> (ZnCharacterEncoder latin1 decodeBytes: >>> (ZnClient new >>> beBinary; >>> get: 'https://raw.githubusercontent.com/SquareBracketAssociates/LearningOOPWithPharo/master/resources/listeDeMotsFrancaisFrGut.txt')) lines. >>> >>> But this is very ugly. >>> >>> Best convert the original file to UTF-8 before uploading to GitHub. >> >> OK I will try to leanr how to do it. >> >> >>> >>> Sven >>> >>> >>> |
Here is a script that should work to convert from latin1 to utf-8.
Thanks to your book and trial and error. | str wstr | str := ('listeDeMotsFrancaisFrGut.txt' asFileReference readStreamDo: [ :in | (ZnCharacterReadStream on: in binary encoding: #latin1) upToEnd ]) lines. 'listeDeMotsFrancaisFrGutUTF8.txt' asFileReference writeStreamDo: [ :out | wstr := (ZnCharacterWriteStream on: out binary encoding: #utf8). str do: [ :each | wstr nextPutAll: each. wstr crlf. ]. ]. On Tue, Sep 26, 2017 at 6:01 PM, Stephane Ducasse <[hidden email]> wrote: > Now inspecting the file containent opens gtinspector and freezes Pharo :( > I think that I will remove this part of my book. > It is simpler. > > On Tue, Sep 26, 2017 at 5:53 PM, Stephane Ducasse > <[hidden email]> wrote: >> I'm reading your chapter :) >> Now I understand the file I found is totally bogus :) >> But the first one I found is indeed encoded in latin1. >> So I'm trying to convert it. >> >> >> >> >> On Tue, Sep 26, 2017 at 5:40 PM, Stephane Ducasse >> <[hidden email]> wrote: >>>> Any chance you can point me to the original file ? >>> >>> No they removed it >>> May be I could try to convert it to utf-8 (I do not know how to do it) >>> >>>> The file is indeed in Latin1 encoded, but GitHub serves it as UTF-8 (it did not change the contents, but the meta data). >>> >>> Ok I see the problem >>>> >>>> The default encoder option only works when the server says nothing, it does not override what the server says. >>> >>> Ah ok. >>> >>>> The only way to read it, is by reading it binary (which basically ignores the meta data) and then convert it manually: >>>> >>>> (ZnCharacterEncoder latin1 decodeBytes: >>>> (ZnClient new >>>> beBinary; >>>> get: 'https://raw.githubusercontent.com/SquareBracketAssociates/LearningOOPWithPharo/master/resources/listeDeMotsFrancaisFrGut.txt')) lines. >>>> >>>> But this is very ugly. >>>> >>>> Best convert the original file to UTF-8 before uploading to GitHub. >>> >>> OK I will try to leanr how to do it. >>> >>> >>>> >>>> Sven >>>> >>>> >>>> |
> On 26 Sep 2017, at 18:09, Stephane Ducasse <[hidden email]> wrote: > > Here is a script that should work to convert from latin1 to utf-8. > Thanks to your book and trial and error. > > | str wstr | > str := ('listeDeMotsFrancaisFrGut.txt' asFileReference readStreamDo: [ :in | > (ZnCharacterReadStream on: in binary encoding: #latin1) > upToEnd ]) lines. > > 'listeDeMotsFrancaisFrGutUTF8.txt' asFileReference writeStreamDo: [ :out | > wstr := (ZnCharacterWriteStream on: out binary encoding: #utf8). > str do: [ :each | wstr nextPutAll: each. wstr crlf. ]. > ]. Yes, that is correct (and using the newer encoders in both directions) > On Tue, Sep 26, 2017 at 6:01 PM, Stephane Ducasse > <[hidden email]> wrote: >> Now inspecting the file containent opens gtinspector and freezes Pharo :( >> I think that I will remove this part of my book. >> It is simpler. >> >> On Tue, Sep 26, 2017 at 5:53 PM, Stephane Ducasse >> <[hidden email]> wrote: >>> I'm reading your chapter :) >>> Now I understand the file I found is totally bogus :) >>> But the first one I found is indeed encoded in latin1. >>> So I'm trying to convert it. >>> >>> >>> >>> >>> On Tue, Sep 26, 2017 at 5:40 PM, Stephane Ducasse >>> <[hidden email]> wrote: >>>>> Any chance you can point me to the original file ? >>>> >>>> No they removed it >>>> May be I could try to convert it to utf-8 (I do not know how to do it) >>>> >>>>> The file is indeed in Latin1 encoded, but GitHub serves it as UTF-8 (it did not change the contents, but the meta data). >>>> >>>> Ok I see the problem >>>>> >>>>> The default encoder option only works when the server says nothing, it does not override what the server says. >>>> >>>> Ah ok. >>>> >>>>> The only way to read it, is by reading it binary (which basically ignores the meta data) and then convert it manually: >>>>> >>>>> (ZnCharacterEncoder latin1 decodeBytes: >>>>> (ZnClient new >>>>> beBinary; >>>>> get: 'https://raw.githubusercontent.com/SquareBracketAssociates/LearningOOPWithPharo/master/resources/listeDeMotsFrancaisFrGut.txt')) lines. >>>>> >>>>> But this is very ugly. >>>>> >>>>> Best convert the original file to UTF-8 before uploading to GitHub. >>>> >>>> OK I will try to leanr how to do it. >>>> >>>> >>>>> >>>>> Sven >>>>> >>>>> >>>>> > |
In reply to this post by Stephane Ducasse-3
So it works now. I converted the files in utf8 and now the book is not
broken anymore. Thanks sven for at least taking the time to reply to my email. I helps for my mental spirit. Stef On Tue, Sep 26, 2017 at 6:09 PM, Stephane Ducasse <[hidden email]> wrote: > Here is a script that should work to convert from latin1 to utf-8. > Thanks to your book and trial and error. > > | str wstr | > str := ('listeDeMotsFrancaisFrGut.txt' asFileReference readStreamDo: [ :in | > (ZnCharacterReadStream on: in binary encoding: #latin1) > upToEnd ]) lines. > > 'listeDeMotsFrancaisFrGutUTF8.txt' asFileReference writeStreamDo: [ :out | > wstr := (ZnCharacterWriteStream on: out binary encoding: #utf8). > str do: [ :each | wstr nextPutAll: each. wstr crlf. ]. > ]. > > On Tue, Sep 26, 2017 at 6:01 PM, Stephane Ducasse > <[hidden email]> wrote: >> Now inspecting the file containent opens gtinspector and freezes Pharo :( >> I think that I will remove this part of my book. >> It is simpler. >> >> On Tue, Sep 26, 2017 at 5:53 PM, Stephane Ducasse >> <[hidden email]> wrote: >>> I'm reading your chapter :) >>> Now I understand the file I found is totally bogus :) >>> But the first one I found is indeed encoded in latin1. >>> So I'm trying to convert it. >>> >>> >>> >>> >>> On Tue, Sep 26, 2017 at 5:40 PM, Stephane Ducasse >>> <[hidden email]> wrote: >>>>> Any chance you can point me to the original file ? >>>> >>>> No they removed it >>>> May be I could try to convert it to utf-8 (I do not know how to do it) >>>> >>>>> The file is indeed in Latin1 encoded, but GitHub serves it as UTF-8 (it did not change the contents, but the meta data). >>>> >>>> Ok I see the problem >>>>> >>>>> The default encoder option only works when the server says nothing, it does not override what the server says. >>>> >>>> Ah ok. >>>> >>>>> The only way to read it, is by reading it binary (which basically ignores the meta data) and then convert it manually: >>>>> >>>>> (ZnCharacterEncoder latin1 decodeBytes: >>>>> (ZnClient new >>>>> beBinary; >>>>> get: 'https://raw.githubusercontent.com/SquareBracketAssociates/LearningOOPWithPharo/master/resources/listeDeMotsFrancaisFrGut.txt')) lines. >>>>> >>>>> But this is very ugly. >>>>> >>>>> Best convert the original file to UTF-8 before uploading to GitHub. >>>> >>>> OK I will try to leanr how to do it. >>>> >>>> >>>>> >>>>> Sven >>>>> >>>>> >>>>> |
Free forum by Nabble | Edit this page |