Hi all, found just another bug. If you get tired of them, just tell me :-)
Steps to reproduce: Print it:
Expected output: Something like 'ct 12/21/2019 15:13'.
Actual output: ''.
Please note that everything would have worked fine if we named class #CTTestClass (without accent) instead.
Do we want to support special class names in general? If yes, this is a bug in my opinion. If no, we should raise an error in the first statement.
Cause of infection not yet investigated.
Best, Christoph
Carpe Squeak!
|
> On 21.12.2019, at 15:16, Thiede, Christoph <[hidden email]> wrote: > > Hi all, found just another bug. If you get tired of them, just tell me :-) > > Steps to reproduce: > Print it: > class := Object subclass: #CTTèstClass "sic (with accent in name)!" > instanceVariableNames: '' > classVariableNames: '' > poolDictionaries: '' > category: 'CT-Experiments'. > class compile: 'foo ^ #foo'. > (class >> #foo) timeStamp > > Expected output: > Something like 'ct 12/21/2019 15:13'. > > Actual output: > ''. > > Please note that everything would have worked fine if we named class #CTTestClass (without accent) instead. > > Do we want to support special class names in general? If yes, this is a bug in my opinion. If no, we should raise an error in the first statement. > > Cause of infection not yet investigated. Please look at your .changes file whether at some point \00 bytes appear. Best regards -Tobias |
Hi Tobias,
what do you mean in detail?
If I create the class via System Browser and add the method, my change file ends with:
Object subclass: #CTTéstClass
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'CT-Experiments'!
!CTTéstClass methodsFor: 'no messages' stamp: 'ct 12/21/2019 17:18'!
foo! !
However, CompiledMethod >> #timeStamp returns ''.
Here is a snapshot of the #timeStamp stackframe:
Please note that "tokens at: tokenCount" returns the correct timestamp, but however, stamp is nil. What is this???
I'm not sure if I understand you correctly, but if you told me to search the hex of my change file for a "zero word", the only occurrence I could find is:
Which lead me to this:
Does not seem related, but still looks somehow wrong ^^
Best, Christoph Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
Gesendet: Samstag, 21. Dezember 2019 15:44 Uhr An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names > On 21.12.2019, at 15:16, Thiede, Christoph <[hidden email]> wrote: > > Hi all, found just another bug. If you get tired of them, just tell me :-) > > Steps to reproduce: > Print it: > class := Object subclass: #CTTèstClass "sic (with accent in name)!" > instanceVariableNames: '' > classVariableNames: '' > poolDictionaries: '' > category: 'CT-Experiments'. > class compile: 'foo ^ #foo'. > (class >> #foo) timeStamp > > Expected output: > Something like 'ct 12/21/2019 15:13'. > > Actual output: > ''. > > Please note that everything would have worked fine if we named class #CTTestClass (without accent) instead. > > Do we want to support special class names in general? If yes, this is a bug in my opinion. If no, we should raise an error in the first statement. > > Cause of infection not yet investigated. Please look at your .changes file whether at some point \00 bytes appear. Best regards -Tobias
Carpe Squeak!
|
Ah ok, the latter was already fixed in Multilingual-nice.249 from the Inbox, nevermind :) Von: Squeak-dev <[hidden email]> im Auftrag von Thiede, Christoph
Gesendet: Samstag, 21. Dezember 2019 17:36:26 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names Hi Tobias,
what do you mean in detail?
If I create the class via System Browser and add the method, my change file ends with:
Object subclass: #CTTéstClass
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'CT-Experiments'!
!CTTéstClass methodsFor: 'no messages' stamp: 'ct 12/21/2019 17:18'!
foo! !
However, CompiledMethod >> #timeStamp returns ''.
Here is a snapshot of the #timeStamp stackframe:
Please note that "tokens at: tokenCount" returns the correct timestamp, but however, stamp is nil. What is this???
I'm not sure if I understand you correctly, but if you told me to search the hex of my change file for a "zero word", the only occurrence I could find is:
Which lead me to this:
Does not seem related, but still looks somehow wrong ^^
Best, Christoph Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
Gesendet: Samstag, 21. Dezember 2019 15:44 Uhr An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names > On 21.12.2019, at 15:16, Thiede, Christoph <[hidden email]> wrote: > > Hi all, found just another bug. If you get tired of them, just tell me :-) > > Steps to reproduce: > Print it: > class := Object subclass: #CTTèstClass "sic (with accent in name)!" > instanceVariableNames: '' > classVariableNames: '' > poolDictionaries: '' > category: 'CT-Experiments'. > class compile: 'foo ^ #foo'. > (class >> #foo) timeStamp > > Expected output: > Something like 'ct 12/21/2019 15:13'. > > Actual output: > ''. > > Please note that everything would have worked fine if we named class #CTTestClass (without accent) instead. > > Do we want to support special class names in general? If yes, this is a bug in my opinion. If no, we should raise an error in the first statement. > > Cause of infection not yet investigated. Please look at your .changes file whether at some point \00 bytes appear. Best regards -Tobias
Carpe Squeak!
|
In reply to this post by Christoph Thiede
> On 21.12.2019, at 17:36, Thiede, Christoph <[hidden email]> wrote: > > Hi Tobias, > > what do you mean in detail? > > If I create the class via System Browser and add the method, my change file ends with: > > Object subclass: #CTTéstClass > instanceVariableNames: '' > classVariableNames: '' > poolDictionaries: '' > category: 'CT-Experiments'! > !CTTéstClass methodsFor: 'no messages' stamp: 'ct 12/21/2019 17:18'! > foo! ! Good. that was what I thought was important. > > However, CompiledMethod >> #timeStamp returns ''. What is the result of the following? (CTTéstClass compiledMethodAt: #foo) preamble > > Here is a snapshot of the #timeStamp stackframe: > > > > Please note that "tokens at: tokenCount" returns the correct timestamp, but however, stamp is nil. What is this??? I see what the problem is. The .changes file is apparently written UTF-8 coded, but read Latin-1 coded. This is BAD. You end up with 7 tokens, because you have three for the class name instead of one. This is because the Latin-1 copyright symbol is classified as binary selector, and thus separates the first part of the Class name from the second part. This happens only because utf8 vs latin. But the code path for 7-element tokens is different, and it looks for the #stamp: at a different position. Hence stamp is nil. A wrong but easy fix would be to call #utf8ToSqueak on the preamble. Best regards -Tobias > > I'm not sure if I understand you correctly, but if you told me to search the hex of my change file for a "zero word", the only occurrence I could find is: > > Which lead me to this: > > Does not seem related, but still looks somehow wrong ^^ > > Best, > Christoph > > Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]> > Gesendet: Samstag, 21. Dezember 2019 15:44 Uhr > An: The general-purpose Squeak developers list > Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names > > > > On 21.12.2019, at 15:16, Thiede, Christoph <[hidden email]> wrote: > > > > Hi all, found just another bug. If you get tired of them, just tell me :-) > > > > Steps to reproduce: > > Print it: > > class := Object subclass: #CTTèstClass "sic (with accent in name)!" > > instanceVariableNames: '' > > classVariableNames: '' > > poolDictionaries: '' > > category: 'CT-Experiments'. > > class compile: 'foo ^ #foo'. > > (class >> #foo) timeStamp > > > > Expected output: > > Something like 'ct 12/21/2019 15:13'. > > > > Actual output: > > ''. > > > > Please note that everything would have worked fine if we named class #CTTestClass (without accent) instead. > > > > Do we want to support special class names in general? If yes, this is a bug in my opinion. If no, we should raise an error in the first statement. > > > > Cause of infection not yet investigated. > > Please look at your .changes file whether at some point \00 bytes appear. > > Best regards > -Tobias |
> On 21.12.2019, at 19:11, Tobias Pape <[hidden email]> wrote: > >> >> On 21.12.2019, at 17:36, Thiede, Christoph <[hidden email]> wrote: >> >> Hi Tobias, >> >> what do you mean in detail? >> >> If I create the class via System Browser and add the method, my change file ends with: >> >> Object subclass: #CTTéstClass >> instanceVariableNames: '' >> classVariableNames: '' >> poolDictionaries: '' >> category: 'CT-Experiments'! >> !CTTéstClass methodsFor: 'no messages' stamp: 'ct 12/21/2019 17:18'! >> foo! ! > > > Good. that was what I thought was important. > > >> >> However, CompiledMethod >> #timeStamp returns ''. > > What is the result of the following? > > (CTTéstClass compiledMethodAt: #foo) preamble > > >> >> Here is a snapshot of the #timeStamp stackframe: >> >> >> >> Please note that "tokens at: tokenCount" returns the correct timestamp, but however, stamp is nil. What is this??? > > > I see what the problem is. The .changes file is apparently written UTF-8 coded, but read Latin-1 coded. > This is BAD. Oh, and we were warned: CompiledMethod getPreambleFrom: aFileStream at: endPosition "This method is an ugly hack. This method assumes that source files have ASCII-compatible encoding and that preambles contain no non-ASCII characters." | chunkSize chunk | chunkSize := 160 min: endPosition. [ | index | chunk := aFileStream position: (endPosition - chunkSize + 1 max: 0); basicNext: chunkSize. (index := chunk lastIndexOf: $! startingAt: chunk size) ~= 0 ifTrue: [ ^chunk copyFrom: index + 1 to: chunk size ]. chunkSize := chunkSize * 2. chunkSize <= endPosition ] whileTrue. ^chunk I have the feeling that the problematic send is #basicNext: in line 10 or so. This seems to circumvent the conversion done by MultiByteFileStream. Best regards -Tobias > > You end up with 7 tokens, because you have three for the class name instead of one. This is because the Latin-1 copyright symbol is classified as binary selector, and thus separates the first part of the Class name from the second part. This happens only because utf8 vs latin. > > But the code path for 7-element tokens is different, and it looks for the #stamp: at a different position. > > Hence stamp is nil. > > A wrong but easy fix would be to call #utf8ToSqueak on the preamble. > > Best regards > -Tobias > > >> >> I'm not sure if I understand you correctly, but if you told me to search the hex of my change file for a "zero word", the only occurrence I could find is: >> >> Which lead me to this: >> >> Does not seem related, but still looks somehow wrong ^^ >> >> Best, >> Christoph >> >> Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]> >> Gesendet: Samstag, 21. Dezember 2019 15:44 Uhr >> An: The general-purpose Squeak developers list >> Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names >> >> >>> On 21.12.2019, at 15:16, Thiede, Christoph <[hidden email]> wrote: >>> >>> Hi all, found just another bug. If you get tired of them, just tell me :-) >>> >>> Steps to reproduce: >>> Print it: >>> class := Object subclass: #CTTèstClass "sic (with accent in name)!" >>> instanceVariableNames: '' >>> classVariableNames: '' >>> poolDictionaries: '' >>> category: 'CT-Experiments'. >>> class compile: 'foo ^ #foo'. >>> (class >> #foo) timeStamp >>> >>> Expected output: >>> Something like 'ct 12/21/2019 15:13'. >>> >>> Actual output: >>> ''. >>> >>> Please note that everything would have worked fine if we named class #CTTestClass (without accent) instead. >>> >>> Do we want to support special class names in general? If yes, this is a bug in my opinion. If no, we should raise an error in the first statement. >>> >>> Cause of infection not yet investigated. >> >> Please look at your .changes file whether at some point \00 bytes appear. >> >> Best regards >> -Tobias |
Hi Tobias, thanks for the pointers!
> (CTTéstClass compiledMethodAt: #foo) preamble Like you said:
I made the following change:
This seems to fix the conversion issues.
Outputs are:
The next problem is the trailing ! for the CTTéstClass preamble.
Here, the integer returned by expandedSourceFileArray >> #filePositionFromSourcePointer: is too large by one.
If have no idea where these constants come from, but as this is a constant method, I don't see how this calculation could be wrong.
I also tried the following:
yielding correctly:
But that seems hacky again.
Looking forward to your reply!
Best, Christoph Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
Gesendet: Samstag, 21. Dezember 2019 19:22:38 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names > On 21.12.2019, at 19:11, Tobias Pape <[hidden email]> wrote: > >> >> On 21.12.2019, at 17:36, Thiede, Christoph <[hidden email]> wrote: >> >> Hi Tobias, >> >> what do you mean in detail? >> >> If I create the class via System Browser and add the method, my change file ends with: >> >> Object subclass: #CTTéstClass >> instanceVariableNames: '' >> classVariableNames: '' >> poolDictionaries: '' >> category: 'CT-Experiments'! >> !CTTéstClass methodsFor: 'no messages' stamp: 'ct 12/21/2019 17:18'! >> foo! ! > > > Good. that was what I thought was important. > > >> >> However, CompiledMethod >> #timeStamp returns ''. > > What is the result of the following? > > (CTTéstClass compiledMethodAt: #foo) preamble > > >> >> Here is a snapshot of the #timeStamp stackframe: >> >> >> >> Please note that "tokens at: tokenCount" returns the correct timestamp, but however, stamp is nil. What is this??? > > > I see what the problem is. The .changes file is apparently written UTF-8 coded, but read Latin-1 coded. > This is BAD. Oh, and we were warned: CompiledMethod getPreambleFrom: aFileStream at: endPosition "This method is an ugly hack. This method assumes that source files have ASCII-compatible encoding and that preambles contain no non-ASCII characters." | chunkSize chunk | chunkSize := 160 min: endPosition. [ | index | chunk := aFileStream position: (endPosition - chunkSize + 1 max: 0); basicNext: chunkSize. (index := chunk lastIndexOf: $! startingAt: chunk size) ~= 0 ifTrue: [ ^chunk copyFrom: index + 1 to: chunk size ]. chunkSize := chunkSize * 2. chunkSize <= endPosition ] whileTrue. ^chunk I have the feeling that the problematic send is #basicNext: in line 10 or so. This seems to circumvent the conversion done by MultiByteFileStream. Best regards -Tobias > > You end up with 7 tokens, because you have three for the class name instead of one. This is because the Latin-1 copyright symbol is classified as binary selector, and thus separates the first part of the Class name from the second part. This happens only because utf8 vs latin. > > But the code path for 7-element tokens is different, and it looks for the #stamp: at a different position. > > Hence stamp is nil. > > A wrong but easy fix would be to call #utf8ToSqueak on the preamble. > > Best regards > -Tobias > > >> >> I'm not sure if I understand you correctly, but if you told me to search the hex of my change file for a "zero word", the only occurrence I could find is: >> >> Which lead me to this: >> >> Does not seem related, but still looks somehow wrong ^^ >> >> Best, >> Christoph >> >> Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]> >> Gesendet: Samstag, 21. Dezember 2019 15:44 Uhr >> An: The general-purpose Squeak developers list >> Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names >> >> >>> On 21.12.2019, at 15:16, Thiede, Christoph <[hidden email]> wrote: >>> >>> Hi all, found just another bug. If you get tired of them, just tell me :-) >>> >>> Steps to reproduce: >>> Print it: >>> class := Object subclass: #CTTèstClass "sic (with accent in name)!" >>> instanceVariableNames: '' >>> classVariableNames: '' >>> poolDictionaries: '' >>> category: 'CT-Experiments'. >>> class compile: 'foo ^ #foo'. >>> (class >> #foo) timeStamp >>> >>> Expected output: >>> Something like 'ct 12/21/2019 15:13'. >>> >>> Actual output: >>> ''. >>> >>> Please note that everything would have worked fine if we named class #CTTestClass (without accent) instead. >>> >>> Do we want to support special class names in general? If yes, this is a bug in my opinion. If no, we should raise an error in the first statement. >>> >>> Cause of infection not yet investigated. >> >> Please look at your .changes file whether at some point \00 bytes appear. >> >> Best regards >> -Tobias pastedImage.png (307K) Download Attachment
Carpe Squeak!
|
> On 21.12.2019, at 20:23, Thiede, Christoph <[hidden email]> wrote: > > Hi Tobias, thanks for the pointers! > > > (CTTéstClass compiledMethodAt: #foo) preamble > > Like you said: > > > I made the following change: > > This seems to fix the conversion issues. > > Outputs are: > > > The next problem is the trailing ! for the CTTéstClass preamble. > Here, the integer returned by expandedSourceFileArray >> #filePositionFromSourcePointer: is too large by one. > If have no idea where these constants come from, but as this is a constant method, I don't see how this calculation could be wrong. Because of utf8. it counts raw bytes, but gets returned in count of unicode codepoints. hence + 1... > > I also tried the following: > > yielding correctly: Seems lucky.. > > But that seems hacky again. > > Looking forward to your reply! Best regards -Tobias PS: maybe copy the code instead of images? its easier to see things then, for me at least :) > > Best, > Christoph > Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]> > Gesendet: Samstag, 21. Dezember 2019 19:22:38 > An: The general-purpose Squeak developers list > Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names > > > > On 21.12.2019, at 19:11, Tobias Pape <[hidden email]> wrote: > > > >> > >> On 21.12.2019, at 17:36, Thiede, Christoph <[hidden email]> wrote: > >> > >> Hi Tobias, > >> > >> what do you mean in detail? > >> > >> If I create the class via System Browser and add the method, my change file ends with: > >> > >> Object subclass: #CTTéstClass > >> instanceVariableNames: '' > >> classVariableNames: '' > >> poolDictionaries: '' > >> category: 'CT-Experiments'! > >> !CTTéstClass methodsFor: 'no messages' stamp: 'ct 12/21/2019 17:18'! > >> foo! ! > > > > > > Good. that was what I thought was important. > > > > > >> > >> However, CompiledMethod >> #timeStamp returns ''. > > > > What is the result of the following? > > > > (CTTéstClass compiledMethodAt: #foo) preamble > > > > > >> > >> Here is a snapshot of the #timeStamp stackframe: > >> > >> > >> > >> Please note that "tokens at: tokenCount" returns the correct timestamp, but however, stamp is nil. What is this??? > > > > > > I see what the problem is. The .changes file is apparently written UTF-8 coded, but read Latin-1 coded. > > This is BAD. > > Oh, and we were warned: > > CompiledMethod > getPreambleFrom: aFileStream at: endPosition > "This method is an ugly hack. This method assumes that source files have ASCII-compatible encoding and that preambles contain no non-ASCII characters." > > | chunkSize chunk | > chunkSize := 160 min: endPosition. > [ > | index | > chunk := aFileStream > position: (endPosition - chunkSize + 1 max: 0); > basicNext: chunkSize. > (index := chunk lastIndexOf: $! startingAt: chunk size) ~= 0 ifTrue: [ > ^chunk copyFrom: index + 1 to: chunk size ]. > chunkSize := chunkSize * 2. > chunkSize <= endPosition ] whileTrue. > ^chunk > > > I have the feeling that the problematic send is #basicNext: in line 10 or so. This seems to circumvent the conversion done by MultiByteFileStream. > > Best regards > -Tobias > > > > > You end up with 7 tokens, because you have three for the class name instead of one. This is because the Latin-1 copyright symbol is classified as binary selector, and thus separates the first part of the Class name from the second part. This happens only because utf8 vs latin. > > > > But the code path for 7-element tokens is different, and it looks for the #stamp: at a different position. > > > > Hence stamp is nil. > > > > A wrong but easy fix would be to call #utf8ToSqueak on the preamble. > > > > Best regards > > -Tobias > > > > > >> > >> I'm not sure if I understand you correctly, but if you told me to search the hex of my change file for a "zero word", the only occurrence I could find is: > >> > >> Which lead me to this: > >> > >> Does not seem related, but still looks somehow wrong ^^ > >> > >> Best, > >> Christoph > >> > >> Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]> > >> Gesendet: Samstag, 21. Dezember 2019 15:44 Uhr > >> An: The general-purpose Squeak developers list > >> Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names > >> > >> > >>> On 21.12.2019, at 15:16, Thiede, Christoph <[hidden email]> wrote: > >>> > >>> Hi all, found just another bug. If you get tired of them, just tell me :-) > >>> > >>> Steps to reproduce: > >>> Print it: > >>> class := Object subclass: #CTTèstClass "sic (with accent in name)!" > >>> instanceVariableNames: '' > >>> classVariableNames: '' > >>> poolDictionaries: '' > >>> category: 'CT-Experiments'. > >>> class compile: 'foo ^ #foo'. > >>> (class >> #foo) timeStamp > >>> > >>> Expected output: > >>> Something like 'ct 12/21/2019 15:13'. > >>> > >>> Actual output: > >>> ''. > >>> > >>> Please note that everything would have worked fine if we named class #CTTestClass (without accent) instead. > >>> > >>> Do we want to support special class names in general? If yes, this is a bug in my opinion. If no, we should raise an error in the first statement. > >>> > >>> Cause of infection not yet investigated. > >> > >> Please look at your .changes file whether at some point \00 bytes appear. > >> > >> Best regards > >> -Tobias > > > > |
Hi Tobias, sorry for the long delay!
> PS: maybe copy the code instead of images? its easier to see things then, for me at least :) Sorry, you're right. Code is bad for showing the diffs, screenshots are bad for editability :(
Please find the attachment.
Best,
Christoph
Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
Gesendet: Samstag, 21. Dezember 2019 20:47:50 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names > On 21.12.2019, at 20:23, Thiede, Christoph <[hidden email]> wrote: > > Hi Tobias, thanks for the pointers! > > > (CTTéstClass compiledMethodAt: #foo) preamble > > Like you said: > > > I made the following change: > > This seems to fix the conversion issues. > > Outputs are: > > > The next problem is the trailing ! for the CTTéstClass preamble. > Here, the integer returned by expandedSourceFileArray >> #filePositionFromSourcePointer: is too large by one. > If have no idea where these constants come from, but as this is a constant method, I don't see how this calculation could be wrong. Because of utf8. it counts raw bytes, but gets returned in count of unicode codepoints. hence + 1... > > I also tried the following: > > yielding correctly: Seems lucky.. > > But that seems hacky again. > > Looking forward to your reply! Best regards -Tobias PS: maybe copy the code instead of images? its easier to see things then, for me at least :) > > Best, > Christoph > Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]> > Gesendet: Samstag, 21. Dezember 2019 19:22:38 > An: The general-purpose Squeak developers list > Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names > > > > On 21.12.2019, at 19:11, Tobias Pape <[hidden email]> wrote: > > > >> > >> On 21.12.2019, at 17:36, Thiede, Christoph <[hidden email]> wrote: > >> > >> Hi Tobias, > >> > >> what do you mean in detail? > >> > >> If I create the class via System Browser and add the method, my change file ends with: > >> > >> Object subclass: #CTTéstClass > >> instanceVariableNames: '' > >> classVariableNames: '' > >> poolDictionaries: '' > >> category: 'CT-Experiments'! > >> !CTTéstClass methodsFor: 'no messages' stamp: 'ct 12/21/2019 17:18'! > >> foo! ! > > > > > > Good. that was what I thought was important. > > > > > >> > >> However, CompiledMethod >> #timeStamp returns ''. > > > > What is the result of the following? > > > > (CTTéstClass compiledMethodAt: #foo) preamble > > > > > >> > >> Here is a snapshot of the #timeStamp stackframe: > >> > >> > >> > >> Please note that "tokens at: tokenCount" returns the correct timestamp, but however, stamp is nil. What is this??? > > > > > > I see what the problem is. The .changes file is apparently written UTF-8 coded, but read Latin-1 coded. > > This is BAD. > > Oh, and we were warned: > > CompiledMethod > getPreambleFrom: aFileStream at: endPosition > "This method is an ugly hack. This method assumes that source files have ASCII-compatible encoding and that preambles contain no non-ASCII characters." > > | chunkSize chunk | > chunkSize := 160 min: endPosition. > [ > | index | > chunk := aFileStream > position: (endPosition - chunkSize + 1 max: 0); > basicNext: chunkSize. > (index := chunk lastIndexOf: $! startingAt: chunk size) ~= 0 ifTrue: [ > ^chunk copyFrom: index + 1 to: chunk size ]. > chunkSize := chunkSize * 2. > chunkSize <= endPosition ] whileTrue. > ^chunk > > > I have the feeling that the problematic send is #basicNext: in line 10 or so. This seems to circumvent the conversion done by MultiByteFileStream. > > Best regards > -Tobias > > > > > You end up with 7 tokens, because you have three for the class name instead of one. This is because the Latin-1 copyright symbol is classified as binary selector, and thus separates the first part of the Class name from the second part. This happens only because utf8 vs latin. > > > > But the code path for 7-element tokens is different, and it looks for the #stamp: at a different position. > > > > Hence stamp is nil. > > > > A wrong but easy fix would be to call #utf8ToSqueak on the preamble. > > > > Best regards > > -Tobias > > > > > >> > >> I'm not sure if I understand you correctly, but if you told me to search the hex of my change file for a "zero word", the only occurrence I could find is: > >> > >> Which lead me to this: > >> > >> Does not seem related, but still looks somehow wrong ^^ > >> > >> Best, > >> Christoph > >> > >> Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]> > >> Gesendet: Samstag, 21. Dezember 2019 15:44 Uhr > >> An: The general-purpose Squeak developers list > >> Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names > >> > >> > >>> On 21.12.2019, at 15:16, Thiede, Christoph <[hidden email]> wrote: > >>> > >>> Hi all, found just another bug. If you get tired of them, just tell me :-) > >>> > >>> Steps to reproduce: > >>> Print it: > >>> class := Object subclass: #CTTèstClass "sic (with accent in name)!" > >>> instanceVariableNames: '' > >>> classVariableNames: '' > >>> poolDictionaries: '' > >>> category: 'CT-Experiments'. > >>> class compile: 'foo ^ #foo'. > >>> (class >> #foo) timeStamp > >>> > >>> Expected output: > >>> Something like 'ct 12/21/2019 15:13'. > >>> > >>> Actual output: > >>> ''. > >>> > >>> Please note that everything would have worked fine if we named class #CTTestClass (without accent) instead. > >>> > >>> Do we want to support special class names in general? If yes, this is a bug in my opinion. If no, we should raise an error in the first statement. > >>> > >>> Cause of infection not yet investigated. > >> > >> Please look at your .changes file whether at some point \00 bytes appear. > >> > >> Best regards > >> -Tobias > > > > CompiledMethod-getPreambleFromat.st (1K) Download Attachment
Carpe Squeak!
|
Anyone willing to look into this? I have been testing this for the latest
months and did not receive any errors from there. :) -- Sent from: http://forum.world.st/Squeak-Dev-f45488.html
Carpe Squeak!
|
Free forum by Nabble | Edit this page |