Hi,
A stupid question, why evaluating #("comment") leads to an empty array instead of an array with a single element #'"comment"'? I guess that this is somehow bound to the fact that #"comment" evaluates to a symbol with a single hidden character (Ascii 30) BTW, I test this on a 3.9-7032 image Noury -------------------------------------------------------------- Dr. Noury Bouraqadi - Enseignant/Chercheur Ecole des Mines de Douai - Dept. G.I.P http://csl.ensm-douai.fr/noury European Smalltalk Users Group Board http://www.esug.org Squeak: an Open Source Smalltalk http://www.squeak.org -------------------------------------------------------------- |
Am 23.05.2006 um 16:42 schrieb Noury Bouraqadi:
> Hi, > > A stupid question, why evaluating > #("comment") leads to an empty array instead of an array with a > single element #'"comment"'? Because a comment is parsed as whitespace, not as token. > I guess that this is somehow bound to the fact that > #"comment" > evaluates to a symbol with a single hidden character (Ascii 30) No, that's because Ascii 30 signifies the end of input (see Scanner>>step). It's the same as if you just evaluate a single # character. - Bert - |
In reply to this post by Noury Bouraqadi
Le Mardi 23 Mai 2006 16:42, Noury Bouraqadi a écrit :
> Hi, > > A stupid question, why evaluating > #("comment") leads to an empty array instead of an array with a single > element #'"comment"'? > Isn't it a good thing, this ability to insert comment inside long literal arrays ? stupidExample := #( "keys are stored in first sub array" #( 'one' 'two' 'three') "values are stored in second sub array" #(1 2 3) ). > I guess that this is somehow bound to the fact that > #"comment" > evaluates to a symbol with a single hidden character (Ascii 30) > > BTW, I test this on a 3.9-7032 image > This one is a bad behaviour indeed, a side effect of Scanner/Parser internal implementation... (Ascii 30 being used with meaning "end of input"). Behind #, i would expect a letter [a-z][A-Z], a string quote ', or an opening parenthesis (. Maybe a second # in Dolphin Smalltalk extension... What else does make sense according to Smalltalk formal definition? Nicolas > Noury > -------------------------------------------------------------- > Dr. Noury Bouraqadi - Enseignant/Chercheur > Ecole des Mines de Douai - Dept. G.I.P > http://csl.ensm-douai.fr/noury > > European Smalltalk Users Group Board > http://www.esug.org > > Squeak: an Open Source Smalltalk > http://www.squeak.org > -------------------------------------------------------------- |
In reply to this post by Noury Bouraqadi
Nicolas,
you asked: >Le Mardi 23 Mai 2006 16:42, Noury Bouraqadi a écrit : >> Hi, >> >> A stupid question, why evaluating >> #("comment") leads to an empty array instead of an array with a single >> element #'"comment"'? >> > >This one is a bad behaviour indeed, a side effect of Scanner/Parser internal >implementation... (Ascii 30 being used with meaning "end of input"). As long as it is "internal", I can't see anything wrong with it. > >Behind #, i would expect a letter [a-z][A-Z], a string quote ', or an opening >parenthesis (. Maybe a second # in Dolphin Smalltalk extension... > >What else does make sense according to Smalltalk formal definition? According to the syntax diagrams in the Book (choose the book's color from blue, yellow or purple), the sharp character may occur as the first character of an array constant or a symbol constant. In these positions it is followed by a left parenthesis, if it marks an array constant, otherwise it marks a symbol constant and is followed by a letter, a special character or a minus character. Remember, special characters are the ones that make a binary selector. Inside a string or a comment, the sharp character may be followed by any of the 95 graphic characters. And finally, inside a character constant, the sharp character may be followed by any character. This holds for the language as defined formally by the syntax diagrams, but not for the Smalltalk programming language as described informally by the Blue Book, where "any character" may occur inside comments, strings and character constants, that is not only the graphic characters but ASCII control characters as well, like carriage return, horizontal tabulator or record separator which is ASCII 30. And this again differs from the language as accepted by the compiler in the V2 image of Smalltalk-80. For example, the ASCII 0 character inside a character constant gets you an index error. But this is another thread :-) Greetings, Wolfgang -- Weniger, aber besser. |
In reply to this post by Nicolas Cellier-3
Le 23 mai 06, à 19:51, nicolas cellier a écrit : > Le Mardi 23 Mai 2006 16:42, Noury Bouraqadi a écrit : >> Hi, >> >> A stupid question, why evaluating >> #("comment") leads to an empty array instead of an array with a single >> element #'"comment"'? >> > > Isn't it a good thing, this ability to insert comment inside long > literal > arrays ? > > stupidExample := #( > "keys are stored in first sub array" > #( 'one' 'two' 'three') > "values are stored in second sub array" > #(1 2 3) > ). > > Maybe a second # in Dolphin Smalltalk extension... > > What is this? I read somewhere in the Squeak Scanner code that ##() may be given the Ansi Smalltalk semantics without further explanation. Any hint? Noury -------------------------------------------------------------- Dr. Noury Bouraqadi - Enseignant/Chercheur Ecole des Mines de Douai - Dept. G.I.P http://csl.ensm-douai.fr/noury European Smalltalk Users Group Board http://www.esug.org Squeak: an Open Source Smalltalk http://www.squeak.org -------------------------------------------------------------- |
In reply to this post by Wolfgang Helbig-2
Le Mercredi 24 Mai 2006 01:16, Wolfgang Helbig a écrit :
> Nicolas, > > you asked: > >Le Mardi 23 Mai 2006 16:42, Noury Bouraqadi a écrit : > >> Hi, > >> > >> A stupid question, why evaluating > >> #("comment") leads to an empty array instead of an array with a single > >> element #'"comment"'? > > > >This one is a bad behaviour indeed, a side effect of Scanner/Parser > > internal implementation... (Ascii 30 being used with meaning "end of > > input"). > > As long as it is "internal", I can't see anything wrong with it. > Hi Wolfgang, Just try: (Compiler evaluate: '#') inspect. and you will see this ascii 30 dangerously leaking from internal... If # alone were really a valid syntax, then: (Compiler evaluate: '# inspect'). should inspect it... It does not, because space is just ignored: (Compiler evaluate: '# inspect') inspect. So as extra sharp signs: (Compiler evaluate: '# # # # inspect') inspect. Do you agree with such behavior ? > >Behind #, i would expect a letter [a-z][A-Z], a string quote ', or an > > opening parenthesis (. Maybe a second # in Dolphin Smalltalk extension... > > > >What else does make sense according to Smalltalk formal definition? > > According to the syntax diagrams in the Book (choose the book's color from > blue, yellow or purple), the sharp character may occur as the first > character of an array constant or a symbol constant. In these positions it > is followed by a left parenthesis, if it marks an array constant, otherwise > it marks a symbol constant and is followed by a letter, a special character > or a minus character. Remember, special characters are the ones that make a > binary selector. > Oh yes, i should not have forgotten... #* #- In latest squeak, also work with any number of special characters like #***. In VW you can have a ByteArray with #[ 0 0 ] > Inside a string or a comment, the sharp character may be followed by any of > the 95 graphic characters. > > And finally, inside a character constant, the sharp character may be > followed by any character. > I do not understand this sentence. Isn't it the dollar that is used in character constants ? Or is it inside a literal array like #( ^x:=y@z ), in which case each character is interpreted as a single character symbol... For fun, note that Squeak does not complain when you write # $a > This holds for the language as defined formally by the syntax diagrams, but > not for the Smalltalk programming language as described informally by the > Blue Book, where "any character" may occur inside comments, strings and > character constants, that is not only the graphic characters but ASCII > control characters as well, like carriage return, horizontal tabulator or > record separator which is ASCII 30. > > And this again differs from the language as accepted by the compiler in the > V2 image of Smalltalk-80. For example, the ASCII 0 character inside a > character constant gets you an index error. But this is another thread :-) > You mean using ascii value as an index in the scanner character table? I started with st-80 v2.3 but just don't remember such details... > Greetings, > Wolfgang > > -- > Weniger, aber besser. Nicolas |
In reply to this post by Noury Bouraqadi
Le Mercredi 24 Mai 2006 01:56, Noury Bouraqadi a écrit :
> > Maybe a second # in Dolphin Smalltalk extension... > > What is this? > > I read somewhere in the Squeak Scanner code that ##() may be given the > Ansi Smalltalk semantics without further explanation. Any hint? AFAIR, in Dolphin, it means that the literal can be any valid Smalltalk expession. It is evaluated once at compile time (when you accept in the browser, or load from file). for example ##(1@2) ##(Dictionary with: $a -> 0) Nicolas |
In reply to this post by Noury Bouraqadi
Nicolas,
You want me to try something out: > >Hi Wolfgang, >Just try: > (Compiler evaluate: '#') inspect. >and you will see this ascii 30 dangerously leaking from internal... And I did. And I see. And this is wrong. A symbol constant is the sharp character followed by an identifier or binary selector, both of which are nonempty strings. Looks like the compiler in this case accepts an empty string, with the record separator, that is ASCII 30, internally marking the end of the string (Not the end of the file). Anyway, the compiler must not accept '#'. > >If # alone were really a valid syntax, then: > (Compiler evaluate: '# inspect'). >should inspect it... But it isn't, so it should not be accepted. In this case, the compiler is right in not accepting it. > >It does not, because space is just ignored: > (Compiler evaluate: '# inspect') inspect. > >So as extra sharp signs: > (Compiler evaluate: '# # # # inspect') inspect. > >Do you agree with such behavior ? No. There seems to be an error with the compiler handling empty strings. They are represented internally by the record separator. Inspecting nonempty strings reveal, that the record separator does not mark the end of string, as I thought originally. Squeak inherited this error from Smalltalk-80 V2, which I tested using the Hobbes emulator. Aside: Since zero is not a natural number, you have to handle empty strings as special cases, makeing the program unnatural and inviting errors. Couldn't resist :-) End of Aside > >> >Behind #, i would expect a letter [a-z][A-Z], a string quote ', or an >> > opening parenthesis (. Maybe a second # in Dolphin Smalltalk extension... >> > >> >What else does make sense according to Smalltalk formal definition? >> >> According to the syntax diagrams in the Book (choose the book's color from >> blue, yellow or purple), the sharp character may occur as the first >> character of an array constant or a symbol constant. In these positions it >> is followed by a left parenthesis, if it marks an array constant, otherwise >> it marks a symbol constant and is followed by a letter, a special character >> or a minus character. Remember, special characters are the ones that make a >> binary selector. >> > >Oh yes, i should not have forgotten... #* #- >In latest squeak, also work with any number of special characters like #***. >In VW you can have a ByteArray with #[ 0 0 ] > >> Inside a string or a comment, the sharp character may be followed by any of >> the 95 graphic characters. >> >> And finally, inside a character constant, the sharp character may be >> followed by any character. >> > >I do not understand this sentence. Isn't it the dollar that is used in >character constants ? Yes, it is. And $# can be followed by any character. >Or is it inside a literal array like #( ^x:=y@z ), in which case each >character is interpreted as a single character symbol... > >For fun, note that Squeak does not complain when you write ># $a This is not a valid expression. > >> This holds for the language as defined formally by the syntax diagrams, but >> not for the Smalltalk programming language as described informally by the >> Blue Book, where "any character" may occur inside comments, strings and >> character constants, that is not only the graphic characters but ASCII >> control characters as well, like carriage return, horizontal tabulator or >> record separator which is ASCII 30. >> >> And this again differs from the language as accepted by the compiler in the >> V2 image of Smalltalk-80. For example, the ASCII 0 character inside a >> character constant gets you an index error. But this is another thread :-) >> > >You mean using ascii value as an index in the scanner character table? >I started with st-80 v2.3 but just don't remember such details... Months ago, I stumbled across this error in ST-80 V2 when I tried to read back a form from the output of aForm storeOn: aStream. The stream then contained ASCII control characters, like ASCII 0, which on reading back triggered. Here is my report I've sent Dan on April 5th: Report: | f s n| f _ StandardFileStream oldFileNamed: 'DefaultTextStyle.so' . s _ String new: f size. n _ 0. f do: [ :v | n _ n + 1. s at: n put: v]. TextConstants at: #ST80DefaultTextStyle put: (Compiler evaluate: s) The above gives me a "subscript is out of bounds: 0". Debug shows: xBinary tokenType _ #binary. token _ Symbol internCharacter: self step. ((typeTable at: hereChar asciiValue) = #xBinary and: [hereChar ~= $-]) ifTrue: [token _ (token , (String with: self step)) asSymbol] And, of course, "hereChar asciiValue" is zero. (End of Report) By the way, I still don't have those venerable ST-80 fonts in a Squeak image. Greetings, Wolfgang -- Weniger, aber besser. |
Free forum by Nabble | Edit this page |