Execute the following piece of code: "-------------------------------------------------------------------" |stream stream2| stream := WriteStream with: 'a test '. stream reset. stream nextPutAll: 'to test'. self assert: [stream contents = 'to test']. "On the following line, you can remove 'copy'" "the problem is present without. It's here to prevent" " the streams from using the same collection" "because the compiler tries to avoid creating 2 identical" "strings" stream2 := WriteStream with: 'a test ' copy. stream2 nextPutAll: 'to test'. "This assert passes but this is abnormal" self assert: [stream2 contents = 'to testto test']. "This assert pass and this is abnormal too" "because the strings MUST be equal !!" self assert: [stream2 contents ~= 'a test to test'] "-------------------------------------------------------------------" On my image, all the 3 tests pass. This is completely abnormal in my opinion. In the second test, where does 'to testto test' come from ??? VM: Squeak VM version: 3.9-8 #2 Tue Oct 10 21:41:34 PDT 2006 gcc 4.0.1 Built from: Squeak3.9alpha of 4 July 2005 [latest update: #7021] Build host: Darwin margaux 8.8.0 Darwin Kernel Version 8.8.0: Fri Sep 8 17:18:57 PDT 2006; root:xnu-792.12.6.obj~1/RELEASE_PPC Power Macintosh powerpc default plugin location: /usr/local/lib/squeak/3.9-8/*.so Image: squeak-dev-76 I will try with other images and the new compiler and let you know. Can you test with your system please and let us know? Bye -- Damien Cassou |
Same with: Squeak VM version: 3.7-7 #1 Sat Mar 19 13:23:20 PST 2005 gcc 3.3 Built from: Squeak3.7 of '4 September 2004' [latest update: #5989] Build host: Darwin emilia.local 7.8.0 Darwin Kernel Version 7.8.0: Wed Dec 22 14:26:17 PST 2004; root:xnu/xnu-517.11.1.obj~1/RELEASE_PPC Power Macintosh powerpc default plugin location: /usr/local/lib/squeak/3.7-7/*.so -- Damien Cassou |
In reply to this post by Damien Cassou-3
On Feb 14, 2007, at 16:18 , Damien Cassou wrote: > Execute the following piece of code: > > > "-------------------------------------------------------------------" > |stream stream2| > stream := WriteStream with: 'a test '. > stream reset. > stream nextPutAll: 'to test'. > self assert: [stream contents = 'to test']. > > "On the following line, you can remove 'copy'" > "the problem is present without. It's here to prevent" > " the streams from using the same collection" > "because the compiler tries to avoid creating 2 identical" > "strings" > stream2 := WriteStream with: 'a test ' copy. > stream2 nextPutAll: 'to test'. > > "This assert passes but this is abnormal" > self assert: [stream2 contents = 'to testto test']. > > "This assert pass and this is abnormal too" > "because the strings MUST be equal !!" > self assert: [stream2 contents ~= 'a test to test'] > "-------------------------------------------------------------------" > > > > On my image, all the 3 tests pass. This is completely abnormal in > my opinion. It is normal. You are modifying the 'a test ' literal into 'to test'. This modified string gets copied in the second test. Lesson: never modify string literals. - Bert - |
All,
This is a really good example of a string literal problem. I'm a fanatic about using copy after any hard coded string in code. So much so that I've had a number of developers make fun of my code because of it. Whenever I'm teaching someone Smalltalk I always include the "just use copy on all string literals" suggestion. But I normally show the problem with a character replacement in a string instead. This is a good example of how someone might make a big mistake and then spend a lot of time trying to figure out why everything is so messed up. I was indoctrinated into the just-use-copy club by Versant. If you had a string literal in a method Versant stored your code in the DB. Then if you tried to change your code without being connected to the database everything blew up! So this can be fixed by: |stream stream2| stream := WriteStream with: 'a test ' copy. stream reset. stream nextPutAll: 'to test' copy. self assert: [stream contents = 'to test' copy]. Ok the last copy is not really needed but "JUST USE COPY" works for me! Ron Teitelbaum > From: Bert Freudenberg > Sent: Wednesday, February 14, 2007 10:36 AM > > > On Feb 14, 2007, at 16:18 , Damien Cassou wrote: > > > Execute the following piece of code: > > > > > > "-------------------------------------------------------------------" > > |stream stream2| > > stream := WriteStream with: 'a test '. > > stream reset. > > stream nextPutAll: 'to test'. > > self assert: [stream contents = 'to test']. > > > > "On the following line, you can remove 'copy'" > > "the problem is present without. It's here to prevent" > > " the streams from using the same collection" > > "because the compiler tries to avoid creating 2 identical" > > "strings" > > stream2 := WriteStream with: 'a test ' copy. > > stream2 nextPutAll: 'to test'. > > > > "This assert passes but this is abnormal" > > self assert: [stream2 contents = 'to testto test']. > > > > "This assert pass and this is abnormal too" > > "because the strings MUST be equal !!" > > self assert: [stream2 contents ~= 'a test to test'] > > "-------------------------------------------------------------------" > > > > > > > > On my image, all the 3 tests pass. This is completely abnormal in > > my opinion. > > It is normal. > > You are modifying the 'a test ' literal into 'to test'. This modified > string gets copied in the second test. > > Lesson: never modify string literals. > > - Bert - > > > |
In reply to this post by Bert Freudenberg
Bert
> It is normal. No this is not. You get used to it and accept it. > > You are modifying the 'a test ' literal into 'to test'. This > modified string gets copied in the second test. > > Lesson: never modify string literals. It shows that the fact that the compiler optimizes the use of certain literals such as boolean and number is good for immutable objects but is wrong for mutable object such as strings. Iin the semantics of Smalltalk nothing says that two strings with the same representation in the same methods are pointing to the same object. I did not check in which books but the difference between strings and symbols is really that two strings are pointing to two different objects, while symbols are referring to the same objects (and are immutable). Stef |
> > Lesson: never modify string literals.
Ohh god, how much I wish to have immutability on a per object bases. Cheers, Lukas -- Lukas Renggli http://www.lukas-renggli.ch |
In reply to this post by stephane ducasse
On Feb 14, 2007, at 16:56 , stephane ducasse wrote: > Bert > >> It is normal. > > No this is not. You get used to it and accept it. >> >> You are modifying the 'a test ' literal into 'to test'. This >> modified string gets copied in the second test. >> >> Lesson: never modify string literals. > > It shows that the fact that the compiler optimizes the use of > certain literals such as boolean and number is good > for immutable objects but is wrong for mutable object such as strings. > > Iin the semantics of Smalltalk nothing says that two strings with > the same representation in the same > methods are pointing to the same object. I did not check in which > books but the difference between strings and symbols > is really that two strings are pointing to two different objects, > while symbols are referring to the same objects (and are immutable). The sharing is not the primary problem, the mutability is. - Bert - |
In reply to this post by Bert Freudenberg
On Feb 14, 2007, at 4:36 PM, Bert Freudenberg wrote:
This is normal for you because you know how the compiler works. But do you think the compiler works normally? Is it normal that a compiler considers two equal strings as identical? I would agree with symbols because symbols are immutable. I think this is a first bug, a bug in the compiler. In my opinion, there is another bug. When the collection of a stream becomes full, its is replaced by another bigger collection. So, first, the stream uses the collection you passed to the constructor, then, at a given time, this collection is replaced by a new one. I don't think it's a normal behavior. In my opinion, the collection must always be the one you gave at the beginning OR it must always be a copy. I prefer the second solution. So, what should be done ? I can write tests for the compiler and tests for streams to show the behavior. This tests will fail because they show a non corrected bug.
Lesson: Use a correct compiler :-) -- Damien Cassou |
In reply to this post by stephane ducasse
Stef,
why should strings be immutable, and why blame the compiler. The same situation is, for example, with Associations. In order to prevent mutation, someone invented ReadOnlyVariableBinding. Literals have nothing much to do with compiler optimization, see senders of #encodeLiteral:, just with determining the correct bytecode for pushing them onto the stack. But of course the compiler could emit code for always copying string literals, if you can afford the performance penalty. As Bert wrote: it's normal :) [Okay okay other languages have immutable string, but this is Smalltalk.] /Klaus On Wed, 14 Feb 2007 16:56:56 +0100, stephane ducasse wrote: > Bert > >> It is normal. > > No this is not. You get used to it and accept it. >> >> You are modifying the 'a test ' literal into 'to test'. This modified >> string gets copied in the second test. >> >> Lesson: never modify string literals. > > It shows that the fact that the compiler optimizes the use of certain > literals such as boolean and number is good > for immutable objects but is wrong for mutable object such as strings. > > Iin the semantics of Smalltalk nothing says that two strings with the > same representation in the same > methods are pointing to the same object. I did not check in which books > but the difference between strings and symbols > is really that two strings are pointing to two different objects, while > symbols are referring to the same objects (and are immutable). > > Stef > > |
> As Bert wrote: it's normal :)
I agree, looks completely normal to me. > [Okay okay other languages have immutable string, but this is Smalltalk.] If we had an immutability bit, that the compiler would set for objects in the literal array (and with what we could do a lot of other cool stuff), then people would not run into such problems. Cheers, Lukas -- Lukas Renggli http://www.lukas-renggli.ch |
In reply to this post by Damien Cassou-3
Hi Damien,
on Wed, 14 Feb 2007 17:19:13 +0100, you wrote: > On Feb 14, 2007, at 4:36 PM, Bert Freudenberg wrote: > > >> It is normal. > > > This is normal for you because you know how the compiler works. But > do you think the compiler works normally? Is it normal that a > compiler considers two equal strings as identical? I would agree with > symbols because symbols are immutable. I think this is a first bug, a > bug in the compiler. > > In my opinion, there is another bug. When the collection of a stream > becomes full, its is replaced by another bigger collection. So, > first, the stream uses the collection you passed to the constructor, > then, at a given time, this collection is replaced by a new one. I > don't think it's a normal behavior. Whatever the stream does with the collection, it is encapsulated. Imagine the stream always uses a highly optimized species for its internal job (or a file on your harddisk!). You should not depend any code on the internals of (in this case) stream. I suggest you use (aStream contents asArray) and then #= for comparing aStream's contents to your expectations. /Klaus > In my opinion, the collection > must always be the one you gave at the beginning OR it must always be > a copy. I prefer the second solution. > > So, what should be done ? I can write tests for the compiler and > tests for streams to show the behavior. This tests will fail because > they show a non corrected bug. > > >> Lesson: never modify string literals. > > Lesson: Use a correct compiler :-) > |
In reply to this post by Lukas Renggli
Hi Lukas,
on Wed, 14 Feb 2007 17:48:40 +0100, you wrote: >> As Bert wrote: it's normal :) > > I agree, looks completely normal to me. > >> [Okay okay other languages have immutable string, but this is >> Smalltalk.] > > If we had an immutability bit, that the compiler would set for objects > in the literal array (and with what we could do a lot of other cool > stuff), then people would not run into such problems. Since I cannot (stupid me ;-) think of a counterexample which would break existing code, I second you (perhaps in the new compiler? without modifying the VM?). /Klaus > Cheers, > Lukas > |
In reply to this post by Klaus D. Witzel
On Feb 14, 2007, at 5:48 PM, Klaus D. Witzel wrote:
I really don't want to depend on the implementation. And in my opinion, this is not encapsulated because this is MY String, not a String created internally. What I see is that the String I give to the new Stream is modified. Then at a moment, the String does not reflect the stream anymore. This doesn't sound coherent to me. And if you all agree to the current behavior, then a documentation should be written: "Don't use the collection after having created a stream on it !" -- Damien Cassou |
In reply to this post by Lukas Renggli
On Feb 14, 2007, at 5:48 PM, Lukas Renggli wrote:
Do you agree with the fact that the compiler merges two different strings into one single variable ? Then, how do you explain that? 'test' == 'test' => true 'test' == #test asString => false "The compiler optimizes... same variable...", I know the explanation, but I don't think it should be accepted. -- Damien Cassou |
In reply to this post by Damien Cassou-3
Hi Damien,
on Wed, 14 Feb 2007 17:58:02 +0100, you wrote: > On Feb 14, 2007, at 5:48 PM, Klaus D. Witzel wrote: >> Hi Damien, >> on Wed, 14 Feb 2007 17:19:13 +0100, you wrote: >>> On Feb 14, 2007, at 4:36 PM, Bert Freudenberg wrote: >>> >>> >>>> It is normal. >>> >>> >>> This is normal for you because you know how the compiler works. But >>> do you think the compiler works normally? Is it normal that a >>> compiler considers two equal strings as identical? I would agree with >>> symbols because symbols are immutable. I think this is a first bug, a >>> bug in the compiler. >>> >>> In my opinion, there is another bug. When the collection of a stream >>> becomes full, its is replaced by another bigger collection. So, >>> first, the stream uses the collection you passed to the constructor, >>> then, at a given time, this collection is replaced by a new one. I >>> don't think it's a normal behavior. >> >> Whatever the stream does with the collection, it is encapsulated. >> Imagine the stream always uses a highly optimized species for its >> internal job (or a file on your harddisk!). You should not depend >> any code on the internals of (in this case) stream. > > > I really don't want to depend on the implementation. And in my > opinion, this is not encapsulated because this is MY String, not a > String created internally. Not really. If you pass a boxed object (other than a SmallInteger) the recipient can #become: it to anything he/she likes. This is reality. > What I see is that the String I give to > the new Stream is modified. Then at a moment, the String does not > reflect the stream anymore. This doesn't sound coherent to me. It is not coherent because you passed an explicitly written *constant* which, in other languages, is believed to be immutable. > And if > you all agree to the current behavior, then a documentation should be > written: "Don't use the collection after having created a stream on > it !" Easier: don't pass constant collections to the streamers :) Another example, not to blame on any streamer: | tmp | tmp := 'lowercase'. tmp translateToUppercase == tmp /Klaus |
In reply to this post by Klaus D. Witzel
On Feb 14, 2007, at 5:37 PM, Klaus D. Witzel wrote: > Stef, > > why should strings be immutable, and why blame the compiler. It's not about making string immutable but is wen you use it as a literal it should be immutable. This show use that literal that are not immutable make really bad tedious side effect. ( Same reflection for #(a b c ) ) Math > > The same situation is, for example, with Associations. In order to > prevent mutation, someone invented ReadOnlyVariableBinding. > > Literals have nothing much to do with compiler optimization, see > senders of #encodeLiteral:, just with determining the correct > bytecode for pushing them onto the stack. > > But of course the compiler could emit code for always copying > string literals, if you can afford the performance penalty. > > As Bert wrote: it's normal :) > > [Okay okay other languages have immutable string, but this is > Smalltalk.] > > /Klaus > > On Wed, 14 Feb 2007 16:56:56 +0100, stephane ducasse wrote: > >> Bert >> >>> It is normal. >> >> No this is not. You get used to it and accept it. >>> >>> You are modifying the 'a test ' literal into 'to test'. This >>> modified string gets copied in the second test. >>> >>> Lesson: never modify string literals. >> >> It shows that the fact that the compiler optimizes the use of >> certain literals such as boolean and number is good >> for immutable objects but is wrong for mutable object such as >> strings. >> >> Iin the semantics of Smalltalk nothing says that two strings with >> the same representation in the same >> methods are pointing to the same object. I did not check in which >> books but the difference between strings and symbols >> is really that two strings are pointing to two different objects, >> while symbols are referring to the same objects (and are immutable). >> >> Stef >> >> > > > |
In reply to this post by Bert Freudenberg
you can do an optimization (having only one string in the literal
frame) when this does not impact the semantics of the language. If strings would be immutable then there would be no problem to have only one because we would not see the difference. >>> It is normal. >> >> No this is not. You get used to it and accept it. >>> >>> You are modifying the 'a test ' literal into 'to test'. This >>> modified string gets copied in the second test. >>> >>> Lesson: never modify string literals. >> >> It shows that the fact that the compiler optimizes the use of >> certain literals such as boolean and number is good >> for immutable objects but is wrong for mutable object such as >> strings. >> >> Iin the semantics of Smalltalk nothing says that two strings with >> the same representation in the same >> methods are pointing to the same object. I did not check in which >> books but the difference between strings and symbols >> is really that two strings are pointing to two different objects, >> while symbols are referring to the same objects (and are immutable). > > The sharing is not the primary problem, the mutability is. > > - Bert - > > > > |
In reply to this post by Lukas Renggli
Do you have example?
Because VW introduced immutable objects and I would like to educate my taste on this topic. On 14 févr. 07, at 17:03, Lukas Renggli wrote: >> > Lesson: never modify string literals. > > Ohh god, how much I wish to have immutability on a per object bases. > > Cheers, > Lukas > > -- > Lukas Renggli > http://www.lukas-renggli.ch > > |
In reply to this post by Klaus D. Witzel
> Stef,
> > why should strings be immutable, and why blame the compiler. I'm not saying that. I'm saying that since strings are not immutable then the compiler should not optimize the strings in a compiled method. > The same situation is, for example, with Associations. In order to > prevent mutation, someone invented ReadOnlyVariableBinding. > > Literals have nothing much to do with compiler optimization, see > senders of #encodeLiteral:, just with determining the correct > bytecode for pushing them onto the stack. > > But of course the compiler could emit code for always copying > string literals, if you can afford the performance penalty. > > As Bert wrote: it's normal :) > > [Okay okay other languages have immutable string, but this is > Smalltalk.] This is not my point. do you think that from a language point of view this is good to say ok two strings are not identical if they are typed in different methods but if there are typed in the same methods they are identical? This is why I got some of my slides not working when I switch from Visualworks to Squeak I do not have an old version of visualworks but it seems that they were consistent with the view they offered to strings to programmers. and consistent is important. Stef |
In reply to this post by Lukas Renggli
On 14 févr. 07, at 17:48, Lukas Renggli wrote: >> As Bert wrote: it's normal :) > > I agree, looks completely normal to me. Is it normal for you that the same strings typed in two different methods are different but if they are typed in the same method they are the same? I never saw that in the books I read on smalltalk but they can be all wrong and squeak right. Stef > >> [Okay okay other languages have immutable string, but this is >> Smalltalk.] > > If we had an immutability bit, that the compiler would set for objects > in the literal array (and with what we could do a lot of other cool > stuff), then people would not run into such problems. Exact. Or the compiler could just optimize immutable objects we have: #symbol, boolean, integers..... Stef > > Cheers, > Lukas > > -- > Lukas Renggli > http://www.lukas-renggli.ch > > |
Free forum by Nabble | Edit this page |