> > 'test' == 'test' => true > 'test' == #test asString => false > | str1 str2 | str1 := 'abc'. str2 := 'abc'. str1 at: 1 put: $x. self assert: [str2 = 'xbc'] "pass" |
In reply to this post by stephane ducasse
Hi Stef,
on Wed, 14 Feb 2007 18:40:31 +0100, you wrote: >> Stef, >> >> why should strings be immutable, and why blame the compiler. > > I'm not saying that. I'm saying that since strings are not immutable > then the compiler should not optimize the strings in a compiled method. > >> The same situation is, for example, with Associations. In order to >> prevent mutation, someone invented ReadOnlyVariableBinding. >> >> Literals have nothing much to do with compiler optimization, see >> senders of #encodeLiteral:, just with determining the correct bytecode >> for pushing them onto the stack. >> >> But of course the compiler could emit code for always copying string >> literals, if you can afford the performance penalty. >> >> As Bert wrote: it's normal :) >> >> [Okay okay other languages have immutable string, but this is >> Smalltalk.] > > This is not my point. > > do you think that from a language point of view this is good to say > > ok two strings are not identical if they are typed in different methods > but if there are typed in the same methods > they are identical? Ah :) Now you know why I asked :) [BTW: if you are able to compare string from different methods, why should anybody suddenly lack this capability when it comes to strings in the same method? The more general (comparision method) subsumes the more specific (comparision method). What else would someone need?] Any other examples besides strings? /Klaus > This is why I got some of my slides not working when I switch from > Visualworks to Squeak I do not have an old version of visualworks > but it seems that they were consistent with the view they offered to > strings to programmers. > and consistent is important. > > Stef > > |
In reply to this post by stephane ducasse
I have the impression that the point of the initial question was lost
in this thread, spinning into mutability, etc. Let me try to phrase it in another way: 'foo' = 'foo' true "ok" 'foo' == 'foo' true "NOT OK" The underlying reason is that whenever I execute one of these expressions in a workspace, a method is created behind my back and then executed. In the case of comparing two strings, the compiler creates *a single literal for both strings*. This is plain wrong: a string is a collection of characters, and therefore these are two different instances of this collection that happen to have the same contents. Now some more experiments: Take any class (say a new class A). First add the following two methods: foo ^'foo' foo2 ^'foo' Now execute the following code: A new foo == A new foo "true ?!?!?!?!?!?!" A new foo == A new foo2 "false" To top it all of, add one more method: isIdentical: arg1 to: arg2 ^arg1 == arg2 So consider the following now: A new isIdentical: A new foo with: A new foo2 "false" A new isIdentical: 'foo' to: 'foo' "true" I can only conclude that this is really not what you want..... On 14 Feb 2007, at 14 February/18:33, stephane ducasse wrote: > Do you have example? > Because VW introduced immutable objects and I would like to educate > my taste on this topic. > > > On 14 févr. 07, at 17:03, Lukas Renggli wrote: > >>> > Lesson: never modify string literals. >> >> Ohh god, how much I wish to have immutability on a per object bases. >> >> Cheers, >> Lukas >> >> -- >> Lukas Renggli >> http://www.lukas-renggli.ch >> >> > > |
On Feb 14, 2007, at 20:27 , Roel Wuyts wrote:
> 'foo' = 'foo' true "ok" > 'foo' == 'foo' true "NOT OK" > [...] > I can only conclude that this is really not what you want..... Why? If you want to test for identity, use a Symbol. IMHO this is splitting hairs over a non-issue. The issue is mutability of literals. - Bert - |
On Wed, 14 Feb 2007 20:58:21 +0100, Bert Freudenberg wrote:
> On Feb 14, 2007, at 20:27 , Roel Wuyts wrote: > >> 'foo' = 'foo' true "ok" >> 'foo' == 'foo' true "NOT OK" >> [...] >> I can only conclude that this is really not what you want..... > > Why? If you want to test for identity, use a Symbol. > > IMHO this is splitting hairs over a non-issue. The issue is mutability > of literals. ... which are not constants but objects created from literally descriptions, therefore their name :) IMHO Lukas had the best suggestion so far, something like a preference which demands to compile literals as if they where constants. /Klaus > - Bert - > > > > |
On Feb 14, 2007, at 9:08 PM, Klaus D. Witzel wrote: > On Wed, 14 Feb 2007 20:58:21 +0100, Bert Freudenberg wrote: > >> On Feb 14, 2007, at 20:27 , Roel Wuyts wrote: >> >>> 'foo' = 'foo' true "ok" >>> 'foo' == 'foo' true "NOT OK" >>> [...] >>> I can only conclude that this is really not what you want..... >> >> Why? If you want to test for identity, use a Symbol. >> >> IMHO this is splitting hairs over a non-issue. The issue is >> mutability of literals. > > ... which are not constants but objects created from literally > descriptions, therefore their name :) > > IMHO Lukas had the best suggestion so far, something like a > preference which demands to compile literals as if they where > constants. Yes but this will be messy since you don't know when you have to turn it on or off. Isn't it? Math > > /Klaus > >> - Bert - >> >> >> >> > > > |
In reply to this post by Lukas Renggli
<Lukas Renggli>
If we had an immutability bit, that the compiler would set for objects in the literal array (and with what we could do a lot of other cool stuff), then people would not run into such problems. </Lukas Renggli> As a principle of language design, literals should be immutable. And immutability needs to be independently settable for each named instance variable, and independently settable for the indexable slots (as a group, not for each index.) The reason is because some named instance variables may need to be "caching variables" whose values are lazily computed only when needed--in fact, such variables may need to be weak references so that the garbage collector can set them to nil. --Alan |
In reply to this post by Mathieu SUEN
Mathieu Suen a écrit :
> > On Feb 14, 2007, at 9:08 PM, Klaus D. Witzel wrote: > >> On Wed, 14 Feb 2007 20:58:21 +0100, Bert Freudenberg wrote: >> >>> On Feb 14, 2007, at 20:27 , Roel Wuyts wrote: >>> >>>> 'foo' = 'foo' true "ok" >>>> 'foo' == 'foo' true "NOT OK" >>>> [...] >>>> I can only conclude that this is really not what you want..... >>> >>> Why? If you want to test for identity, use a Symbol. >>> >>> IMHO this is splitting hairs over a non-issue. The issue is >>> mutability of literals. >> >> ... which are not constants but objects created from literally >> descriptions, therefore their name :) >> >> IMHO Lukas had the best suggestion so far, something like a preference >> which demands to compile literals as if they where constants. > > Yes but this will be messy since you don't know when you have to turn it > on or off. Isn't it? > > Math > I would hate that same code lead to different results depending on a global preference set somewhere in the image... Unless you use: a) a compiler directive in a pragma (i know, Lukas don't like this use of annotations) <thisCompiler literalAreMutable: false> WriteStream on: 'test'. b) a message explicitely stating the literal should be mutable. WriteStream on: 'test' beMutable. Nicolas >> >> /Klaus >> >>> - Bert - >>> >>> >>> >>> >> >> >> > > > |
In reply to this post by Bert Freudenberg
No, you did not get the point.
Would you say that: (Collection new add: $f; add: $o; add: $o; yourself) == (Collection new add: $f; add: $o; add: $o; yourself) ? Besides, the last example in my mail is also worth explaining... On 14 Feb 2007, at 14 February/20:58, Bert Freudenberg wrote: > On Feb 14, 2007, at 20:27 , Roel Wuyts wrote: > >> 'foo' = 'foo' true "ok" >> 'foo' == 'foo' true "NOT OK" >> [...] >> I can only conclude that this is really not what you want..... > > Why? If you want to test for identity, use a Symbol. :-) > > IMHO this is splitting hairs over a non-issue. The issue is > mutability of literals. If Squeak is the only Smalltalk that has this behaviour for Strings, than it shows that is definitely an issue........... I ported T-Gen and the ParserCompiler, and suddenly this non-trivial issue becomes vital. We are still unable to port the logic language Soul to Squeak because of this issue, because, sorry, symbols use a flyweight pattern and are unique while Strings are collections of characters and should behave as such. It is a simple issue in itself. Besides, if it would be only splitting hairs, then why are all beginner's books full of warning for this issue ? Ever tried to teach Smaltalk to a class of newbies ? Ever had students come up to you because when they find some examples in a book or on the web and they tried in Squeak the results are different ? Think about Smalltalk being this nice and clean language where everything is logical and then having to remember by heart some stupid rules because I am splitting hairs ??????? Besides, have a look at the last part of my mail. Would you not consider this wrong ? Depending on whether you call the behaviour from a method or not you get different behaviour ??????????????? [PS: Yes, you hit a sore spot there] -- Roel |
In reply to this post by Damien Cassou-3
Sure, I fully agree.
The essence of the discussion boils down whether whether you consider 'foo' to be a literal or not in the current system. On 14 Feb 2007, at 14 February/21:59, Alan Lovejoy wrote: > <Lukas Renggli> > If we had an immutability bit, that the compiler would set for > objects in > the literal array (and with what we could do a lot of other cool > stuff), > then people would not run into such problems. > </Lukas Renggli> > > As a principle of language design, literals should be immutable. And > immutability needs to be independently settable for each named > instance > variable, and independently settable for the indexable slots (as a > group, > not for each index.) The reason is because some named instance > variables > may need to be "caching variables" whose values are lazily computed > only > when needed--in fact, such variables may need to be weak references > so that > the garbage collector can set them to nil. > > --Alan > > > |
In reply to this post by Klaus D. Witzel
Hi Klaus,
I'm not talking about security, only unconsistent side effect. It's not about constant collections at all here. myCollection := String new: 3. myStream := WriteStream on: myCollection. myStream nextPutAll: 'abcd' copy. Here, myCollection is left untouched which sounds normal (it's still an empty string of size 3). Now, lets replace #nextPutAll: by 4 #nextPut: myCollection := String new: 3. myStream := WriteStream on: myCollection. myStream nextPut: $a; nextPut: $b; nextPut: $c; nextPut: $d. This should have exactly the same behavior... however myCollection now equals 'abc' !!! Why the first 3 characters ? Why not everything or nothing at all ? This is why I think it's not coherent. I've read the source code and I understand why it happens but I don't think it's coherent. And this as nothing to do with literals nor with immutability. This is a completely different problem (this is why I changed the thread title). |
Hi Damien,
I understand your argument and the fresh light that you've thrown on it. BTW "having a side effect" is not what happens, after all you *want* the streamer to write, isn't it so. And in your "myStream nextPutAll: 'abcd' copy" the #copy is superflous (has no effect). But anyways, lets not argue about the thread title. For sure I do like consistency and friends like coherence. My concern is that #become: would be a gun instead of using a pigeon transporting a peace message (so to speak). So, what's your solution? Perhaps, like we have heard from the VW folks, should the streamer be adapted to *also* work on an OrderedCollection (which automagically grows). So that people can expect that, if they pass anOrderedCollection, all is fine with its contents and identity (i.e. because of the behavior which is already in OrderedCollection). /Klaus On Thu, 15 Feb 2007 09:14:50 +0100, Damien Cassou wrote: > > Hi Klaus, > > > Klaus D. Witzel wrote: >> >>> I really don't want to depend on the implementation. And in my >>> opinion, this is not encapsulated because this is MY String, not a >>> String created internally. >> >> Not really. If you pass a boxed object (other than a SmallInteger) the >> recipient can #become: it to anything he/she likes. This is reality. >> > > I'm not talking about security, only unconsistent side effect. > > > Klaus D. Witzel wrote: >> >>> What I see is that the String I give to >>> the new Stream is modified. Then at a moment, the String does not >>> reflect the stream anymore. This doesn't sound coherent to me. >> >> It is not coherent because you passed an explicitly written *constant* >> which, in other languages, is believed to be immutable. >> >>> And if >>> you all agree to the current behavior, then a documentation should be >>> written: "Don't use the collection after having created a stream on >>> it !" >> >> Easier: don't pass constant collections to the streamers :) >> > > It's not about constant collections at all here. > > myCollection := String new: 3. > myStream := WriteStream on: myCollection. > myStream nextPutAll: 'abcd' copy. > > Here, myCollection is left untouched which sounds normal (it's still an > empty string of size 3). Now, lets replace #nextPutAll: by 4 #nextPut: > > myCollection := String new: 3. > myStream := WriteStream on: myCollection. > myStream nextPut: $a; nextPut: $b; nextPut: $c; nextPut: $d. > > This should have exactly the same behavior... however myCollection now > equals 'abc' !!! Why the first 3 characters ? Why not everything or > nothing > at all ? This is why I think it's not coherent. > > I've read the source code and I understand why it happens but I don't > think > it's coherent. > And this as nothing to do with literals nor with immutability. This is a > completely different problem (this is why I changed the thread title). |
I've written unit-tests for the current behavior. I do not agree with what we have currently, but this is not going to change so:
============== testStreamUseGivenCollection "self debug: #testStreamUseGivenCollection" "When a stream is created on a collection, it tries to keep using that collection instead of copying. See thread with title 'Very strange bug on Streams and probably compiler' (Feb 14 2007) on the squeak-dev mailing list." |string stream| string := String withAll: 'erased'. stream := WriteStream on: string. self assert: string = 'erased'. stream nextPutAll: 'test'. self assert: string = 'tested'. "Begining of 'erased' has been replaced by 'test'". ============== ============== testNextPutAllDifferentFromNextPuts "self debug: #testNextPutAllDifferentFromNextPuts" "When a stream is created on a collection, it tries to keep using that collection instead of copying. See thread with title 'Very strange bug on Streams and probably compiler' (Feb 14 2007) on the squeak-dev mailing list." "nextPutAll verifies the size of the parameter and directly grows the underlying collection of the required size." |string stream| string := String withAll: 'z'. stream := WriteStream on: string. stream nextPutAll: 'abc'. self assert: string = 'z'. "string hasn't been modified because #nextPutAll: detects that 'abc' is bigger than the underlying collection. Thus, it starts by creating a new collection and doesn't modify our variable." string := String withAll: 'z'. stream := WriteStream on: string. stream nextPut: $a; nextPut: $b; nextPut: $c. self assert: string = 'a'. "The first #nextPut: has no problem and replaces $z by $a in the string. Others will detect that string is too small." ============== |
In reply to this post by Damien Cassou-3
I've written unit tests for this compiler "optimizations". I think
they show a bug. See attached file. -- Damien Cassou CompilerTest.st (3K) Download Attachment |
Hi Damien,
I still think that something is not correct with your analysis. In you comments you write: "Current compiler uses only one variable for both strings. I think this is a bug." But you do not code a variable, instead you code *literals* which, as was mentioned earlier, are also not a constant. ? /Klaus On Sat, 24 Feb 2007 19:14:52 +0100, Damien Cassou wrote: > I've written unit tests for this compiler "optimizations". I think > they show a bug. See attached file. > |
Ok, I may not use the right word. What comment would you write ?
2007/2/24, Klaus D. Witzel <[hidden email]>: > Hi Damien, > > I still think that something is not correct with your analysis. In you > comments you write: > > "Current compiler uses only one variable for both strings. I think this > is a bug." > > But you do not code a variable, instead you code *literals* which, as was > mentioned earlier, are also not a constant. > > ? > > /Klaus > > On Sat, 24 Feb 2007 19:14:52 +0100, Damien Cassou wrote: > > > I've written unit tests for this compiler "optimizations". I think > > they show a bug. See attached file. > > > > > > -- Damien Cassou |
Hi Damien,
on Sat, 24 Feb 2007 23:29:18 +0100, you wrote: > Ok, I may not use the right word. What comment would you write ? I cannot write you that comment. Smalltalk was built with a minimum set of unchangeable parts, see "Design Principles Behind Smalltalk" just the sentence after "Good Design" - http://users.ipa.net/~dwighth/smalltalk/byte_aug81/design_principles_behind_smalltalk.html Moreover, the design principle does not say minimal, it says minimum. It turned out the only unchangeable parts are instances of SmallInteger (they are their own oop). Every object (but the SmallIntegers) can be changed and I disagree with your blaming of the compiler and streams. /Klaus > 2007/2/24, Klaus D. Witzel <[hidden email]>: >> Hi Damien, >> >> I still think that something is not correct with your analysis. In you >> comments you write: >> >> "Current compiler uses only one variable for both strings. I >> think this >> is a bug." >> >> But you do not code a variable, instead you code *literals* which, as >> was >> mentioned earlier, are also not a constant. >> >> ? >> >> /Klaus >> >> On Sat, 24 Feb 2007 19:14:52 +0100, Damien Cassou wrote: >> >> > I've written unit tests for this compiler "optimizations". I think >> > they show a bug. See attached file. >> > >> >> >> >> > > |
FWIW, ObjectiveC solves this by making literals special subclasses of
NSString because they reference data compiled into the static data segment of the executable. Also NSString's are immutable, subclasses NSMutableString are mutable. So modification of literals doesn't come up as it produces a DNU type error. -Todd Blanchard On Feb 25, 2007, at 2:26 AM, Klaus D. Witzel wrote: > Hi Damien, > > on Sat, 24 Feb 2007 23:29:18 +0100, you wrote: >> Ok, I may not use the right word. What comment would you write ? > > I cannot write you that comment. Smalltalk was built with a minimum > set of unchangeable parts, see "Design Principles Behind Smalltalk" > just the sentence after "Good Design" > > - http://users.ipa.net/~dwighth/smalltalk/byte_aug81/ > design_principles_behind_smalltalk.html > > Moreover, the design principle does not say minimal, it says minimum. > > It turned out the only unchangeable parts are instances of > SmallInteger (they are their own oop). > > Every object (but the SmallIntegers) can be changed and I disagree > with your blaming of the compiler and streams. > > /Klaus > >> 2007/2/24, Klaus D. Witzel <[hidden email]>: >>> Hi Damien, >>> >>> I still think that something is not correct with your analysis. >>> In you >>> comments you write: >>> >>> "Current compiler uses only one variable for both >>> strings. I think this >>> is a bug." >>> >>> But you do not code a variable, instead you code *literals* >>> which, as was >>> mentioned earlier, are also not a constant. >>> >>> ? >>> >>> /Klaus >>> >>> On Sat, 24 Feb 2007 19:14:52 +0100, Damien Cassou wrote: >>> >>> > I've written unit tests for this compiler "optimizations". I think >>> > they show a bug. See attached file. >>> > >>> >>> >>> >>> >> >> > > > |
Thank you Todd,
I meant the same when mentioning literals made with read-only associations earlier. But Damien is also concerned about passing a subinstance of ArrayedCollection (no other type works at the moment) to a WriteStream, which, on size overflow, allocates a suitable sized new subinstance ... So this also means, it does that for strings. /Klaus On Sun, 25 Feb 2007 12:07:54 +0100, you wrote: > FWIW, ObjectiveC solves this by making literals special subclasses of > NSString because they reference data compiled into the static data > segment of the executable. Also NSString's are immutable, subclasses > NSMutableString are mutable. So modification of literals doesn't come > up as it produces a DNU type error. > > -Todd Blanchard > > On Feb 25, 2007, at 2:26 AM, Klaus D. Witzel wrote: > >> Hi Damien, >> >> on Sat, 24 Feb 2007 23:29:18 +0100, you wrote: >>> Ok, I may not use the right word. What comment would you write ? >> >> I cannot write you that comment. Smalltalk was built with a minimum set >> of unchangeable parts, see "Design Principles Behind Smalltalk" just >> the sentence after "Good Design" >> >> - http://users.ipa.net/~dwighth/smalltalk/byte_aug81/ >> design_principles_behind_smalltalk.html >> >> Moreover, the design principle does not say minimal, it says minimum. >> >> It turned out the only unchangeable parts are instances of SmallInteger >> (they are their own oop). >> >> Every object (but the SmallIntegers) can be changed and I disagree with >> your blaming of the compiler and streams. >> >> /Klaus >> >>> 2007/2/24, Klaus D. Witzel <[hidden email]>: >>>> Hi Damien, >>>> >>>> I still think that something is not correct with your analysis. In you >>>> comments you write: >>>> >>>> "Current compiler uses only one variable for both strings. I >>>> think this >>>> is a bug." >>>> >>>> But you do not code a variable, instead you code *literals* which, as >>>> was >>>> mentioned earlier, are also not a constant. >>>> >>>> ? >>>> >>>> /Klaus >>>> >>>> On Sat, 24 Feb 2007 19:14:52 +0100, Damien Cassou wrote: >>>> >>>> > I've written unit tests for this compiler "optimizations". I think >>>> > they show a bug. See attached file. >>>> > >>>> >>>> >>>> >>>> >>> >>> >> >> >> > > > |
In reply to this post by Roel Wuyts
Roel Wuyts <[hidden email]> writes:
> 'foo' = 'foo' true "ok" > 'foo' == 'foo' true "NOT OK" > > The underlying reason is that whenever I execute one of these > expressions in a workspace, a method is created behind my back and > then executed. In the case of comparing two strings, the compiler > creates *a single literal for both strings*. This is plain wrong: a > string is a collection of characters, and therefore these are two > different instances of this collection that happen to have the same > contents. Indeed it is good to get this straight. The semantics you propose makes me thing of cons in Scheme, where cons is guaranteed to give you a fresh object every time it is executed. Additionally, the result is guaranteed to be mutable, so it's especially important that separate calls to cons return separate objects! The other viewpoint also seems to also make sense, though: literals describe a *read-only* object, and thus the compiler may reuse them if it likes. FWIW, the ANSI standard supports read-only literals, but does not require them to be. It says that you get undefined behavior if you try to modify an object created via a literal. See section 3.4.6.3, "String literals". We do not have to conform to ANSI, of course, but other Smalltalks might consider it. Also, FWIW, if you like the ANSI version, it is easy to implement. Here is a version using a dead forked dialect of Squeak; it should be easy to dust it off should anyone want. This version goes further than discussed in this thread, and even makes floats and large integers be immutable. :) The code is in islands.zip on the following page; look inside the zip for immutLits1.5.cs and immutLits2.2.cs. http://wiki.squeak.org/squeak/2074 If anyone is passionate about this issue, by all means open up a Mantis entry! Judging from the discussion so far, however, it may be hard to come to a decision.... -Lex |
Free forum by Nabble | Edit this page |