Very strange bug on Streams and probably compiler

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
45 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Very strange bug on Streams and probably compiler

Damien Cassou-3
Execute the following piece of code:


"-------------------------------------------------------------------"
|stream stream2|
stream := WriteStream with: 'a test '.
stream reset.
stream nextPutAll: 'to test'.
self assert: [stream contents = 'to test'].

"On the following line, you can remove 'copy'"
"the problem is present without. It's here to prevent"
" the streams from using the same collection"
"because the compiler tries to avoid creating 2 identical"
"strings"
stream2 := WriteStream with: 'a test ' copy.
stream2 nextPutAll: 'to test'.

"This assert passes but this is abnormal"
self assert: [stream2 contents = 'to testto test'].

"This assert pass and this is abnormal too"
"because the strings MUST be equal !!"
self assert: [stream2 contents ~= 'a test to test']
"-------------------------------------------------------------------"



On my image, all the 3 tests pass. This is completely abnormal in my opinion.
In the second test, where does 'to testto test' come from ???

VM:
Squeak VM version: 3.9-8 #2 Tue Oct 10 21:41:34 PDT 2006 gcc 4.0.1
Built from: Squeak3.9alpha of 4 July 2005 [latest update: #7021]
Build host: Darwin margaux 8.8.0 Darwin Kernel Version 8.8.0: Fri Sep  8 17:18:57 PDT 2006; root:xnu-792.12.6.obj~1/RELEASE_PPC Power Macintosh powerpc
default plugin location: /usr/local/lib/squeak/3.9-8/*.so

Image: squeak-dev-76

I will try with other images and the new compiler and let you know.

Can you test with your system please and let us know?

Bye

-- 
Damien Cassou





Reply | Threaded
Open this post in threaded view
|

Re: Very strange bug on Streams and probably compiler

Damien Cassou-3
I will try with other images and the new compiler and let you know.

Same with:

Squeak VM version: 3.7-7 #1 Sat Mar 19 13:23:20 PST 2005 gcc 3.3
Built from: Squeak3.7 of '4 September 2004' [latest update: #5989]
Build host: Darwin emilia.local 7.8.0 Darwin Kernel Version 7.8.0: Wed Dec 22 14:26:17 PST 2004; root:xnu/xnu-517.11.1.obj~1/RELEASE_PPC  Power Macintosh powerpc
default plugin location: /usr/local/lib/squeak/3.7-7/*.so


-- 
Damien Cassou





Reply | Threaded
Open this post in threaded view
|

Re: Very strange bug on Streams and probably compiler

Bert Freudenberg
In reply to this post by Damien Cassou-3

On Feb 14, 2007, at 16:18 , Damien Cassou wrote:

> Execute the following piece of code:
>
>
> "-------------------------------------------------------------------"
> |stream stream2|
> stream := WriteStream with: 'a test '.
> stream reset.
> stream nextPutAll: 'to test'.
> self assert: [stream contents = 'to test'].
>
> "On the following line, you can remove 'copy'"
> "the problem is present without. It's here to prevent"
> " the streams from using the same collection"
> "because the compiler tries to avoid creating 2 identical"
> "strings"
> stream2 := WriteStream with: 'a test ' copy.
> stream2 nextPutAll: 'to test'.
>
> "This assert passes but this is abnormal"
> self assert: [stream2 contents = 'to testto test'].
>
> "This assert pass and this is abnormal too"
> "because the strings MUST be equal !!"
> self assert: [stream2 contents ~= 'a test to test']
> "-------------------------------------------------------------------"
>
>
>
> On my image, all the 3 tests pass. This is completely abnormal in  
> my opinion.

It is normal.

You are modifying the 'a test ' literal into 'to test'. This modified  
string gets copied in the second test.

Lesson: never modify string literals.

- Bert -



Reply | Threaded
Open this post in threaded view
|

RE: Very strange bug on Streams and probably compiler

Ron Teitelbaum
All,

This is a really good example of a string literal problem.  I'm a fanatic
about using copy after any hard coded string in code.  So much so that I've
had a number of developers make fun of my code because of it.  Whenever I'm
teaching someone Smalltalk I always include the "just use copy on all string
literals" suggestion.  But I normally show the problem with a character
replacement in a string instead.  This is a good example of how someone
might make a big mistake and then spend a lot of time trying to figure out
why everything is so messed up.

I was indoctrinated into the just-use-copy club by Versant.  If you had a
string literal in a method Versant stored your code in the DB.  Then if you
tried to change your code without being connected to the database everything
blew up!

So this can be fixed by:

|stream stream2|
        stream := WriteStream with: 'a test ' copy.
        stream reset.
        stream nextPutAll: 'to test' copy.
        self assert: [stream contents = 'to test' copy].

Ok the last copy is not really needed but "JUST USE COPY" works for me!

Ron Teitelbaum


> From: Bert Freudenberg
> Sent: Wednesday, February 14, 2007 10:36 AM
>
>
> On Feb 14, 2007, at 16:18 , Damien Cassou wrote:
>
> > Execute the following piece of code:
> >
> >
> > "-------------------------------------------------------------------"
> > |stream stream2|
> > stream := WriteStream with: 'a test '.
> > stream reset.
> > stream nextPutAll: 'to test'.
> > self assert: [stream contents = 'to test'].
> >
> > "On the following line, you can remove 'copy'"
> > "the problem is present without. It's here to prevent"
> > " the streams from using the same collection"
> > "because the compiler tries to avoid creating 2 identical"
> > "strings"
> > stream2 := WriteStream with: 'a test ' copy.
> > stream2 nextPutAll: 'to test'.
> >
> > "This assert passes but this is abnormal"
> > self assert: [stream2 contents = 'to testto test'].
> >
> > "This assert pass and this is abnormal too"
> > "because the strings MUST be equal !!"
> > self assert: [stream2 contents ~= 'a test to test']
> > "-------------------------------------------------------------------"
> >
> >
> >
> > On my image, all the 3 tests pass. This is completely abnormal in
> > my opinion.
>
> It is normal.
>
> You are modifying the 'a test ' literal into 'to test'. This modified
> string gets copied in the second test.
>
> Lesson: never modify string literals.
>
> - Bert -
>
>
>



Reply | Threaded
Open this post in threaded view
|

Re: Very strange bug on Streams and probably compiler

stephane ducasse
In reply to this post by Bert Freudenberg
Bert

> It is normal.

No this is not. You get used to it and accept it.
>
> You are modifying the 'a test ' literal into 'to test'. This  
> modified string gets copied in the second test.
>
> Lesson: never modify string literals.

It shows that the fact that the compiler optimizes the use of certain  
literals such as boolean and number is good
for immutable objects but is wrong for mutable object such as strings.

Iin the semantics of Smalltalk nothing says that two strings with the  
same representation in the same
methods are pointing to the same object. I did not check in which  
books but the difference between strings and symbols
is really that two strings are pointing to two different objects,  
while symbols are referring to the same objects (and are immutable).

Stef

Reply | Threaded
Open this post in threaded view
|

Re: Very strange bug on Streams and probably compiler

Lukas Renggli
> > Lesson: never modify string literals.

Ohh god, how much I wish to have immutability on a per object bases.

Cheers,
Lukas

--
Lukas Renggli
http://www.lukas-renggli.ch

Reply | Threaded
Open this post in threaded view
|

Re: Very strange bug on Streams and probably compiler

Bert Freudenberg
In reply to this post by stephane ducasse

On Feb 14, 2007, at 16:56 , stephane ducasse wrote:

> Bert
>
>> It is normal.
>
> No this is not. You get used to it and accept it.
>>
>> You are modifying the 'a test ' literal into 'to test'. This  
>> modified string gets copied in the second test.
>>
>> Lesson: never modify string literals.
>
> It shows that the fact that the compiler optimizes the use of  
> certain literals such as boolean and number is good
> for immutable objects but is wrong for mutable object such as strings.
>
> Iin the semantics of Smalltalk nothing says that two strings with  
> the same representation in the same
> methods are pointing to the same object. I did not check in which  
> books but the difference between strings and symbols
> is really that two strings are pointing to two different objects,  
> while symbols are referring to the same objects (and are immutable).

The sharing is not the primary problem, the mutability is.

- Bert -



Reply | Threaded
Open this post in threaded view
|

Re: Very strange bug on Streams and probably compiler

Damien Cassou-3
In reply to this post by Bert Freudenberg

On Feb 14, 2007, at 4:36 PM, Bert Freudenberg wrote:


It is normal.


This is normal for you because you know how the compiler works. But do you think the compiler works normally? Is it normal that a compiler considers two equal strings as identical? I would agree with symbols because symbols are immutable. I think this is a first bug, a bug in the compiler.

In my opinion, there is another bug. When the collection of a stream becomes full, its is replaced by another bigger collection. So, first, the stream uses the collection you passed to the constructor, then, at a given time, this collection is replaced by a new one. I don't think it's a normal behavior. In my opinion, the collection must always be the one you gave at the beginning OR it must always be a copy. I prefer the second solution.

So, what should be done ? I can write tests for the compiler and tests for streams to show the behavior. This tests will fail because they show a non corrected bug.


Lesson: never modify string literals.

Lesson: Use a correct compiler :-)

-- 
Damien Cassou





Reply | Threaded
Open this post in threaded view
|

Re: Very strange bug on Streams and probably compiler

Klaus D. Witzel
In reply to this post by stephane ducasse
Stef,

why should strings be immutable, and why blame the compiler.

The same situation is, for example, with Associations. In order to prevent  
mutation, someone invented ReadOnlyVariableBinding.

Literals have nothing much to do with compiler optimization, see senders  
of #encodeLiteral:, just with determining the correct bytecode for pushing  
them onto the stack.

But of course the compiler could emit code for always copying string  
literals, if you can afford the performance penalty.

As Bert wrote: it's normal :)

[Okay okay other languages have immutable string, but this is Smalltalk.]

/Klaus

On Wed, 14 Feb 2007 16:56:56 +0100, stephane ducasse wrote:

> Bert
>
>> It is normal.
>
> No this is not. You get used to it and accept it.
>>
>> You are modifying the 'a test ' literal into 'to test'. This modified  
>> string gets copied in the second test.
>>
>> Lesson: never modify string literals.
>
> It shows that the fact that the compiler optimizes the use of certain  
> literals such as boolean and number is good
> for immutable objects but is wrong for mutable object such as strings.
>
> Iin the semantics of Smalltalk nothing says that two strings with the  
> same representation in the same
> methods are pointing to the same object. I did not check in which books  
> but the difference between strings and symbols
> is really that two strings are pointing to two different objects, while  
> symbols are referring to the same objects (and are immutable).
>
> Stef
>
>



Reply | Threaded
Open this post in threaded view
|

Re: Very strange bug on Streams and probably compiler

Lukas Renggli
> As Bert wrote: it's normal :)

I agree, looks completely normal to me.

> [Okay okay other languages have immutable string, but this is Smalltalk.]

If we had an immutability bit, that the compiler would set for objects
in the literal array (and with what we could do a lot of other cool
stuff), then people would not run into such problems.

Cheers,
Lukas

--
Lukas Renggli
http://www.lukas-renggli.ch

Reply | Threaded
Open this post in threaded view
|

Re: Very strange bug on Streams and probably compiler

Klaus D. Witzel
In reply to this post by Damien Cassou-3
Hi Damien,

on Wed, 14 Feb 2007 17:19:13 +0100, you wrote:

> On Feb 14, 2007, at 4:36 PM, Bert Freudenberg wrote:
>
>
>> It is normal.
>
>
> This is normal for you because you know how the compiler works. But
> do you think the compiler works normally? Is it normal that a
> compiler considers two equal strings as identical? I would agree with
> symbols because symbols are immutable. I think this is a first bug, a
> bug in the compiler.
>
> In my opinion, there is another bug. When the collection of a stream
> becomes full, its is replaced by another bigger collection. So,
> first, the stream uses the collection you passed to the constructor,
> then, at a given time, this collection is replaced by a new one. I
> don't think it's a normal behavior.

Whatever the stream does with the collection, it is encapsulated. Imagine  
the stream always uses a highly optimized species for its internal job (or  
a file on your harddisk!). You should not depend any code on the internals  
of (in this case) stream.

I suggest you use (aStream contents asArray) and then #= for comparing  
aStream's contents to your expectations.

/Klaus

> In my opinion, the collection
> must always be the one you gave at the beginning OR it must always be
> a copy. I prefer the second solution.
>
> So, what should be done ? I can write tests for the compiler and
> tests for streams to show the behavior. This tests will fail because
> they show a non corrected bug.
>
>
>> Lesson: never modify string literals.
>
> Lesson: Use a correct compiler :-)
>



Reply | Threaded
Open this post in threaded view
|

Re: Very strange bug on Streams and probably compiler

Klaus D. Witzel
In reply to this post by Lukas Renggli
Hi Lukas,

on Wed, 14 Feb 2007 17:48:40 +0100, you wrote:

>> As Bert wrote: it's normal :)
>
> I agree, looks completely normal to me.
>
>> [Okay okay other languages have immutable string, but this is  
>> Smalltalk.]
>
> If we had an immutability bit, that the compiler would set for objects
> in the literal array (and with what we could do a lot of other cool
> stuff), then people would not run into such problems.

Since I cannot (stupid me ;-) think of a counterexample which would break  
existing code, I second you (perhaps in the new compiler? without  
modifying the VM?).

/Klaus

> Cheers,
> Lukas
>



Reply | Threaded
Open this post in threaded view
|

Re: Very strange bug on Streams and probably compiler

Damien Cassou-3
In reply to this post by Klaus D. Witzel

On Feb 14, 2007, at 5:48 PM, Klaus D. Witzel wrote:

Hi Damien,

on Wed, 14 Feb 2007 17:19:13 +0100, you wrote:
On Feb 14, 2007, at 4:36 PM, Bert Freudenberg wrote:


It is normal.


This is normal for you because you know how the compiler works. But
do you think the compiler works normally? Is it normal that a
compiler considers two equal strings as identical? I would agree with
symbols because symbols are immutable. I think this is a first bug, a
bug in the compiler.

In my opinion, there is another bug. When the collection of a stream
becomes full, its is replaced by another bigger collection. So,
first, the stream uses the collection you passed to the constructor,
then, at a given time, this collection is replaced by a new one. I
don't think it's a normal behavior.

Whatever the stream does with the collection, it is encapsulated. Imagine the stream always uses a highly optimized species for its internal job (or a file on your harddisk!). You should not depend any code on the internals of (in this case) stream.


I really don't want to depend on the implementation. And in my opinion, this is not encapsulated because this is MY String, not a String created internally. What I see is that the String I give to the new Stream is modified. Then at a moment, the String does not reflect the stream anymore. This doesn't sound coherent to me. And if you all agree to the current behavior, then a documentation should be written: "Don't use the collection after having created a stream on it !"

-- 
Damien Cassou





Reply | Threaded
Open this post in threaded view
|

Re: Very strange bug on Streams and probably compiler

Damien Cassou-3
In reply to this post by Lukas Renggli

On Feb 14, 2007, at 5:48 PM, Lukas Renggli wrote:

As Bert wrote: it's normal :)

I agree, looks completely normal to me.


Do you agree with the fact that the compiler merges two different strings into one single variable ?

Then, how do you explain that?

'test' == 'test'                        => true
'test' == #test asString       => false

"The compiler optimizes... same variable...", I know the explanation, but I don't think it should be accepted.

-- 
Damien Cassou





Reply | Threaded
Open this post in threaded view
|

Re: Very strange bug on Streams and probably compiler

Klaus D. Witzel
In reply to this post by Damien Cassou-3
Hi Damien,

on Wed, 14 Feb 2007 17:58:02 +0100, you wrote:

> On Feb 14, 2007, at 5:48 PM, Klaus D. Witzel wrote:
>> Hi Damien,
>> on Wed, 14 Feb 2007 17:19:13 +0100, you wrote:
>>> On Feb 14, 2007, at 4:36 PM, Bert Freudenberg wrote:
>>>
>>>
>>>> It is normal.
>>>
>>>
>>> This is normal for you because you know how the compiler works. But
>>> do you think the compiler works normally? Is it normal that a
>>> compiler considers two equal strings as identical? I would agree with
>>> symbols because symbols are immutable. I think this is a first bug, a
>>> bug in the compiler.
>>>
>>> In my opinion, there is another bug. When the collection of a stream
>>> becomes full, its is replaced by another bigger collection. So,
>>> first, the stream uses the collection you passed to the constructor,
>>> then, at a given time, this collection is replaced by a new one. I
>>> don't think it's a normal behavior.
>>
>> Whatever the stream does with the collection, it is encapsulated.
>> Imagine the stream always uses a highly optimized species for its
>> internal job (or a file on your harddisk!). You should not depend
>> any code on the internals of (in this case) stream.
>
>
> I really don't want to depend on the implementation. And in my
> opinion, this is not encapsulated because this is MY String, not a
> String created internally.

Not really. If you pass a boxed object (other than a SmallInteger) the  
recipient can #become: it to anything he/she likes. This is reality.

> What I see is that the String I give to
> the new Stream is modified. Then at a moment, the String does not
> reflect the stream anymore. This doesn't sound coherent to me.

It is not coherent because you passed an explicitly written *constant*  
which, in other languages, is believed to be immutable.

> And if
> you all agree to the current behavior, then a documentation should be
> written: "Don't use the collection after having created a stream on
> it !"

Easier: don't pass constant collections to the streamers :)

Another example, not to blame on any streamer:
  | tmp |
  tmp := 'lowercase'.
  tmp translateToUppercase == tmp

/Klaus


Reply | Threaded
Open this post in threaded view
|

Re: Very strange bug on Streams and probably compiler

Mathieu SUEN
In reply to this post by Klaus D. Witzel

On Feb 14, 2007, at 5:37 PM, Klaus D. Witzel wrote:

> Stef,
>
> why should strings be immutable, and why blame the compiler.

It's not about making string immutable but is wen you use it as a  
literal it should be immutable.
This show use that literal that are not immutable make really bad  
tedious side effect. ( Same reflection for #(a b c ) )

        Math

>
> The same situation is, for example, with Associations. In order to  
> prevent mutation, someone invented ReadOnlyVariableBinding.
>
> Literals have nothing much to do with compiler optimization, see  
> senders of #encodeLiteral:, just with determining the correct  
> bytecode for pushing them onto the stack.
>
> But of course the compiler could emit code for always copying  
> string literals, if you can afford the performance penalty.
>
> As Bert wrote: it's normal :)
>
> [Okay okay other languages have immutable string, but this is  
> Smalltalk.]
>
> /Klaus
>
> On Wed, 14 Feb 2007 16:56:56 +0100, stephane ducasse wrote:
>
>> Bert
>>
>>> It is normal.
>>
>> No this is not. You get used to it and accept it.
>>>
>>> You are modifying the 'a test ' literal into 'to test'. This  
>>> modified string gets copied in the second test.
>>>
>>> Lesson: never modify string literals.
>>
>> It shows that the fact that the compiler optimizes the use of  
>> certain literals such as boolean and number is good
>> for immutable objects but is wrong for mutable object such as  
>> strings.
>>
>> Iin the semantics of Smalltalk nothing says that two strings with  
>> the same representation in the same
>> methods are pointing to the same object. I did not check in which  
>> books but the difference between strings and symbols
>> is really that two strings are pointing to two different objects,  
>> while symbols are referring to the same objects (and are immutable).
>>
>> Stef
>>
>>
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Very strange bug on Streams and probably compiler

stephane ducasse
In reply to this post by Bert Freudenberg
you can do an optimization (having only one string in the literal  
frame) when this does not impact the semantics of the language.
If strings would be immutable then there would be no problem to have  
only one because we would not see the difference.

>>> It is normal.
>>
>> No this is not. You get used to it and accept it.
>>>
>>> You are modifying the 'a test ' literal into 'to test'. This  
>>> modified string gets copied in the second test.
>>>
>>> Lesson: never modify string literals.
>>
>> It shows that the fact that the compiler optimizes the use of  
>> certain literals such as boolean and number is good
>> for immutable objects but is wrong for mutable object such as  
>> strings.
>>
>> Iin the semantics of Smalltalk nothing says that two strings with  
>> the same representation in the same
>> methods are pointing to the same object. I did not check in which  
>> books but the difference between strings and symbols
>> is really that two strings are pointing to two different objects,  
>> while symbols are referring to the same objects (and are immutable).
>
> The sharing is not the primary problem, the mutability is.
>
> - Bert -
>
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Very strange bug on Streams and probably compiler

stephane ducasse
In reply to this post by Lukas Renggli
Do you have example?
Because VW introduced immutable objects and I would like to educate  
my taste on this topic.


On 14 févr. 07, at 17:03, Lukas Renggli wrote:

>> > Lesson: never modify string literals.
>
> Ohh god, how much I wish to have immutability on a per object bases.
>
> Cheers,
> Lukas
>
> --
> Lukas Renggli
> http://www.lukas-renggli.ch
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Very strange bug on Streams and probably compiler

stephane ducasse
In reply to this post by Klaus D. Witzel
> Stef,
>
> why should strings be immutable, and why blame the compiler.

I'm not saying that. I'm saying that since strings are not immutable  
then the compiler should not optimize the strings in a compiled method.

> The same situation is, for example, with Associations. In order to  
> prevent mutation, someone invented ReadOnlyVariableBinding.
>
> Literals have nothing much to do with compiler optimization, see  
> senders of #encodeLiteral:, just with determining the correct  
> bytecode for pushing them onto the stack.
>
> But of course the compiler could emit code for always copying  
> string literals, if you can afford the performance penalty.
>
> As Bert wrote: it's normal :)
>
> [Okay okay other languages have immutable string, but this is  
> Smalltalk.]

This is not my point.

do you think that from a language point of view this is good to say

ok two strings are not identical if they are typed in different  
methods but if there are typed in the same methods
they are identical?

This is why I got some of my slides not working when I switch from  
Visualworks to Squeak I do not have an old version of visualworks
but it seems that they were consistent with the view they offered to  
strings to programmers.
and consistent is important.

Stef

Reply | Threaded
Open this post in threaded view
|

Re: Very strange bug on Streams and probably compiler

stephane ducasse
In reply to this post by Lukas Renggli

On 14 févr. 07, at 17:48, Lukas Renggli wrote:

>> As Bert wrote: it's normal :)
>
> I agree, looks completely normal to me.

Is it normal for you that the same strings typed in two different  
methods are different but
if they are typed in the same method they are the same?

I never saw that in the books I read on smalltalk but they can be all  
wrong and squeak right.

Stef

>
>> [Okay okay other languages have immutable string, but this is  
>> Smalltalk.]
>
> If we had an immutability bit, that the compiler would set for objects
> in the literal array (and with what we could do a lot of other cool
> stuff), then people would not run into such problems.

Exact.
Or the compiler could just optimize immutable objects we have:  
#symbol, boolean, integers.....

Stef
>
> Cheers,
> Lukas
>
> --
> Lukas Renggli
> http://www.lukas-renggli.ch
>
>


123