Smalltalk › Pharo › Pharo Smalltalk Developers

('a' == 'a') == true ?

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

37 messages Options

Johan Brichau-2

('a' == 'a') == true ?

I convinced the teacher who will be taking over my Smalltalk courses at UCLouvain (starting this week) to use Pharo :-)
One of the introductory exercises in these courses shows the difference between '==' and '='. However, in Pharo (&Squeak) the following goes wrong imho:

'a' == 'a' -> true
$a asString == $a asString -> false

It seems that when you evaluate the expression, the (semantically identical) strings are represented as the same literal in the compiled block.
For example, try to evaluate the following code by evaluating each statement in a separate doit. Then do it again as a single block...

|a b|
a := 'a'.
b := 'a'.
a == b inspect

Do I make it an issue? Is there already an issue? (did not find one)
Am I wrong?

Johan
_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Lukas Renggli

Re: ('a' == 'a') == true ?

> Do I make it an issue?

Yes.

> Is there already an issue? (did not find one)

Yes.

> Am I wrong?

Yes, almost always one should probably use #= instead of #==.

Lukas

--
Lukas Renggli
www.lukas-renggli.ch

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Tobias Pape

Re: ('a' == 'a') == true ?

In reply to this post by Johan Brichau-2

Am 2010-09-27 um 10:32 schrieb Johan Brichau:

> I convinced the teacher who will be taking over my Smalltalk courses at UCLouvain (starting this week) to use Pharo :-)
> One of the introductory exercises in these courses shows the difference between '==' and '='. However, in Pharo (&Squeak) the following goes wrong imho:
>
> 'a' == 'a' -> true
> $a asString == $a asString -> false
>
> It seems that when you evaluate the expression, the (semantically identical) strings are represented as the same literal in the compiled block.
> For example, try to evaluate the following code by evaluating each statement in a separate doit. Then do it again as a single block...
>
> |a b|
> a := 'a'.
> b := 'a'.
> a == b inspect
>
>
> Do I make it an issue? Is there already an issue? (did not find one)
> Am I wrong?

I think this is a compiler optimization, as strings in general are
immutable if created via literal notation (afaik).
Remember that most operations on strings create copies of the
original string.
It gets even more interesting:

Look at doit1.png
I've referenced to 'abs' four times in the doit, nevertheless, there
is only one literal stored. When i execute the second line, i operate
with the #at:put: directly on that literal (doit2.png). hence, the
method does not return 'abs' but 'ads' (as you can see by the 'stack top'
in doit3.png)

HTH,
So Long,
-Tobias

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Doit1.png (54K) Download Attachment

Doit2.png (56K) Download Attachment

DOit3.png (19K) Download Attachment

Johan Brichau-2

Re: ('a' == 'a') == true ?

In reply to this post by Lukas Renggli

On 27 Sep 2010, at 10:38, Lukas Renggli wrote:

>> Am I wrong?
>
> Yes, almost always one should probably use #= instead of #==.

I will add that to the exercise :-)
The exercise actually makes students aware of the difference between strings and symbols (which should be pointer-equal)

Johan
_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Igor Stasenko

Re: ('a' == 'a') == true ?

On 27 September 2010 11:54, Johan Brichau <[hidden email]> wrote:

>
> On 27 Sep 2010, at 10:38, Lukas Renggli wrote:
>
>>> Am I wrong?
>>
>> Yes, almost always one should probably use #= instead of #==.
>
> I will add that to the exercise :-)
> The exercise actually makes students aware of the difference between strings and symbols (which should be pointer-equal)
>

I think you can avoid using 'equal' word when describing a #== comparison.
It can be explained as 'test whether comparands are same object or not'
while #= is test whether two objects equal or not.

Also keep in mind that #== optimized by compiler. So , even if you
override it in some class,
it won't behave differently.

And #=, of course, can be implemented in any way you like. One of
interesting example:

= anObject
^ false

saying 'i am not equal to anything' :)

> Johan
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>

--
Best regards,
Igor Stasenko AKA sig.

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Johan Brichau-2

Re: ('a' == 'a') == true ?

On 27 Sep 2010, at 11:28, Igor Stasenko wrote:

> On 27 September 2010 11:54, Johan Brichau <[hidden email]> wrote:
>>
>> On 27 Sep 2010, at 10:38, Lukas Renggli wrote:
>>
>>>> Am I wrong?
>>>
>>> Yes, almost always one should probably use #= instead of #==.
>>
>> I will add that to the exercise :-)
>> The exercise actually makes students aware of the difference between strings and symbols (which should be pointer-equal)
>>
>
> I think you can avoid using 'equal' word when describing a #== comparison.
> It can be explained as 'test whether comparands are same object or not'
> while #= is test whether two objects equal or not.

Yes, this is exactly what the exercise is doing.
I want them to be aware that equal _symbols_ are the same objects, but that equal _strings_ are not, which is why I let them evaluate:

a := #foobar.
b := #foobar.
a == b.

a := 'foobar'.
b := 'foobar'.
a == b

The problem is that evaluating the second snippet also yields true in Pharo/Squeak, so I cannot illustrate it using these snippets (which works fine in Visualworks, btw).

Yes, this is a compiler optimization and, yes, people should use #= instead of #== normally. But imho the optimization yields a wrong semantics, which is why I posted the email.

I have absolutely no clue if it can be changed (I am not familiar with the compiler implementation *at all*), but I would be happy to look over the shoulder of an experienced compiler hacker during the next sprint to learn ;-)

cheers
Johan
_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Igor Stasenko

Re: ('a' == 'a') == true ?

On 27 September 2010 12:51, Johan Brichau <[hidden email]> wrote:

>
> On 27 Sep 2010, at 11:28, Igor Stasenko wrote:
>
>> On 27 September 2010 11:54, Johan Brichau <[hidden email]> wrote:
>>>
>>> On 27 Sep 2010, at 10:38, Lukas Renggli wrote:
>>>
>>>>> Am I wrong?
>>>>
>>>> Yes, almost always one should probably use #= instead of #==.
>>>
>>> I will add that to the exercise :-)
>>> The exercise actually makes students aware of the difference between strings and symbols (which should be pointer-equal)
>>>
>>
>> I think you can avoid using 'equal' word when describing a #== comparison.
>> It can be explained as 'test whether comparands are same object or not'
>> while #= is test whether two objects equal or not.
>
> Yes, this is exactly what the exercise is doing.
> I want them to be aware that equal _symbols_ are the same objects, but that equal _strings_ are not, which is why I let them evaluate:
>
> a := #foobar.
> b := #foobar.
> a == b.
>
> a := 'foobar'.
> b := 'foobar'.
> a == b
>
> The problem is that evaluating the second snippet also yields true in Pharo/Squeak, so I cannot illustrate it using these snippets (which works fine in Visualworks, btw).
>
> Yes, this is a compiler optimization and, yes, people should use #= instead of #== normally. But imho the optimization yields a wrong semantics, which is why I posted the email.
>
> I have absolutely no clue if it can be changed (I am not familiar with the compiler implementation *at all*), but I would be happy to look over the shoulder of an experienced compiler hacker during the next sprint to learn ;-)
>

Why waiting for sprint?

Implement

String>>literalEqueal: anObject

^ self == anObject

and then you have
'aaa' == 'aaa' -> false

:)

> cheers
> Johan
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>

--
Best regards,
Igor Stasenko AKA sig.

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Igor Stasenko

Re: ('a' == 'a') == true ?

On 27 September 2010 13:34, Igor Stasenko <[hidden email]> wrote:

> On 27 September 2010 12:51, Johan Brichau <[hidden email]> wrote:
>>
>> On 27 Sep 2010, at 11:28, Igor Stasenko wrote:
>>
>>> On 27 September 2010 11:54, Johan Brichau <[hidden email]> wrote:
>>>>
>>>> On 27 Sep 2010, at 10:38, Lukas Renggli wrote:
>>>>
>>>>>> Am I wrong?
>>>>>
>>>>> Yes, almost always one should probably use #= instead of #==.
>>>>
>>>> I will add that to the exercise :-)
>>>> The exercise actually makes students aware of the difference between strings and symbols (which should be pointer-equal)
>>>>
>>>
>>> I think you can avoid using 'equal' word when describing a #== comparison.
>>> It can be explained as 'test whether comparands are same object or not'
>>> while #= is test whether two objects equal or not.
>>
>> Yes, this is exactly what the exercise is doing.
>> I want them to be aware that equal _symbols_ are the same objects, but that equal _strings_ are not, which is why I let them evaluate:
>>
>> a := #foobar.
>> b := #foobar.
>> a == b.
>>
>> a := 'foobar'.
>> b := 'foobar'.
>> a == b
>>
>> The problem is that evaluating the second snippet also yields true in Pharo/Squeak, so I cannot illustrate it using these snippets (which works fine in Visualworks, btw).
>>
>> Yes, this is a compiler optimization and, yes, people should use #= instead of #== normally. But imho the optimization yields a wrong semantics, which is why I posted the email.
>>
>> I have absolutely no clue if it can be changed (I am not familiar with the compiler implementation *at all*), but I would be happy to look over the shoulder of an experienced compiler hacker during the next sprint to learn ;-)
>>
> Why waiting for sprint?
>
> Implement
>
> String>>literalEqueal: anObject
>

oops. sorry for typo , a right selector is #literalEqual:

> ^ self == anObject
>
> and then you have
> 'aaa' == 'aaa' -> false
>
> :)
>
>> cheers
>> Johan
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
>
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>

--
Best regards,
Igor Stasenko AKA sig.

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Göran Krampe

Re: ('a' == 'a') == true ?

Hi all!

This issue is listed as a "newbie trap" on this page, search for "newbie
trap":

http://wiki.squeak.org/squeak/3644

...and there is another cool one just above, which goes like this:

"This one actually is borrowed from Leandro Caniglia. He
challenged me yesterday to figure out why the method SmallInteger>>gcd:
actually works. The challenge can also be expressed like this:

Evaluate this code below and figure out why n=5.

One could ask oneself if the assignment (m := n) is performed after or
before the evaluation of the expression?

If it is done before - then n should become 4 right? (2 + 2)
And if it is done after then it should become 6... (3 + 3)

Gurus: don't spoil it for the newbies ok? Let them chew on it for a
while... :-)"
| m n |
n := 2.
m := 3.
n := m + (m := n)

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Johan Brichau-2

Re: ('a' == 'a') == true ?

In reply to this post by Igor Stasenko

On 27 Sep 2010, at 12:34, Igor Stasenko wrote:

> Why waiting for sprint?

I guess I thought the solution was more intricate than that :-)

> Implement
>
> String>>literalEqueal: anObject
>
> ^ self == anObject
>
> and then you have
> 'aaa' == 'aaa' -> false

Thanks for pointing that out to me, Igor!
I created an issue for it (3006) and will see later this week if it is a good idea to change the implementation of #literalEqual: to be less liberal.

Johan
_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Schwab,Wilhelm K

Re: ('a' == 'a') == true ?

In reply to this post by Igor Stasenko

#== (test for identity/same object) is perhaps most useful when verifying that objects are copying themselves as intended, or that code has or has not made a copy of an object - I lost a few hours this weekend over a glorified version of just that question :( Found it :)

Literals can sometimes be confusing, being identical (courtesy of the compiler) when one might otherwise think they should be distinct but equal objects. As for whether a string and a symbol should have the same address, I am not convinced they would, but a good exercise for your students would be to make a C function that prints on stdout and/or returns the address of something passed to it, and then call it with both 'hello' and 'hello' asSymbol, #hello, etc.

Bill

________________________________________
From: [hidden email] [[hidden email]] On Behalf Of Igor Stasenko [[hidden email]]
Sent: Monday, September 27, 2010 5:28 AM
To: [hidden email]
Subject: Re: [Pharo-project] ('a' == 'a') == true ?

On 27 September 2010 11:54, Johan Brichau <[hidden email]> wrote:

Levente Uzonyi-2

Re: ('a' == 'a') == true ?

In reply to this post by Johan Brichau-2

On Mon, 27 Sep 2010, Johan Brichau wrote:

If you evaluate it line by line, it will work as you expected. So you can
even show how the compiler optimization works by evaluating it both ways.

Levente

>
> Yes, this is a compiler optimization and, yes, people should use #= instead of #== normally. But imho the optimization yields a wrong semantics, which is why I posted the email.
>
> I have absolutely no clue if it can be changed (I am not familiar with the compiler implementation *at all*), but I would be happy to look over the shoulder of an experienced compiler hacker during the next sprint to learn ;-)
>
> cheers
> Johan
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Schwab,Wilhelm K

Re: ('a' == 'a') == true ?

"Works fine" is a value judgment that can cut both ways: one could also argue that VW lacks a useful optimization of literals. Non-literal strings should do what you want.

Bill

________________________________________
From: [hidden email] [[hidden email]] On Behalf Of Levente Uzonyi [[hidden email]]
Sent: Monday, September 27, 2010 8:03 AM
To: [hidden email]
Subject: Re: [Pharo-project] ('a' == 'a') == true ?

On Mon, 27 Sep 2010, Johan Brichau wrote:

If you evaluate it line by line, it will work as you expected. So you can
even show how the compiler optimization works by evaluating it both ways.

Levente

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Stéphane Ducasse

Re: ('a' == 'a') == true ?

In reply to this post by Johan Brichau-2

On Sep 27, 2010, at 10:32 AM, Johan Brichau wrote:

> I convinced the teacher who will be taking over my Smalltalk courses at UCLouvain (starting this week) to use Pharo :-)

gorgeous!

> One of the introductory exercises in these courses shows the difference between '==' and '='. However, in Pharo (&Squeak) the following goes wrong imho:
>
> 'a' == 'a' -> true
> $a asString == $a asString -> false
>
> It seems that when you evaluate the expression, the (semantically identical) strings are represented as the same literal in the compiled block.
> For example, try to evaluate the following code by evaluating each statement in a separate doit. Then do it again as a single block...
>
> |a b|
> a := 'a'.
> b := 'a'.
> a == b inspect
>
>
> Do I make it an issue? Is there already an issue? (did not find one)
> Am I wrong?
>
> Johan
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Sean P. DeNigris

Re: ('a' == 'a') == true ?

Administrator

In reply to this post by Johan Brichau-2

From Pharo By Example:
Symbols are like Strings, in that they contain a sequence of characters. However, unlike a string, a literal symbol is guaranteed to be globally unique. There is only one Symbol object #Hello but there *may be* multiple String objects with the value 'Hello'.

According to the above:
Strings -> 1+ per value
Symbols -> always one per value

Where did you get that Strings are guaranteed not to be 1 per value?

Sean

Cheers,
Sean

Johan Brichau-2

Re: ('a' == 'a') == true ?

On 27 Sep 2010, at 20:01, Sean P. DeNigris wrote:

> Where did you get that Strings are guaranteed not to be 1 per value?

where did I say that?

Johan

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Johan Brichau-2

Re: ('a' == 'a') == true ?

In reply to this post by Schwab,Wilhelm K

On 27 Sep 2010, at 14:37, Schwab,Wilhelm K wrote:

> "Works fine" is a value judgment that can cut both ways: one could also argue that VW lacks a useful optimization of literals. Non-literal strings should do what you want.

I wonder how useful the optimization is, actually.
Probably not many (if any) methods will use the same literal multiple times and count on the compiler to optimize them into the same literal.

Inversely, I have never run into that issue myself either, until I was just testing those little snippets for the students. Or... at least, I think I never have :-)

Johan
_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Levente Uzonyi-2

Re: ('a' == 'a') == true ?

On Mon, 27 Sep 2010, Johan Brichau wrote:

>
> On 27 Sep 2010, at 14:37, Schwab,Wilhelm K wrote:
>
>> "Works fine" is a value judgment that can cut both ways: one could also argue that VW lacks a useful optimization of literals. Non-literal strings should do what you want.
>
> I wonder how useful the optimization is, actually.
> Probably not many (if any) methods will use the same literal multiple times and count on the compiler to optimize them into the same literal.

I think this is from the 80's or 90's where this could save some memory.
According to my calculations in Squeak 4.2 with some extra packages loaded
it saves at least 36591 bytes (object size + 1 slot in the literal array),
which is only 0.79 bytes / method.

Levente

>
> Inversely, I have never run into that issue myself either, until I was just testing those little snippets for the students. Or... at least, I think I never have :-)
>
> Johan
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Alexandre Bergel

Re: ('a' == 'a') == true ?

> I think this is from the 80's or 90's where this could save some memory. According to my calculations in Squeak 4.2 with some extra packages loaded it saves at least 36591 bytes (object size + 1 slot in the literal array), which is only 0.79 bytes / method.

Wow! Thanks Levente for sharing this with us.

Alexandre

--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Sean P. DeNigris

Re: ('a' == 'a') == true ?

Administrator

In reply to this post by Johan Brichau-2

Johan Brichau-2 wrote

where did I say that?

I thought that's what you meant by "imho the optimization yields a wrong semantics." If there is no guarantee on whether or not two strings with the same value are the same object, what is wrong with the result you observed?

Sean

Cheers,
Sean