Visualworks - Unique strings object?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Visualworks - Unique strings object?

Michel Duf
Hi,

I have some memory issues with a VW 5i image.

By instance, I found out that for the class ByteString I have 12 megs
of memory used and for that 12 megs only 6 megs are unique strings.

Is there a way (as JAVA) to tell Smalltalk to "re-use" string objects?
I do not think so but I would like to be sure.

Is there something else I should do other less that I may have some
internal issues with the application code.

Regards,
Michel

Reply | Threaded
Open this post in threaded view
|

Re: Visualworks - Unique strings object?

Runar Jordahl
You can use Symbol. Upon creation, a symbol will check if "the same"
name already exits, and if it does, this instance will be used.

To create a symbol use the literal #. For example:
#'My String'

You can debug  Symbol intern: 'My String'  to see how a Symbol is
created. Note how class method #findInterned:  searches a table for
existing instances.

Also try the following two statements. Note that in Smalltalk, method
#== tests for same identity.
'My String' == 'My String'   gives false
#'My String' == #'My String'   gives true

Now, a symbol is not a string (or…well, it inherits from String, so in
OO-terms it is a String, but at least the class is different), so you
might need to convert it back to String:
#'My String' asString

Runar

Reply | Threaded
Open this post in threaded view
|

Re: Visualworks - Unique strings object?

Ladislav Lenart
In reply to this post by Michel Duf
Michel Dufour wrote:

> Hi,
>
> I have some memory issues with a VW 5i image.
>
> By instance, I found out that for the class ByteString I have 12 megs
> of memory used and for that 12 megs only 6 megs are unique strings.
>
> Is there a way (as JAVA) to tell Smalltalk to "re-use" string objects?
> I do not think so but I would like to be sure.
>
> Is there something else I should do other less that I may have some
> internal issues with the application code.
As far as I know there is no "compiler switch" for this.

But if those six megs of equal strings are yours (i.e. generated
by your application code), maybe you would be better of using
Symbols instead of Strings.

Ladislav Lenart

Reply | Threaded
Open this post in threaded view
|

Re: Visualworks - Unique strings object?

Reinout Heeck-2
In reply to this post by Michel Duf
Michel,

If you really need all those strings you can use the Flyweight pattern, or
uses symbols in stead of string.

HTH,
Adriaan.

> Hi,
>
> I have some memory issues with a VW 5i image.
>
> By instance, I found out that for the class ByteString I have 12 megs
> of memory used and for that 12 megs only 6 megs are unique strings.
>
> Is there a way (as JAVA) to tell Smalltalk to "re-use" string objects?
> I do not think so but I would like to be sure.
>
> Is there something else I should do other less that I may have some
> internal issues with the application code.
>
> Regards,
> Michel
>
>


--
http://vdg38bis.xs4all.nl

Reply | Threaded
Open this post in threaded view
|

Re: Visualworks - Unique strings object?

Ricardo Birmann
In reply to this post by Michel Duf
Michel,

In Smalltalk Symbols are only stored once.

'hey' = 'hey'  --> answers true
'hey' == 'hey'  --> answers false

#hey = #hey --> answers true
#hey == #hey --> answers true

Hope it helps,

Ricardo


On 8/23/06, Michel Dufour <[hidden email]> wrote:
Hi,

I have some memory issues with a VW 5i image.

By instance, I found out that for the class ByteString I have 12 megs
of memory used and for that 12 megs only 6 megs are unique strings.

Is there a way (as JAVA) to tell Smalltalk to "re-use" string objects?
I do not think so but I would like to be sure.

Is there something else I should do other less that I may have some
internal issues with the application code.

Regards,
Michel


Reply | Threaded
Open this post in threaded view
|

Re: Visualworks - Unique strings object?

Malte Zacharias
In reply to this post by Michel Duf

Hi,

sorry for breaking the Threadview of your mailclients with this mail,

but I unfortunately already got the ongoing thread deleted and this just came to mind. (Oh and sorry for clogging up your space in case you weren't interested ;-) )

If I recall correctly there was a question in the direction whether this can be achieved in Smalltalk:

| a b |

a := 'String'.

b := 'String'.

(a = b) = (a == b). "This should answer true but doesn't (just print it)"

Another question came to my mind, would this behaviour sensible? In my opinion there are two reasons why we wouldn't want Smalltalk to do this.

The first one would be, there is already a suitable class for Texts which only exist once, as you all pointed out, Symbol is readily available. The other reason however is even more important in my opinion, which is consistency.

Why you may ask? Let me illustrate. I wrote all the code in Smalltalk Syntax, as I am not sure how many of you know Java and know it well enough. For the Java literate, = should be the same as String's equal() instance method.

I wrote a small Java App and this is the comparison against Smalltalk:

(Sorry for the ugly code..)

| c1 c2 c all |

c1 := 'te'. c2 := 'st'.

c := c1 , c2.

all := Array

with: 'test' "String literal"

with: 'test' "Same String literal, different object in ST)"

with: 'te' , 'st' "Java: ""te"" + ""st"", just added two literals"

with: c "Create two Strings by literals and

add them like done before this."

with: (String fromString: 'test'). "JAVA: new String(""test" ")"

all do: [:elem1 | all do:

[:elem2 |

Transcript

show: elem1 == elem2;

space].

Transcript cr]

Give it a thought, what would you expect for this to yield? Hint: The output is formatted like a table which compares each Element against each other (actually only half of these are needed as #== is implemented reflexive, but what the heck) So all of us Smalltalkers would expect this:

true false false false false

false true false false false

false false true false false

false false false true false

false false false false true

Which is exactly the output generated by smalltalk, very lovely world :-)

If you want, the code can be copied and run directly as typed above.

However if you type exactly (well not exactly) this code in Java you get an interestening result being:

true true true false false

true true true false false

true true true false false

false false false true false

false false false false true

Now apparently in Java with this: a = "te"; b = "st";

the results of a + b are not the same object as "te" + "st" which is very strange in terms of consistency. Strange as well is that "Test" saved in a variable twice is the same object, a new String from the String "Test" however is not. I would call that very weird...

What I had to learn in both languages is, that comparing using the == Operator is bad (most of the time) but in Java, in many cases I can actually do that without getting hurt. I like the ST solution better, which is using Symbols for unique Strings, the JAVA optimizations may save space and/or time, but from the point of OOP I find it very strange and questionable.

I thank everyone for following me so far,

and btw I would be interested to know what leads to your multiple String objects for the same String, if you can say something about the details, maybe we can tell you better whether it's feasible to use Symbols instead?

Greetings,

Malte Zacharias

On Wednesday 23 August 2006 20:36, you wrote:

> You welcome and thanks to let me know if you found something else.

>

> I know JAVA a little and I am always interested to learn it since

> unfortunately I never had the chance working with it.

>

> But anyway I prefer Smalltalk :)

>

> Michel

>

> On 8/23/06, Malte Zacharias <[hidden email]> wrote:

> > I just conducted a test, and it seems you're right, just FYI..

> > I didn't know that yet, thanks for telling :-)

> >

> > > I am not a JAVA expert but by default :

> > > A:= 'hi'

> > > B:= 'hi' means A and B hold the same object.

> > >

> > > It is when you explicitly use the class String that the string object

> > > are not unique.

> > >

> > > Michel

> > >

> > > On 8/23/06, Malte Zacharias <[hidden email]> wrote:

> > > > > Hi,

> > > > > I have some memory issues with a VW 5i image.

> > > > >

> > > > > By instance, I found out that for the class ByteString I have 12

> > > > > megs of memory used and for that 12 megs only 6 megs are unique

> > > > > strings. Is there a way (as JAVA) to tell Smalltalk to "re-use"

> > > > > string objects? I do not think so but I would like to be sure.

> > > >

> > > > How would one tell Java that? Just curious, you know.

> > > >

> > > > Greetings,

> > > > Malte Zacharias


attachment0 (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Visualworks - Unique strings object?

Michel Duf
and btw I would be interested to know what leads to your multiple String
> objects for the same String, if you can say something about the details,
> maybe we can tell you better whether it's feasible to use Symbols instead?
>

So far I do not know, it is just a few weeks ago that I started
working for this Smalltalk code mostly written by Cobol programmer's I
think.

I found some application code strings (maybe loaded from a table) not
2 or 3 times but more than 100 times and sometime more than 2,000
times!!!

Strangely I found 3 instances of a string and in the code I found 3
tests methods using it but since the method should not have been
executed I think is weird that Smalltalk hold 3 objects for it.

Michel


On 8/23/06, Malte Zacharias <[hidden email]> wrote:

>
>
>
> Hi,
>
>
>
> sorry for breaking the Threadview of your mailclients with this mail,
>
> but I unfortunately already got the ongoing thread deleted and this just
> came to mind. (Oh and sorry for clogging up your space in case you weren't
> interested ;-) )
>
>
>
> If I recall correctly there was a question in the direction whether this can
> be achieved in Smalltalk:
>
> | a b |
>
> a := 'String'.
>
> b := 'String'.
>
> (a = b) = (a == b). "This should answer true but doesn't (just print it)"
>
> Another question came to my mind, would this behaviour sensible? In my
> opinion there are two reasons why we wouldn't want Smalltalk to do this.
>
>
>
> The first one would be, there is already a suitable class for Texts which
> only exist once, as you all pointed out, Symbol is readily available. The
> other reason however is even more important in my opinion, which is
> consistency.
>
>
>
> Why you may ask? Let me illustrate. I wrote all the code in Smalltalk
> Syntax, as I am not sure how many of you know Java and know it well enough.
> For the Java literate, = should be the same as String's equal() instance
> method.
>
> I wrote a small Java App and this is the comparison against Smalltalk:
>
> (Sorry for the ugly code..)
>
> | c1 c2 c all |
>
> c1 := 'te'. c2 := 'st'.
>
> c := c1 , c2.
>
> all := Array
>
> with: 'test' "String literal"
>
> with: 'test' "Same String literal, different object in ST)"
>
> with: 'te' , 'st' "Java: ""te"" + ""st"", just added two literals"
>
> with: c "Create two Strings by literals and
>
> add them like done before this."
>
> with: (String fromString: 'test'). "JAVA: new String(""test" ")"
>
>
>
> all do: [:elem1 | all do:
>
> [:elem2 |
>
> Transcript
>
> show: elem1 == elem2;
>
> space].
>
> Transcript cr]
>
>
>
> Give it a thought, what would you expect for this to yield? Hint: The output
> is formatted like a table which compares each Element against each other
> (actually only half of these are needed as #== is implemented reflexive, but
> what the heck) So all of us Smalltalkers would expect this:
>
> true false false false false
>
> false true false false false
>
> false false true false false
>
> false false false true false
>
> false false false false true
>
> Which is exactly the output generated by smalltalk, very lovely world :-)
>
> If you want, the code can be copied and run directly as typed above.
>
> However if you type exactly (well not exactly) this code in Java you get an
> interestening result being:
>
> true true true false false
>
> true true true false false
>
> true true true false false
>
> false false false true false
>
> false false false false true
>
> Now apparently in Java with this: a = "te"; b = "st";
>
> the results of a + b are not the same object as "te" + "st" which is very
> strange in terms of consistency. Strange as well is that "Test" saved in a
> variable twice is the same object, a new String from the String "Test"
> however is not. I would call that very weird...
>
>
>
> What I had to learn in both languages is, that comparing using the ==
> Operator is bad (most of the time) but in Java, in many cases I can actually
> do that without getting hurt. I like the ST solution better, which is using
> Symbols for unique Strings, the JAVA optimizations may save space and/or
> time, but from the point of OOP I find it very strange and questionable.
>
>
>
> I thank everyone for following me so far,
>
> and btw I would be interested to know what leads to your multiple String
> objects for the same String, if you can say something about the details,
> maybe we can tell you better whether it's feasible to use Symbols instead?
>
>
>
> Greetings,
>
>
> Malte Zacharias
>
>
>
>
>
>
>
>
> On Wednesday 23 August 2006 20:36, you wrote:
>
> > You welcome and thanks to let me know if you found something else.
>
> >
>
> > I know JAVA a little and I am always interested to learn it since
>
> > unfortunately I never had the chance working with it.
>
> >
>
> > But anyway I prefer Smalltalk :)
>
> >
>
> > Michel
>
> >
>
> > On 8/23/06, Malte Zacharias <[hidden email]> wrote:
>
> > > I just conducted a test, and it seems you're right, just FYI..
>
> > > I didn't know that yet, thanks for telling :-)
>
> > >
>
> > > > I am not a JAVA expert but by default :
>
> > > > A:= 'hi'
>
> > > > B:= 'hi' means A and B hold the same object.
>
> > > >
>
> > > > It is when you explicitly use the class String that the string object
>
> > > > are not unique.
>
> > > >
>
> > > > Michel
>
> > > >
>
> > > > On 8/23/06, Malte Zacharias <[hidden email]> wrote:
>
> > > > > > Hi,
>
> > > > > > I have some memory issues with a VW 5i image.
>
> > > > > >
>
> > > > > > By instance, I found out that for the class ByteString I have 12
>
> > > > > > megs of memory used and for that 12 megs only 6 megs are unique
>
> > > > > > strings. Is there a way (as JAVA) to tell Smalltalk to "re-use"
>
> > > > > > string objects? I do not think so but I would like to be sure.
>
> > > > >
>
> > > > > How would one tell Java that? Just curious, you know.
>
> > > > >
>
> > > > > Greetings,
>
> > > > > Malte Zacharias
>
>

Reply | Threaded
Open this post in threaded view
|

RE: Visualworks - Unique strings object?

Terry Raymond
In reply to this post by Malte Zacharias

Malte

 

There are a few issues here.

 

1. #== is an identity test, you should only use #== if you want to

know if two variables are referring to the identical object.

 

2. In VisualWorks you cannot assume that there is only one instance

of a string literal in the image, so you should not use #== as test.

Additionally, you should not use #== to test numbers even though

it will work,  (1 + 1 ) == 2 will be true.

 

3. If you have code that tests strings for identity you probably

should be using an enumerated data pattern. You could use

symbols but you would be better off using your own classes,

particularly if your code tests an object and then takes an action.

This can be made more OO by just sending a polymorphic message

to the object. Then it is up to the different classes to perform the

specialized operation.

 

4. If you have the situation where you have several equal instances

of a string literal and they are not working as enumerated data, then

instead of placing a literal in several methods you could have one

method supply the literal and obtain it by a message.

Terry

===========================================================
Terry Raymond       Smalltalk Professional Debug Package
Crafted Smalltalk
80 Lazywood Ln.
Tiverton, RI  02878
(401) 624-4517      [hidden email]
<http://www.craftedsmalltalk.com>
===========================================================


From: Malte Zacharias [mailto:[hidden email]]
Sent: Wednesday, August 23, 2006 7:07 PM
To: Michel Dufour
Cc: [hidden email]
Subject: Re: Visualworks - Unique strings object?

 

Hi,

sorry for breaking the Threadview of your mailclients with this mail,

but I unfortunately already got the ongoing thread deleted and this just came to mind. (Oh and sorry for clogging up your space in case you weren't interested ;-) )

If I recall correctly there was a question in the direction whether this can be achieved in Smalltalk:

| a b |

a := 'String'.

b := 'String'.

(a = b) = (a == b). "This should answer true but doesn't (just print it)"

Another question came to my mind, would this behaviour sensible? In my opinion there are two reasons why we wouldn't want Smalltalk to do this.

The first one would be, there is already a suitable class for Texts which only exist once, as you all pointed out, Symbol is readily available. The other reason however is even more important in my opinion, which is consistency.

Why you may ask? Let me illustrate. I wrote all the code in Smalltalk Syntax, as I am not sure how many of you know Java and know it well enough. For the Java literate, = should be the same as String's equal() instance method.

I wrote a small Java App and this is the comparison against Smalltalk:

(Sorry for the ugly code..)

| c1 c2 c all |

c1 := 'te'. c2 := 'st'.

c := c1 , c2.

all := Array

with: 'test' "String literal"

with: 'test' "Same String literal, different object in ST)"

with: 'te' , 'st' "Java: ""te"" + ""st"", just added two literals"

with: c "Create two Strings by literals and

add them like done before this."

with: (String fromString: 'test'). "JAVA: new String(""test" ")"

all do: [:elem1 | all do:

[:elem2 |

Transcript

show: elem1 == elem2;

space].

Transcript cr]

Give it a thought, what would you expect for this to yield? Hint: The output is formatted like a table which compares each Element against each other (actually only half of these are needed as #== is implemented reflexive, but what the heck) So all of us Smalltalkers would expect this:

true false false false false

false true false false false

false false true false false

false false false true false

false false false false true

Which is exactly the output generated by smalltalk, very lovely world :-)

If you want, the code can be copied and run directly as typed above.

However if you type exactly (well not exactly) this code in Java you get an interestening result being:

true true true false false

true true true false false

true true true false false

false false false true false

false false false false true

Now apparently in Java with this: a = "te"; b = "st";

the results of a + b are not the same object as "te" + "st" which is very strange in terms of consistency. Strange as well is that "Test" saved in a variable twice is the same object, a new String from the String "Test" however is not. I would call that very weird...

What I had to learn in both languages is, that comparing using the == Operator is bad (most of the time) but in Java, in many cases I can actually do that without getting hurt. I like the ST solution better, which is using Symbols for unique Strings, the JAVA optimizations may save space and/or time, but from the point of OOP I find it very strange and questionable.

I thank everyone for following me so far,

and btw I would be interested to know what leads to your multiple String objects for the same String, if you can say something about the details, maybe we can tell you better whether it's feasible to use Symbols instead?

Greetings,

Malte Zacharias

On Wednesday 23 August 2006 20:36, you wrote:

> You welcome and thanks to let me know if you found something else.

> 

> I know JAVA a little and I am always interested to learn it since

> unfortunately I never had the chance working with it.

> 

> But anyway I prefer Smalltalk :)

> 

> Michel

> 

> On 8/23/06, Malte Zacharias <[hidden email]> wrote:

> > I just conducted a test, and it seems you're right, just FYI..

> > I didn't know that yet, thanks for telling :-)

> >

> > > I am not a JAVA expert but by default :

> > > A:= 'hi'

> > > B:= 'hi' means A and B hold the same object.

> > >

> > > It is when you explicitly use the class String that the string object

> > > are not unique.

> > >

> > > Michel

> > >

> > > On 8/23/06, Malte Zacharias <[hidden email]> wrote:

> > > > > Hi,

> > > > > I have some memory issues with a VW 5i image.

> > > > >

> > > > > By instance, I found out that for the class ByteString I have 12

> > > > > megs of memory used and for that 12 megs only 6 megs are unique

> > > > > strings. Is there a way (as JAVA) to tell Smalltalk to "re-use"

> > > > > string objects? I do not think so but I would like to be sure.

> > > >

> > > > How would one tell Java that? Just curious, you know.

> > > >

> > > > Greetings,

> > > > Malte Zacharias