Smalltalk › Squeak › Squeak - Dev

Max source method length? Max string length? Max change set size?

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

19 messages Options

Casey Ransberger-2

Max source method length? Max string length? Max change set size?

Do these guys have caps? Any difference between Cog and the interpreter VM?

--
Casey Ransberger

Randal L. Schwartz

Re: Max source method length? Max string length? Max change set size?

>>>>> "Casey" == Casey Ransberger <[hidden email]> writes:

Casey> Do these guys have caps? Any difference between Cog and the
Casey> interpreter VM?

I hope the 32K limit for method source is still in place. It's sad to
think anyone would ever want to go beyond that.

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<[hidden email]> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.posterous.com/ for Smalltalk discussion

David T. Lewis

Re: Max source method length? Max string length? Max change set size?

In reply to this post by Casey Ransberger-2

On Tue, May 17, 2011 at 05:15:03PM -0700, Casey Ransberger wrote:
>
> Do these guys have caps?

Not that I am aware of.

> Any difference between Cog and the interpreter VM?

No.

Dave

Tony Garnock-Jones-3

Re: Max source method length? Max string length? Max change set size?

In reply to this post by Randal L. Schwartz

On 2011-05-17 10:04 PM, Randal L. Schwartz wrote:
> I hope the 32K limit for method source is still in place. It's sad to
> think anyone would ever want to go beyond that.

Humans might balk at writing such long methods, but compilers frequently
find themselves in a position where writing very long methods can make a
lot of sense.

Tony

Igor Stasenko

Re: Max source method length? Max string length? Max change set size?

On 18 May 2011 04:42, Tony Garnock-Jones <[hidden email]> wrote:
> On 2011-05-17 10:04 PM, Randal L. Schwartz wrote:
>>
>> I hope the 32K limit for method source is still in place. It's sad to
>> think anyone would ever want to go beyond that.
>
> Humans might balk at writing such long methods, but compilers frequently
> find themselves in a position where writing very long methods can make a lot
> of sense.
>

hmm.. can you provide some example?

i could understand why compilers could generate big code, like
unrolling the loops to gain better performance.
But if we are talking about source code of a single method (and taking
the context into mind that we're talking about smalltalk),
then i really cannot find any good situation where methods with big
source code is preferable to a number of short methods.
If we don't count an obvious abuses, like encoding binary data into
large literal arrays, which turning a method into dumb data storage,
and really considering that there could be some complex algorithms,
which for some reason is preferable to put into a single method body.

So i'd like to hear what are those algorithms and what is the gains
comparing to implementing it using a number of shorter methods.

> Tony

--
Best regards,
Igor Stasenko AKA sig.

Tony Garnock-Jones-3

Re: Max source method length? Max string length? Max change set size?

On 2011-05-17 10:54 PM, Igor Stasenko wrote:
> hmm.. can you provide some example?

Module initialisation code is the case I've run into, where many
"constant" objects (including symbols, functions, classes etc.) are
computed, initialised and placed into a dictionary/namespace. A compiler
from Scheme to .NET CLR I worked on a while ago frequently generated
module initialisation methods with more than 64k of *bytecode*, which
worked (a) slowly on one implementation (b) reasonably on another and
(c) not at all on a third, causing it to dump core.

(Another way of saying the same thing: The only three reasonable numbers
are zero, one and infinity.)

Tony

David T. Lewis

Re: Max source method length? Max string length? Max change set size?

In reply to this post by Tony Garnock-Jones-3

On Tue, May 17, 2011 at 10:42:14PM -0400, Tony Garnock-Jones wrote:
> On 2011-05-17 10:04 PM, Randal L. Schwartz wrote:
> >I hope the 32K limit for method source is still in place. It's sad to
> >think anyone would ever want to go beyond that.
>
> Humans might balk at writing such long methods, but compilers frequently
> find themselves in a position where writing very long methods can make a
> lot of sense.

Indeed, although 32K of generated Smalltalk source still sounds somewhat
horrible under any circumstances. But out of curiosity, is there actually
a 32K limit on this? I know there is a limit on number of method temps,
but I cannot think why there would be a limit on the length of the source
string.

There once was a 32M limit on sources and changes files, but that was
eliminated long ago (see ExpandedSourceFileArrayTest class comment).

Dave

Igor Stasenko

Re: Max source method length? Max string length? Max change set size?

In reply to this post by Tony Garnock-Jones-3

On 18 May 2011 04:58, Tony Garnock-Jones <[hidden email]> wrote:

> On 2011-05-17 10:54 PM, Igor Stasenko wrote:
>>
>> hmm.. can you provide some example?
>
> Module initialisation code is the case I've run into, where many "constant"
> objects (including symbols, functions, classes etc.) are computed,
> initialised and placed into a dictionary/namespace. A compiler from Scheme
> to .NET CLR I worked on a while ago frequently generated module
> initialisation methods with more than 64k of *bytecode*, which worked (a)
> slowly on one implementation (b) reasonably on another and (c) not at all on
> a third, causing it to dump core.
>

Aha, so you're talking not about code directly authored by humans but
rather indirectly/automatically generated code,
which like i said is a form of abuse because it actually turns a
source code into data storage (and of course sometimes
it is hard to invent something better, but it doesn't makes it less abuse ;).

I would not bother about limits, because it is not a big deal, in your
framework you could always detect if method's size surpasses certain
reasonable limit, then you can simply split it onto number of smaller
methods and then generate a root method to invoke them one by one in
order initialize things in specific order.

> (Another way of saying the same thing: The only three reasonable numbers are
> zero, one and infinity.)
>
> Tony
>

--
Best regards,
Igor Stasenko AKA sig.

Casey Ransberger-2

Re: Max source method length? Max string length? Max change set size?

In reply to this post by Randal L. Schwartz

Busted! Working my way around a bootstrapping problem with an abuse of the compiler:) and really just trying to do the quick hack that works. 32K will be more than enough for my purpose. 32K should be enough for anybody, right? :P I was mostly looking for a bound that wasn't completely arbitrary for shuffling some data that I have out of one image and into another.

Thanks to everyone who replied!

On Tue, May 17, 2011 at 7:04 PM, Randal L. Schwartz <[hidden email]> wrote:

>>>>> "Casey" == Casey Ransberger <[hidden email]> writes:

Casey> Do these guys have caps? Any difference between Cog and the
Casey> interpreter VM?

I hope the 32K limit for method source is still in place. It's sad to
think anyone would ever want to go beyond that.

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - <a href="tel:%2B1%20503%20777%200095" value="+15037770095">+1 503 777 0095
<[hidden email]> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.posterous.com/ for Smalltalk discussion

--
Casey Ransberger

Bert Freudenberg

Re: Max source method length? Max string length? Max change set size?

In reply to this post by Igor Stasenko

On 18.05.2011, at 05:28, Igor Stasenko wrote:

> Aha, so you're talking not about code directly authored by humans but
> rather indirectly/automatically generated code,
> which like i said is a form of abuse because it actually turns a
> source code into data storage (and of course sometimes
> it is hard to invent something better, but it doesn't makes it less abuse ;).
>
> I would not bother about limits, because it is not a big deal, in your
> framework you could always detect if method's size surpasses certain
> reasonable limit, then you can simply split it onto number of smaller
> methods and then generate a root method to invoke them one by one in
> order initialize things in specific order.

Of course you can do that, it's just more work.

I ran into the limits of the max jump distance being 1024 bytecodes, and the number of temps being limited to 64. That meant I had to split into different methods. That meant I had to figure out how to pass the intermediates into the next method. Also, both cases of a conditional must be in a single method, so each branch needs to spill over separately. Etc.

If I had created the code generator from scratch I would have designed it to not run into the limits in the first place. But instead I modified a code generator that used to output C. And a C function has virtually no limits, at least compared to a Squeak method.

I wonder if compiler magic could remove the limits without having to change the VM. Like, if there are more than 63 temps, make the 64th temp be an array to hold the spillover temps. Same for literals, method arguments, block arguments, inst vars. For large jumps, do a series of unconditional 1024 byte jumps. The most severe limit might be stack depth, it might have to reify the stack into an array. The hard limit on CompiledMethod size would be 1 GB, because the PC is a positive SmallInt?

The benefit would be that for reasonably sized methods there would be no penalty at all, but there would not be artificially low limits either when you happen to do something unreasonable :)

I can see how this would work for the interpreter. And since bytecode semantics would be unchanged, it should even work for Cog? Of course, it rarely translates large methods anyway.

- Bert -

marcel.taeumel (old)

Re: Max source method length? Max string length? Max change set size?

In reply to this post by David T. Lewis

Uh-Oh. =D

I just opened one of the more complex designs I created with the Morphic Designer and it hat about 25K characters. Hmm.... I could split up the code generation but, besides the 32K limit, I see no other reason why to do so. Well, it is just generated code. No one will read it. Actually, there is no intention to ever do so. At least in that case. ;)

Marcel

Bert Freudenberg

Re: Max source method length? Max string length? Max change set size?

On 18.05.2011, at 11:49, Marcel Taeumel wrote:

> Uh-Oh. =D
>
> I just opened one of the more complex designs I created with the Morphic
> Designer and it hat about 25K characters. Hmm.... I could split up the code
> generation but, besides the 32K limit, I see no other reason why to do so.

The code has no jumps, so that is no problem. But you might exceed the number of literals. How many literals does your big method have? And just out of curiosity, how many byte codes?

- Bert -

Bert Freudenberg

Re: Max source method length? Max string length? Max change set size?

In reply to this post by Randal L. Schwartz

On 18.05.2011, at 04:04, Randal L. Schwartz wrote:

>>>>>> "Casey" == Casey Ransberger <[hidden email]> writes:
>
> Casey> Do these guys have caps? Any difference between Cog and the
> Casey> interpreter VM?
>
> I hope the 32K limit for method source is still in place.

I don't think there ever was a limit on the method source length in Squeak. I just made a method > 100K.

At least up to the max length of a string. Which is I think 1 GB (SmallInteger maxVal, 2^30 - 1).

- Bert -

Igor Stasenko

Re: Max source method length? Max string length? Max change set size?

On 18 May 2011 12:15, Bert Freudenberg <[hidden email]> wrote:

>
> On 18.05.2011, at 04:04, Randal L. Schwartz wrote:
>
>>>>>>> "Casey" == Casey Ransberger <[hidden email]> writes:
>>
>> Casey> Do these guys have caps? Any difference between Cog and the
>> Casey> interpreter VM?
>>
>> I hope the 32K limit for method source is still in place.
>
> I don't think there ever was a limit on the method source length in Squeak. I just made a method > 100K.
>
> At least up to the max length of a string. Which is I think 1 GB (SmallInteger maxVal, 2^30 - 1).
>

Yes. I had the same expectation. There is no hardcoded limits for
source code, at least i didn't seen them.
But since source code length ~~ bytecode size ~~ literals number ~~
number of temps
so i wonder if original question, which started this topic were
correct (exactly about source code limits , but not the limits
incurred by method's and context formats).

> - Bert -
>

--
Best regards,
Igor Stasenko AKA sig.

Nicolas Cellier

Re: Max source method length? Max string length? Max change set size?

In reply to this post by Bert Freudenberg

I've implemented some of these changes in VW 20 years ago, because I
was generating code from symbolic expression (computer algebra
system).
But it was easier because BlockClosure are a literal in VW, so you
just have to turn optimisation of long blocks off.
For literals, and temps, I created Arrays of literals and temps as
Bert suggested. That means that some message sends were replace with
perform: operations.
But there is more : even the integer index used to access the
literal/temp Array can be a literal by itself (the limit depends on
the byte code set).
One trick is to generate Arrays of Arrays of Arrays ... all of size
accessible with a literal free integer (BEWARE, at expense of stack
depth when it comes to evaluating).
I chose another way: generate an expression computing literal indices
from byte-code encoded smaller integers. Funny.

For the number of arguments I also passed an array of arguments
(generalisation of temps trick). I have a patch pending in mantis
which can be applied to Squeak.

For stack depth, I don't remember if I ever hit the limit, nor what
this limit is in VW. That's IMO a big problem in current Squeak.

All in all, hacking the Compiler is do-able. But the bad news is that
you will have to hack the Decompiler, and the Debugger... Much harder
in current Squeak architecture (maybe worth a full rewrite in this
case).

Nicolas

2011/5/18 Bert Freudenberg <[hidden email]>:

> On 18.05.2011, at 05:28, Igor Stasenko wrote:
>
>> Aha, so you're talking not about code directly authored by humans but
>> rather indirectly/automatically generated code,
>> which like i said is a form of abuse because it actually turns a
>> source code into data storage (and of course sometimes
>> it is hard to invent something better, but it doesn't makes it less abuse ;).
>>
>> I would not bother about limits, because it is not a big deal, in your
>> framework you could always detect if method's size surpasses certain
>> reasonable limit, then you can simply split it onto number of smaller
>> methods and then generate a root method to invoke them one by one in
>> order initialize things in specific order.
>
> Of course you can do that, it's just more work.
>
> I ran into the limits of the max jump distance being 1024 bytecodes, and the number of temps being limited to 64. That meant I had to split into different methods. That meant I had to figure out how to pass the intermediates into the next method. Also, both cases of a conditional must be in a single method, so each branch needs to spill over separately. Etc.
>
> If I had created the code generator from scratch I would have designed it to not run into the limits in the first place. But instead I modified a code generator that used to output C. And a C function has virtually no limits, at least compared to a Squeak method.
>
> I wonder if compiler magic could remove the limits without having to change the VM. Like, if there are more than 63 temps, make the 64th temp be an array to hold the spillover temps. Same for literals, method arguments, block arguments, inst vars. For large jumps, do a series of unconditional 1024 byte jumps. The most severe limit might be stack depth, it might have to reify the stack into an array. The hard limit on CompiledMethod size would be 1 GB, because the PC is a positive SmallInt?
>
> The benefit would be that for reasonably sized methods there would be no penalty at all, but there would not be artificially low limits either when you happen to do something unreasonable :)
>
> I can see how this would work for the interpreter. And since bytecode semantics would be unchanged, it should even work for Cog? Of course, it rarely translates large methods anyway.
>
> - Bert -
>
>
>

Nicolas Cellier

Re: Max source method length? Max string length? Max change set size?

2011/5/18 Nicolas Cellier <[hidden email]>:

> I've implemented some of these changes in VW 20 years ago, because I
> was generating code from symbolic expression (computer algebra
> system).
> But it was easier because BlockClosure are a literal in VW, so you
> just have to turn optimisation of long blocks off.
> For literals, and temps, I created Arrays of literals and temps as
> Bert suggested. That means that some message sends were replace with
> perform: operations.
> But there is more : even the integer index used to access the
> literal/temp Array can be a literal by itself (the limit depends on
> the byte code set).
> One trick is to generate Arrays of Arrays of Arrays ... all of size
> accessible with a literal free integer (BEWARE, at expense of stack
> depth when it comes to evaluating).
> I chose another way: generate an expression computing literal indices
> from byte-code encoded smaller integers. Funny.

Oops, posted too fast...
it was the contrary, my solution was impacting stack depth.
Array of Array of ... should not.

Nicolas

>
> For the number of arguments I also passed an array of arguments
> (generalisation of temps trick). I have a patch pending in mantis
> which can be applied to Squeak.
>
> For stack depth, I don't remember if I ever hit the limit, nor what
> this limit is in VW. That's IMO a big problem in current Squeak.
>
> All in all, hacking the Compiler is do-able. But the bad news is that
> you will have to hack the Decompiler, and the Debugger... Much harder
> in current Squeak architecture (maybe worth a full rewrite in this
> case).
>
> Nicolas
>
> 2011/5/18 Bert Freudenberg <[hidden email]>:
>> On 18.05.2011, at 05:28, Igor Stasenko wrote:
>>
>>> Aha, so you're talking not about code directly authored by humans but
>>> rather indirectly/automatically generated code,
>>> which like i said is a form of abuse because it actually turns a
>>> source code into data storage (and of course sometimes
>>> it is hard to invent something better, but it doesn't makes it less abuse ;).
>>>
>>> I would not bother about limits, because it is not a big deal, in your
>>> framework you could always detect if method's size surpasses certain
>>> reasonable limit, then you can simply split it onto number of smaller
>>> methods and then generate a root method to invoke them one by one in
>>> order initialize things in specific order.
>>
>> Of course you can do that, it's just more work.
>>
>> I ran into the limits of the max jump distance being 1024 bytecodes, and the number of temps being limited to 64. That meant I had to split into different methods. That meant I had to figure out how to pass the intermediates into the next method. Also, both cases of a conditional must be in a single method, so each branch needs to spill over separately. Etc.
>>
>> If I had created the code generator from scratch I would have designed it to not run into the limits in the first place. But instead I modified a code generator that used to output C. And a C function has virtually no limits, at least compared to a Squeak method.
>>
>> I wonder if compiler magic could remove the limits without having to change the VM. Like, if there are more than 63 temps, make the 64th temp be an array to hold the spillover temps. Same for literals, method arguments, block arguments, inst vars. For large jumps, do a series of unconditional 1024 byte jumps. The most severe limit might be stack depth, it might have to reify the stack into an array. The hard limit on CompiledMethod size would be 1 GB, because the PC is a positive SmallInt?
>>
>> The benefit would be that for reasonably sized methods there would be no penalty at all, but there would not be artificially low limits either when you happen to do something unreasonable :)
>>
>> I can see how this would work for the interpreter. And since bytecode semantics would be unchanged, it should even work for Cog? Of course, it rarely translates large methods anyway.
>>
>> - Bert -
>>
>>
>>
>

Casey Ransberger-2

Re: Max source method length? Max string length? Max change set size?

In reply to this post by Igor Stasenko

You know why I like you, Igor? You just gave me a segue:) Comment inline below...

On May 17, 2011, at 8:28 PM, Igor Stasenko <[hidden email]> wrote:

> Aha, so you're talking not about code directly authored by humans but
> rather indirectly/automatically generated code,
> which like i said is a form of abuse because it actually turns a
> source code into data storage (and of course sometimes
> it is hard to invent something better, but it doesn't makes it less abuse ;)

Squeak already does this with window icons, etc. It's not new. As a matter of fact, the whole reason I did this was to get those very same awful blobs the hell out of my image without sacrificing their utility.

It's a total strawman, but it works. That's all I'll say for now, but there will be a tad more news about it soon, so stay tuned;)

Thanks again for all of your replies everyone! The chronological integrity of my world is unbroken now, even if I've done something rather rude and hackish to my object system:)

marcel.taeumel (old)

Re: Max source method length? Max string length? Max change set size?

In reply to this post by Bert Freudenberg

Hi.

166 literals
4286 code bytes (2971 instructions)

Marcel

Bert Freudenberg

Re: Max source method length? Max string length? Max change set size?

On 18.05.2011, at 19:53, Marcel Taeumel wrote:

> Hi.
>
> 166 literals
> 4286 code bytes (2971 instructions)
>
> Marcel

Ah, so you're not that far off from the max number of literals. I guess that's why other UI frameworks use literal arrays instead of separate messages.

Also, you could not wrap the method into a "true ifTrue: []" because it's too large.

- Bert -