Do these guys have caps? Any difference between Cog and the interpreter VM?
-- Casey Ransberger |
>>>>> "Casey" == Casey Ransberger <[hidden email]> writes:
Casey> Do these guys have caps? Any difference between Cog and the Casey> interpreter VM? I hope the 32K limit for method source is still in place. It's sad to think anyone would ever want to go beyond that. -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <[hidden email]> <URL:http://www.stonehenge.com/merlyn/> Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc. See http://methodsandmessages.posterous.com/ for Smalltalk discussion |
In reply to this post by Casey Ransberger-2
On Tue, May 17, 2011 at 05:15:03PM -0700, Casey Ransberger wrote:
> > Do these guys have caps? Not that I am aware of. > Any difference between Cog and the interpreter VM? No. Dave |
In reply to this post by Randal L. Schwartz
On 2011-05-17 10:04 PM, Randal L. Schwartz wrote:
> I hope the 32K limit for method source is still in place. It's sad to > think anyone would ever want to go beyond that. Humans might balk at writing such long methods, but compilers frequently find themselves in a position where writing very long methods can make a lot of sense. Tony |
On 18 May 2011 04:42, Tony Garnock-Jones <[hidden email]> wrote:
> On 2011-05-17 10:04 PM, Randal L. Schwartz wrote: >> >> I hope the 32K limit for method source is still in place. It's sad to >> think anyone would ever want to go beyond that. > > Humans might balk at writing such long methods, but compilers frequently > find themselves in a position where writing very long methods can make a lot > of sense. > hmm.. can you provide some example? i could understand why compilers could generate big code, like unrolling the loops to gain better performance. But if we are talking about source code of a single method (and taking the context into mind that we're talking about smalltalk), then i really cannot find any good situation where methods with big source code is preferable to a number of short methods. If we don't count an obvious abuses, like encoding binary data into large literal arrays, which turning a method into dumb data storage, and really considering that there could be some complex algorithms, which for some reason is preferable to put into a single method body. So i'd like to hear what are those algorithms and what is the gains comparing to implementing it using a number of shorter methods. > Tony -- Best regards, Igor Stasenko AKA sig. |
On 2011-05-17 10:54 PM, Igor Stasenko wrote:
> hmm.. can you provide some example? Module initialisation code is the case I've run into, where many "constant" objects (including symbols, functions, classes etc.) are computed, initialised and placed into a dictionary/namespace. A compiler from Scheme to .NET CLR I worked on a while ago frequently generated module initialisation methods with more than 64k of *bytecode*, which worked (a) slowly on one implementation (b) reasonably on another and (c) not at all on a third, causing it to dump core. (Another way of saying the same thing: The only three reasonable numbers are zero, one and infinity.) Tony |
In reply to this post by Tony Garnock-Jones-3
On Tue, May 17, 2011 at 10:42:14PM -0400, Tony Garnock-Jones wrote:
> On 2011-05-17 10:04 PM, Randal L. Schwartz wrote: > >I hope the 32K limit for method source is still in place. It's sad to > >think anyone would ever want to go beyond that. > > Humans might balk at writing such long methods, but compilers frequently > find themselves in a position where writing very long methods can make a > lot of sense. Indeed, although 32K of generated Smalltalk source still sounds somewhat horrible under any circumstances. But out of curiosity, is there actually a 32K limit on this? I know there is a limit on number of method temps, but I cannot think why there would be a limit on the length of the source string. There once was a 32M limit on sources and changes files, but that was eliminated long ago (see ExpandedSourceFileArrayTest class comment). Dave |
In reply to this post by Tony Garnock-Jones-3
On 18 May 2011 04:58, Tony Garnock-Jones <[hidden email]> wrote:
> On 2011-05-17 10:54 PM, Igor Stasenko wrote: >> >> hmm.. can you provide some example? > > Module initialisation code is the case I've run into, where many "constant" > objects (including symbols, functions, classes etc.) are computed, > initialised and placed into a dictionary/namespace. A compiler from Scheme > to .NET CLR I worked on a while ago frequently generated module > initialisation methods with more than 64k of *bytecode*, which worked (a) > slowly on one implementation (b) reasonably on another and (c) not at all on > a third, causing it to dump core. > Aha, so you're talking not about code directly authored by humans but rather indirectly/automatically generated code, which like i said is a form of abuse because it actually turns a source code into data storage (and of course sometimes it is hard to invent something better, but it doesn't makes it less abuse ;). I would not bother about limits, because it is not a big deal, in your framework you could always detect if method's size surpasses certain reasonable limit, then you can simply split it onto number of smaller methods and then generate a root method to invoke them one by one in order initialize things in specific order. > (Another way of saying the same thing: The only three reasonable numbers are > zero, one and infinity.) > > Tony > -- Best regards, Igor Stasenko AKA sig. |
In reply to this post by Randal L. Schwartz
Busted! Working my way around a bootstrapping problem with an abuse of the compiler:) and really just trying to do the quick hack that works. 32K will be more than enough for my purpose. 32K should be enough for anybody, right? :P I was mostly looking for a bound that wasn't completely arbitrary for shuffling some data that I have out of one image and into another.
Thanks to everyone who replied!
On Tue, May 17, 2011 at 7:04 PM, Randal L. Schwartz <[hidden email]> wrote: >>>>> "Casey" == Casey Ransberger <[hidden email]> writes: -- Casey Ransberger |
In reply to this post by Igor Stasenko
On 18.05.2011, at 05:28, Igor Stasenko wrote:
> Aha, so you're talking not about code directly authored by humans but > rather indirectly/automatically generated code, > which like i said is a form of abuse because it actually turns a > source code into data storage (and of course sometimes > it is hard to invent something better, but it doesn't makes it less abuse ;). > > I would not bother about limits, because it is not a big deal, in your > framework you could always detect if method's size surpasses certain > reasonable limit, then you can simply split it onto number of smaller > methods and then generate a root method to invoke them one by one in > order initialize things in specific order. Of course you can do that, it's just more work. I ran into the limits of the max jump distance being 1024 bytecodes, and the number of temps being limited to 64. That meant I had to split into different methods. That meant I had to figure out how to pass the intermediates into the next method. Also, both cases of a conditional must be in a single method, so each branch needs to spill over separately. Etc. If I had created the code generator from scratch I would have designed it to not run into the limits in the first place. But instead I modified a code generator that used to output C. And a C function has virtually no limits, at least compared to a Squeak method. I wonder if compiler magic could remove the limits without having to change the VM. Like, if there are more than 63 temps, make the 64th temp be an array to hold the spillover temps. Same for literals, method arguments, block arguments, inst vars. For large jumps, do a series of unconditional 1024 byte jumps. The most severe limit might be stack depth, it might have to reify the stack into an array. The hard limit on CompiledMethod size would be 1 GB, because the PC is a positive SmallInt? The benefit would be that for reasonably sized methods there would be no penalty at all, but there would not be artificially low limits either when you happen to do something unreasonable :) I can see how this would work for the interpreter. And since bytecode semantics would be unchanged, it should even work for Cog? Of course, it rarely translates large methods anyway. - Bert - |
In reply to this post by David T. Lewis
Uh-Oh. =D
I just opened one of the more complex designs I created with the Morphic Designer and it hat about 25K characters. Hmm.... I could split up the code generation but, besides the 32K limit, I see no other reason why to do so. Well, it is just generated code. No one will read it. Actually, there is no intention to ever do so. At least in that case. ;) Marcel |
On 18.05.2011, at 11:49, Marcel Taeumel wrote: > Uh-Oh. =D > > I just opened one of the more complex designs I created with the Morphic > Designer and it hat about 25K characters. Hmm.... I could split up the code > generation but, besides the 32K limit, I see no other reason why to do so. The code has no jumps, so that is no problem. But you might exceed the number of literals. How many literals does your big method have? And just out of curiosity, how many byte codes? - Bert - |
In reply to this post by Randal L. Schwartz
On 18.05.2011, at 04:04, Randal L. Schwartz wrote: >>>>>> "Casey" == Casey Ransberger <[hidden email]> writes: > > Casey> Do these guys have caps? Any difference between Cog and the > Casey> interpreter VM? > > I hope the 32K limit for method source is still in place. I don't think there ever was a limit on the method source length in Squeak. I just made a method > 100K. At least up to the max length of a string. Which is I think 1 GB (SmallInteger maxVal, 2^30 - 1). - Bert - |
On 18 May 2011 12:15, Bert Freudenberg <[hidden email]> wrote:
> > On 18.05.2011, at 04:04, Randal L. Schwartz wrote: > >>>>>>> "Casey" == Casey Ransberger <[hidden email]> writes: >> >> Casey> Do these guys have caps? Any difference between Cog and the >> Casey> interpreter VM? >> >> I hope the 32K limit for method source is still in place. > > I don't think there ever was a limit on the method source length in Squeak. I just made a method > 100K. > > At least up to the max length of a string. Which is I think 1 GB (SmallInteger maxVal, 2^30 - 1). > Yes. I had the same expectation. There is no hardcoded limits for source code, at least i didn't seen them. But since source code length ~~ bytecode size ~~ literals number ~~ number of temps so i wonder if original question, which started this topic were correct (exactly about source code limits , but not the limits incurred by method's and context formats). > - Bert - > -- Best regards, Igor Stasenko AKA sig. |
In reply to this post by Bert Freudenberg
I've implemented some of these changes in VW 20 years ago, because I
was generating code from symbolic expression (computer algebra system). But it was easier because BlockClosure are a literal in VW, so you just have to turn optimisation of long blocks off. For literals, and temps, I created Arrays of literals and temps as Bert suggested. That means that some message sends were replace with perform: operations. But there is more : even the integer index used to access the literal/temp Array can be a literal by itself (the limit depends on the byte code set). One trick is to generate Arrays of Arrays of Arrays ... all of size accessible with a literal free integer (BEWARE, at expense of stack depth when it comes to evaluating). I chose another way: generate an expression computing literal indices from byte-code encoded smaller integers. Funny. For the number of arguments I also passed an array of arguments (generalisation of temps trick). I have a patch pending in mantis which can be applied to Squeak. For stack depth, I don't remember if I ever hit the limit, nor what this limit is in VW. That's IMO a big problem in current Squeak. All in all, hacking the Compiler is do-able. But the bad news is that you will have to hack the Decompiler, and the Debugger... Much harder in current Squeak architecture (maybe worth a full rewrite in this case). Nicolas 2011/5/18 Bert Freudenberg <[hidden email]>: > On 18.05.2011, at 05:28, Igor Stasenko wrote: > >> Aha, so you're talking not about code directly authored by humans but >> rather indirectly/automatically generated code, >> which like i said is a form of abuse because it actually turns a >> source code into data storage (and of course sometimes >> it is hard to invent something better, but it doesn't makes it less abuse ;). >> >> I would not bother about limits, because it is not a big deal, in your >> framework you could always detect if method's size surpasses certain >> reasonable limit, then you can simply split it onto number of smaller >> methods and then generate a root method to invoke them one by one in >> order initialize things in specific order. > > Of course you can do that, it's just more work. > > I ran into the limits of the max jump distance being 1024 bytecodes, and the number of temps being limited to 64. That meant I had to split into different methods. That meant I had to figure out how to pass the intermediates into the next method. Also, both cases of a conditional must be in a single method, so each branch needs to spill over separately. Etc. > > If I had created the code generator from scratch I would have designed it to not run into the limits in the first place. But instead I modified a code generator that used to output C. And a C function has virtually no limits, at least compared to a Squeak method. > > I wonder if compiler magic could remove the limits without having to change the VM. Like, if there are more than 63 temps, make the 64th temp be an array to hold the spillover temps. Same for literals, method arguments, block arguments, inst vars. For large jumps, do a series of unconditional 1024 byte jumps. The most severe limit might be stack depth, it might have to reify the stack into an array. The hard limit on CompiledMethod size would be 1 GB, because the PC is a positive SmallInt? > > The benefit would be that for reasonably sized methods there would be no penalty at all, but there would not be artificially low limits either when you happen to do something unreasonable :) > > I can see how this would work for the interpreter. And since bytecode semantics would be unchanged, it should even work for Cog? Of course, it rarely translates large methods anyway. > > - Bert - > > > |
2011/5/18 Nicolas Cellier <[hidden email]>:
> I've implemented some of these changes in VW 20 years ago, because I > was generating code from symbolic expression (computer algebra > system). > But it was easier because BlockClosure are a literal in VW, so you > just have to turn optimisation of long blocks off. > For literals, and temps, I created Arrays of literals and temps as > Bert suggested. That means that some message sends were replace with > perform: operations. > But there is more : even the integer index used to access the > literal/temp Array can be a literal by itself (the limit depends on > the byte code set). > One trick is to generate Arrays of Arrays of Arrays ... all of size > accessible with a literal free integer (BEWARE, at expense of stack > depth when it comes to evaluating). > I chose another way: generate an expression computing literal indices > from byte-code encoded smaller integers. Funny. Oops, posted too fast... it was the contrary, my solution was impacting stack depth. Array of Array of ... should not. Nicolas > > For the number of arguments I also passed an array of arguments > (generalisation of temps trick). I have a patch pending in mantis > which can be applied to Squeak. > > For stack depth, I don't remember if I ever hit the limit, nor what > this limit is in VW. That's IMO a big problem in current Squeak. > > All in all, hacking the Compiler is do-able. But the bad news is that > you will have to hack the Decompiler, and the Debugger... Much harder > in current Squeak architecture (maybe worth a full rewrite in this > case). > > Nicolas > > 2011/5/18 Bert Freudenberg <[hidden email]>: >> On 18.05.2011, at 05:28, Igor Stasenko wrote: >> >>> Aha, so you're talking not about code directly authored by humans but >>> rather indirectly/automatically generated code, >>> which like i said is a form of abuse because it actually turns a >>> source code into data storage (and of course sometimes >>> it is hard to invent something better, but it doesn't makes it less abuse ;). >>> >>> I would not bother about limits, because it is not a big deal, in your >>> framework you could always detect if method's size surpasses certain >>> reasonable limit, then you can simply split it onto number of smaller >>> methods and then generate a root method to invoke them one by one in >>> order initialize things in specific order. >> >> Of course you can do that, it's just more work. >> >> I ran into the limits of the max jump distance being 1024 bytecodes, and the number of temps being limited to 64. That meant I had to split into different methods. That meant I had to figure out how to pass the intermediates into the next method. Also, both cases of a conditional must be in a single method, so each branch needs to spill over separately. Etc. >> >> If I had created the code generator from scratch I would have designed it to not run into the limits in the first place. But instead I modified a code generator that used to output C. And a C function has virtually no limits, at least compared to a Squeak method. >> >> I wonder if compiler magic could remove the limits without having to change the VM. Like, if there are more than 63 temps, make the 64th temp be an array to hold the spillover temps. Same for literals, method arguments, block arguments, inst vars. For large jumps, do a series of unconditional 1024 byte jumps. The most severe limit might be stack depth, it might have to reify the stack into an array. The hard limit on CompiledMethod size would be 1 GB, because the PC is a positive SmallInt? >> >> The benefit would be that for reasonably sized methods there would be no penalty at all, but there would not be artificially low limits either when you happen to do something unreasonable :) >> >> I can see how this would work for the interpreter. And since bytecode semantics would be unchanged, it should even work for Cog? Of course, it rarely translates large methods anyway. >> >> - Bert - >> >> >> > |
In reply to this post by Igor Stasenko
You know why I like you, Igor? You just gave me a segue:) Comment inline below...
On May 17, 2011, at 8:28 PM, Igor Stasenko <[hidden email]> wrote: > Aha, so you're talking not about code directly authored by humans but > rather indirectly/automatically generated code, > which like i said is a form of abuse because it actually turns a > source code into data storage (and of course sometimes > it is hard to invent something better, but it doesn't makes it less abuse ;) Squeak already does this with window icons, etc. It's not new. As a matter of fact, the whole reason I did this was to get those very same awful blobs the hell out of my image without sacrificing their utility. It's a total strawman, but it works. That's all I'll say for now, but there will be a tad more news about it soon, so stay tuned;) Thanks again for all of your replies everyone! The chronological integrity of my world is unbroken now, even if I've done something rather rude and hackish to my object system:) |
In reply to this post by Bert Freudenberg
Hi.
166 literals 4286 code bytes (2971 instructions) Marcel |
On 18.05.2011, at 19:53, Marcel Taeumel wrote: > Hi. > > 166 literals > 4286 code bytes (2971 instructions) > > Marcel Ah, so you're not that far off from the max number of literals. I guess that's why other UI frameworks use literal arrays instead of separate messages. Also, you could not wrap the method into a "true ifTrue: []" because it's too large. - Bert - |
Free forum by Nabble | Edit this page |