test crashing the cog vm

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
50 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: test crashing the cog vm

Igor Stasenko
On 22 March 2011 14:02, Toon Verwaest <[hidden email]> wrote:

> The problem is exactly what you had at hand. Your bytecode WAS valid, but it
> was used in combination with an incompatible class layout. So validation
> here wouldn't solve anything. You always need to validate in a closed world
> to ensure you don't accidentally break everything. Whenever you change
> something, you need to ensure that you revalidate the relevant parts. In
> this case you could for example just have validated that your new class is a
> valid subclass of its superclass; which it was not.
>
> To make the system more secure obviously you would have to check those
> things, but if you add part of the API that circumvents these checks, like
> you were doing by calling "Class new" rather than using the ClassBuilder
> which does do the checks, everything breaks. The only authority that can
> actually ensure that you don't circumvent these checks is the VM.
>
> Obviously you can make sure already in your image that you have enough
> checks everywhere. You just don't have a crashproof mechanism that will
> chain your users down avoiding that they shoot themselves in the foot with
> segfaults. If you put it inside of the VM you -can- provide such a
> mechanism, because you don't execute anything unless you know it's safe. And
> as I said, this piece of code could be a piece of prevalidated Smalltalk
> code that's immutable from the rest of the image.
>

Yep.. and that is possible only after you introduce irreversible
immutability mechanism into VM,
which marking object(s) to be immutable for the rest of their existence.

> cheers,
> Toon
>
> On 03/22/2011 02:37 PM, Alexandre Bergel wrote:
>>>
>>> But why we could not have a byecode validator at the image level that
>>> first make sure that byte code are in sync with the format of the objects.
>>> Why this has to be done in the vm.
>>
>> I agree with Stef. It is not obvious for me why it has to be done at the
>> VM.
>>
>> Alexandre
>
>
>



--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: test crashing the cog vm

abergel
In reply to this post by Toon Verwaest-2
Hi Toon,

Thanks for having taken the time to reply.
I understand your and Eliot's email but I am left unconvinced.

Alexandre



On 22 Mar 2011, at 09:02, Toon Verwaest wrote:

> The problem is exactly what you had at hand. Your bytecode WAS valid, but it was used in combination with an incompatible class layout. So validation here wouldn't solve anything. You always need to validate in a closed world to ensure you don't accidentally break everything. Whenever you change something, you need to ensure that you revalidate the relevant parts. In this case you could for example just have validated that your new class is a valid subclass of its superclass; which it was not.
>
> To make the system more secure obviously you would have to check those things, but if you add part of the API that circumvents these checks, like you were doing by calling "Class new" rather than using the ClassBuilder which does do the checks, everything breaks. The only authority that can actually ensure that you don't circumvent these checks is the VM.
>
> Obviously you can make sure already in your image that you have enough checks everywhere. You just don't have a crashproof mechanism that will chain your users down avoiding that they shoot themselves in the foot with segfaults. If you put it inside of the VM you -can- provide such a mechanism, because you don't execute anything unless you know it's safe. And as I said, this piece of code could be a piece of prevalidated Smalltalk code that's immutable from the rest of the image.
>
> cheers,
> Toon
>
> On 03/22/2011 02:37 PM, Alexandre Bergel wrote:
>>> But why we could not have a byecode validator at the image level that first make sure that byte code are in sync with the format of the objects.
>>> Why this has to be done in the vm.
>>
>> I agree with Stef. It is not obvious for me why it has to be done at the VM.
>>
>> Alexandre
>
>

--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel  http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.






Reply | Threaded
Open this post in threaded view
|

Re: test crashing the cog vm

Toon Verwaest-2
In reply to this post by Igor Stasenko
It is much simpler than that! You just need a separate Smalltalk root
object to which only the VM has access. It has its own set of classes
which are unique pointers. This means that the user-level image never
has any pointer to any of the objects inside of this core image. This
means that it can never GET access to it either! So they can't tamper
with the code in the first place, since they can't see it.

The VM just passes code from the user-part of the memory to the
verification part of the memory, since the VM has access to both. As
simple as that.

cheers,
Toon

On 03/22/2011 02:17 PM, Igor Stasenko wrote:

> On 22 March 2011 14:02, Toon Verwaest<[hidden email]>  wrote:
>> The problem is exactly what you had at hand. Your bytecode WAS valid, but it
>> was used in combination with an incompatible class layout. So validation
>> here wouldn't solve anything. You always need to validate in a closed world
>> to ensure you don't accidentally break everything. Whenever you change
>> something, you need to ensure that you revalidate the relevant parts. In
>> this case you could for example just have validated that your new class is a
>> valid subclass of its superclass; which it was not.
>>
>> To make the system more secure obviously you would have to check those
>> things, but if you add part of the API that circumvents these checks, like
>> you were doing by calling "Class new" rather than using the ClassBuilder
>> which does do the checks, everything breaks. The only authority that can
>> actually ensure that you don't circumvent these checks is the VM.
>>
>> Obviously you can make sure already in your image that you have enough
>> checks everywhere. You just don't have a crashproof mechanism that will
>> chain your users down avoiding that they shoot themselves in the foot with
>> segfaults. If you put it inside of the VM you -can- provide such a
>> mechanism, because you don't execute anything unless you know it's safe. And
>> as I said, this piece of code could be a piece of prevalidated Smalltalk
>> code that's immutable from the rest of the image.
>>
> Yep.. and that is possible only after you introduce irreversible
> immutability mechanism into VM,
> which marking object(s) to be immutable for the rest of their existence.
>
>> cheers,
>> Toon
>>
>> On 03/22/2011 02:37 PM, Alexandre Bergel wrote:
>>>> But why we could not have a byecode validator at the image level that
>>>> first make sure that byte code are in sync with the format of the objects.
>>>> Why this has to be done in the vm.
>>> I agree with Stef. It is not obvious for me why it has to be done at the
>>> VM.
>>>
>>> Alexandre
>>
>>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: test crashing the cog vm

Toon Verwaest-2
In reply to this post by abergel
Construct me any scenario just based on the image, and I can show you
how I can avoid it and segfault your system anyway.

Construct me any piece of code that can segfault my approach. I doubt
that you'll find any.

I can't really say more, since I don't know what you are unconvinced
about :)

cheers,
Toon

On 03/22/2011 03:19 PM, Alexandre Bergel wrote:

> Hi Toon,
>
> Thanks for having taken the time to reply.
> I understand your and Eliot's email but I am left unconvinced.
>
> Alexandre
>
>
>
> On 22 Mar 2011, at 09:02, Toon Verwaest wrote:
>
>> The problem is exactly what you had at hand. Your bytecode WAS valid, but it was used in combination with an incompatible class layout. So validation here wouldn't solve anything. You always need to validate in a closed world to ensure you don't accidentally break everything. Whenever you change something, you need to ensure that you revalidate the relevant parts. In this case you could for example just have validated that your new class is a valid subclass of its superclass; which it was not.
>>
>> To make the system more secure obviously you would have to check those things, but if you add part of the API that circumvents these checks, like you were doing by calling "Class new" rather than using the ClassBuilder which does do the checks, everything breaks. The only authority that can actually ensure that you don't circumvent these checks is the VM.
>>
>> Obviously you can make sure already in your image that you have enough checks everywhere. You just don't have a crashproof mechanism that will chain your users down avoiding that they shoot themselves in the foot with segfaults. If you put it inside of the VM you -can- provide such a mechanism, because you don't execute anything unless you know it's safe. And as I said, this piece of code could be a piece of prevalidated Smalltalk code that's immutable from the rest of the image.
>>
>> cheers,
>> Toon
>>
>> On 03/22/2011 02:37 PM, Alexandre Bergel wrote:
>>>> But why we could not have a byecode validator at the image level that first make sure that byte code are in sync with the format of the objects.
>>>> Why this has to be done in the vm.
>>> I agree with Stef. It is not obvious for me why it has to be done at the VM.
>>>
>>> Alexandre
>>


Reply | Threaded
Open this post in threaded view
|

Re: test crashing the cog vm

abergel
I am not saying that this is feasible with the current VM. But I am doubtful that the bytecode verification can only be done in the VM. If something get wrong, why not to raise a primitiveFailed, as 1/0 will do?

I have the impression that you give a particular status to the bytecodes. I have the feeling that we should not. I can realize a division, and time to time it get wrong (/0). But the VM does not crash. So, why if I evaluate the wrong bytecodes, the VM should crash? You could have the necessary guards in the VM. But the full algorithm for verification could be in the VM. I am not expert on this and I may perfectly be wrong.

Most of virtual machine maker will say that a good debugger and profiler can only be done in the VM. Do you agree with this?

Cheers,
Alexandre


On 22 Mar 2011, at 09:24, Toon Verwaest wrote:

> Construct me any scenario just based on the image, and I can show you how I can avoid it and segfault your system anyway.
>
> Construct me any piece of code that can segfault my approach. I doubt that you'll find any.
>
> I can't really say more, since I don't know what you are unconvinced about :)
>
> cheers,
> Toon
>
> On 03/22/2011 03:19 PM, Alexandre Bergel wrote:
>> Hi Toon,
>>
>> Thanks for having taken the time to reply.
>> I understand your and Eliot's email but I am left unconvinced.
>>
>> Alexandre
>>
>>
>>
>> On 22 Mar 2011, at 09:02, Toon Verwaest wrote:
>>
>>> The problem is exactly what you had at hand. Your bytecode WAS valid, but it was used in combination with an incompatible class layout. So validation here wouldn't solve anything. You always need to validate in a closed world to ensure you don't accidentally break everything. Whenever you change something, you need to ensure that you revalidate the relevant parts. In this case you could for example just have validated that your new class is a valid subclass of its superclass; which it was not.
>>>
>>> To make the system more secure obviously you would have to check those things, but if you add part of the API that circumvents these checks, like you were doing by calling "Class new" rather than using the ClassBuilder which does do the checks, everything breaks. The only authority that can actually ensure that you don't circumvent these checks is the VM.
>>>
>>> Obviously you can make sure already in your image that you have enough checks everywhere. You just don't have a crashproof mechanism that will chain your users down avoiding that they shoot themselves in the foot with segfaults. If you put it inside of the VM you -can- provide such a mechanism, because you don't execute anything unless you know it's safe. And as I said, this piece of code could be a piece of prevalidated Smalltalk code that's immutable from the rest of the image.
>>>
>>> cheers,
>>> Toon
>>>
>>> On 03/22/2011 02:37 PM, Alexandre Bergel wrote:
>>>>> But why we could not have a byecode validator at the image level that first make sure that byte code are in sync with the format of the objects.
>>>>> Why this has to be done in the vm.
>>>> I agree with Stef. It is not obvious for me why it has to be done at the VM.
>>>>
>>>> Alexandre
>>>
>
>

--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel  http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.






Reply | Threaded
Open this post in threaded view
|

Re: test crashing the cog vm

Marcus Denker-4
In reply to this post by Toon Verwaest-2

On Mar 22, 2011, at 2:35 PM, Alexandre Bergel wrote:

> I am not saying that this is feasible with the current VM. But I am doubtful that the bytecode verification can only be done in the VM. If something get wrong, why not to raise a primitiveFailed, as 1/0 will do?
>
Because checking all these things at runtime will make interpretation slow.

> I have the impression that you give a particular status to the bytecodes. I have the feeling that we should not.

Yes, to be interpretable without having to check at runtime.

> I can realize a division, and time to time it get wrong (/0). But the VM does not crash. So, why if I evaluate the wrong bytecodes, the VM should crash?

Because we don't want to check everything, else it is too slow.

> You could have the necessary guards in the VM.

Slow.


        Marcus

--
Marcus Denker  -- http://www.marcusdenker.de
INRIA Lille -- Nord Europe. Team RMoD.


Reply | Threaded
Open this post in threaded view
|

Re: test crashing the cog vm

Toon Verwaest-2
In reply to this post by abergel
On 03/22/2011 03:35 PM, Alexandre Bergel wrote:
> I am not saying that this is feasible with the current VM. But I am doubtful that the bytecode verification can only be done in the VM. If something get wrong, why not to raise a primitiveFailed, as 1/0 will do?
>
> I have the impression that you give a particular status to the bytecodes. I have the feeling that we should not. I can realize a division, and time to time it get wrong (/0). But the VM does not crash. So, why if I evaluate the wrong bytecodes, the VM should crash? You could have the necessary guards in the VM. But the full algorithm for verification could be in the VM. I am not expert on this and I may perfectly be wrong.
>
> Most of virtual machine maker will say that a good debugger and profiler can only be done in the VM. Do you agree with this?
>
> Cheers,
> Alexandre
Well, there is 1 alternative. Check at runtime if what the bytecode
wants to do is safe. And that's what you basically suggest. This imposes
a high runtime overhead however and that by itself makes that an
unwanted approach.

The same is true for debuggers and profilers. Depending on what your
task at hand is, you don't want this to interfere with the speed of the
system. In other cases however you are just fine with knowing where you
spend most of your time based on your model. These are 2 different
things. And I do agree very strongly that it's most often more
interesting to have the high-level view.

If you want the bytecode to run safely but -also- as fast as possible, I
don't see a way around precomputing if it is valid. This is how VMs
optimize things all the time; that's exactly why register-based
bytecodes are faster than stack-based bytecodes. You avoid doing at
runtime what you know at compilation time. Obviously it gives the same
effect in both cases, but you avoid an overhead with the increased risk
of doing things wrongly.

However, your solution of having a VM that safely executes bytecodes
doesn't bring the verification to the image! It keeps it in the VM, just
in a different location. Now every execution of a bytecode needs to
perform the check, while beforehand you only would do it once for the
whole method. Even worse! My approach would allow you to write the
verification in Smalltalk code; it's just Smalltalk code running in a
privileged sector of the VM. Your code will be pure C code. And why
would you need Smalltalk-level verification anyway if your bytecodes
already do the checks anyway? This seems duplicate work with no gain.
(it basically would just check that your compiler is surely not broken :))

cheers,
Toon

>
> On 22 Mar 2011, at 09:24, Toon Verwaest wrote:
>
>> Construct me any scenario just based on the image, and I can show you how I can avoid it and segfault your system anyway.
>>
>> Construct me any piece of code that can segfault my approach. I doubt that you'll find any.
>>
>> I can't really say more, since I don't know what you are unconvinced about :)
>>
>> cheers,
>> Toon
>>
>> On 03/22/2011 03:19 PM, Alexandre Bergel wrote:
>>> Hi Toon,
>>>
>>> Thanks for having taken the time to reply.
>>> I understand your and Eliot's email but I am left unconvinced.
>>>
>>> Alexandre
>>>
>>>
>>>
>>> On 22 Mar 2011, at 09:02, Toon Verwaest wrote:
>>>
>>>> The problem is exactly what you had at hand. Your bytecode WAS valid, but it was used in combination with an incompatible class layout. So validation here wouldn't solve anything. You always need to validate in a closed world to ensure you don't accidentally break everything. Whenever you change something, you need to ensure that you revalidate the relevant parts. In this case you could for example just have validated that your new class is a valid subclass of its superclass; which it was not.
>>>>
>>>> To make the system more secure obviously you would have to check those things, but if you add part of the API that circumvents these checks, like you were doing by calling "Class new" rather than using the ClassBuilder which does do the checks, everything breaks. The only authority that can actually ensure that you don't circumvent these checks is the VM.
>>>>
>>>> Obviously you can make sure already in your image that you have enough checks everywhere. You just don't have a crashproof mechanism that will chain your users down avoiding that they shoot themselves in the foot with segfaults. If you put it inside of the VM you -can- provide such a mechanism, because you don't execute anything unless you know it's safe. And as I said, this piece of code could be a piece of prevalidated Smalltalk code that's immutable from the rest of the image.
>>>>
>>>> cheers,
>>>> Toon
>>>>
>>>> On 03/22/2011 02:37 PM, Alexandre Bergel wrote:
>>>>>> But why we could not have a byecode validator at the image level that first make sure that byte code are in sync with the format of the objects.
>>>>>> Why this has to be done in the vm.
>>>>> I agree with Stef. It is not obvious for me why it has to be done at the VM.
>>>>>
>>>>> Alexandre
>>


Reply | Threaded
Open this post in threaded view
|

Re: test crashing the cog vm

Toon Verwaest-2
In reply to this post by Marcus Denker-4
Exactly! :)

On 03/22/2011 02:40 PM, Marcus Denker wrote:

> On Mar 22, 2011, at 2:35 PM, Alexandre Bergel wrote:
>
>> I am not saying that this is feasible with the current VM. But I am doubtful that the bytecode verification can only be done in the VM. If something get wrong, why not to raise a primitiveFailed, as 1/0 will do?
>>
> Because checking all these things at runtime will make interpretation slow.
>
>> I have the impression that you give a particular status to the bytecodes. I have the feeling that we should not.
> Yes, to be interpretable without having to check at runtime.
>
>> I can realize a division, and time to time it get wrong (/0). But the VM does not crash. So, why if I evaluate the wrong bytecodes, the VM should crash?
> Because we don't want to check everything, else it is too slow.
>
>> You could have the necessary guards in the VM.
> Slow.
>
>
> Marcus
>
> --
> Marcus Denker  -- http://www.marcusdenker.de
> INRIA Lille -- Nord Europe. Team RMoD.
>
>


Reply | Threaded
Open this post in threaded view
|

Re: test crashing the cog vm

abergel
In reply to this post by Marcus Denker-4
>> I am not saying that this is feasible with the current VM. But I am doubtful that the bytecode verification can only be done in the VM. If something get wrong, why not to raise a primitiveFailed, as 1/0 will do?
>>
> Because checking all these things at runtime will make interpretation slow.

But could not it be at compile time?

I am not expert in VM. But something that I learnt from all over these years, is that if we copy what other people do, then we will just have a pale copy. Innovation begins by doing what other think it is impossible to do. Smalltalk has classes as object, whereas Java and C++ took a different stance. Smalltalk has the debugger and the profiler in the image; again, java people have a different opinion. Now, if someone say that bytecode verification can only be done in the VM, I am skeptic.

>> You could have the necessary guards in the VM.
>
> Slow.

I do not know. There are guards for arrays, message sent, primitive calls, ...
And it is not that slow.

Alexandre
--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel  http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.






Reply | Threaded
Open this post in threaded view
|

Re: test crashing the cog vm

Toon Verwaest-2
On 03/22/2011 04:52 PM, Alexandre Bergel wrote:
>>> I am not saying that this is feasible with the current VM. But I am doubtful that the bytecode verification can only be done in the VM. If something get wrong, why not to raise a primitiveFailed, as 1/0 will do?
>>>
>> Because checking all these things at runtime will make interpretation slow.
> But could not it be at compile time?
>
> I am not expert in VM. But something that I learnt from all over these years, is that if we copy what other people do, then we will just have a pale copy. Innovation begins by doing what other think it is impossible to do. Smalltalk has classes as object, whereas Java and C++ took a different stance. Smalltalk has the debugger and the profiler in the image; again, java people have a different opinion. Now, if someone say that bytecode verification can only be done in the VM, I am skeptic.
Ok, I very much understand this point of view. That's exactly why I
started Pinocchio.

I'm currently building my 4rd iteration. I started building a Smalltalk
by having AST evaluation (actually I started out with self-evaluating
AST objects that got evaluated by sending message... :D) . Then moved on
to stack-based bytecodes. Then moved on to registered-based bytecodes
(where I am now). All of my progress has been made by careful
scrutinizing of the problems at hand. I did reevaluate and question all
steps; and I'm a bit sad that I ended up doing it this way... But then I
guess there is some inevitability in it.

For example the register-based bytecodes, I had no idea that what I was
doing were register-based bytecodes until I read the Lua VM description
and noticed that it was exactly the same.

However I do think that I learned a lot by doing it. I made the
trade-offs myself and understand them a lot better than I did before.
What I'm defending in this discussion isn't based on estimates but
experience from building it from the ground up.

And this is where I welcome everyone to do the same! It took me 2 years
to get to where I am now, but maybe it goes faster for you ;) It is
definitely a very worthwhile exercise. And I think because of it I'm
finally at the point where I can contribute something...

cheers,
Toon

Reply | Threaded
Open this post in threaded view
|

Re: test crashing the cog vm

abergel
I see your point and I suspected you spoke according to your experience. I am just trying to make people innovate, in my own way.

Another thing. Keeping the VM small will definitely help when moving to a different executing platform. It would be amazing to have the Pharo running on Javascript VM.

Alexandre


> Ok, I very much understand this point of view. That's exactly why I started Pinocchio.
>
> I'm currently building my 4rd iteration. I started building a Smalltalk by having AST evaluation (actually I started out with self-evaluating AST objects that got evaluated by sending message... :D) . Then moved on to stack-based bytecodes. Then moved on to registered-based bytecodes (where I am now). All of my progress has been made by careful scrutinizing of the problems at hand. I did reevaluate and question all steps; and I'm a bit sad that I ended up doing it this way... But then I guess there is some inevitability in it.
>
> For example the register-based bytecodes, I had no idea that what I was doing were register-based bytecodes until I read the Lua VM description and noticed that it was exactly the same.
>
> However I do think that I learned a lot by doing it. I made the trade-offs myself and understand them a lot better than I did before. What I'm defending in this discussion isn't based on estimates but experience from building it from the ground up.
>
> And this is where I welcome everyone to do the same! It took me 2 years to get to where I am now, but maybe it goes faster for you ;) It is definitely a very worthwhile exercise. And I think because of it I'm finally at the point where I can contribute something...
>
> cheers,
> Toon
>

--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel  http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.






Reply | Threaded
Open this post in threaded view
|

Re: test crashing the cog vm

Marcus Denker-4
In reply to this post by Marcus Denker-4

On Mar 22, 2011, at 3:53 PM, Alexandre Bergel wrote:

>>> I am not saying that this is feasible with the current VM. But I am doubtful that the bytecode verification can only be done in the VM. If something get wrong, why not to raise a primitiveFailed, as 1/0 will do?
>>>
>> Because checking all these things at runtime will make interpretation slow.
>
> But could not it be at compile time?
>
Yes. That is called a "bytecode verifier". Now if that would be in the image, you would not gain anything as what you verify is changable.
(and the verifier itself, too).

> I am not expert in VM. But something that I learnt from all over these years, is that if we copy what other people do, then we will just have a pale copy. Innovation begins by doing what other think it is impossible to do. Smalltalk has classes as object, whereas Java and C++ took a different stance. Smalltalk has the debugger and the profiler in the image; again, java people have a different opinion. Now, if someone say that bytecode verification can only be done in the VM, I am skeptic.
>
It can only be done in the VM if what you have is a standard smalltalk system.

        Marcus

--
Marcus Denker  -- http://www.marcusdenker.de
INRIA Lille -- Nord Europe. Team RMoD.


Reply | Threaded
Open this post in threaded view
|

Re: test crashing the cog vm

Marcus Denker-4
In reply to this post by Toon Verwaest-2

On Mar 22, 2011, at 4:10 PM, Alexandre Bergel wrote:

> I see your point and I suspected you spoke according to your experience. I am just trying to make people innovate, in my own way.
>
But talking is easy. Doing is hard.

> Another thing. Keeping the VM small will definitely help when moving to a different executing platform. It would be amazing to have the Pharo running on Javascript VM.
>
I don't know... Gilad sounds really depressed:

        http://gbracha.blogspot.com/2011/03/truthiness-is-out-there.html

I did not waste the last years fighting about not doing research in Java to now do it on top of another crap.

And the things I wan to do are, I think, not possible on top of JavaScript anyway.

        Marcus

--
Marcus Denker  -- http://www.marcusdenker.de
INRIA Lille -- Nord Europe. Team RMoD.


Reply | Threaded
Open this post in threaded view
|

Re: test crashing the cog vm

Toon Verwaest-2
In reply to this post by abergel
On 03/22/2011 05:10 PM, Alexandre Bergel wrote:
> I see your point and I suspected you spoke according to your experience. I am just trying to make people innovate, in my own way.
>
> Another thing. Keeping the VM small will definitely help when moving to a different executing platform. It would be amazing to have the Pharo running on Javascript VM.
That's not even that hard. I even had Pharo running on top of my own
SchemeTalk already, after 2 days of hacking. I only stopped where I
bumped into traits, since I didn't find any bootstrapping script for
traits.

I would basically compile all the smalltalk code to equivalent
schemetalk code (very easy since the models are very similar), and
compile all the natives to function-calls that perform what the native
should perform. These functions had names that wouldn't clash with the
Smalltalk names so you couldn't refer to it directly.

Obviously in the end you would have all the smalltalk code running with
SchemeTalk syntax, but all the code was virgin Pharo code. Very fun stuff :)
And if you do it this way, it doesn't really depend on your VM anyway,
since I was executing Smalltalk code as Scheme code, not as Smalltalk
bytecodes interpreted by schemecode. That's a lot faster.

cheers,
Toon

> Alexandre
>
>
>> Ok, I very much understand this point of view. That's exactly why I started Pinocchio.
>>
>> I'm currently building my 4rd iteration. I started building a Smalltalk by having AST evaluation (actually I started out with self-evaluating AST objects that got evaluated by sending message... :D) . Then moved on to stack-based bytecodes. Then moved on to registered-based bytecodes (where I am now). All of my progress has been made by careful scrutinizing of the problems at hand. I did reevaluate and question all steps; and I'm a bit sad that I ended up doing it this way... But then I guess there is some inevitability in it.
>>
>> For example the register-based bytecodes, I had no idea that what I was doing were register-based bytecodes until I read the Lua VM description and noticed that it was exactly the same.
>>
>> However I do think that I learned a lot by doing it. I made the trade-offs myself and understand them a lot better than I did before. What I'm defending in this discussion isn't based on estimates but experience from building it from the ground up.
>>
>> And this is where I welcome everyone to do the same! It took me 2 years to get to where I am now, but maybe it goes faster for you ;) It is definitely a very worthwhile exercise. And I think because of it I'm finally at the point where I can contribute something...
>>
>> cheers,
>> Toon
>>


Reply | Threaded
Open this post in threaded view
|

Re: test crashing the cog vm

abergel
I tried to have a look at SchemeTalk. Apparently the website is not up to date. I reached http://www.iam.unibe.ch/~verwaest/blog/?p=6 from http://scg.unibe.ch/research/schemetalk

Alexandre


On 22 Mar 2011, at 11:16, Toon Verwaest wrote:

> On 03/22/2011 05:10 PM, Alexandre Bergel wrote:
>> I see your point and I suspected you spoke according to your experience. I am just trying to make people innovate, in my own way.
>>
>> Another thing. Keeping the VM small will definitely help when moving to a different executing platform. It would be amazing to have the Pharo running on Javascript VM.
> That's not even that hard. I even had Pharo running on top of my own SchemeTalk already, after 2 days of hacking. I only stopped where I bumped into traits, since I didn't find any bootstrapping script for traits.
>
> I would basically compile all the smalltalk code to equivalent schemetalk code (very easy since the models are very similar), and compile all the natives to function-calls that perform what the native should perform. These functions had names that wouldn't clash with the Smalltalk names so you couldn't refer to it directly.
>
> Obviously in the end you would have all the smalltalk code running with SchemeTalk syntax, but all the code was virgin Pharo code. Very fun stuff :)
> And if you do it this way, it doesn't really depend on your VM anyway, since I was executing Smalltalk code as Scheme code, not as Smalltalk bytecodes interpreted by schemecode. That's a lot faster.
>
> cheers,
> Toon
>> Alexandre
>>
>>
>>> Ok, I very much understand this point of view. That's exactly why I started Pinocchio.
>>>
>>> I'm currently building my 4rd iteration. I started building a Smalltalk by having AST evaluation (actually I started out with self-evaluating AST objects that got evaluated by sending message... :D) . Then moved on to stack-based bytecodes. Then moved on to registered-based bytecodes (where I am now). All of my progress has been made by careful scrutinizing of the problems at hand. I did reevaluate and question all steps; and I'm a bit sad that I ended up doing it this way... But then I guess there is some inevitability in it.
>>>
>>> For example the register-based bytecodes, I had no idea that what I was doing were register-based bytecodes until I read the Lua VM description and noticed that it was exactly the same.
>>>
>>> However I do think that I learned a lot by doing it. I made the trade-offs myself and understand them a lot better than I did before. What I'm defending in this discussion isn't based on estimates but experience from building it from the ground up.
>>>
>>> And this is where I welcome everyone to do the same! It took me 2 years to get to where I am now, but maybe it goes faster for you ;) It is definitely a very worthwhile exercise. And I think because of it I'm finally at the point where I can contribute something...
>>>
>>> cheers,
>>> Toon
>>>
>
>

--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel  http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.






Reply | Threaded
Open this post in threaded view
|

Re: test crashing the cog vm

Toon Verwaest-2
Grmbl. Seems broken indeed. I reinstalled it. Still some work to be done
on coloring :)

If you are interesting in the translation part from Smalltalk to
Schemetalk, it can be found at:
https://www.iam.unibe.ch/scg/svn_repos/Sources/SchemeTalk/smalltalk

The parser is basically a petit-parser like PEG parser written in
Schemetalk. The translator compiles the pharo filed-out sources to
schemetalk code and/or loads the generated code. Then there's some stuff
that avoids "bad" definitions that only serve to make pharo run fast on
the pharo VM, such as "fixtemps" and whileFalse:, whileTrue: which don't
convey their proper semantics but rather just recursively end up in the
same method hoping that the bytecode evaluation will do the proper thing.

Unfortunately I didn't write about the squeak->schemetalk conversion;
since I just considered it a silly exercise ... I guess I could if
someone's interested.

cheers,
Toon

On 03/22/2011 05:23 PM, Alexandre Bergel wrote:

> I tried to have a look at SchemeTalk. Apparently the website is not up to date. I reached http://www.iam.unibe.ch/~verwaest/blog/?p=6 from http://scg.unibe.ch/research/schemetalk
>
> Alexandre
>
>
> On 22 Mar 2011, at 11:16, Toon Verwaest wrote:
>
>> On 03/22/2011 05:10 PM, Alexandre Bergel wrote:
>>> I see your point and I suspected you spoke according to your experience. I am just trying to make people innovate, in my own way.
>>>
>>> Another thing. Keeping the VM small will definitely help when moving to a different executing platform. It would be amazing to have the Pharo running on Javascript VM.
>> That's not even that hard. I even had Pharo running on top of my own SchemeTalk already, after 2 days of hacking. I only stopped where I bumped into traits, since I didn't find any bootstrapping script for traits.
>>
>> I would basically compile all the smalltalk code to equivalent schemetalk code (very easy since the models are very similar), and compile all the natives to function-calls that perform what the native should perform. These functions had names that wouldn't clash with the Smalltalk names so you couldn't refer to it directly.
>>
>> Obviously in the end you would have all the smalltalk code running with SchemeTalk syntax, but all the code was virgin Pharo code. Very fun stuff :)
>> And if you do it this way, it doesn't really depend on your VM anyway, since I was executing Smalltalk code as Scheme code, not as Smalltalk bytecodes interpreted by schemecode. That's a lot faster.
>>
>> cheers,
>> Toon
>>> Alexandre
>>>
>>>
>>>> Ok, I very much understand this point of view. That's exactly why I started Pinocchio.
>>>>
>>>> I'm currently building my 4rd iteration. I started building a Smalltalk by having AST evaluation (actually I started out with self-evaluating AST objects that got evaluated by sending message... :D) . Then moved on to stack-based bytecodes. Then moved on to registered-based bytecodes (where I am now). All of my progress has been made by careful scrutinizing of the problems at hand. I did reevaluate and question all steps; and I'm a bit sad that I ended up doing it this way... But then I guess there is some inevitability in it.
>>>>
>>>> For example the register-based bytecodes, I had no idea that what I was doing were register-based bytecodes until I read the Lua VM description and noticed that it was exactly the same.
>>>>
>>>> However I do think that I learned a lot by doing it. I made the trade-offs myself and understand them a lot better than I did before. What I'm defending in this discussion isn't based on estimates but experience from building it from the ground up.
>>>>
>>>> And this is where I welcome everyone to do the same! It took me 2 years to get to where I am now, but maybe it goes faster for you ;) It is definitely a very worthwhile exercise. And I think because of it I'm finally at the point where I can contribute something...
>>>>
>>>> cheers,
>>>> Toon
>>>>
>>


Reply | Threaded
Open this post in threaded view
|

Re: test crashing the cog vm

abergel
Ok

Alexandre


On 22 Mar 2011, at 11:40, Toon Verwaest wrote:

> Grmbl. Seems broken indeed. I reinstalled it. Still some work to be done on coloring :)
>
> If you are interesting in the translation part from Smalltalk to Schemetalk, it can be found at:
> https://www.iam.unibe.ch/scg/svn_repos/Sources/SchemeTalk/smalltalk
>
> The parser is basically a petit-parser like PEG parser written in Schemetalk. The translator compiles the pharo filed-out sources to schemetalk code and/or loads the generated code. Then there's some stuff that avoids "bad" definitions that only serve to make pharo run fast on the pharo VM, such as "fixtemps" and whileFalse:, whileTrue: which don't convey their proper semantics but rather just recursively end up in the same method hoping that the bytecode evaluation will do the proper thing.
>
> Unfortunately I didn't write about the squeak->schemetalk conversion; since I just considered it a silly exercise ... I guess I could if someone's interested.
>
> cheers,
> Toon
>
> On 03/22/2011 05:23 PM, Alexandre Bergel wrote:
>> I tried to have a look at SchemeTalk. Apparently the website is not up to date. I reached http://www.iam.unibe.ch/~verwaest/blog/?p=6 from http://scg.unibe.ch/research/schemetalk
>>
>> Alexandre
>>
>>
>> On 22 Mar 2011, at 11:16, Toon Verwaest wrote:
>>
>>> On 03/22/2011 05:10 PM, Alexandre Bergel wrote:
>>>> I see your point and I suspected you spoke according to your experience. I am just trying to make people innovate, in my own way.
>>>>
>>>> Another thing. Keeping the VM small will definitely help when moving to a different executing platform. It would be amazing to have the Pharo running on Javascript VM.
>>> That's not even that hard. I even had Pharo running on top of my own SchemeTalk already, after 2 days of hacking. I only stopped where I bumped into traits, since I didn't find any bootstrapping script for traits.
>>>
>>> I would basically compile all the smalltalk code to equivalent schemetalk code (very easy since the models are very similar), and compile all the natives to function-calls that perform what the native should perform. These functions had names that wouldn't clash with the Smalltalk names so you couldn't refer to it directly.
>>>
>>> Obviously in the end you would have all the smalltalk code running with SchemeTalk syntax, but all the code was virgin Pharo code. Very fun stuff :)
>>> And if you do it this way, it doesn't really depend on your VM anyway, since I was executing Smalltalk code as Scheme code, not as Smalltalk bytecodes interpreted by schemecode. That's a lot faster.
>>>
>>> cheers,
>>> Toon
>>>> Alexandre
>>>>
>>>>
>>>>> Ok, I very much understand this point of view. That's exactly why I started Pinocchio.
>>>>>
>>>>> I'm currently building my 4rd iteration. I started building a Smalltalk by having AST evaluation (actually I started out with self-evaluating AST objects that got evaluated by sending message... :D) . Then moved on to stack-based bytecodes. Then moved on to registered-based bytecodes (where I am now). All of my progress has been made by careful scrutinizing of the problems at hand. I did reevaluate and question all steps; and I'm a bit sad that I ended up doing it this way... But then I guess there is some inevitability in it.
>>>>>
>>>>> For example the register-based bytecodes, I had no idea that what I was doing were register-based bytecodes until I read the Lua VM description and noticed that it was exactly the same.
>>>>>
>>>>> However I do think that I learned a lot by doing it. I made the trade-offs myself and understand them a lot better than I did before. What I'm defending in this discussion isn't based on estimates but experience from building it from the ground up.
>>>>>
>>>>> And this is where I welcome everyone to do the same! It took me 2 years to get to where I am now, but maybe it goes faster for you ;) It is definitely a very worthwhile exercise. And I think because of it I'm finally at the point where I can contribute something...
>>>>>
>>>>> cheers,
>>>>> Toon
>>>>>
>>>
>
>

--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel  http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.






Reply | Threaded
Open this post in threaded view
|

Re: test crashing the cog vm

Stéphane Ducasse
In reply to this post by Toon Verwaest-2
Can you explain rapidly register-case byte code?

Stef

On Mar 22, 2011, at 4:02 PM, Toon Verwaest wrote:

> On 03/22/2011 04:52 PM, Alexandre Bergel wrote:
>>>> I am not saying that this is feasible with the current VM. But I am doubtful that the bytecode verification can only be done in the VM. If something get wrong, why not to raise a primitiveFailed, as 1/0 will do?
>>>>
>>> Because checking all these things at runtime will make interpretation slow.
>> But could not it be at compile time?
>>
>> I am not expert in VM. But something that I learnt from all over these years, is that if we copy what other people do, then we will just have a pale copy. Innovation begins by doing what other think it is impossible to do. Smalltalk has classes as object, whereas Java and C++ took a different stance. Smalltalk has the debugger and the profiler in the image; again, java people have a different opinion. Now, if someone say that bytecode verification can only be done in the VM, I am skeptic.
> Ok, I very much understand this point of view. That's exactly why I started Pinocchio.
>
> I'm currently building my 4rd iteration. I started building a Smalltalk by having AST evaluation (actually I started out with self-evaluating AST objects that got evaluated by sending message... :D) . Then moved on to stack-based bytecodes. Then moved on to registered-based bytecodes (where I am now). All of my progress has been made by careful scrutinizing of the problems at hand. I did reevaluate and question all steps; and I'm a bit sad that I ended up doing it this way... But then I guess there is some inevitability in it.
>
> For example the register-based bytecodes, I had no idea that what I was doing were register-based bytecodes until I read the Lua VM description and noticed that it was exactly the same.
>
> However I do think that I learned a lot by doing it. I made the trade-offs myself and understand them a lot better than I did before. What I'm defending in this discussion isn't based on estimates but experience from building it from the ground up.
>
> And this is where I welcome everyone to do the same! It took me 2 years to get to where I am now, but maybe it goes faster for you ;) It is definitely a very worthwhile exercise. And I think because of it I'm finally at the point where I can contribute something...
>
> cheers,
> Toon
>


Reply | Threaded
Open this post in threaded view
|

Re: test crashing the cog vm

Stéphane Ducasse
It depends what you want to get from a bytecode verifier. May be with a sealing semantics at package level (compiledMethod that can only be created once...) or something like that
you could get the code loaded, the package verified and installed without having to make the vm more complex.

Stef
Reply | Threaded
Open this post in threaded view
|

Re: test crashing the cog vm

Toon Verwaest-2
In reply to this post by Stéphane Ducasse
It's a very simple idea really.

Where your bytecode would previously look like

pushTemp: 1
pushTemp: 2
add

now it looks like:

add 1, 2, 3

where 3 is the start of your original stack, for example.

So rather than pushing and popping etc, you just abstractly interpret
the bytecodes and figure out the stack locations where they would be
pushed. And then you use this number as arguments to a copy bytecode,
rather than calling "push". This avoids having to maintain a stack where
you up front already know what the target index will be. You know this
up front since your code is static by itself.

The advantage is that you don't need to do stack-tricks to keep data
around, since all the operation are simple things such as copy,
loadFromOuterScope, storeInOuterScope, loadFromField, storeInField, ...

So conversely to what you might have thought beforehand, you don't have
"real registers" around, you just consider a stackframe to be a set of
registers. And you operate on them with easy bytecodes. The advantage,
in addition to it being less work and thus slightly faster by itself, is
that it will later on also map better on the real hardware. The
operations already look like what normal assembler operations would look
like.

cheers,
Toon

On 03/22/2011 07:00 PM, Stéphane Ducasse wrote:

> Can you explain rapidly register-case byte code?
>
> Stef
>
> On Mar 22, 2011, at 4:02 PM, Toon Verwaest wrote:
>
>> On 03/22/2011 04:52 PM, Alexandre Bergel wrote:
>>>>> I am not saying that this is feasible with the current VM. But I am doubtful that the bytecode verification can only be done in the VM. If something get wrong, why not to raise a primitiveFailed, as 1/0 will do?
>>>>>
>>>> Because checking all these things at runtime will make interpretation slow.
>>> But could not it be at compile time?
>>>
>>> I am not expert in VM. But something that I learnt from all over these years, is that if we copy what other people do, then we will just have a pale copy. Innovation begins by doing what other think it is impossible to do. Smalltalk has classes as object, whereas Java and C++ took a different stance. Smalltalk has the debugger and the profiler in the image; again, java people have a different opinion. Now, if someone say that bytecode verification can only be done in the VM, I am skeptic.
>> Ok, I very much understand this point of view. That's exactly why I started Pinocchio.
>>
>> I'm currently building my 4rd iteration. I started building a Smalltalk by having AST evaluation (actually I started out with self-evaluating AST objects that got evaluated by sending message... :D) . Then moved on to stack-based bytecodes. Then moved on to registered-based bytecodes (where I am now). All of my progress has been made by careful scrutinizing of the problems at hand. I did reevaluate and question all steps; and I'm a bit sad that I ended up doing it this way... But then I guess there is some inevitability in it.
>>
>> For example the register-based bytecodes, I had no idea that what I was doing were register-based bytecodes until I read the Lua VM description and noticed that it was exactly the same.
>>
>> However I do think that I learned a lot by doing it. I made the trade-offs myself and understand them a lot better than I did before. What I'm defending in this discussion isn't based on estimates but experience from building it from the ground up.
>>
>> And this is where I welcome everyone to do the same! It took me 2 years to get to where I am now, but maybe it goes faster for you ;) It is definitely a very worthwhile exercise. And I think because of it I'm finally at the point where I can contribute something...
>>
>> cheers,
>> Toon
>>
>


123