Performance, Quality and Process [was Array new: SmallInteger maxVal]

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

Performance, Quality and Process [was Array new: SmallInteger maxVal]

Eliot Miranda-2
 
Hi All,

    I'm not happy with this fix and I'm not happy with the lack of process behind it.  First there has been insufficient discussion of what the right behaviour is.  Second, the fix David has written does lots of computation (shifts) to check a valid size request that could be pushed earlier at initialization time, which would allow e.g. a vmParameterAt:put: to modify the max allocation request size.  Third, there is no review of fixes; we just put them out there.

I'm concerned about performance, code quality and a lack of process for agreeing fixes.  But at the same time I don't want to institute a bureaucracy or slow down the pace of development.  Do others share my concerns?  What suggestions have you?

One problem here is that Cog will introduce a huge raft of changes to the VM and to Slang, and so possibly the whole issue is moot.  We'll face the issues as we try and integrate my Cog VM into the squeakvm trunk.  But it might be worth thinking a little about the issues up front.

David, I know you're technical lead, and I'm not trying to depose or undermine you.  But I do think we can benefit from discussion and review of major changes.  Alas, my Cog work not being generaly available yet is going to cause problems down the line.  I need to at least hurry up and get the StackInterpreter released.

On Tue, Oct 20, 2009 at 7:39 PM, David T. Lewis <[hidden email]> wrote:

On Tue, Oct 06, 2009 at 06:58:30PM -0400, David T. Lewis wrote:
>
> On Tue, Oct 06, 2009 at 10:05:02PM +0200, Nicolas Cellier wrote:
> >
> > >From http://code.google.com/p/pharo/issues/detail?id=1282
> > Is this known?
>
> Thanks, it's known now :)
>
> Also added to Mantis at http://bugs.squeak.org/view.php?id=7405.
>
> Petr: Yes this is definitely a cool bug.

A fix for this is on Mantis 7405 (http://bugs.squeak.org/view.php?id=7405).

The VMMaker updates are in SqueakSource in VMMaker-dtl.142.

A separate patch is needed for platforms/unix/vm/sqUnixMemory.c (see the
Mantis report, patch also sent to Ian).

I have not tested Windows or Mac OS, but the basic scenario now is that
the VM will limit the size of memory increase requests to 31 bit integer
so support code can check for overflows. The requests from the VM are
guaranteed (I hope) be valid 31 bit integers.

Allocation requests of > 950 MB seem to be possible, although I do not
have enough memory on my PC to verify that it actually works.

Limiting allocation requests to more reasonable limits would be the
responsibility of the image.

Dave


Reply | Threaded
Open this post in threaded view
|

Re: Performance, Quality and Process [was Array new: SmallInteger maxVal]

Igor Stasenko

2009/10/21 Eliot Miranda <[hidden email]>:
>
> Hi All,
>     I'm not happy with this fix and I'm not happy with the lack of process behind it.  First there has been insufficient discussion of what the right behaviour is.  Second, the fix David has written does lots of computation (shifts) to check a valid size request that could be pushed earlier at initialization time, which would allow e.g. a vmParameterAt:put: to modify the max allocation request size.  Third, there is no review of fixes; we just put them out there.
> I'm concerned about performance, code quality and a lack of process for agreeing fixes.  But at the same time I don't want to institute a bureaucracy or slow down the pace of development.  Do others share my concerns?  What suggestions have you?
>
> One problem here is that Cog will introduce a huge raft of changes to the VM and to Slang, and so possibly the whole issue is moot.  We'll face the issues as we try and integrate my Cog VM into the squeakvm trunk.  But it might be worth thinking a little about the issues up front.
> David, I know you're technical lead, and I'm not trying to depose or undermine you.  But I do think we can benefit from discussion and review of major changes.  Alas, my Cog work not being generaly available yet is going to cause problems down the line.  I need to at least hurry up and get the StackInterpreter released.

Hi Eliot & Dave.

I think that Dave's fix is a quick way to close the security hole.
I mean, it is good to have some critical issues closed quickly and
deliver the 'hot' fix than having no fix at all.
And surely, it should stay open for further discussion how to make it
better/cleaner/faster/safer etc..

I hope nothing in this fix is unrevertable, which can't be changed in
future versions.

> On Tue, Oct 20, 2009 at 7:39 PM, David T. Lewis <[hidden email]> wrote:
>>
>> On Tue, Oct 06, 2009 at 06:58:30PM -0400, David T. Lewis wrote:
>> >
>> > On Tue, Oct 06, 2009 at 10:05:02PM +0200, Nicolas Cellier wrote:
>> > >
>> > > >From http://code.google.com/p/pharo/issues/detail?id=1282
>> > > Is this known?
>> >
>> > Thanks, it's known now :)
>> >
>> > Also added to Mantis at http://bugs.squeak.org/view.php?id=7405.
>> >
>> > Petr: Yes this is definitely a cool bug.
>>
>> A fix for this is on Mantis 7405 (http://bugs.squeak.org/view.php?id=7405).
>>
>> The VMMaker updates are in SqueakSource in VMMaker-dtl.142.
>>
>> A separate patch is needed for platforms/unix/vm/sqUnixMemory.c (see the
>> Mantis report, patch also sent to Ian).
>>
>> I have not tested Windows or Mac OS, but the basic scenario now is that
>> the VM will limit the size of memory increase requests to 31 bit integer
>> so support code can check for overflows. The requests from the VM are
>> guaranteed (I hope) be valid 31 bit integers.
>>
>> Allocation requests of > 950 MB seem to be possible, although I do not
>> have enough memory on my PC to verify that it actually works.
>>
>> Limiting allocation requests to more reasonable limits would be the
>> responsibility of the image.
>>
>> Dave
>>
>
>
>



--
Best regards,
Igor Stasenko AKA sig.
Reply | Threaded
Open this post in threaded view
|

Re: Performance, Quality and Process [was Array new: SmallInteger maxVal]

David T. Lewis
 
On Wed, Oct 21, 2009 at 10:19:17PM +0300, Igor Stasenko wrote:

>
> 2009/10/21 Eliot Miranda <[hidden email]>:
> >
> > Hi All,
> > ???? ??I'm not happy with this fix and I'm not happy with the lack of process behind it. ??First there has been insufficient discussion of what the right behaviour is. ??Second, the fix David has written does lots of computation (shifts) to check a valid size request that could be pushed earlier at initialization time, which would allow e.g. a vmParameterAt:put: to modify the max allocation request size. ??Third, there is no review of fixes; we just put them out there.
> > I'm concerned about performance, code quality and a lack of process for agreeing fixes. ??But at the same time I don't want to institute a bureaucracy or slow down the pace of development. ??Do others share my concerns? ??What suggestions have you?
> >
> > One problem here is that Cog will introduce a huge raft of changes to the VM and to Slang, and so possibly the whole issue is moot. ??We'll face the issues as we try and integrate my Cog VM into the squeakvm trunk. ??But it might be worth thinking a little about the issues up front.
> > David, I know you're technical lead, and I'm not trying to depose or undermine you. ??But I do think we can benefit from discussion and review of major changes. ??Alas, my Cog work not being generaly available yet is going to cause problems down the line. ??I need to at least hurry up and get the StackInterpreter released.
>
> Hi Eliot & Dave.
>
> I think that Dave's fix is a quick way to close the security hole.
> I mean, it is good to have some critical issues closed quickly and
> deliver the 'hot' fix than having no fix at all.
> And surely, it should stay open for further discussion how to make it
> better/cleaner/faster/safer etc..
>
> I hope nothing in this fix is unrevertable, which can't be changed in
> future versions.

Hi Eliot,

No worries, the changes are easily reverted and I definitely welcome
more review and better solutions.

There is a Mantis entry for this as well (http://bugs.squeak.org/view.php?id=7405),
currently in status "testing". The specific changes that I added on
SqueakSource are in a change set on the Mantis report.

Please don't think of me as technical lead; the "VM team leader" is
an editorial and facilitation role and I fully expect to defer to others
in technical matters.

Dave

Reply | Threaded
Open this post in threaded view
|

Re: Array new: SmallInteger maxVal

David T. Lewis
In reply to this post by Eliot Miranda-2
 
(temporarily popping back from the meta discussion)

On Wed, Oct 21, 2009 at 10:25:31AM -0700, Eliot Miranda wrote:
>
   <snip>

> the fix David has written does lots of computation
> (shifts) to check a valid size request that could be pushed earlier at
> initialization time, which would allow e.g. a vmParameterAt:put: to modify
> the max allocation request size.

   <snip>

> I'm concerned about performance, code quality and a lack of process for
> agreeing fixes.

Regarding performance associated with the changes, I was not able to measure
any loss of performance. Actually, my crude test showed a slight improvement,
which I can only attribute to random variation in the results.

Here is an example of one of the informal tests that I tried:

  block := [oc := OrderedCollection new.
  (1 to: 1000000) do: [:e | oc add: (Array new: (e \\ 27) + 1)]].
 
  "Stock VM:"
  Smalltalk garbageCollect.
  before := (1 to: 5) collect: [:e | Time millisecondsToRun: block] ==> #(21393 20582 21511 21101 20761)
 
  "VM with my Array alloc changes:"
  Smalltalk garbageCollect.
  after := (1 to: 5) collect: [:e | Time millisecondsToRun: block] ==> #(21582 20737 20693 20691 20725)
 
  slowdownDueToTheChanges := (after sum - before sum / before sum) asFloat ==> -0.008732961233246

I got similar results for allocating strings, very slightly faster after
the changes. I was happy with "not slower" and left it at that.

Can anyone suggest a more suitable benchmark?

Also, I'm running on AMD 64 and I was only guessing that integer shift and
test sign would be a good approach. It might be awful on some hardware, I
don't know.

r.e. vmParameterAt:put: to modify max allocation request size -- good idea.
The changes that I made are strictly intended to protect against a VM crash
or object memory corruption, nothing more. But some mechanism to prevent
people from making unreasonable memory requests is clearly also needed.

Dave

Reply | Threaded
Open this post in threaded view
|

Squeak + Android and Dalvik!? (was Re: Performance, Quality and Process [was Array new: SmallInteger maxVal])

Göran Krampe
In reply to this post by Eliot Miranda-2
 
Hi all!

Eliot Miranda wrote:
> I need to at least hurry up and get the StackInterpreter released.

Which of course tickles me enough to ask again about your status? :)

Let me indulge in some "braindumping" here about Squeak vs Android:

Android is coming on a broad front right now and I am interested in how
Squeak can "fit" into that platform.

Unless you have been living under a rock (or working hard on a JIT we
all want!) the mobile industry is heating up a LOT. Android is really
impressive and is getting thrown on all sorts of interesting hardware,
not only phones. I think that it really will be disruptive in a way that
IPhone can never be since it is a "high end only" product from a single
hardware company.

The Android SDK/dev stack is interesting, it is a Linux kernel etc, has
the Dalvik VM which is a VM with its own bytecode set that is designed
for phones and "runs Java". It does it by using a cross compiler from
java .class files to so called "dex" files and then runs those. So it is
not a "Java VM (tm)".

It is actually *register based* and I have peeked at its sources. It
seems to have been written with a "brute force" approach by C-coders -
nothing wrong in that of course.

First they wrote a simple interpreter core in a single C function. Then
they split it up using macros etc in order to be able to implement each
bytecode in assembler for each major CPU. The "old" C-only interpreter
is still there and it basically is a "while (!) { FETCH; switch
(bytecode) for-each-bytecode}"-loop. :)

There is no JIT. It is not fast. BUT... it is the VM on the Android and
it comes with tons of libraries etc.

Now... there is also an NDK (Native SDK) that allows one to compile C to
libraries (but not executables, not sure) that can in turn be used from
java/Dalvik... humm...

So in order to be able to write anything remotely interesting for an
Android one needs to go "through" Dalvik somehow.

My silly first "funny idea" was to somehow marry the Squeak VM with the
Dalvik VM. Merge the sources and somehow make it possibly to "drive" the
Dalvik interpreter by feeding it "dex" bytecodes from Squeak. Sure, this
will only work on a "rooted phone" I presume, but anyway, would be cool.

If we could do that I guess one could open a VNC connection into the
Squeak VM (if we can get Sockets to work) running inside the Android
simulator/device and then using Squeak producing dex bytecodes one could
perhaps dynamically "drive" Dalvik?

I am probably talking totally "out of my hat". Would be really
interesting to hear thoughts from Eliot on all this.

Anyway, getting Squeak running on Android in any fashion would be
awesomely interesting :) - it IS coming, all over.

Sidenote: Getting Squeak to run on Maemo (Nokia's Linux based OS for
n900 etc) and derivatives of Maemo is probably tons of less work
(because it is not a java centric thing and has much more standard Linux
stuff) - but tons of less interesting given the market...

regards, Göran

PS. This article + comments (by Dan Bernstein also, creator of Dalvik)
is a bit interesting.

Reply | Threaded
Open this post in threaded view
|

Re: Array new: SmallInteger maxVal

Henrik Sperre Johansen
In reply to this post by David T. Lewis
 
That's more of a GC-test :) (93% GC, 5% OrderedCollection>>add: on my  
machine)
I found it's usually a good idea to first do a
TimeProfileBrowser onBlock: testBlock
just to check the timing is actually spent doing what you want to  
measure a difference in,
before switching to millisecondsToRun to get the number without tally  
overhead.

Measuring single primitives can be rather hard though, since any  
overhead can be a big part of total runtime...
Also, do:,  timesRepeat: etc. should be avoided for looping when  
measuring performance until the Stack VM is out, since they create  
additional BlockContexts (and thus more time spent in gc) that weren't  
there before closures.

It's also good to avoid computations other than the one you're testing  
in the inner loop, so a better test might be something like:

[1 to: 200 do: [:e | 1 to: 25185 do: [:t | Array new: e]]] timeToRun -  
[1 to: 200 do: [:e | 1 to: 25185 do: [:t | ]]] timeToRun.
Then open a TimeProfileBrowser  on the first block and subtract the GC-
time listed there.
(The 25185 was 1000000//27 from your test, changed 27 with 200 since  
the ms runtime with 27 was in the double digits...)

If any of my assumptions are incorrect, I'd like to know :)

Cheers,
Henry

On Oct 22, 2009, at 3:23 15AM, David T. Lewis wrote:

> Regarding performance associated with the changes, I was not able to  
> measure
> any loss of performance. Actually, my crude test showed a slight  
> improvement,
> which I can only attribute to random variation in the results.
>
> Here is an example of one of the informal tests that I tried:
>
>  block := [oc := OrderedCollection new.
>  (1 to: 1000000) do: [:e | oc add: (Array new: (e \\ 27) + 1)]].
>
>  "Stock VM:"
>  Smalltalk garbageCollect.
>  before := (1 to: 5) collect: [:e | Time millisecondsToRun: block]  
> ==> #(21393 20582 21511 21101 20761)
>
>  "VM with my Array alloc changes:"
>  Smalltalk garbageCollect.
>  after := (1 to: 5) collect: [:e | Time millisecondsToRun: block]  
> ==> #(21582 20737 20693 20691 20725)
>
>  slowdownDueToTheChanges := (after sum - before sum / before sum)  
> asFloat ==> -0.008732961233246
>
> I got similar results for allocating strings, very slightly faster  
> after
> the changes. I was happy with "not slower" and left it at that.
>
> Can anyone suggest a more suitable benchmark?
>
> Also, I'm running on AMD 64 and I was only guessing that integer  
> shift and
> test sign would be a good approach. It might be awful on some  
> hardware, I
> don't know.
>
> r.e. vmParameterAt:put: to modify max allocation request size --  
> good idea.
> The changes that I made are strictly intended to protect against a  
> VM crash
> or object memory corruption, nothing more. But some mechanism to  
> prevent
> people from making unreasonable memory requests is clearly also  
> needed.
>
> Dave
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Array new: SmallInteger maxVal

David T. Lewis
 
On Thu, Oct 22, 2009 at 02:47:36PM +0200, Henrik Johansen wrote:
>
> It's also good to avoid computations other than the one you're testing  
> in the inner loop, so a better test might be something like:
>
> [1 to: 200 do: [:e | 1 to: 25185 do: [:t | Array new: e]]] timeToRun -  
> [1 to: 200 do: [:e | 1 to: 25185 do: [:t | ]]] timeToRun.
> Then open a TimeProfileBrowser  on the first block and subtract the GC-
> time listed there.

Thanks Henrik, I'll give that a try.

Dave

Reply | Threaded
Open this post in threaded view
|

Re: Squeak + Android and Dalvik!? (was Re: Performance, Quality and Process [was Array new: SmallInteger maxVal])

Eliot Miranda-2
In reply to this post by Göran Krampe
 


2009/10/22 Göran Krampe <[hidden email]>

Hi all!

Eliot Miranda wrote:
I need to at least hurry up and get the StackInterpreter released.

Which of course tickles me enough to ask again about your status? :)

The V1 JIT has been released to customers, and so is now our only VM, in use in the server and client components.  Performance is around 2.5x to 3x the old VM for real world use (number of clients that can connect to a server).  Certain macro benchmarks are in the 4x to 5x range.  We have been very busy and so the process of getting the VMs released has been put on the back burner.  I need to move them to the front of the stove :)

Let me indulge in some "braindumping" here about Squeak vs Android:

Android is coming on a broad front right now and I am interested in how Squeak can "fit" into that platform.

Unless you have been living under a rock (or working hard on a JIT we all want!) the mobile industry is heating up a LOT. Android is really impressive and is getting thrown on all sorts of interesting hardware, not only phones. I think that it really will be disruptive in a way that IPhone can never be since it is a "high end only" product from a single hardware company.

The Android SDK/dev stack is interesting, it is a Linux kernel etc, has the Dalvik VM which is a VM with its own bytecode set that is designed for phones and "runs Java". It does it by using a cross compiler from java .class files to so called "dex" files and then runs those. So it is not a "Java VM (tm)".

It is actually *register based* and I have peeked at its sources. It seems to have been written with a "brute force" approach by C-coders - nothing wrong in that of course.

Much is made of register-based vs stack-based bytecode designs.  IMO, it makes little difference to overall performance since there are straight-forward techniques for transfrming stack bytecode into register-based code.  e.g. the VisualWorks VM does this, and Ian Piumarta's thesis describes essentially the same technique.  For languages like Smalltalk with extremely high send frequencies there is little scope for caching results in registers.  The important thing is to have a register-based calling convention.

David Simmons' performance work also demonstrates that stack bytecodes arte no hinderance to performance.  He focusses much more on inline cacheing performance and allocation and reclamation speed (GC), and gets excellent speed.

First they wrote a simple interpreter core in a single C function. Then they split it up using macros etc in order to be able to implement each bytecode in assembler for each major CPU. The "old" C-only interpreter is still there and it basically is a "while (!) { FETCH; switch (bytecode) for-each-bytecode}"-loop. :)

There is no JIT. It is not fast. BUT... it is the VM on the Android and it comes with tons of libraries etc.

Now... there is also an NDK (Native SDK) that allows one to compile C to libraries (but not executables, not sure) that can in turn be used from java/Dalvik... humm...

But a JIT can live in a library.  Provided one can allocate executable memory (see my concluding question below).
 
So in order to be able to write anything remotely interesting for an Android one needs to go "through" Dalvik somehow.

My silly first "funny idea" was to somehow marry the Squeak VM with the Dalvik VM. Merge the sources and somehow make it possibly to "drive" the Dalvik interpreter by feeding it "dex" bytecodes from Squeak. Sure, this will only work on a "rooted phone" I presume, but anyway, would be cool.

If we could do that I guess one could open a VNC connection into the Squeak VM (if we can get Sockets to work) running inside the Android simulator/device and then using Squeak producing dex bytecodes one could perhaps dynamically "drive" Dalvik?

I am probably talking totally "out of my hat". Would be really interesting to hear thoughts from Eliot on all this.

Anyway, getting Squeak running on Android in any fashion would be awesomely interesting :) - it IS coming, all over.

The devil is in the details.  I'm pretty sure that Squeak would run horribly above Dalvik (sends, GC, tagged integers are all likely to suck on a VM not designed for Smalltalk).  But if Dalvik is not the entirte framework (you speak about libraries above) then running alongside is perhaps a possibility.  e.g. this is how David Simmons' S# runs within .net.
 

Sidenote: Getting Squeak to run on Maemo (Nokia's Linux based OS for n900 etc) and derivatives of Maemo is probably tons of less work (because it is not a java centric thing and has much more standard Linux stuff) - but tons of less interesting given the market...

regards, Göran

PS. This article + comments (by Dan Bernstein also, creator of Dalvik) is a bit interesting.

Is there no access other than through the VM?  One of the major pains with the iPhone is the lack of support for JITs.  The mmap function prevents granting execute access on the memory it allocates.  John McIntosh has suggested that Apple might be persuaded to provide a work-around for certain applications (i.e. the Cog JIT) but I think John is merely speculating optimistically (John, am I right or is there a real possibility here?).  It would be great if Android didn't present simular hurdles.


Reply | Threaded
Open this post in threaded view
|

Re: Squeak + Android and Dalvik!? (was Re: Performance, Quality and Process [was Array new: SmallInteger maxVal])

Derek O'Connell-2
In reply to this post by Göran Krampe
 
Göran Krampe wrote:
> Sidenote: Getting Squeak to run on Maemo (Nokia's Linux based OS for
> n900 etc) and derivatives of Maemo is probably tons of less work
> (because it is not a java centric thing and has much more standard Linux
> stuff) - but tons of less interesting given the market...

The VM compiles fine for Maemo without any changes. Advantage of a
(mostly) open platform :-)
Reply | Threaded
Open this post in threaded view
|

Re: Squeak + Android and Dalvik!? (was Re: Performance, Quality and Process [was Array new: SmallInteger maxVal])

johnmci
In reply to this post by Eliot Miranda-2
 

On 2009-10-22, at 9:26 AM, Eliot Miranda wrote:
>
> Is there no access other than through the VM?  One of the major  
> pains with the iPhone is the lack of support for JITs.  The mmap  
> function prevents granting execute access on the memory it  
> allocates.  John McIntosh has suggested that Apple might be  
> persuaded to provide a work-around for certain applications (i.e.  
> the Cog JIT) but I think John is merely speculating optimistically  
> (John, am I right or is there a real possibility here?).  It would  
> be great if Android didn't present simular hurdles.

Ah well the story is:

The iPhone uses the virtual memory hardware page tags to deal with  
read, write, execute. *** Actually I was looking for confirmation of  
this but couldn't find it ***
Apps from the store run as non-root from a nosuid partition, so you  
can't make them root, well not outside an exploit but those are  
*really rare now*.
Apps from the store cannot dynamically link in executable code,  
everything you supply is static linked.
I note things like Core-Data to SQLLite do DDL, but that's Apple's  
sandbox.

To get a page of memory that is marked executable you need to use mmap  
to allocate a executable/read/write page of memory.
The mmap binary is not quite BSD compliant, passing PROT_EXEC won't  
work if you are not root.
*** I assume as root PROT_EXEC will work, but not clear on this, an  
Apple engineer insisted the PROC_EXEC logic wasn't in the binary, but  
doubtful how
does the application loader then get the memory to load the binary? ***

I did talk to some people in Apple enterprise security (hint at WWDC,  
the people in charge) about this issue,
they have no plans to allow anyone to mmap memory with PROT_EXEC.

They did however think if the *right* enterprise clients asked, then  
*maybe* an enterprise app with the proper certificate could get to  
PROC_EXEC as a
non-root app, but that would require a change to the operating  
system.  For the curious a Enterprise can configure phones to disable  
various hardware
components/features (ie no camera), plus of course distribute apps  
signed by the enterprise internally for phones which had the  
enterprise certificates.

I did suggest to Cincom that they should ask.

Did I mention Apple's security organizations general feelings? The  
answer is NO, now what was the question?

Obviously this lowers the optimistic level...  Since I don't think we  
have an enterprise client (think 100,000 phones) who needs a JIT based  
app on the iPhone.

--
=
=
=
========================================================================
John M. McIntosh <[hidden email]>   Twitter:  
squeaker68882
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
=
=
=
========================================================================




Reply | Threaded
Open this post in threaded view
|

Re: Squeak + Android and Dalvik!? (was Re: Performance, Quality and Process [was Array new: SmallInteger maxVal])

Eliot Miranda-2
 


On Thu, Oct 22, 2009 at 1:11 PM, John M McIntosh <[hidden email]> wrote:


On 2009-10-22, at 9:26 AM, Eliot Miranda wrote:

Is there no access other than through the VM?  One of the major pains with the iPhone is the lack of support for JITs.  The mmap function prevents granting execute access on the memory it allocates.  John McIntosh has suggested that Apple might be persuaded to provide a work-around for certain applications (i.e. the Cog JIT) but I think John is merely speculating optimistically (John, am I right or is there a real possibility here?).  It would be great if Android didn't present simular hurdles.

Ah well the story is:

The iPhone uses the virtual memory hardware page tags to deal with read, write, execute. *** Actually I was looking for confirmation of this but couldn't find it ***
Apps from the store run as non-root from a nosuid partition, so you can't make them root, well not outside an exploit but those are *really rare now*.
Apps from the store cannot dynamically link in executable code, everything you supply is static linked.
I note things like Core-Data to SQLLite do DDL, but that's Apple's sandbox.

To get a page of memory that is marked executable you need to use mmap to allocate a executable/read/write page of memory.
The mmap binary is not quite BSD compliant, passing PROT_EXEC won't work if you are not root.
*** I assume as root PROT_EXEC will work, but not clear on this, an Apple engineer insisted the PROC_EXEC logic wasn't in the binary, but doubtful how
does the application loader then get the memory to load the binary? ***

I did talk to some people in Apple enterprise security (hint at WWDC, the people in charge) about this issue,
they have no plans to allow anyone to mmap memory with PROT_EXEC.

They did however think if the *right* enterprise clients asked, then *maybe* an enterprise app with the proper certificate could get to PROC_EXEC as a
non-root app, but that would require a change to the operating system.  For the curious a Enterprise can configure phones to disable various hardware
components/features (ie no camera), plus of course distribute apps signed by the enterprise internally for phones which had the enterprise certificates.

I did suggest to Cincom that they should ask.

Did I mention Apple's security organizations general feelings? The answer is NO, now what was the question?

Obviously this lowers the optimistic level...  Since I don't think we have an enterprise client (think 100,000 phones) who needs a JIT based app on the iPhone.

Its OK if you're Apple, right?  JavaScript is V8 (a JIT) on the iPhone isn't it?  And if Java is on the iPhone its probably a JIT too.
 

--
===========================================================================
John M. McIntosh <[hidden email]>   Twitter:  squeaker68882
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
===========================================================================





Reply | Threaded
Open this post in threaded view
|

Re: Squeak + Android and Dalvik!? (was Re: Performance, Quality and Process [was Array new: SmallInteger maxVal])

johnmci
 

On 2009-10-22, at 1:41 PM, Eliot Miranda wrote:

> Its OK if you're Apple, right?  JavaScript is V8 (a JIT) on the  
> iPhone isn't it?  And if Java is on the iPhone its probably a JIT too.

Er yes, well it's your operating system, your hardware, your legal  
documents. One can do what one wants, as long as one can
keep the other guy out of your playpen. So who does hand executable  
pages to V8? Good question...

Java is NOT on the iPhone. Neither is Flash.  *cough* well interpreted  
flash, Adobe has some static compiled
version they make or something now. Grind Flash thru some process,  
makes iphone app.

--
=
=
=
========================================================================
John M. McIntosh <[hidden email]>   Twitter:  
squeaker68882
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
=
=
=
========================================================================




Reply | Threaded
Open this post in threaded view
|

Re: Squeak + Android and Dalvik!? (was Re: Performance, Quality and Process [was Array new: SmallInteger maxVal])

Eliot Miranda-2
 


On Thu, Oct 22, 2009 at 2:08 PM, John M McIntosh <[hidden email]> wrote:


On 2009-10-22, at 1:41 PM, Eliot Miranda wrote:

Its OK if you're Apple, right?  JavaScript is V8 (a JIT) on the iPhone isn't it?  

At least on Safari it is "Nitro".  So I guess Mobile Safari doesn't contain V8 either.

 
And if Java is on the iPhone its probably a JIT too.

Er yes, well it's your operating system, your hardware, your legal documents. One can do what one wants, as long as one can
keep the other guy out of your playpen. So who does hand executable pages to V8? Good question...

Java is NOT on the iPhone. Neither is Flash.  *cough* well interpreted flash, Adobe has some static compiled
version they make or something now. Grind Flash thru some process, makes iphone app.


--
===========================================================================
John M. McIntosh <[hidden email]>   Twitter:  squeaker68882
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
===========================================================================





Reply | Threaded
Open this post in threaded view
|

Re: Array new: SmallInteger maxVal

David T. Lewis
In reply to this post by Henrik Sperre Johansen
 
Thanks Henrik,

I took your suggestions and found the following:

- Using your suggested test:
        [1 to: 200 do: [:e | 1 to: 25185 do: [:t | Array new: e]]] timeToRun -
        [1 to: 200 do: [:e | 1 to: 25185 do: [:t | ]]] timeToRun.
  Unfortunately I was not able to get any useful data from a TimeProfileBrowser
  on my system (there was no indication that time was being spent in GC though),
  but overall time to run showed the updated VM (with allocation checks) giving
  a 12% better performance in primitives than the prior VM without checks (!?!).

- Going back to my original test, and looking at it with a TimeProfileBrowser,
  I saw about 91-95% of the time was spent in primitives under Collection>>add:
  so the time was presumably being spent largely in array allocation. That
  presumably included garbage collection, but it was nonetheless primarily
  exercising #primitiveNewWithArg.

- Comparing just the time spent in primitives, the time in primitives for
  the VM with new object allocation checks was 3.8% better than the VM without
  those checks. I would not attribute much precision to this, but it's still
  consistent with my original smoke test check that showed the VM with checks
  being slightly ( < 1% ) faster than the prior version without the checks.

I cannot explain why the updates seem to make the VM slightly faster, but
it does seem to be the case on my machine (AMD, 64-bit Linux). My best SWAG
speculative-and-probably-wrong guess would be that the variable declaration
updates included in the change set may have had the unintended side effect
of eliminating some inefficiencies somewhere.

I suspect that I am making a mistake somewhere. Really, there's just no
way that the added checks should make things go *faster*. Can anyone
else confirm or deny a performance difference between a VM built with
VMMaker-dtl.143 (including the allocation checks) versus a VM built with
VMMaker-dtl.142 or earlier?

Dave

On Thu, Oct 22, 2009 at 02:47:36PM +0200, Henrik Johansen wrote:

>
> That's more of a GC-test :) (93% GC, 5% OrderedCollection>>add: on my  
> machine)
> I found it's usually a good idea to first do a
> TimeProfileBrowser onBlock: testBlock
> just to check the timing is actually spent doing what you want to  
> measure a difference in,
> before switching to millisecondsToRun to get the number without tally  
> overhead.
>
> Measuring single primitives can be rather hard though, since any  
> overhead can be a big part of total runtime...
> Also, do:,  timesRepeat: etc. should be avoided for looping when  
> measuring performance until the Stack VM is out, since they create  
> additional BlockContexts (and thus more time spent in gc) that weren't  
> there before closures.
>
> It's also good to avoid computations other than the one you're testing  
> in the inner loop, so a better test might be something like:
>
> [1 to: 200 do: [:e | 1 to: 25185 do: [:t | Array new: e]]] timeToRun -  
> [1 to: 200 do: [:e | 1 to: 25185 do: [:t | ]]] timeToRun.
> Then open a TimeProfileBrowser  on the first block and subtract the GC-
> time listed there.
> (The 25185 was 1000000//27 from your test, changed 27 with 200 since  
> the ms runtime with 27 was in the double digits...)
>
> If any of my assumptions are incorrect, I'd like to know :)
>
> Cheers,
> Henry
>
> On Oct 22, 2009, at 3:23 15AM, David T. Lewis wrote:
>
> >Regarding performance associated with the changes, I was not able to  
> >measure
> >any loss of performance. Actually, my crude test showed a slight  
> >improvement,
> >which I can only attribute to random variation in the results.
> >
> >Here is an example of one of the informal tests that I tried:
> >
> > block := [oc := OrderedCollection new.
> > (1 to: 1000000) do: [:e | oc add: (Array new: (e \\ 27) + 1)]].
> >
> > "Stock VM:"
> > Smalltalk garbageCollect.
> > before := (1 to: 5) collect: [:e | Time millisecondsToRun: block]  
> >==> #(21393 20582 21511 21101 20761)
> >
> > "VM with my Array alloc changes:"
> > Smalltalk garbageCollect.
> > after := (1 to: 5) collect: [:e | Time millisecondsToRun: block]  
> >==> #(21582 20737 20693 20691 20725)
> >
> > slowdownDueToTheChanges := (after sum - before sum / before sum)  
> >asFloat ==> -0.008732961233246
> >
> >I got similar results for allocating strings, very slightly faster  
> >after
> >the changes. I was happy with "not slower" and left it at that.
> >
> >Can anyone suggest a more suitable benchmark?
> >
> >Also, I'm running on AMD 64 and I was only guessing that integer  
> >shift and
> >test sign would be a good approach. It might be awful on some  
> >hardware, I
> >don't know.
> >
> >r.e. vmParameterAt:put: to modify max allocation request size --  
> >good idea.
> >The changes that I made are strictly intended to protect against a  
> >VM crash
> >or object memory corruption, nothing more. But some mechanism to  
> >prevent
> >people from making unreasonable memory requests is clearly also  
> >needed.
> >
> >Dave
> >
> >
Reply | Threaded
Open this post in threaded view
|

Re: Squeak + Android and Dalvik!? (was Re: Performance, Quality and Process [was Array new: SmallInteger maxVal])

Göran Krampe
In reply to this post by Eliot Miranda-2
 
Hi Eliot and all!

Eliot Miranda wrote:
> The V1 JIT has been released to customers, and so is now our only VM, in use
> in the server and client components.  Performance is around 2.5x to 3x the
> old VM for real world use (number of clients that can connect to a server).
>  Certain macro benchmarks are in the 4x to 5x range.  We have been very busy
> and so the process of getting the VMs released has been put on the back
> burner.  I need to move them to the front of the stove :)

Sounds great! Yeah, we want it!!!! :)

>> There is no JIT. It is not fast. BUT... it is the VM on the Android and it
>> comes with tons of libraries etc.
>>
>> Now... there is also an NDK (Native SDK) that allows one to compile C to
>> libraries (but not executables, not sure) that can in turn be used from
>> java/Dalvik... humm...
>
> But a JIT can live in a library.  Provided one can allocate executable
> memory (see my concluding question below).

Yes, AFAICT mono JIT works on Android.

Btw, this is a good set of introductory technical slides about the
Android runtime env:

http://jazoon.com/download/presentations/8801.pdf

It explains Dalvik etc fairly good, from june this year.

This blog entry may be even more interesting:

http://blogs.sun.com/jrose/entry/with_android_and_dalvik_at

>> So in order to be able to write anything remotely interesting for an
>> Android one needs to go "through" Dalvik somehow.
 >
> The devil is in the details.  I'm pretty sure that Squeak would run horribly
> above Dalvik (sends, GC, tagged integers are all likely to suck on a VM not
> designed for Smalltalk).  But if Dalvik is not the entirte framework (you
> speak about libraries above) then running alongside is perhaps a
> possibility.  e.g. this is how David Simmons' S# runs within .net.

I did not mean "above" Dalvik as in "implemented on top of Dalvik". I
was more toying with the idea of being able to "feed" Dalvik with DEX
code from a Squeak VM running alongside it. Perhaps this is what you
also describe above.

Thing is - we should not kid ourselves - the libraries on Android are
java libraries (although converted to DEX form) and Android is getting
pushed onto new CPUs "day by day". For example "The Droid" (Motorola's
latest Android phone) uses a new CPU etc. So keeping up with the CPU
level assembler is not something "we" would want to do.

So somehow standing on the shoulders of Dalvik could be an approach?

Dalvik will get some form of JIT soon most people seem to think.

> Is there no access other than through the VM?  One of the major pains with
> the iPhone is the lack of support for JITs.  The mmap function prevents
> granting execute access on the memory it allocates.  John McIntosh has
> suggested that Apple might be persuaded to provide a work-around for certain
> applications (i.e. the Cog JIT) but I think John is merely speculating
> optimistically (John, am I right or is there a real possibility here?).  It
> would be great if Android didn't present simular hurdles.

That sure does NOT seem to be a hurdle. And yes, there are plenty of
access other than through Dalvik. I jsut don't think that you will be
able to do much interesting without being able to reuse/use the java
libraries.

regards, Göran

PS. Remember that I brought this up a year from now when Android will be
all over the place...

Reply | Threaded
Open this post in threaded view
|

Re: Squeak + Android and Dalvik!? (was Re: Performance, Quality and Process [was Array new: SmallInteger maxVal])

Göran Krampe
 
Hi!

Btw, this is Dan Bornsteins (Dalvik tech lead) resumé which among other
things lists www.erights.org!!! Cool.

http://www.milk.com/home/danfuzz/resume/

regards, Göran

Reply | Threaded
Open this post in threaded view
|

Re: Squeak + Android and Dalvik!? (was Re: Performance, Quality and Process [was Array new: SmallInteger maxVal])

Eliot Miranda-2
In reply to this post by johnmci
 
Seems like influential people are starting to apply pressure:
Graham's article doesn't directly target interpreters and JITs but its in the right direction.

On Thu, Oct 22, 2009 at 12:11 PM, John M McIntosh <[hidden email]> wrote:


On 2009-10-22, at 9:26 AM, Eliot Miranda wrote:

Is there no access other than through the VM?  One of the major pains with the iPhone is the lack of support for JITs.  The mmap function prevents granting execute access on the memory it allocates.  John McIntosh has suggested that Apple might be persuaded to provide a work-around for certain applications (i.e. the Cog JIT) but I think John is merely speculating optimistically (John, am I right or is there a real possibility here?).  It would be great if Android didn't present simular hurdles.

Ah well the story is:

The iPhone uses the virtual memory hardware page tags to deal with read, write, execute. *** Actually I was looking for confirmation of this but couldn't find it ***
Apps from the store run as non-root from a nosuid partition, so you can't make them root, well not outside an exploit but those are *really rare now*.
Apps from the store cannot dynamically link in executable code, everything you supply is static linked.
I note things like Core-Data to SQLLite do DDL, but that's Apple's sandbox.

To get a page of memory that is marked executable you need to use mmap to allocate a executable/read/write page of memory.
The mmap binary is not quite BSD compliant, passing PROT_EXEC won't work if you are not root.
*** I assume as root PROT_EXEC will work, but not clear on this, an Apple engineer insisted the PROC_EXEC logic wasn't in the binary, but doubtful how
does the application loader then get the memory to load the binary? ***

I did talk to some people in Apple enterprise security (hint at WWDC, the people in charge) about this issue,
they have no plans to allow anyone to mmap memory with PROT_EXEC.

They did however think if the *right* enterprise clients asked, then *maybe* an enterprise app with the proper certificate could get to PROC_EXEC as a
non-root app, but that would require a change to the operating system.  For the curious a Enterprise can configure phones to disable various hardware
components/features (ie no camera), plus of course distribute apps signed by the enterprise internally for phones which had the enterprise certificates.

I did suggest to Cincom that they should ask.

Did I mention Apple's security organizations general feelings? The answer is NO, now what was the question?

Obviously this lowers the optimistic level...  Since I don't think we have an enterprise client (think 100,000 phones) who needs a JIT based app on the iPhone.

--
===========================================================================
John M. McIntosh <[hidden email]>   Twitter:  squeaker68882

Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
===========================================================================