The Trunk: System-codefrau.1205.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

The Trunk: System-codefrau.1205.mcz

commits-2
Vanessa Freudenberg uploaded a new version of System to project The Trunk:
http://source.squeak.org/trunk/System-codefrau.1205.mcz

==================== Summary ====================

Name: System-codefrau.1205
Author: codefrau
Time: 20 December 2020, 10:23:10.790782 pm
UUID: f94486f3-3743-4300-a495-c2a89089e122
Ancestors: System-dtl.1204

Update platformName for SqueakJS 1.0

=============== Diff against System-dtl.1204 ===============

Item was changed:
  ----- Method: SmalltalkImage>>isLowerPerformance (in category 'system attributes') -----
  isLowerPerformance
  "Some operations - TestCases for example - need an idea of the typical performance of the system on which they are being performed. For now we will simply assert that running on an ARM cpu or as a SqueakJS instance is enough of a discriminator. Options for the future might also involve whether the vm is a full Cog or Sisata system, even actually measuring the performance at some point to be sure"
  ^ (self platformSubtype beginsWith: 'arm') "Raspberry PI for example"
+ or: [self platformName = 'JS'] "SqueakJS"!
- or: [self platformName = 'Web'] "SqueakJS"!


Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: System-codefrau.1205.mcz

Eliot Miranda-2
Hi Vanessa, Hi Fabio, Hi David, Hi All,

> On Dec 20, 2020, at 10:24 PM, [hidden email] wrote:
>
> Vanessa Freudenberg uploaded a new version of System to project The Trunk:
> http://source.squeak.org/trunk/System-codefrau.1205.mcz
>
> ==================== Summary ====================
>
> Name: System-codefrau.1205
> Author: codefrau
> Time: 20 December 2020, 10:23:10.790782 pm
> UUID: f94486f3-3743-4300-a495-c2a89089e122
> Ancestors: System-dtl.1204
>
> Update platformName for SqueakJS 1.0
>
> =============== Diff against System-dtl.1204 ===============
>
> Item was changed:
>  ----- Method: SmalltalkImage>>isLowerPerformance (in category 'system attributes') -----
>  isLowerPerformance
>      "Some operations - TestCases for example - need an idea of the typical performance of the system on which they are being performed. For now we will simply assert that running on an ARM cpu or as a SqueakJS instance is enough of a discriminator. Options for the future might also involve whether the vm is a full Cog or Sisata system, even actually measuring the performance at some point to be sure"
>      ^ (self platformSubtype beginsWith: 'arm') "Raspberry PI for example"
> +        or: [self platformName = 'JS'] "SqueakJS"!
> -        or: [self platformName = 'Web'] "SqueakJS"!

this is interesting.  The method is do crude, but potentially we have a much more rational basis upon which to derive its result.  I would expect the effective performance to be the product of processor speed (mips), core execution engine architecture and object representation.

Mips varies hugely across the range from eg Raspberry Pi 2,3,4 to various Intel (i5,I7,i9 etc) and Apple Silicon.  The range here is about one order of magnitude.

Execution architecture varies from pure context interpreter (the BTTF VM), Stack Interpreter, Squeak JS interpreter, Squeak JS generation one JIT, Squeak JS subsequent generation JITs (temps in JS vars, sends mapped to JS calls), Cog JIT, Sista JIT.

Very crudely Spur = 2 x v3 (actually about 1.7 and varies according to workflow).

Of the execution architectures Sista JIT is for practical purposes not yet implemented, a prototype, but may offer 2x to 4x of Cog.  Of the Squeak JS JITs I think that the send mapping isn’t implemented (am I right?).  But is the temp var mapping implemented? If so what difference does it make?  Context to Stack is about 1.5. Stack to Cog is about 6.

So the notion is that if we can come up with crude numbers that rank the execution architectures and a measure of mips we can compute a meaningful numeric estimate of likely Smalltalk execution speed and answer isLowerPerformance if this number falls below a specific threshold.  What we have now based on platformName is simply wrong.  eg a Raspberry Pi 4 is way faster than a Pi 3.

One thing I did for visual works is estimate processor mips by timing the first invocation of the allInstances primitive and dividing by the number of objects. Basically the heuristic is that mips is roughly (inversely) proportional to how much time per object the first allInstances invocation spends.  There is (almost) always an allInstances invocation at startup in VisualWorks (to clear font handles IIRC), and there may be in a Squeak image.  Alternatives are measuring how long it takes to load and/or swizzle the image on load divided by the heap size.  Basically we have the opportunity to introspection at startup cheaply measuring the time some costly primitive takes to run and this result can be cached, accessed via a primitive or vmParameter and perhaps updated as execution proceeds.

Does this sound like overkill? If not, what should we choose as our mips measurer?  We want something that all VMs have to do somewhat similarly fairly early on system startup and we can correlate with stopwatches and macro benchmarks like the time taken for the Compiler package to recompile itself, etc.

Eliot
_,,,^..^,,,_ (phone)

Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: System-codefrau.1205.mcz

David T. Lewis
On Tue, Dec 22, 2020 at 01:08:28AM -0800, Eliot Miranda wrote:

> Hi Vanessa, Hi Fabio, Hi David, Hi All,
>
> > On Dec 20, 2020, at 10:24 PM, [hidden email] wrote:
> >
> > ???Vanessa Freudenberg uploaded a new version of System to project The Trunk:
> > http://source.squeak.org/trunk/System-codefrau.1205.mcz
> >
> > ==================== Summary ====================
> >
> > Name: System-codefrau.1205
> > Author: codefrau
> > Time: 20 December 2020, 10:23:10.790782 pm
> > UUID: f94486f3-3743-4300-a495-c2a89089e122
> > Ancestors: System-dtl.1204
> >
> > Update platformName for SqueakJS 1.0
> >
> > =============== Diff against System-dtl.1204 ===============
> >
> > Item was changed:
> >  ----- Method: SmalltalkImage>>isLowerPerformance (in category 'system attributes') -----
> >  isLowerPerformance
> >      "Some operations - TestCases for example - need an idea of the typical performance of the system on which they are being performed. For now we will simply assert that running on an ARM cpu or as a SqueakJS instance is enough of a discriminator. Options for the future might also involve whether the vm is a full Cog or Sisata system, even actually measuring the performance at some point to be sure"
> >      ^ (self platformSubtype beginsWith: 'arm') "Raspberry PI for example"
> > +        or: [self platformName = 'JS'] "SqueakJS"!
> > -        or: [self platformName = 'Web'] "SqueakJS"!
>
> this is interesting.  The method is do crude, but potentially we have a much more rational basis upon which to derive its result.  I would expect the effective performance to be the product of processor speed (mips), core execution engine architecture and object representation.
>
> Mips varies hugely across the range from eg Raspberry Pi 2,3,4 to various Intel (i5,I7,i9 etc) and Apple Silicon.  The range here is about one order of magnitude.
>
> Execution architecture varies from pure context interpreter (the BTTF VM), Stack Interpreter, Squeak JS interpreter, Squeak JS generation one JIT, Squeak JS subsequent generation JITs (temps in JS vars, sends mapped to JS calls), Cog JIT, Sista JIT.
>
> Very crudely Spur = 2 x v3 (actually about 1.7 and varies according to workflow).
>
> Of the execution architectures Sista JIT is for practical purposes not yet implemented, a prototype, but may offer 2x to 4x of Cog.  Of the Squeak JS JITs I think that the send mapping isn???t implemented (am I right?).  But is the temp var mapping implemented? If so what difference does it make?  Context to Stack is about 1.5. Stack to Cog is about 6.
>
> So the notion is that if we can come up with crude numbers that rank the execution architectures and a measure of mips we can compute a meaningful numeric estimate of likely Smalltalk execution speed and answer isLowerPerformance if this number falls below a specific threshold.  What we have now based on platformName is simply wrong.  eg a Raspberry Pi 4 is way faster than a Pi 3.
>
> One thing I did for visual works is estimate processor mips by timing the first invocation of the allInstances primitive and dividing by the number of objects. Basically the heuristic is that mips is roughly (inversely) proportional to how much time per object the first allInstances invocation spends.  There is (almost) always an allInstances invocation at startup in VisualWorks (to clear font handles IIRC), and there may be in a Squeak image.  Alternatives are measuring how long it takes to load and/or swizzle the image on load divided by the heap size.  Basically we have the opportunity to introspection at startup cheaply measuring the time some costly primitive takes to run and this result can be cached, accessed via a primitive or vmParameter and perhaps updated as execution proceeds.
>
> Does this sound like overkill? If not, what should we choose as our mips measurer?  We want something that all VMs have to do somewhat similarly fairly early on system startup and we can correlate with stopwatches and macro benchmarks like the time taken for the Compiler package to recompile itself, etc.
>

Not overkill at all, I think it's a good idea. There are currently only two
use cases for isLowerPerformance, and in both cases it looks like something
that could be computed on demand when first referenced after image startup.

Dave


Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: System-codefrau.1205.mcz

David T. Lewis
On Tue, Dec 22, 2020 at 12:55:32PM -0500, David T. Lewis wrote:

> On Tue, Dec 22, 2020 at 01:08:28AM -0800, Eliot Miranda wrote:
> > Hi Vanessa, Hi Fabio, Hi David, Hi All,
> >
> > > On Dec 20, 2020, at 10:24 PM, [hidden email] wrote:
> > >
> > > ???Vanessa Freudenberg uploaded a new version of System to project The Trunk:
> > > http://source.squeak.org/trunk/System-codefrau.1205.mcz
> > >
> > > ==================== Summary ====================
> > >
> > > Name: System-codefrau.1205
> > > Author: codefrau
> > > Time: 20 December 2020, 10:23:10.790782 pm
> > > UUID: f94486f3-3743-4300-a495-c2a89089e122
> > > Ancestors: System-dtl.1204
> > >
> > > Update platformName for SqueakJS 1.0
> > >
> > > =============== Diff against System-dtl.1204 ===============
> > >
> > > Item was changed:
> > >  ----- Method: SmalltalkImage>>isLowerPerformance (in category 'system attributes') -----
> > >  isLowerPerformance
> > >      "Some operations - TestCases for example - need an idea of the typical performance of the system on which they are being performed. For now we will simply assert that running on an ARM cpu or as a SqueakJS instance is enough of a discriminator. Options for the future might also involve whether the vm is a full Cog or Sisata system, even actually measuring the performance at some point to be sure"
> > >      ^ (self platformSubtype beginsWith: 'arm') "Raspberry PI for example"
> > > +        or: [self platformName = 'JS'] "SqueakJS"!
> > > -        or: [self platformName = 'Web'] "SqueakJS"!
> >
> > this is interesting.  The method is do crude, but potentially we have a much more rational basis upon which to derive its result.  I would expect the effective performance to be the product of processor speed (mips), core execution engine architecture and object representation.
> >
> > Mips varies hugely across the range from eg Raspberry Pi 2,3,4 to various Intel (i5,I7,i9 etc) and Apple Silicon.  The range here is about one order of magnitude.
> >
> > Execution architecture varies from pure context interpreter (the BTTF VM), Stack Interpreter, Squeak JS interpreter, Squeak JS generation one JIT, Squeak JS subsequent generation JITs (temps in JS vars, sends mapped to JS calls), Cog JIT, Sista JIT.
> >
> > Very crudely Spur = 2 x v3 (actually about 1.7 and varies according to workflow).
> >
> > Of the execution architectures Sista JIT is for practical purposes not yet implemented, a prototype, but may offer 2x to 4x of Cog.  Of the Squeak JS JITs I think that the send mapping isn???t implemented (am I right?).  But is the temp var mapping implemented? If so what difference does it make?  Context to Stack is about 1.5. Stack to Cog is about 6.
> >
> > So the notion is that if we can come up with crude numbers that rank the execution architectures and a measure of mips we can compute a meaningful numeric estimate of likely Smalltalk execution speed and answer isLowerPerformance if this number falls below a specific threshold.  What we have now based on platformName is simply wrong.  eg a Raspberry Pi 4 is way faster than a Pi 3.
> >
> > One thing I did for visual works is estimate processor mips by timing the first invocation of the allInstances primitive and dividing by the number of objects. Basically the heuristic is that mips is roughly (inversely) proportional to how much time per object the first allInstances invocation spends.  There is (almost) always an allInstances invocation at startup in VisualWorks (to clear font handles IIRC), and there may be in a Squeak image.  Alternatives are measuring how long it takes to load and/or swizzle the image on load divided by the heap size.  Basically we have the opportunity to introspection at startup cheaply measuring the time some costly primitive takes to run and this result can be cached, accessed via a primitive or vmParameter and perhaps updated as execution proceeds.
> >
> > Does this sound like overkill? If not, what should we choose as our mips measurer?  We want something that all VMs have to do somewhat similarly fairly early on system startup and we can correlate with stopwatches and macro benchmarks like the time taken for the Compiler package to recompile itself, etc.
> >
>
> Not overkill at all, I think it's a good idea. There are currently only two
> use cases for isLowerPerformance, and in both cases it looks like something
> that could be computed on demand when first referenced after image startup.
>

I put System-dtl.1207 in the inbox as a possible solution. Treat it as
a bikeshed in need of a good coat of paint.

Dave


Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] The Trunk: System-codefrau.1205.mcz

codefrau
In reply to this post by Eliot Miranda-2
On Tue, Dec 22, 2020 at 1:08 AM Eliot Miranda <[hidden email]> wrote:
 
Of the execution architectures Sista JIT is for practical purposes not yet implemented, a prototype, but may offer 2x to 4x of Cog.  Of the Squeak JS JITs I think that the send mapping isn’t implemented (am I right?).  But is the temp var mapping implemented? If so what difference does it make?  Context to Stack is about 1.5. Stack to Cog is about 6.

None of that has been implemented in SqueakJS. The current JIT only gets rid of the generic bytecode decoding, plus it inlines small-int arithmetic.

However, that still gives an 8x increase in bytecode speed, which causes the send speed as measured by tinyBenchmarks to go up by 3.5x too. It also feels significantly faster with the JIT enabled.


Does this sound like overkill? If not, what should we choose as our mips measurer?  We want something that all VMs have to do somewhat similarly fairly early on system startup and we can correlate with stopwatches and macro benchmarks like the time taken for the Compiler package to recompile itself, etc.

I like measuring all-over performance, and not adding any extra work.

Like, DateAndTime is pretty early in the startup list. It could remember the time its startup was invoked. Another class that comes later could set a LowPerformance flag if it took longer than x ms since DateAndTime was initialized.

I just tried that with ProcessorScheduler (see attachment). On Safari and a 5.3 image I get ImageStartMS = 133 ms, on Chrome 250 ms. On a fast VM I get 5 ms. So maybe if that takes longer than say 50 ms it could be considered low performance?

Vanessa
 



startup-codefrau.1.cs (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] The Trunk: System-codefrau.1205.mcz

codefrau
(dropping vm-dev)

On Tue, Dec 22, 2020 at 1:57 PM Eliot Miranda <[hidden email]> wrote:
 
On Tue, Dec 22, 2020 at 1:44 PM Vanessa Freudenberg <[hidden email]> wrote:

I like measuring all-over performance, and not adding any extra work.

Like, DateAndTime is pretty early in the startup list. It could remember the time its startup was invoked. Another class that comes later could set a LowPerformance flag if it took longer than x ms since DateAndTime was initialized.

I just tried that with ProcessorScheduler (see attachment). On Safari and a 5.3 image I get ImageStartMS = 133 ms, on Chrome 250 ms. On a fast VM I get 5 ms. So maybe if that takes longer than say 50 ms it could be considered low performance?

Works for me. I would record and provide an accessor for ImageStartUsecs (a class variable in SmalltalkImage, using microseconds :-) ).  Then one can either use isLowerPerformance or the actual time for a more "nuanced" view.

Good idea. It also should use class vars not globals, etc. 

I didn't mean to use this as is, just to do a quick proof of concept. And I didn't see Dave's take come through yet ... or are announcements from squeaksource broken?

Vanessa
 


Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] The Trunk: System-codefrau.1205.mcz

David T. Lewis
On Tue, Dec 22, 2020 at 02:07:30PM -0800, Vanessa Freudenberg wrote:

> (dropping vm-dev)
>
> On Tue, Dec 22, 2020 at 1:57 PM Eliot Miranda <[hidden email]>
> wrote:
>
> >
> > On Tue, Dec 22, 2020 at 1:44 PM Vanessa Freudenberg <[hidden email]>
> > wrote:
> >
> >>
> >> I like measuring all-over performance, and not adding any extra work.
> >>
> >> Like, DateAndTime is pretty early in the startup list. It could remember
> >> the time its startup was invoked. Another class that comes later could set
> >> a LowPerformance flag if it took longer than x ms since DateAndTime was
> >> initialized.
> >>
> >> I just tried that with ProcessorScheduler (see attachment). On Safari and
> >> a 5.3 image I get ImageStartMS = 133 ms, on Chrome 250 ms. On a fast VM I
> >> get 5 ms. So maybe if that takes longer than say 50 ms it could be
> >> considered low performance?
> >>
> >
> > Works for me. I would record and provide an accessor for ImageStartUsecs
> > (a class variable in SmalltalkImage, using microseconds :-) ).  Then one
> > can either use isLowerPerformance or the actual time for a more "nuanced"
> > view.
> >
>
> Good idea. It also should use class vars not globals, etc.
>
> I didn't mean to use this as is, just to do a quick proof of concept. And I
> didn't see Dave's take come through yet ... or are announcements from
> squeaksource broken?
>
> Vanessa

I'm not sure if the inbox announcement came out on email (I did not see
it either), but my take on this was in System-dtl.1207.mcz in the inbox.
It uses a value computed once per session on first reference, so it should
have no effect on startup time.

The actual logic to determine a "slow platform" is this:

  (Time millisecondsToRun:[ 25 benchFib ]) > 200
      or: [ (Time millisecondsToRun: [ 64 benchmark ]) > 200 ]

Perhaps someone with a Raspberry Pi can check and see if this is
reasonable? It does indicate that SqueakJS is "slow" when running
in Chrome on my Linux box, and the compiled VMs are "fast".

Dave


Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] The Trunk: System-codefrau.1205.mcz

codefrau
Hi Dave,

I haven’t actually tried it but it seems on a slow platform this would delay the first-time startup by however long those benchmarks take, right?

IsLowPerformance is used to determine if the full wizard is shown or a cheaper one.

Vanessa

On Thu 31. Dec 2020 at 09:47, David T. Lewis <[hidden email]> wrote:
On Tue, Dec 22, 2020 at 02:07:30PM -0800, Vanessa Freudenberg wrote:
> (dropping vm-dev)
>
> On Tue, Dec 22, 2020 at 1:57 PM Eliot Miranda <[hidden email]>
> wrote:
>
> >
> > On Tue, Dec 22, 2020 at 1:44 PM Vanessa Freudenberg <[hidden email]>
> > wrote:
> >
> >>
> >> I like measuring all-over performance, and not adding any extra work.
> >>
> >> Like, DateAndTime is pretty early in the startup list. It could remember
> >> the time its startup was invoked. Another class that comes later could set
> >> a LowPerformance flag if it took longer than x ms since DateAndTime was
> >> initialized.
> >>
> >> I just tried that with ProcessorScheduler (see attachment). On Safari and
> >> a 5.3 image I get ImageStartMS = 133 ms, on Chrome 250 ms. On a fast VM I
> >> get 5 ms. So maybe if that takes longer than say 50 ms it could be
> >> considered low performance?
> >>
> >
> > Works for me. I would record and provide an accessor for ImageStartUsecs
> > (a class variable in SmalltalkImage, using microseconds :-) ).  Then one
> > can either use isLowerPerformance or the actual time for a more "nuanced"
> > view.
> >
>
> Good idea. It also should use class vars not globals, etc.
>
> I didn't mean to use this as is, just to do a quick proof of concept. And I
> didn't see Dave's take come through yet ... or are announcements from
> squeaksource broken?
>
> Vanessa

I'm not sure if the inbox announcement came out on email (I did not see
it either), but my take on this was in System-dtl.1207.mcz in the inbox.
It uses a value computed once per session on first reference, so it should
have no effect on startup time.

The actual logic to determine a "slow platform" is this:

  (Time millisecondsToRun:[ 25 benchFib ]) > 200
      or: [ (Time millisecondsToRun: [ 64 benchmark ]) > 200 ]

Perhaps someone with a Raspberry Pi can check and see if this is
reasonable? It does indicate that SqueakJS is "slow" when running
in Chrome on my Linux box, and the compiled VMs are "fast".

Dave




Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] The Trunk: System-codefrau.1205.mcz

timrowledge
On a Pi 4 -
>
>  (Time millisecondsToRun:[ 25 benchFib ]) > 200
>       or: [ (Time millisecondsToRun: [ 64 benchmark ]) > 200 ]

19mS & 143mS

I don't see an issue with possibly slowing down the first startup to test performance as long as
a) it isn't a multi-second delay
b) it says what it is doing
   "Checking system performance in order to provide advice on suitable setup options"
would be apropos.

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Useful random insult:- Not all his dogs are barking.



Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] The Trunk: System-codefrau.1205.mcz

Eliot Miranda-2


> On Jan 1, 2021, at 9:43 PM, tim Rowledge <[hidden email]> wrote:
>
> On a Pi 4 -
>>
>> (Time millisecondsToRun:[ 25 benchFib ]) > 200
>>      or: [ (Time millisecondsToRun: [ 64 benchmark ]) > 200 ]
>
> 19mS & 143mS
>
> I don't see an issue with possibly slowing down the first startup to test performance as long as
> a) it isn't a multi-second delay
> b) it says what it is doing
>   "Checking system performance in order to provide advice on suitable setup options"
> would be apropos.

I’m sorry but it’s completely unacceptable.  19ms is an age.  Simple Unix commands are done in that time.  50 of those and a second has elapsed.  Imagine one is using a Smalltalk script with the Unix find command and one can not do better than 50 applications per second.

We know that we can derive a meaningful performance figure from existing work.  Why are we even considering adding unnecessary overhead?

>
> tim
> --
> tim Rowledge; [hidden email]; http://www.rowledge.org/tim
> Useful random insult:- Not all his dogs are barking.
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] The Trunk: System-codefrau.1205.mcz

Levente Uzonyi
Perhaps it's sufficient to measure the uptime and estimate the machine's
performance based on that value. On slower machines it should take longer
to get to the same point in the startup process.

By uptime I mean what's shown in the VM stats:

| vmStartTime uptime |
vmStartTime := Smalltalk vmParameterAt: 20.
uptime := vmStartTime ~= 0 "utcMicrosecondClock at startup in later Spur VMs"
  ifTrue: [Time utcMicrosecondClock - vmStartTime + 500 // 1000]
  ifFalse: [Time eventMillisecondClock].

which may be simplified to:

| uptime |
uptime := Time utcMicrosecondClock - (Smalltalk vmParameterAt: 20) + 500 // 1000.


Levente

On Sat, 2 Jan 2021, Eliot Miranda wrote:

>
>
>> On Jan 1, 2021, at 9:43 PM, tim Rowledge <[hidden email]> wrote:
>>
>> On a Pi 4 -
>>>
>>> (Time millisecondsToRun:[ 25 benchFib ]) > 200
>>>      or: [ (Time millisecondsToRun: [ 64 benchmark ]) > 200 ]
>>
>> 19mS & 143mS
>>
>> I don't see an issue with possibly slowing down the first startup to test performance as long as
>> a) it isn't a multi-second delay
>> b) it says what it is doing
>>   "Checking system performance in order to provide advice on suitable setup options"
>> would be apropos.
>
> I’m sorry but it’s completely unacceptable.  19ms is an age.  Simple Unix commands are done in that time.  50 of those and a second has elapsed.  Imagine one is using a Smalltalk script with the Unix find command and one can not do better than 50 applications per second.
>
> We know that we can derive a meaningful performance figure from existing work.  Why are we even considering adding unnecessary overhead?
>
>>
>> tim
>> --
>> tim Rowledge; [hidden email]; http://www.rowledge.org/tim
>> Useful random insult:- Not all his dogs are barking.
>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] The Trunk: System-codefrau.1205.mcz

fniephaus
In reply to this post by codefrau
On Sat, 2 Jan 2021 at 6:28 am, Vanessa Freudenberg <[hidden email]> wrote:
Hi Dave,

I haven’t actually tried it but it seems on a slow platform this would delay the first-time startup by however long those benchmarks take, right?

IsLowPerformance is used to determine if the full wizard is shown or a cheaper one.

The wizard unfortunately takes a lot of time snapshotting a preview of the world on startup. Marcel said that this can be avoided with lazily initializing the preview (it's not even shown in the initial overlay). This would also eliminate the noticeable delay on startup when running on OpenSmalltalkVM.

Fabio


Vanessa

On Thu 31. Dec 2020 at 09:47, David T. Lewis <[hidden email]> wrote:
On Tue, Dec 22, 2020 at 02:07:30PM -0800, Vanessa Freudenberg wrote:
> (dropping vm-dev)
>
> On Tue, Dec 22, 2020 at 1:57 PM Eliot Miranda <[hidden email]>
> wrote:
>
> >
> > On Tue, Dec 22, 2020 at 1:44 PM Vanessa Freudenberg <[hidden email]>
> > wrote:
> >
> >>
> >> I like measuring all-over performance, and not adding any extra work.
> >>
> >> Like, DateAndTime is pretty early in the startup list. It could remember
> >> the time its startup was invoked. Another class that comes later could set
> >> a LowPerformance flag if it took longer than x ms since DateAndTime was
> >> initialized.
> >>
> >> I just tried that with ProcessorScheduler (see attachment). On Safari and
> >> a 5.3 image I get ImageStartMS = 133 ms, on Chrome 250 ms. On a fast VM I
> >> get 5 ms. So maybe if that takes longer than say 50 ms it could be
> >> considered low performance?
> >>
> >
> > Works for me. I would record and provide an accessor for ImageStartUsecs
> > (a class variable in SmalltalkImage, using microseconds :-) ).  Then one
> > can either use isLowerPerformance or the actual time for a more "nuanced"
> > view.
> >
>
> Good idea. It also should use class vars not globals, etc.
>
> I didn't mean to use this as is, just to do a quick proof of concept. And I
> didn't see Dave's take come through yet ... or are announcements from
> squeaksource broken?
>
> Vanessa

I'm not sure if the inbox announcement came out on email (I did not see
it either), but my take on this was in System-dtl.1207.mcz in the inbox.
It uses a value computed once per session on first reference, so it should
have no effect on startup time.

The actual logic to determine a "slow platform" is this:

  (Time millisecondsToRun:[ 25 benchFib ]) > 200
      or: [ (Time millisecondsToRun: [ 64 benchmark ]) > 200 ]

Perhaps someone with a Raspberry Pi can check and see if this is
reasonable? It does indicate that SqueakJS is "slow" when running
in Chrome on my Linux box, and the compiled VMs are "fast".

Dave





Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] The Trunk: System-codefrau.1205.mcz

David T. Lewis
In reply to this post by codefrau
Hi Vanessa,

On Fri, Jan 01, 2021 at 09:28:35PM -0800, Vanessa Freudenberg wrote:
> Hi Dave,
>
> I haven???t actually tried it but it seems on a slow platform this would
> delay the first-time startup by however long those benchmarks take, right?
>

No, it would run at most one time per session, and then only if someone
sends #isLowerPerformance. I think thas means it would affect the run time
of the first unit test that someone runs after starting the image, but
otherwise no impact. Certainly it would not be noticable to humans.

I like Levente's suggestion also.

Dave


Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] The Trunk: System-codefrau.1205.mcz

David T. Lewis
In reply to this post by Levente Uzonyi
On Sat, Jan 02, 2021 at 01:00:03PM +0100, Levente Uzonyi wrote:

> >>
> >>On Jan 1, 2021, at 9:43 PM, tim Rowledge <[hidden email]> wrote:
> >>
> >>???On a Pi 4 -
> >>>
> >>>(Time millisecondsToRun:[ 25 benchFib ]) > 200
> >>>     or: [ (Time millisecondsToRun: [ 64 benchmark ]) > 200 ]
> >>
> >>19mS & 143mS
> >>

That certainly does not sound "slow" to me :-)   Would you want the
Pi 4 to be treated as slow, or is this a case of the PI arm platform
actually moving into the "fast" category?


> Perhaps it's sufficient to measure the uptime and estimate the machine's
> performance based on that value. On slower machines it should take longer
> to get to the same point in the startup process.
>
> By uptime I mean what's shown in the VM stats:
>
> | vmStartTime uptime |
> vmStartTime := Smalltalk vmParameterAt: 20.
> uptime := vmStartTime ~= 0 "utcMicrosecondClock at startup in later Spur
> VMs"
> ifTrue: [Time utcMicrosecondClock - vmStartTime + 500 // 1000]
> ifFalse: [Time eventMillisecondClock].
>
> which may be simplified to:
>
> | uptime |
> uptime := Time utcMicrosecondClock - (Smalltalk vmParameterAt: 20) + 500 //
> 1000.
>
>
> Levente

This is a good simple approach that might give a better overall measure
of "slow" in the case of Raspberry Pi. However, I don't think that VM
parameter 20 is implemented on SqueakJS.

Dave


Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] The Trunk: System-codefrau.1205.mcz

timrowledge
In reply to this post by Eliot Miranda-2


> On 2021-01-02, at 12:20 AM, Eliot Miranda <[hidden email]> wrote:
>
>
> I’m sorry but it’s completely unacceptable.  

This is for deciding what options to have the welcome to your new image wizard suggest, right? In *that context* a single frame cycle delay spent being helpful to a new user seems very reasonable.

Now, in an image set up to be used within some scripty-command-line-doit, absolutely not. I'm fairly sure nobody was suggesting that. If there is a problem where the wizard is getting invoked at an annoying time (for example, your scripts for building a vmmaker image) then ther must be a decent way to make it skip the whole thing. Might it be appropriate to say that if any commandline input is provided then the wizard stage is skipped completely?


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Useful Latin Phrases:- Vescere bracis meis. = Eat my shorts.



Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] The Trunk: System-codefrau.1205.mcz

Eliot Miranda-2
In reply to this post by Levente Uzonyi


> On Jan 2, 2021, at 4:00 AM, Levente Uzonyi <[hidden email]> wrote:
>
> Perhaps it's sufficient to measure the uptime and estimate the machine's performance based on that value. On slower machines it should take longer to get to the same point in the startup process.
>
> By uptime I mean what's shown in the VM stats:
>
> | vmStartTime uptime |
> vmStartTime := Smalltalk vmParameterAt: 20.
> uptime := vmStartTime ~= 0 "utcMicrosecondClock at startup in later Spur VMs"
>    ifTrue: [Time utcMicrosecondClock - vmStartTime + 500 // 1000]
>    ifFalse: [Time eventMillisecondClock].
>
> which may be simplified to:
>
> | uptime |
> uptime := Time utcMicrosecondClock - (Smalltalk vmParameterAt: 20) + 500 // 1000.

Exactly. +1.  And further:

uptimeUsecs
    ^ Time utcMicrosecondClock - (Smalltalk vmParameterAt: 20)

>
>
> Levente
>
>> On Sat, 2 Jan 2021, Eliot Miranda wrote:
>>
>>
>>
>>>> On Jan 1, 2021, at 9:43 PM, tim Rowledge <[hidden email]> wrote:
>>> On a Pi 4 -
>>>> (Time millisecondsToRun:[ 25 benchFib ]) > 200
>>>>     or: [ (Time millisecondsToRun: [ 64 benchmark ]) > 200 ]
>>> 19mS & 143mS
>>> I don't see an issue with possibly slowing down the first startup to test performance as long as
>>> a) it isn't a multi-second delay
>>> b) it says what it is doing
>>>  "Checking system performance in order to provide advice on suitable setup options"
>>> would be apropos.
>>
>> I’m sorry but it’s completely unacceptable.  19ms is an age.  Simple Unix commands are done in that time.  50 of those and a second has elapsed.  Imagine one is using a Smalltalk script with the Unix find command and one can not do better than 50 applications per second.
>>
>> We know that we can derive a meaningful performance figure from existing work.  Why are we even considering adding unnecessary overhead?
>>
>>> tim
>>> --
>>> tim Rowledge; [hidden email]; http://www.rowledge.org/tim
>>> Useful random insult:- Not all his dogs are barking.
>

Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] The Trunk: System-codefrau.1205.mcz

Eliot Miranda-2
In reply to this post by David T. Lewis


> On Jan 2, 2021, at 8:41 AM, David T. Lewis <[hidden email]> wrote:
>
> Hi Vanessa,
>
>> On Fri, Jan 01, 2021 at 09:28:35PM -0800, Vanessa Freudenberg wrote:
>> Hi Dave,
>>
>> I haven???t actually tried it but it seems on a slow platform this would
>> delay the first-time startup by however long those benchmarks take, right?
>>
>
> No, it would run at most one time per session, and then only if someone
> sends #isLowerPerformance. I think thas means it would affect the run time
> of the first unit test that someone runs after starting the image, but
> otherwise no impact. Certainly it would not be noticable to humans.

David, put this delay in every startup and use Squeak as a scripting language and apply the script to every file in your home directory and yes, it would be noticeable to humans.

>
> I like Levente's suggestion also.
>
> Dave
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] The Trunk: System-codefrau.1205.mcz

Eliot Miranda-2
In reply to this post by timrowledge
Tim,

> On Jan 2, 2021, at 10:29 AM, tim Rowledge <[hidden email]> wrote:
>
> 
>
>> On 2021-01-02, at 12:20 AM, Eliot Miranda <[hidden email]> wrote:
>>
>>
>> I’m sorry but it’s completely unacceptable.  
>
> This is for deciding what options to have the welcome to your new image wizard suggest, right? In *that context* a single frame cycle delay spent being helpful to a new user seems very reasonable.
>
> Now, in an image set up to be used within some scripty-command-line-doit, absolutely not. I'm fairly sure nobody was suggesting that. If there is a problem where the wizard is getting invoked at an annoying time (for example, your scripts for building a vmmaker image) then ther must be a decent way to make it skip the whole thing. Might it be appropriate to say that if any commandline input is provided then the wizard stage is skipped completely?

We’re talking at cross purposes.  The Wizard is but one client of isSlowMachine (or whatever the selector is).  If it’s a generally useful facility (and it’s been there for 40 years so maybe it is) then it is present in *every* context.  And in many contexts adding 20ms to startup, let alone 150, is too slow.

At DarkPlace at the turn of the C or so we got start time down to around 80ms after it being in the 400ms range of the machines of the time.  Machines are much faster now.

In a typical OpenSmalltalk-vm repository clone there are about 10,000 .c files (not .h, .cc, .m etc; just .c files).

On my 2.9GHz MBP doing
     find . -name ‘*.c’ >/dev/null
takes 1 second.  Doing
     find . -name ‘*.c’ -exec grep NOTTHERE \{} \; >/dev/null
takes 28 seconds.

Adding 20ms to the startup time of grep would add 200 seconds, a 700% overhead.  20ms is unacceptable if it’s unnecessary.

>
>
> tim
> --
> tim Rowledge; [hidden email]; http://www.rowledge.org/tim
> Useful Latin Phrases:- Vescere bracis meis. = Eat my shorts.
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] The Trunk: System-codefrau.1205.mcz

David T. Lewis
In reply to this post by Eliot Miranda-2
Hi Eliot,

On Sat, Jan 02, 2021 at 11:10:45AM -0800, Eliot Miranda wrote:

>
>
> > On Jan 2, 2021, at 8:41 AM, David T. Lewis <[hidden email]> wrote:
> >
> > ???Hi Vanessa,
> >
> >> On Fri, Jan 01, 2021 at 09:28:35PM -0800, Vanessa Freudenberg wrote:
> >> Hi Dave,
> >>
> >> I haven???t actually tried it but it seems on a slow platform this would
> >> delay the first-time startup by however long those benchmarks take, right?
> >>
> >
> > No, it would run at most one time per session, and then only if someone
> > sends #isLowerPerformance. I think thas means it would affect the run time
> > of the first unit test that someone runs after starting the image, but
> > otherwise no impact. Certainly it would not be noticable to humans.
>
> David, put this delay in every startup and use Squeak as a scripting
> language and apply the script to every file in your home directory and
> yes, it would be noticeable to humans.
>

I does *not* do that. Apparently I am not doing a good job of explaining.

Dave


> >
> > I like Levente's suggestion also.
> >
> > Dave
> >
> >
>

Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] The Trunk: System-codefrau.1205.mcz

David T. Lewis
On Sat, Jan 02, 2021 at 02:56:49PM -0500, David T. Lewis wrote:

> Hi Eliot,
>
> On Sat, Jan 02, 2021 at 11:10:45AM -0800, Eliot Miranda wrote:
> >
> >
> > > On Jan 2, 2021, at 8:41 AM, David T. Lewis <[hidden email]> wrote:
> > >
> > > ???Hi Vanessa,
> > >
> > >> On Fri, Jan 01, 2021 at 09:28:35PM -0800, Vanessa Freudenberg wrote:
> > >> Hi Dave,
> > >>
> > >> I haven???t actually tried it but it seems on a slow platform this would
> > >> delay the first-time startup by however long those benchmarks take, right?
> > >>
> > >
> > > No, it would run at most one time per session, and then only if someone
> > > sends #isLowerPerformance. I think thas means it would affect the run time
> > > of the first unit test that someone runs after starting the image, but
> > > otherwise no impact. Certainly it would not be noticable to humans.
> >
> > David, put this delay in every startup and use Squeak as a scripting
> > language and apply the script to every file in your home directory and
> > yes, it would be noticeable to humans.
> >
>
> I does *not* do that. Apparently I am not doing a good job of explaining.
>
>

Regarding startup time, I'm afraid that my explanation did not come across
well in email, and I apologize for not being clear.

Let me try once more by just quoting the code. At startup time, there is
a new class var SlowPlatform that is set to nil. Otherwise nothing new
is done during image startup processing:

SmalltalkImage>>startUp: resuming
        resuming ifTrue:
                [LastStats := nil.
                SlowPlatform := nil.
                SystemChangeNotifier uniqueInstance notify: Smalltalk ofAllSystemChangesUsing: #event:]

Later on, the class var is lazy initialized if and only if someone needs
to check for isLowerPerformance:

SmalltalkImage>>isLowerPerformance
        "Some operations - TestCases for example - need an idea of the typical performance
        of the system on which they are being performed."
        ^ SlowPlatform
                ifNil: [ SlowPlatform := (Time millisecondsToRun:[ 25 benchFib ]) > 200
                        or: [ (Time millisecondsToRun: [ 64 benchmark ]) > 200 ]]

That's all I meant to suggest.

Sorry for the confusion,

Dave


12