One of the key parts of Delay scheduling is setting the resumption time. There are two places this could be done. a. In Delay>>#schedule, running (for example) at userSchedulingPriority=40 b. In DelayMicrosecondScheduler>>#handleTImerEvent:, running at timingPriority=80 When we were using the millisecond clock for delay scheduling, it made sense to use (b.) since the clock might(?) roll over between when resumption time was set, and when #handleTImerEvent: expires delays. This should not be a problem now that we are using the microsecond clock, so I wondered what the latency penalty might be between (a.) and (b.). I ran a little experiment that indicates the cost/latency of switching threads, and was curious if anyone can comment on the validity of the experiment and interpretation of results. I made the following three changes... DelayScheduler subclass: #DelayMicrosecondScheduler instanceVariableNames: 'latencyStart countSinceLastZero' classVariableNames: 'LatencyCounts' poolDictionaries: '' category: 'Kernel-Processes' DelayMicrosecondScheduler>>#schedule: aDelay latencyStart:= Time primUTCMicrosecondsClock. "This is position (a.)" aDelay schedulerBeingWaitedOn ifTrue: [^self error: 'This Delay has already been scheduled.']. accessProtect critical: [ scheduledDelay := aDelay. timingSemaphore signal. "#handleTimerEvent: sets scheduledDelay:=nil" ].
DelayMicrosecondScheduler>>#handleTimerEvent: microsecondNowTick | microsecondNextTick | "Process any schedule requests" "This is position (b.)" scheduledDelay ifNotNil: [ |latency| latency := Time primUTCMicrosecondsClock - latencyStart. LatencyCounts ifNil: [ LatencyCounts := Bag new ]. LatencyCounts add: latency. latency = 0 ifTrue: [ countSinceLastZero := 1 + (countSinceLastZero ifNil: [0])] ifFalse: [Transcript crShow: 'zero latency count ' , countSinceLastZero printString ; show: ' before latency ', latency printString. countSinceLastZero := 0]. "Schedule the given delay." scheduledDelay scheduler: self resumptionTime: microsecondNowTick + (1000 * scheduledDelay millisecondDelayDuration). self scheduleDelay: scheduledDelay. scheduledDelay := nil ]. <snip> rest of method unchanged Then opened the Transcript and in Playground evaluated... Delay delaySchedulerClass: DelayMicrosecondScheduler. The Transcript results are shown below
with some comments inserted.
Now I guess the latency is affected by garbage collection. But one thing I was curious about is why the latency's were quantised in 1000s. Another interesting thing is that vast majority of the latency's were zero, which was a nice surprise, but can it be true? Or is it a consequence of the quantitisation rounding down? It seems that the idle-ness of the image affected how often a non-zero latency occurred. After I left the house for a while, the count of zero latency was very high, but a few still occurred. It would make sense there was less GC while idle. What is a good snippet of code to stress GC. I presume the latency might increase. zero latency count 2273 before latency 1000 zero latency count 943 before latency 1000 zero latency count 3666 before latency 1000 zero latency count 1643 before latency 1000 zero latency count 27 before latency 1000 "Left house for 20 minutes" zero latency count 12022 before latency 1000 zero latency count 15195 before latency 1000 zero latency count 41998 before latency 1000 "Returned from outing" zero latency count 128 before latency 1000 zero latency count 116 before latency 1000 zero latency count 555 before latency 1000 zero latency count 2377 before latency 1000 zero latency count 5423 before latency 1000 zero latency count 3178 before latency 1000 zero latency count 47 before latency 1000 zero latency count 2276 before latency 1000 "Left house to go shopping" zero latency count 6708 before latency 3000 zero latency count 4896 before latency 1000 zero latency count 433 before latency 1000 zero latency count 7106 before latency 1000 zero latency count 2195 before latency 1000 zero latency count 12397 before latency 1000 zero latency count 4815 before latency 2000 zero latency count 3480 before latency 1000 zero latency count 5690 before latency 1000 zero latency count 1670 before latency 1000 zero latency count 320 before latency 1000 zero latency count 1641 before latency 2000 zero latency count 290 before latency 1000 zero latency count 3835 before latency 1000 zero latency count 3818 before latency 1000 zero latency count 1132 before latency 1000 "Returned from shopping." zero latency count 2045 before latency 1000 zero latency count 671 before latency 1000 zero latency count 167 before latency 1000 zero latency count 147 before latency 1000 From the System Browser, inspecting the class variable LatencyCounts showed... 0 171329 3000 1 2000 3 1000 49 |
On 5 May 2018 at 22:10, Ben Coman <[hidden email]> wrote:
P.S. after running a few more hours,
LatencyCounts shows... 0 1102691 1000 219 2000 12 3000 29 4000 21 5000 5 49000 1 75000 1 |
Hi Ben,
It's likely that the vm code is using the cheap microsecond click which is updated by the heartbeat, not the expensive one that interrogates the OS at the instant. If not, it's the OS. I'll check soon.
|
On 6 May 2018 at 04:20, Eliot Miranda <[hidden email]> wrote:
I went looking and found an obvious source of quantitization multiplying by MicrosecondsPerMillisecond in the Win32 currentUTCMicroseconds() line 109 here. Running the same experiment (for 20 mins) on Linux produced a latency distribution in line with what I was originally expecting... usec count 1 135 2 992 3 3250 4 21161 5 17309 6 5341 7 1230 8 313 9 79 10 44 11 46 12 50 13 36 14 58 15 73 16 38 17 12 18 7 19 5 20 1 21 3 23 2 24 4 25 4 26 5 27 4 29 3 30 6 31 5 32 6 33 5 34 2 35 2 36 5 37 4 39 4 40 2 41 2 44 3 46 2 47 1 48 4 49 3 51 2 53 1 54 1 56 2 58 1 59 1 61 1 66 2 70 1 82 1 86 1 184 1 544 1 576 1 1212 1 1402 1 2016 1 self average. "4.8"microseconds self stdev. "12.9"microseconds self average + (5 * self stdev) "69.1"microseconds So there are nine outliers outside five sigma. @Clement, Would you expect your incremental GC to have much impact on those outliers? or its target pause is ~1000usecs anyway? cheers -ben |
Hi Ben,
|
Free forum by Nabble | Edit this page |