Hi.
I found case where Spur more slow than Cog. Not much but. prototype := BorderedMorph new. result := [prototype veryDeepCopy ] benchFor: 2 seconds. In my computer I got: Cog: a BenchmarkResult(16,982 iterations in 2 seconds 1 millisecond. 8,487 per second) Spur: a BenchmarkResult(12,507 iterations in 2 seconds 33 milliseconds. 6,152 per second). Is it expected behavior for such case? |
Hi Denis,
On Fri, Dec 18, 2015 at 9:36 AM, Denis Kudriashov <[hidden email]> wrote:
Well, it might be indicative of a soluble problem. One thing that Spur is slower at is primitive failure; it catches primitive failure in the VM as potentially caused by primitives encountering forwarders. So many primitive failure involves a slow scan of the primitives arguments to some depth to find out if there are any forwarders which should be followed, and if so the primitive retried. So that means some idioms that use primitive failure normally will go slower. One of these is instVarAt: which used to be written instVarAt: index "Primitive. Answer a fixed variable in an object. The numbering of the variables corresponds to the named instance variables. Fail if the index is not an Integer or is not the index of a fixed variable. Essential. See Object documentation whatIsAPrimitive." <primitive: 73> "Access beyond fixed variables." ^self basicAt: index - self class instSize and used in some copying code to access indexable fields. Chris Muller found this in some of his Magma code and so we added a pair of new primitives, primitiveSlotAt and primitiveSlotAtPut that access any slot within the sequence of named followed by indexable inst vars, avoiding the primitive failure. So in Squeak Spur instVarAt: now reads: instVarAt: index "Primitive. Answer a fixed variable in an object. The numbering of the variables corresponds to the named instance variables, followed by the indexed instance variables. Fail if the index is not an Integer or is not the index of a fixed variable. Essential. See Object documentation whatIsAPrimitive." <primitive: 173 error: ec> self primitiveFailed I suspect something like this is happening with the veryDeepCopy code. You might want to use the profiler on both versions and compare. You might find an issue. HTH
_,,,^..^,,,_ best, Eliot |
Hello, In the Smalltalk profiler it shows more time spent in primitive. It could be also, as it is an older version of the spur vm currently used in Pharo, that this problem is already fixed. I can't know the problem for sure as I can't run the VM profiler in Pharo. 2015-12-19 4:18 GMT+01:00 Eliot Miranda <[hidden email]>:
|
In reply to this post by Eliot Miranda-2
Hi 2015-12-19 4:18 GMT+01:00 Eliot Miranda <[hidden email]>:
Same code used in Pharo too. I try same test on latest squeak images: Cog: '10,100 per second. 98.8 microseconds per run.' Spur: '7,610 per second. 131 microseconds per run.' |
I could reproduce with Squeak 5 and Squeak 4.6 so I was able to use the VM profiler. I used: | prototype | prototype := BorderedMorph new. 10000 timesRepeat: [prototype veryDeepCopy] The profiler ran for ~0.8 second. Maybe you can have more accurate result with longer bench but it takes already a while to display results. In Spur: 1044 samples in the VM (1057 samples in the entire program) 98.77% of total 754 samples in generated vm code 72.22% of entire vm (71.33% of total) 290 samples in vanilla vm code 27.78% of entire vm (27.44% of total) % of generated vm code (% of total) (samples) (cumulative) 37.67% (26.87%) Dictionary>>associationsDo: (284) (37.67%) 30.37% (21.67%) Object>>at: (229) (68.04%) 3.18% ( 2.27%) WeakKeyDictionary>>associationsDo: (24) (71.22%) 2.12% ( 1.51%) MethodDictionary>>scanFor: (16) (73.34%) 1.59% ( 1.14%) SmallInteger>>\\ (12) (74.93%) % of vanilla vm code (% of total) (samples) (cumulative) 58.62% (16.08%) scavengeReferentsOf (170) (58.62%) 24.48% ( 6.72%) instantiateClassindexableSize (71) (83.10%) 2.76% ( 0.76%) copyAndForward (8) (85.86%) 2.07% ( 0.57%) doScavenge (6) (87.93%) 2.07% ( 0.57%) lookupOrdinaryNoMNUEtcInClass (6) (90.00%) 13 samples in the rest 1.23% of total % of rest (% of total) (samples) (cumulative) 53.85% ( 0.66%) SCDynamicStoreCreate (7) (53.85%) 46.15% ( 0.57%) _platform_memmove$VARIANT$sse42 (6) (100.0%) 46.15% ( 0.57%) OSAtomicAdd32Barrier (6) (146.15%) **GCs** full 0 totalling 0ms (0% elapsed time) scavenges 258 totalling 151ms (18.97% elapsed time), avg 0.585ms In SqueakV3: 1189 samples in the VM (1196 samples in the entire program) 99.41% of total 998 samples in generated vm code 83.94% of entire vm (83.44% of total) 191 samples in vanilla vm code 16.06% of entire vm (15.97% of total) % of generated vm code (% of total) (samples) (cumulative) 43.19% (36.04%) Object>>at: (431) (43.19%) 35.07% (29.26%) Dictionary>>associationsDo: (350) (78.26%) 1.90% ( 1.59%) WeakKeyDictionary>>associationsDo: (19) (80.16%) 1.30% ( 1.09%) Object>>veryDeepCopyWith: (13) (81.46%) 1.20% ( 1.00%) BlockClosure>>value (12) (82.67%) % of vanilla vm code (% of total) (samples) (cumulative) 30.37% ( 4.85%) instantiateClassindexableSize (58) (30.37%) 13.61% ( 2.17%) incrementalGC (26) (43.98%) 9.42% ( 1.51%) markAndTrace (18) (53.40%) 8.38% ( 1.34%) marryFrameSP (16) (61.78%) 6.81% ( 1.09%) eeInstantiateSmallClassnumSlots (13) (68.59%) 5.24% ( 0.84%) createClosureNumArgsnumCopiedstartpc (10) (73.82%) 4.71% ( 0.75%) mapPointersInObjectsFromto (9) (78.53%) **GCs** full 0 totalling 0ms (0% elapsed time) incr 270 totalling 46ms (5.094% elapsed time), avg 0.17ms Conclusion ? It feels that the main issue is with the scavenges that take overall more time in Spur in this case. Especially, the method scavengeReferentsOf seems to be guilty and may be worth looking at. There are other details, like now a little bit of execution time seems to be used to allocate new memory segments and associationsDo: looks slower, likely because of == and forwarders, but this is not as important. What do you think Eliot ? 2015-12-19 11:06 GMT+01:00 Denis Kudriashov <[hidden email]>:
|
In reply to this post by Denis Kudriashov
Hi. With latest VM these benches are similar. New compaction solves the problem I guess 2015-12-18 18:36 GMT+01:00 Denis Kudriashov <[hidden email]>:
|
Free forum by Nabble | Edit this page |