Hi folks. I wasn't aware of the class SparseLargeTable but it seems it has two important instances "SparseLargeTable allInstances size -> 2"
They are the class side variables #DecimalProperty and #GeneralCategory in Unicode. Actually, (Unicode classPool at: 'GeneralCategory') size -> 917632 (Unicode classPool at: 'DecimalProperty') size -> 917632 So...collections of almost 1 million elements. Now a couple of questions: 1) Is this normal/expected? In my calculus both arrays are like 3mb of memory 2) If I don't use UTF and I send an encoding as parameter in the VM (different from UTF), do I need this class side variables? Thanks and forgive my ignorance in this field...I just get the alarm while doing some memory analysis for a paper. Mariano _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
> 1) Is this normal/expected? In my calculus both arrays are like 3mb of
> memory Yes, these are lookup tables for character classification like #isLetter and #isLowercase. Now what bothers me more is that the character classification in Pharo is dead slow. Other Smalltalk implementations have much more space and time efficient implementations. > 2) If I don't use UTF and I send an encoding as parameter in the VM > (different from UTF), do I need this class side variables? AFAIK the encoding just defines how characters are passed into the VM. This does not affect how characters are represented inside. Lukas -- Lukas Renggli www.lukas-renggli.ch _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Mariano Martinez Peck
On 21.09.2010 19:10, Mariano Martinez Peck wrote:
> Hi folks. I wasn't aware of the class SparseLargeTable but it seems it > has two important instances "SparseLargeTable allInstances size -> 2" > > They are the class side variables #DecimalProperty and > #GeneralCategory in Unicode. > > Actually, > > (Unicode classPool at: 'GeneralCategory') size -> 917632 > > (Unicode classPool at: 'DecimalProperty') size -> 917632 > > So...collections of almost 1 million elements. > > Now a couple of questions: > > 1) Is this normal/expected? In my calculus both arrays are like 3mb > of memory If my math is correct (crosses fingers), DecimalProperty is ~15KB, GeneralCategory ~70KB > > 2) If I don't use UTF and I send an encoding as parameter in the VM > (different from UTF), do I need this class side variables? What Lukas said. Cheers, Henry _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
On Tue, Sep 21, 2010 at 7:55 PM, Henrik Sperre Johansen <[hidden email]> wrote:
So...there must be a problem with my code, I was doing something like SpaceTally does: spaceForUsedInstancesOf: aClass withInstanceCount: instCount "Answer the number of bytes consumed by all instances of the given class, including their object headers." | isCompact instVarBytes bytesPerElement headerBytes total | instCount = 0 ifTrue: [^ 0]. isCompact := aClass indexIfCompact > 0. instVarBytes := aClass instSize * 4. aClass isVariable ifTrue: [ bytesPerElement := aClass isBytes ifTrue: [1] ifFalse: [4]. total := 0. aClass allInstancesDo: [:inst | | contentBytes | contentBytes := instVarBytes + (inst size * bytesPerElement). headerBytes := contentBytes > 255 ifTrue: [12] ifFalse: [isCompact ifTrue: [4] ifFalse: [8]]. (UnusedObjectDiscoverer current primitiveGetUsedBit: inst) ifTrue: [total := total + headerBytes + contentBytes]]. ^ total] ifFalse: [ headerBytes := instVarBytes > 255 ifTrue: [12] ifFalse: [isCompact ifTrue: [4] ifFalse: [8]]. ^ instCount * (headerBytes + instVarBytes)]. SparseLargeTable instanceCount ->>> 2 SpaceTally new spaceForUsedInstancesOf: SparseLargeTable withInstanceCount: 2 ->>> 7341112 thanks
_______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
There you go. Use basicSize, size is overridden by f.ex. HashedCollections and LargeSparseTables. Cheers, Henry _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Here's the code I used to confirm Henrik's math:
table := Unicode classPool at: 'GeneralCategory'. count := table basicSize. 1 to: table basicSize do: [ :i | (table basicAt: i) isNil ifFalse: [ count := count + (table basicAt: i) basicSize ] ]. (count / 1024) asFloat -> 66.8759765625 + the headers of each subtable. Cheers, Adrian On Sep 21, 2010, at 20:16 , Henrik Sperre Johansen wrote: > On 21.09.2010 20:07, Mariano Martinez Peck wrote: >> >> >> On Tue, Sep 21, 2010 at 7:55 PM, Henrik Sperre Johansen <[hidden email] <mailto:[hidden email]>> wrote: >> >> On 21.09.2010 19:10, Mariano Martinez Peck wrote: >> >> Hi folks. I wasn't aware of the class SparseLargeTable but it >> seems it has two important instances "SparseLargeTable >> allInstances size -> 2" >> >> They are the class side variables #DecimalProperty and >> #GeneralCategory in Unicode. >> >> Actually, >> >> (Unicode classPool at: 'GeneralCategory') size -> 917632 >> >> (Unicode classPool at: 'DecimalProperty') size -> 917632 >> >> So...collections of almost 1 million elements. >> >> Now a couple of questions: >> >> 1) Is this normal/expected? In my calculus both arrays are >> like 3mb of memory >> >> Look at the implementation, there's a reason they're called sparse. >> If my math is correct (crosses fingers), DecimalProperty is ~15KB, >> GeneralCategory ~70KB >> >> >> So...there must be a problem with my code, I was doing something like SpaceTally does: >> >> >> spaceForUsedInstancesOf: aClass withInstanceCount: instCount >> "Answer the number of bytes consumed by all instances of the given class, including their object headers." >> >> | isCompact instVarBytes bytesPerElement headerBytes total | >> instCount = 0 ifTrue: [^ 0]. >> isCompact := aClass indexIfCompact > 0. >> instVarBytes := aClass instSize * 4. >> aClass isVariable >> ifTrue: [ >> bytesPerElement := aClass isBytes ifTrue: [1] ifFalse: [4]. >> total := 0. >> aClass allInstancesDo: [:inst | | contentBytes | >> contentBytes := instVarBytes + (inst *size* * bytesPerElement). > There you go. Use basicSize, size is overridden by f.ex. HashedCollections and LargeSparseTables. > > Cheers, > Henry > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Henrik Sperre Johansen
2010/9/21 Henrik Sperre Johansen <[hidden email]>
:) Thanks Henry. Now SpaceTally new spaceForInstancesOf: SparseLargeTable withInstanceCount: 2 ->> 18748 So....I guess I should commit the fix for SpaceTally >> spaceForInstancesOf: aClass withInstanceCount: instCount shouldn't I ? In addition, I modified SpaceTally >> printSpaceAnalysis: threshold on: aStream from this: printSpaceAnalysis: threshold on: aStream "SpaceTally new printSpaceAnalysis: 1 on:(FileStream forceNewFileNamed: 'STspace.text')" "sd-This method should be rewrote to be more coherent within the rest of the class ie using preAllocate and spaceForInstanceOf:" "If threshold > 0, then only those classes with more than that number of instances will be shown, and they will be sorted by total instance space. If threshold = 0, then all classes will appear, sorted by name." | totalCodeSpace totalInstCount totalInstSpace n totalPercent | Smalltalk garbageCollect. totalCodeSpace := totalInstCount := totalInstSpace := n := 0. results := OrderedCollection new: Smalltalk classNames size. 'Taking statistics...' displayProgressAt: Sensor cursorPoint from: 0 to: Smalltalk classNames size during: [:bar | (Smalltalk globals allClasses) do: [:cl | | codeSpace instCount instSpace eltSize | codeSpace := cl spaceUsed. bar value: (n := n+1). Smalltalk garbageCollectMost. instCount := cl instanceCount. instSpace := (cl indexIfCompact > 0 ifTrue: [4] ifFalse: [8])*instCount. "Object headers" cl isVariable ifTrue: [eltSize := cl isBytes ifTrue: [1] ifFalse: [4]. cl allInstancesDo: [:x | instSpace := instSpace + (x basicSize*eltSize)]] ifFalse: [instSpace := instSpace + (cl instSize*instCount*4)]. results add: (SpaceTallyItem analyzedClassName: cl name codeSize: codeSpace instanceCount: instCount spaceForInstances: instSpace). totalCodeSpace := totalCodeSpace + codeSpace. totalInstCount := totalInstCount + instCount. totalInstSpace := totalInstSpace + instSpace]]. totalPercent := 0.0. "aStream timeStamp." aStream nextPutAll: ('Class' padded: #right to: 30 with: $ ); nextPutAll: ('code space' padded: #left to: 12 with: $ ); nextPutAll: ('# instances' padded: #left to: 12 with: $ ); nextPutAll: ('inst space' padded: #left to: 12 with: $ ); nextPutAll: ('percent' padded: #left to: 8 with: $ ); cr. threshold > 0 ifTrue: [ "If inst count threshold > 0, then sort by space" results := (results select: [:s | s instanceCount >= threshold or: [s spaceForInstances > (totalInstSpace // 500)]]) asSortedCollection: [:s :s2 | s spaceForInstances > s2 spaceForInstances]]. results do: [:s | | percent | aStream nextPutAll: (s analyzedClassName padded: #right to: 30 with: $ ); nextPutAll: (s codeSize printString padded: #left to: 12 with: $ ); nextPutAll: (s instanceCount printString padded: #left to: 12 with: $ ); nextPutAll: (s spaceForInstances printString padded: #left to: 14 with: $ ). percent := s spaceForInstances*100.0/totalInstSpace roundTo: 0.1. totalPercent := totalPercent + percent. percent >= 0.1 ifTrue: [ aStream nextPutAll: (percent printString padded: #left to: 8 with: $ )]. aStream cr]. aStream cr; nextPutAll: ('Total' padded: #right to: 30 with: $ ); nextPutAll: (totalCodeSpace printString padded: #left to: 12 with: $ ); nextPutAll: (totalInstCount printString padded: #left to: 12 with: $ ); nextPutAll: (totalInstSpace printString padded: #left to: 14 with: $ ); nextPutAll: ((totalPercent roundTo: 0.1) printString padded: #left to: 8 with: $ ). To this: printSpaceAnalysis: threshold on: aStream "SpaceTally new printSpaceAnalysis: 1 on:(FileStream forceNewFileNamed: 'STspace.text')" "sd-This method should be rewrote to be more coherent within the rest of the class ie using preAllocate and spaceForInstanceOf:" "If threshold > 0, then only those classes with more than that number of instances will be shown, and they will be sorted by total instance space. If threshold = 0, then all classes will appear, sorted by name." | totalCodeSpace totalInstCount totalInstSpace n totalPercent | Smalltalk garbageCollect. totalCodeSpace := totalInstCount := totalInstSpace := n := 0. results := OrderedCollection new: Smalltalk classNames size. 'Taking statistics...' displayProgressAt: Sensor cursorPoint from: 0 to: Smalltalk classNames size during: [:bar | (Smalltalk globals allClasses) do: [:cl | | codeSpace instCount instSpace eltSize | codeSpace := cl spaceUsed. bar value: (n := n+1). Smalltalk garbageCollectMost. instCount := cl instanceCount. instSpace := self spaceForInstancesOf: cl withInstanceCount: instCount. results add: (SpaceTallyItem analyzedClassName: cl name codeSize: codeSpace instanceCount: instCount spaceForInstances: instSpace). totalCodeSpace := totalCodeSpace + codeSpace. totalInstCount := totalInstCount + instCount. totalInstSpace := totalInstSpace + instSpace]]. totalPercent := 0.0. "aStream timeStamp." aStream nextPutAll: ('Class' padded: #right to: 30 with: $ ); nextPutAll: ('code space' padded: #left to: 12 with: $ ); nextPutAll: ('# instances' padded: #left to: 12 with: $ ); nextPutAll: ('inst space' padded: #left to: 12 with: $ ); nextPutAll: ('percent' padded: #left to: 8 with: $ ); cr. threshold > 0 ifTrue: [ "If inst count threshold > 0, then sort by space" results := (results select: [:s | s instanceCount >= threshold or: [s spaceForInstances > (totalInstSpace // 500)]]) asSortedCollection: [:s :s2 | s spaceForInstances > s2 spaceForInstances]]. results do: [:s | | percent | aStream nextPutAll: (s analyzedClassName padded: #right to: 30 with: $ ); nextPutAll: (s codeSize printString padded: #left to: 12 with: $ ); nextPutAll: (s instanceCount printString padded: #left to: 12 with: $ ); nextPutAll: (s spaceForInstances printString padded: #left to: 14 with: $ ). percent := s spaceForInstances*100.0/totalInstSpace roundTo: 0.1. totalPercent := totalPercent + percent. percent >= 0.1 ifTrue: [ aStream nextPutAll: (percent printString padded: #left to: 8 with: $ )]. aStream cr]. aStream cr; nextPutAll: ('Total' padded: #right to: 30 with: $ ); nextPutAll: (totalCodeSpace printString padded: #left to: 12 with: $ ); nextPutAll: (totalInstCount printString padded: #left to: 12 with: $ ); nextPutAll: (totalInstSpace printString padded: #left to: 14 with: $ ); nextPutAll: ((totalPercent roundTo: 0.1) printString padded: #left to: 8 with: $ ). So....you agree with both changes? Thanks! Mariano
_______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Adrian Lienhard
On 21.09.2010 20:22, Adrian Lienhard wrote:
> Here's the code I used to confirm Henrik's math: > > table := Unicode classPool at: 'GeneralCategory'. > count := table basicSize. > 1 to: table basicSize do: [ :i | > (table basicAt: i) isNil ifFalse: [ > count := count + (table basicAt: i) basicSize ] ]. > (count / 1024) asFloat > > -> 66.8759765625 > > + the headers of each subtable. > > Cheers, > Adrian count := table basicSize*4 (4 bytes for a slot when class isBytes not) that's basically what I did. ~ due to not counting headers :) Cheers, Henry _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Mariano Martinez Peck
On 21.09.2010 20:28, Mariano Martinez Peck wrote:
Sound good to me! Personally I'd also like some more refactoring, like rewriting spaceForInstancesOf: withInstanceCount into #spaceForInstancesOfClass: aClass, delegating to: #spaceFor: amount instancesOfClass: aFixedClass (private category?) #spaceForInstancesOfVariableClass: aVariableClass (private category?) That way you also move ugly instance counting out of the loop. Also, while you're at it, what about factoring out contentBytes > 255 ifTrue: [12] ifFalse: [isCompact ifTrue: [4] ifFalse: [8]. into, say, #headerBytesOfClass: aClass ? And really, wtf? An instance of a compact, variable class with more than 255 bytes in it will have a 12-byte header, while those with less will have a 4-byte header? Plus, setting a class with more than 63 instance variables will never have an effect, its instances will always have 12-byte headers? Cheers, Henry _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Sound good to me!
I was thinking more or less the same. Even more, if we integrate Object >> sizeInMemory why not adding also: Class >> spaceForInstances | totalSize | totalSize := 0. self allInstancesDo: [ :inst | totalSize := totalSize + inst sizeInMemory. ]. ^ totalSize ? Then we can also replace spaceForInstancesOf: withInstanceCount with spaceForInstances Example: printSpaceAnalysis: threshold on: aStream "SpaceTally new printSpaceAnalysis: 1 on:(FileStream forceNewFileNamed: 'STspace.text')" "sd-This method should be rewrote to be more coherent within the rest of the class ie using preAllocate and spaceForInstanceOf:" "If threshold > 0, then only those classes with more than that number of instances will be shown, and they will be sorted by total instance space. If threshold = 0, then all classes will appear, sorted by name." | totalCodeSpace totalInstCount totalInstSpace n totalPercent | Smalltalk garbageCollect. totalCodeSpace := totalInstCount := totalInstSpace := n := 0. results := OrderedCollection new: Smalltalk classNames size. 'Taking statistics...' displayProgressAt: Sensor cursorPoint from: 0 to: Smalltalk classNames size during: [:bar | (Smalltalk globals allClasses) do: [:cl | | codeSpace instCount instSpace eltSize | codeSpace := cl spaceUsed. bar value: (n := n+1). Smalltalk garbageCollectMost. instCount := cl instanceCount. instSpace := cl spaceForInstances. results add: (SpaceTallyItem analyzedClassName: cl name codeSize: codeSpace instanceCount: instCount spaceForInstances: instSpace). totalCodeSpace := totalCodeSpace + codeSpace. totalInstCount := totalInstCount + instCount. totalInstSpace := totalInstSpace + instSpace]]. totalPercent := 0.0. "aStream timeStamp." aStream nextPutAll: ('Class' padded: #right to: 30 with: $ ); nextPutAll: ('code space' padded: #left to: 12 with: $ ); nextPutAll: ('# instances' padded: #left to: 12 with: $ ); nextPutAll: ('inst space' padded: #left to: 12 with: $ ); nextPutAll: ('percent' padded: #left to: 8 with: $ ); cr. threshold > 0 ifTrue: [ "If inst count threshold > 0, then sort by space" results := (results select: [:s | s instanceCount >= threshold or: [s spaceForInstances > (totalInstSpace // 500)]]) asSortedCollection: [:s :s2 | s spaceForInstances > s2 spaceForInstances]]. results do: [:s | | percent | aStream nextPutAll: (s analyzedClassName padded: #right to: 30 with: $ ); nextPutAll: (s codeSize printString padded: #left to: 12 with: $ ); nextPutAll: (s instanceCount printString padded: #left to: 12 with: $ ); nextPutAll: (s spaceForInstances printString padded: #left to: 14 with: $ ). percent := s spaceForInstances*100.0/totalInstSpace roundTo: 0.1. totalPercent := totalPercent + percent. percent >= 0.1 ifTrue: [ aStream nextPutAll: (percent printString padded: #left to: 8 with: $ )]. aStream cr]. aStream cr; nextPutAll: ('Total' padded: #right to: 30 with: $ ); nextPutAll: (totalCodeSpace printString padded: #left to: 12 with: $ ); nextPutAll: (totalInstCount printString padded: #left to: 12 with: $ ); nextPutAll: (totalInstSpace printString padded: #left to: 14 with: $ ); nextPutAll: ((totalPercent roundTo: 0.1) printString padded: #left to: 8 with: $ ). agree?? thanks
_______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Mariano Martinez Peck
2010/9/21 Mariano Martinez Peck <[hidden email]>
You must use basicSize not size. This is a sparse table. It has a virtual size. size answers that virtual size, not the actual size. If you use basicSize you'll also need to traverse su structure to correctly compute the size of the entire sparse array including its pages.
HTH Eliot
_______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Free forum by Nabble | Edit this page |