Christoph Thiede uploaded a new version of ShoutCore to project The Inbox:
http://source.squeak.org/inbox/ShoutCore-ct.78.mcz ==================== Summary ==================== Name: ShoutCore-ct.78 Author: ct Time: 12 January 2020, 3:00:21.215555 am UUID: c04a2f92-7a12-1743-a4d7-5d555c8ee634 Ancestors: ShoutCore-mt.77 Optimize context-dependent styling of multiple texts in the same context. Reading the context's method from the sources file can be *really* slow (up to several hundred milliseconds, depending on the size of your image). Thus we don't need to request the sources file again each time we handle a subsequent styling request for the same context. Measurements from a fresh Trunk image: | styler | styler := SHTextStylerST80 new. styler context: (thisContext stack at: 4). "to get a not-so-recently modified method" [styler styledTextFor: 'foo' asText] bench Before: 36 milliseconds/run After: 161 microseconds/run Cache is fast: [| styler | styler := SHTextStylerST80 new. styler context: (thisContext stack at: 4). "to get a not-so-recently modified method" styler styledTextFor: 'foo' asText] bench Before: 276 milliseconds/run After: 276 milliseconds/run "sic!" =============== Diff against ShoutCore-mt.77 =============== Item was changed: Object subclass: #SHParserST80 + instanceVariableNames: 'source parseAMethod classOrMetaClass workspace environment context arguments sourcePosition currentToken currentTokenFirst temporaries instanceVariables errorBlock currentTokenSourcePosition bracketDepth ranges allowUnderscoreAssignments allowUnderscoreSelectors allowBlockArgumentAssignment currentTokenType contextMethodCache' - instanceVariableNames: 'classOrMetaClass source workspace arguments sourcePosition currentToken currentTokenFirst temporaries instanceVariables errorBlock currentTokenSourcePosition bracketDepth ranges environment allowUnderscoreAssignments allowUnderscoreSelectors allowBlockArgumentAssignment parseAMethod currentTokenType context' classVariableNames: '' poolDictionaries: '' category: 'ShoutCore-Parsing'! !SHParserST80 commentStamp: 'ul 7/30/2019 00:31' prior: 0! I am a Smalltalk method / expression parser. Rather than creating an Abstract Syntax Tree, I create a sequence of SHRanges (in my 'ranges' instance variable), which represent the tokens within the String I am parsing. I am used by a SHTextStylerST80 to parse method source strings. I am able to parse incomplete / incorrect methods, and so can be used to parse methods that are being edited. Instance Variables allowBlockArgumentAssignment: <Boolean> allowUnderscoreAssignments: <Boolean> allowUnderscoreSelectors: <Boolean> arguments: <OrderedCollection<OrderedCollection<String>|nil> bracketDepth: <Integer> classOrMetaClass: <Class|nil> currentToken: <String|nil> currentTokenFirst: <Character> currentTokenSourcePosition: <Integer|nil> currentTokenType: <Symbol|nil> environment: <Environment> errorBlock: <Block> instanceVariables: <Array> parseAMethod: <Boolean> ranges: <OrderedCollection<SHRange>> source: <String> sourcePosition: <Integer> temporaries: <OrderedCollection<OrderedCollection<String>|nil> workspace: <Workspace|nil> context: <Context|nil> allowBlockArgumentAssignment The value cached at the beginning of parsing of Scanner allowBlockArgumentAssignment. allowUnderscoreAssignments The value cached at the beginning of parsing of Scanner allowUnderscoreAsAssignment. allowUnderscoreSelectors The value cached at the beginning of parsing of Scanner prefAllowUnderscoreSelectors. arguments This OrderedCollection has an element for each scope encapsulating the current scope. The current scope's arguments are stored in the last element. The first element holds the outermost scope's arguments. Each element is nil when the corresponding scope doesn't have any arguments, and the element is an OrderedCollection with the names of the arguments declared at the given scope when there's at least one. The size of this variable is the same as the size of temporaries. bracketDepth Stores the number of unclosed brackets "(" and parentheses "[" before the current sourcePosition. classOrMetaClass The Class or MetaClass instance, class and pool variables should be looked up during parsing or nil when not parsing code in the context of a class (e.g. when parsing code written in a Workspace). Having this set doesn't mean a method is being parsed. currentToken The token being analyzed for which the next range should be created for. currentTokenFirst The first character of currentToken cached for quick access or a space character when there are no more tokens to parse. Being always a Character helps avoiding extra checks. currentTokenSourcePosition The position of source the current token starts at or nil when there are no more tokens to process. currentTokenType The type of the current token calculated lazily by #currentTokenType. When it has been calculated, Its value is one of #keyword, #assignment, #ansiAssignment, #binary, #name, #other and occasionally #invalid. environment The Environment globals and classes should be looked up at during parsing when classOrMetaClass is nil. Its value is Smalltalk globals by default. errorBlock A block used to quickly stop parsing in case of an unrecoverable parse error. instanceVariables An Array with the instance variable names of classOrMetaClass or an empty Array when classOrMetaClass is nil. parseAMethod A way to tell the parser to parse source as a code snippet instead of a method. Mainly used by inspectors. ranges The SHRanges parsed by the parser. source The source code as a String to be parsed. sourcePosition souce is treated as a stream by the parser. This variable stores the stream position. temporaries This OrderedCollection has an element for each scope encapsulating the current scope. The current scope's temporaries are stored in the last element. The first element holds the outermost scope's temporaries. Each element is nil when the corresponding scope doesn't have any temporary variables, and the element is an OrderedCollection with the names of the temporaries declared at the given scope when there's at least one. The size of this variable is the same as the size of arguments. workspace The Workspace in whose context variables should be looked up during parsing or nil when not parsing code in a workspace. context The Context in which variables should be looked up during parsing or nil when not parsing within a context. Example (explore it): ranges := SHParserST80 new classOrMetaClass: Object; source: 'testMethod ^self'; parse; ranges Benchmark (print it): SHParserST80 benchmark! Item was changed: ----- Method: SHParserST80>>initializeVariablesFromContext (in category 'parse support') ----- initializeVariablesFromContext + | contextMethod contextSource contextSourcePcIndex contextSourceParser | + contextMethod := context method. + (contextMethodCache ifNotNil: #key) = contextMethod "for optimization" + ifFalse: [contextMethodCache := contextMethod -> contextMethod getSource]. + contextSource := contextMethodCache value. + - | contextSourcePcIndex contextSourceParser | contextSourcePcIndex := (context debuggerMap rangeForPC: (context isDead ifTrue: [context endPC] ifFalse: [context pc]) in: context method contextIsActiveContext: true "... to really use the context's pc.") start. contextSourceParser := self class new classOrMetaClass: context method methodClass; + environment: context receiver environment; + source: (contextSource take: contextSourcePcIndex); - environment: self environment; - source: (context method getSource first: contextSourcePcIndex); yourself. contextSourceParser parse. arguments := contextSourceParser activeArguments. temporaries := contextSourceParser activeTemporaries.! |
Hi Christoph,
On Sun, 12 Jan 2020, [hidden email] wrote: > Christoph Thiede uploaded a new version of ShoutCore to project The Inbox: > http://source.squeak.org/inbox/ShoutCore-ct.78.mcz > > ==================== Summary ==================== > > Name: ShoutCore-ct.78 > Author: ct > Time: 12 January 2020, 3:00:21.215555 am > UUID: c04a2f92-7a12-1743-a4d7-5d555c8ee634 > Ancestors: ShoutCore-mt.77 > > Optimize context-dependent styling of multiple texts in the same context. > > Reading the context's method from the sources file can be *really* slow (up to several hundred milliseconds, depending on the size of your image). Thus we don't need to request the sources file again each time we handle a subsequent styling request for the same context. Reading the source of a method - especially if you read the source of the same method multiple times, which is what you're trying to fix - should not take "several hundred milliseconds", no matter how large your image is. Besides the obvious things, the runtime depends on a few things like: - the amount of free RAM there is, - how well your OS caches files, - whether the method's source has been cached or not, - the speed of the drive the source file is stored on, - whether you have too many open files and are close to the maxExternalSemaphores limit (which should have been bumped to 8k long ago), but, in general, it should be in the submillisecond range. Here's an example reading the source of a rather large method (11kB): [ (MethodFinder >> #initialize) getSource ] bench. '11,800 per second. 84.5 microseconds per run. 5.95762 % GC time.' > > Measurements from a fresh Trunk image: > > | styler | > styler := SHTextStylerST80 new. > styler context: (thisContext stack at: 4). "to get a not-so-recently modified method" > [styler styledTextFor: 'foo' asText] bench > > Before: 36 milliseconds/run > After: 161 microseconds/run Here's the result of the benchmark executed on my machine without your changes: '4,700 per second. 213 microseconds per run. 1.04 % GC time.' So, perhaps there's some other thing causing the slowdown, e.g. one of the reasons I mentioned above or some anti-virus software. > > Cache is fast: > > [| styler | > styler := SHTextStylerST80 new. > styler context: (thisContext stack at: 4). "to get a not-so-recently modified method" > styler styledTextFor: 'foo' asText] bench > > Before: 276 milliseconds/run > After: 276 milliseconds/run "sic!" > > =============== Diff against ShoutCore-mt.77 =============== > > Item was changed: > Object subclass: #SHParserST80 > + instanceVariableNames: 'source parseAMethod classOrMetaClass workspace environment context arguments sourcePosition currentToken currentTokenFirst temporaries instanceVariables errorBlock currentTokenSourcePosition bracketDepth ranges allowUnderscoreAssignments allowUnderscoreSelectors allowBlockArgumentAssignment currentTokenType contextMethodCache' > - instanceVariableNames: 'classOrMetaClass source workspace arguments sourcePosition currentToken currentTokenFirst temporaries instanceVariables errorBlock currentTokenSourcePosition bracketDepth ranges environment allowUnderscoreAssignments allowUnderscoreSelectors allowBlockArgumentAssignment parseAMethod currentTokenType context' The new instance variable should be documented in the class comment. > classVariableNames: '' > poolDictionaries: '' > category: 'ShoutCore-Parsing'! > > !SHParserST80 commentStamp: 'ul 7/30/2019 00:31' prior: 0! > I am a Smalltalk method / expression parser. > > Rather than creating an Abstract Syntax Tree, I create a sequence of SHRanges (in my 'ranges' instance variable), which represent the tokens within the String I am parsing. > > I am used by a SHTextStylerST80 to parse method source strings. > I am able to parse incomplete / incorrect methods, and so can be used to parse methods that are being edited. > > Instance Variables > allowBlockArgumentAssignment: <Boolean> > allowUnderscoreAssignments: <Boolean> > allowUnderscoreSelectors: <Boolean> > arguments: <OrderedCollection<OrderedCollection<String>|nil> > bracketDepth: <Integer> > classOrMetaClass: <Class|nil> > currentToken: <String|nil> > currentTokenFirst: <Character> > currentTokenSourcePosition: <Integer|nil> > currentTokenType: <Symbol|nil> > environment: <Environment> > errorBlock: <Block> > instanceVariables: <Array> > parseAMethod: <Boolean> > ranges: <OrderedCollection<SHRange>> > source: <String> > sourcePosition: <Integer> > temporaries: <OrderedCollection<OrderedCollection<String>|nil> > workspace: <Workspace|nil> > context: <Context|nil> > > allowBlockArgumentAssignment > The value cached at the beginning of parsing of Scanner allowBlockArgumentAssignment. > > allowUnderscoreAssignments > The value cached at the beginning of parsing of Scanner allowUnderscoreAsAssignment. > > allowUnderscoreSelectors > The value cached at the beginning of parsing of Scanner prefAllowUnderscoreSelectors. > > arguments > This OrderedCollection has an element for each scope encapsulating the current scope. > The current scope's arguments are stored in the last element. The first element holds the outermost scope's arguments. > Each element is nil when the corresponding scope doesn't have any arguments, and the element is an OrderedCollection with the names of the arguments declared at the given scope when there's at least one. > The size of this variable is the same as the size of temporaries. > > bracketDepth > Stores the number of unclosed brackets "(" and parentheses "[" before the current sourcePosition. > > classOrMetaClass > The Class or MetaClass instance, class and pool variables should be looked up during parsing or nil when not parsing code in the context of a class (e.g. when parsing code written in a Workspace). Having this set doesn't mean a method is being parsed. > > currentToken > The token being analyzed for which the next range should be created for. > > currentTokenFirst > The first character of currentToken cached for quick access or a space character when there are no more tokens to parse. > Being always a Character helps avoiding extra checks. > > currentTokenSourcePosition > The position of source the current token starts at or nil when there are no more tokens to process. > > currentTokenType > The type of the current token calculated lazily by #currentTokenType. When it has been calculated, Its value is one of #keyword, #assignment, #ansiAssignment, #binary, #name, #other and occasionally #invalid. > > environment > The Environment globals and classes should be looked up at during parsing when classOrMetaClass is nil. Its value is Smalltalk globals by default. > > errorBlock > A block used to quickly stop parsing in case of an unrecoverable parse error. > > instanceVariables > An Array with the instance variable names of classOrMetaClass or an empty Array when classOrMetaClass is nil. > > parseAMethod > A way to tell the parser to parse source as a code snippet instead of a method. Mainly used by inspectors. > > ranges > The SHRanges parsed by the parser. > > source > The source code as a String to be parsed. > > sourcePosition > souce is treated as a stream by the parser. This variable stores the stream position. > > temporaries > This OrderedCollection has an element for each scope encapsulating the current scope. > The current scope's temporaries are stored in the last element. The first element holds the outermost scope's temporaries. > Each element is nil when the corresponding scope doesn't have any temporary variables, and the element is an OrderedCollection with the names of the temporaries declared at the given scope when there's at least one. > The size of this variable is the same as the size of arguments. > > workspace > The Workspace in whose context variables should be looked up during parsing or nil when not parsing code in a workspace. > > context > The Context in which variables should be looked up during parsing or nil when not parsing within a context. > > Example (explore it): > > ranges := SHParserST80 new > classOrMetaClass: Object; > source: 'testMethod ^self'; > parse; > ranges > > Benchmark (print it): > > SHParserST80 benchmark! > > Item was changed: > ----- Method: SHParserST80>>initializeVariablesFromContext (in category 'parse support') ----- > initializeVariablesFromContext > > + | contextMethod contextSource contextSourcePcIndex contextSourceParser | > + contextMethod := context method. > + (contextMethodCache ifNotNil: #key) = contextMethod "for optimization" I know I sound like a broken record, but if you care about performance, don't use symbols in place of blocks, especially as the arguments of methods like #ifNotNil:, which would optimize the blocks away. Also, using a symbol with #ifNotNil: may not legible to the average Smalltalker. I'm sure some would expect [ x ifNotNil: #y ] to yield #y when x is not nil. > + ifFalse: [contextMethodCache := contextMethod -> contextMethod getSource]. > + contextSource := contextMethodCache value. > + > - | contextSourcePcIndex contextSourceParser | > contextSourcePcIndex := (context debuggerMap > rangeForPC: (context isDead ifTrue: [context endPC] ifFalse: [context pc]) > in: context method > contextIsActiveContext: true "... to really use the context's pc.") > start. > contextSourceParser := self class new > classOrMetaClass: context method methodClass; > + environment: context receiver environment; > + source: (contextSource take: contextSourcePcIndex); We should stop passing Text obejcts as source to SHParserST80. It only works on Strings. Eliot added a guard recently because of the introduction of texts, and I wanted to fixed that, but it seems I never got to do that. CompiledMethod >> #getSource returns a Text, so the best place to convert to String is right after sending #getSource. That would also save us a few CPU cycles. While #take: works, I would keep using #first: here. We know that the receiver is String (or at least it should be) and we need to get the first N characters. We can also assume that contextSourcePcIndex is within the range of the source. If it's not, then it's very likely a bug that would be hidden by #take:. Levente > - environment: self environment; > - source: (context method getSource first: contextSourcePcIndex); > yourself. > contextSourceParser parse. > arguments := contextSourceParser activeArguments. > temporaries := contextSourceParser activeTemporaries.! |
> On 2020-01-12, at 10:04 AM, Levente Uzonyi <[hidden email]> wrote: > > I know I sound like a broken record, but if you care about performance, don't use symbols in place of blocks, especially as the arguments of methods like #ifNotNil:, which would optimize the blocks away. > Also, using a symbol with #ifNotNil: may not legible to the average Smalltalker. I'm sure some would expect [ x ifNotNil: #y ] to yield #y when x is not nil. + VeryLargeNumber Concision ~~ Clarity tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Debugger: A tool that substitutes afterthought for forethought. |
In reply to this post by Levente Uzonyi
Hi Levente,
many thanks for your review!
> Reading the source of a method - especially if you read the source of the same method multiple times, which is what you're trying to fix - should not take "several hundred milliseconds",
no matter how large your image is.
That's indeed interesting.
> Besides the obvious things, the runtime depends on a few things like:
> - the amount of free RAM there is,
> - how well your OS caches files,
> - whether the method's source has been cached or not,
> - the speed of the drive the source file is stored on,
> - whether you have too many open files and are close to the maxExternalSemaphores limit (which should have been bumped to 8k long ago),
but, in general, it should be in the submillisecond range.
80% - 90% of 16 RAM.
Windows 10, SSD ... Setup should be rather state of the start.
No additional antivirus software. Disabling OneDrive, which synces image & changes file in the background, accelerated the issue by around 30%, but it's still *much* slower than on your system.
I tested it on a fresh Trunk image, so there shouldn't be a lot of files opened.
> but, in general, it should be in the submillisecond range.
> Here's an example reading the source of a rather large method (11kB):
'11.4 per second. 87.7 milliseconds per run. 0.01214 % GC time.'
> > | styler | > > styler := SHTextStylerST80 new.
> > styler context: (thisContext stack at: 4). "to get a not-so-recently modified method"
> > [styler styledTextFor: 'foo' asText] bench
> Here's the result of the benchmark executed on my machine without your changes:
> '4,700 per second. 213 microseconds per run. 1.04 % GC time.'
How fast is the example if you do load my example?
In general, do you think my proposal makes any sense for "normal" systems? > I know I sound like a broken record, but if you care about performance, don't use symbols in place of blocks, especially as the arguments of methods
like #ifNotNil:, which would optimize the blocks away.
Optimization is a good point (although I have the impression that this might be rather premature optimization). Also, we could optimize symbols in the same way as blocks, couldn't we? But I will try to keep it in mind, also in terms of readability :)
> While #take: works, I would keep using #first: here. We know that the receiver is String (or at least it should be) and we need to get the first N characters.
We can also assume that contextSourcePcIndex is within the range of the source. If it's not, then it's very likely a bug that would be hidden by #take:.
Actually, I got a "subscript out of bounds" a few times, so there might be indeed a bug somewhere. Do we really want Shout to be the bug-checking instance here? We also could raise a warning if the source is too short, without irritating all kind of users
with a non-working toolset (because this would likely crash the debugger as well). I don't know what's the typical approach in Squeak to balance robustness vs error detection ...
Best,
Christoph
Von: Squeak-dev <[hidden email]> im Auftrag von Levente Uzonyi <[hidden email]>
Gesendet: Sonntag, 12. Januar 2020 19:04:35 An: [hidden email] Betreff: Re: [squeak-dev] The Inbox: ShoutCore-ct.78.mcz Hi Christoph,
On Sun, 12 Jan 2020, [hidden email] wrote: > Christoph Thiede uploaded a new version of ShoutCore to project The Inbox: > http://source.squeak.org/inbox/ShoutCore-ct.78.mcz > > ==================== Summary ==================== > > Name: ShoutCore-ct.78 > Author: ct > Time: 12 January 2020, 3:00:21.215555 am > UUID: c04a2f92-7a12-1743-a4d7-5d555c8ee634 > Ancestors: ShoutCore-mt.77 > > Optimize context-dependent styling of multiple texts in the same context. > > Reading the context's method from the sources file can be *really* slow (up to several hundred milliseconds, depending on the size of your image). Thus we don't need to request the sources file again each time we handle a subsequent styling request for the same context. Reading the source of a method - especially if you read the source of the same method multiple times, which is what you're trying to fix - should not take "several hundred milliseconds", no matter how large your image is. Besides the obvious things, the runtime depends on a few things like: - the amount of free RAM there is, - how well your OS caches files, - whether the method's source has been cached or not, - the speed of the drive the source file is stored on, - whether you have too many open files and are close to the maxExternalSemaphores limit (which should have been bumped to 8k long ago), but, in general, it should be in the submillisecond range. Here's an example reading the source of a rather large method (11kB): [ (MethodFinder >> #initialize) getSource ] bench. '11,800 per second. 84.5 microseconds per run. 5.95762 % GC time.' > > Measurements from a fresh Trunk image: > > | styler | > styler := SHTextStylerST80 new. > styler context: (thisContext stack at: 4). "to get a not-so-recently modified method" > [styler styledTextFor: 'foo' asText] bench > > Before: 36 milliseconds/run > After: 161 microseconds/run Here's the result of the benchmark executed on my machine without your changes: '4,700 per second. 213 microseconds per run. 1.04 % GC time.' So, perhaps there's some other thing causing the slowdown, e.g. one of the reasons I mentioned above or some anti-virus software. > > Cache is fast: > > [| styler | > styler := SHTextStylerST80 new. > styler context: (thisContext stack at: 4). "to get a not-so-recently modified method" > styler styledTextFor: 'foo' asText] bench > > Before: 276 milliseconds/run > After: 276 milliseconds/run "sic!" > > =============== Diff against ShoutCore-mt.77 =============== > > Item was changed: > Object subclass: #SHParserST80 > + instanceVariableNames: 'source parseAMethod classOrMetaClass workspace environment context arguments sourcePosition currentToken currentTokenFirst temporaries instanceVariables errorBlock currentTokenSourcePosition bracketDepth ranges allowUnderscoreAssignments allowUnderscoreSelectors allowBlockArgumentAssignment currentTokenType contextMethodCache' > - instanceVariableNames: 'classOrMetaClass source workspace arguments sourcePosition currentToken currentTokenFirst temporaries instanceVariables errorBlock currentTokenSourcePosition bracketDepth ranges environment allowUnderscoreAssignments allowUnderscoreSelectors allowBlockArgumentAssignment parseAMethod currentTokenType context' The new instance variable should be documented in the class comment. > classVariableNames: '' > poolDictionaries: '' > category: 'ShoutCore-Parsing'! > > !SHParserST80 commentStamp: 'ul 7/30/2019 00:31' prior: 0! > I am a Smalltalk method / expression parser. > > Rather than creating an Abstract Syntax Tree, I create a sequence of SHRanges (in my 'ranges' instance variable), which represent the tokens within the String I am parsing. > > I am used by a SHTextStylerST80 to parse method source strings. > I am able to parse incomplete / incorrect methods, and so can be used to parse methods that are being edited. > > Instance Variables > allowBlockArgumentAssignment: <Boolean> > allowUnderscoreAssignments: <Boolean> > allowUnderscoreSelectors: <Boolean> > arguments: <OrderedCollection<OrderedCollection<String>|nil> > bracketDepth: <Integer> > classOrMetaClass: <Class|nil> > currentToken: <String|nil> > currentTokenFirst: <Character> > currentTokenSourcePosition: <Integer|nil> > currentTokenType: <Symbol|nil> > environment: <Environment> > errorBlock: <Block> > instanceVariables: <Array> > parseAMethod: <Boolean> > ranges: <OrderedCollection<SHRange>> > source: <String> > sourcePosition: <Integer> > temporaries: <OrderedCollection<OrderedCollection<String>|nil> > workspace: <Workspace|nil> > context: <Context|nil> > > allowBlockArgumentAssignment > The value cached at the beginning of parsing of Scanner allowBlockArgumentAssignment. > > allowUnderscoreAssignments > The value cached at the beginning of parsing of Scanner allowUnderscoreAsAssignment. > > allowUnderscoreSelectors > The value cached at the beginning of parsing of Scanner prefAllowUnderscoreSelectors. > > arguments > This OrderedCollection has an element for each scope encapsulating the current scope. > The current scope's arguments are stored in the last element. The first element holds the outermost scope's arguments. > Each element is nil when the corresponding scope doesn't have any arguments, and the element is an OrderedCollection with the names of the arguments declared at the given scope when there's at least one. > The size of this variable is the same as the size of temporaries. > > bracketDepth > Stores the number of unclosed brackets "(" and parentheses "[" before the current sourcePosition. > > classOrMetaClass > The Class or MetaClass instance, class and pool variables should be looked up during parsing or nil when not parsing code in the context of a class (e.g. when parsing code written in a Workspace). Having this set doesn't mean a method is being parsed. > > currentToken > The token being analyzed for which the next range should be created for. > > currentTokenFirst > The first character of currentToken cached for quick access or a space character when there are no more tokens to parse. > Being always a Character helps avoiding extra checks. > > currentTokenSourcePosition > The position of source the current token starts at or nil when there are no more tokens to process. > > currentTokenType > The type of the current token calculated lazily by #currentTokenType. When it has been calculated, Its value is one of #keyword, #assignment, #ansiAssignment, #binary, #name, #other and occasionally #invalid. > > environment > The Environment globals and classes should be looked up at during parsing when classOrMetaClass is nil. Its value is Smalltalk globals by default. > > errorBlock > A block used to quickly stop parsing in case of an unrecoverable parse error. > > instanceVariables > An Array with the instance variable names of classOrMetaClass or an empty Array when classOrMetaClass is nil. > > parseAMethod > A way to tell the parser to parse source as a code snippet instead of a method. Mainly used by inspectors. > > ranges > The SHRanges parsed by the parser. > > source > The source code as a String to be parsed. > > sourcePosition > souce is treated as a stream by the parser. This variable stores the stream position. > > temporaries > This OrderedCollection has an element for each scope encapsulating the current scope. > The current scope's temporaries are stored in the last element. The first element holds the outermost scope's temporaries. > Each element is nil when the corresponding scope doesn't have any temporary variables, and the element is an OrderedCollection with the names of the temporaries declared at the given scope when there's at least one. > The size of this variable is the same as the size of arguments. > > workspace > The Workspace in whose context variables should be looked up during parsing or nil when not parsing code in a workspace. > > context > The Context in which variables should be looked up during parsing or nil when not parsing within a context. > > Example (explore it): > > ranges := SHParserST80 new > classOrMetaClass: Object; > source: 'testMethod ^self'; > parse; > ranges > > Benchmark (print it): > > SHParserST80 benchmark! > > Item was changed: > ----- Method: SHParserST80>>initializeVariablesFromContext (in category 'parse support') ----- > initializeVariablesFromContext > > + | contextMethod contextSource contextSourcePcIndex contextSourceParser | > + contextMethod := context method. > + (contextMethodCache ifNotNil: #key) = contextMethod "for optimization" I know I sound like a broken record, but if you care about performance, don't use symbols in place of blocks, especially as the arguments of methods like #ifNotNil:, which would optimize the blocks away. Also, using a symbol with #ifNotNil: may not legible to the average Smalltalker. I'm sure some would expect [ x ifNotNil: #y ] to yield #y when x is not nil. > + ifFalse: [contextMethodCache := contextMethod -> contextMethod getSource]. > + contextSource := contextMethodCache value. > + > - | contextSourcePcIndex contextSourceParser | > contextSourcePcIndex := (context debuggerMap > rangeForPC: (context isDead ifTrue: [context endPC] ifFalse: [context pc]) > in: context method > contextIsActiveContext: true "... to really use the context's pc.") > start. > contextSourceParser := self class new > classOrMetaClass: context method methodClass; > + environment: context receiver environment; > + source: (contextSource take: contextSourcePcIndex); We should stop passing Text obejcts as source to SHParserST80. It only works on Strings. Eliot added a guard recently because of the introduction of texts, and I wanted to fixed that, but it seems I never got to do that. CompiledMethod >> #getSource returns a Text, so the best place to convert to String is right after sending #getSource. That would also save us a few CPU cycles. While #take: works, I would keep using #first: here. We know that the receiver is String (or at least it should be) and we need to get the first N characters. We can also assume that contextSourcePcIndex is within the range of the source. If it's not, then it's very likely a bug that would be hidden by #take:. Levente > - environment: self environment; > - source: (context method getSource first: contextSourcePcIndex); > yourself. > contextSourceParser parse. > arguments := contextSourceParser activeArguments. > temporaries := contextSourceParser activeTemporaries.!
Carpe Squeak!
|
Hi Christoph,
On Mon, 13 Jan 2020, Thiede, Christoph wrote: > > Hi Levente, > > > many thanks for your review! > > > > Reading the source of a method - especially if you read the source of the same method multiple times, which is what you're trying to fix > - should not take "several hundred milliseconds", no matter how large your image is. > > That's indeed interesting. > > > Besides the obvious things, the runtime depends on a few things like: > > - the amount of free RAM there is, > > - how well your OS caches files, > > - whether the method's source has been cached or not, > > - the speed of the drive the source file is stored on, > > - whether you have too many open files and are close to the maxExternalSemaphores limit (which should have been bumped to 8k long ago), > but, in general, it should be in the submillisecond range. > > 80% - 90% of 16 RAM. > Windows 10, SSD ... Setup should be rather state of the start. > No additional antivirus software. Disabling OneDrive, which synces image & changes file in the background, accelerated the issue by around 30%, > but it's still *much* slower than on your system. > I tested it on a fresh Trunk image, so there shouldn't be a lot of files opened. > > > > but, in general, it should be in the submillisecond range. > > Here's an example reading the source of a rather large method (11kB): > > [ (MethodFinder >> #initialize) getSource ] bench. > > '11.4 per second. 87.7 milliseconds per run. 0.01214 % GC time.' > > > Is this a typical workload? > > > [IMAGE] > time I used Squeak on Windows was a while ago, but there was no issue like that. It's worth investigating why that takes so long. Perhaps its a VM issue. > > > > | styler | > > > > styler := SHTextStylerST80 new. > > > styler context: (thisContext stack at: 4). "to get a not-so-recently modified method" > > > [styler styledTextFor: 'foo' asText] bench > > Here's the result of the benchmark executed on my machine without your changes: > > '4,700 per second. 213 microseconds per run. 1.04 % GC time.' > > How fast is the example if you do load my example? > > In general, do you think my proposal makes any sense for "normal" systems? On "normal" systems, not really. Based on my measurements and assuming a 100x slowdown, I expect it to save 4ms per run on SqueakJS (from 20ms to 16ms). It sounds like premature optimization, so I'd at least add a comment about why it was added. And I would try to find the cause of why it takes so long on your machine to open a file. Here's a small benchmark I just came up with: [ | filename | filename := UUID new asString36. FileStream newFileNamed: filename do: [ :file | ]. [ FileStream fileNamed: filename do: [ :file | ] ] bench ] ensure: [ FileDirectory default deleteFileNamed: filename ]. '119,000 per second. 8.43 microseconds per run. 4.27914 % GC time.' > > > > I know I sound like a broken record, but if you care about performance, don't use symbols in place of blocks, especially as the arguments > of methods like #ifNotNil:, which would optimize the blocks away. > Optimization is a good point (although I have the impression that this might be rather premature optimization). Also, we could optimize symbols in Premature optimization involves two things: - added complexity - optimizing for a possible, but currently not existant or unlikely use case Neither seem to apply here. You could argue that it's an unnecessary optimization, as it won't give measurable difference. I simply consider it good practice to write things an optimal way if it is straightforward, like in that case. All in all, it costs nothing to write a block there. > the same way as blocks, couldn't we? But I will try to keep it in mind, also in terms of readability :) We probably could but why would we? :) > > > While #take: works, I would keep using #first: here. We know that the receiver is String (or at least it should be) and we need to get the > first N characters. We can also assume that contextSourcePcIndex is within the range of the source. If it's not, then it's very likely a bug that > would be hidden by #take:. > Actually, I got a "subscript out of bounds" a few times, so there might be indeed a bug somewhere. Do we really want Shout to be the bug-checking > instance here? We also could raise a warning if the source is too short, without irritating all kind of users with a non-working toolset (because > this would likely crash the debugger as well). I don't know what's the typical approach in Squeak to balance robustness vs error detection ... #take: clearly just makes things work at the cost of hiding the bug. We should really know why the source is shorter than the index returned by DebuggerMethodMap. I've never seen the bug myself, but it's something worth debugging. Levente > > Best, > Christoph |
Just for fun I tried that on a Pi3+ with SSD and got
a) a halt because the first [ is in the wrong place, should be | filename | [filename := UUID new asString36. FileStream newFileNamed: filename do: [ :file | ]. [ FileStream fileNamed: filename do: [ :file | ] ] bench ] ensure: [ FileDirectory default deleteFileNamed: filename ]. b) '9,710 per second. 103 microseconds per run. 1.9 % GC time.' so 1/12th the performance. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim "#define QUESTION ((bb) || !(bb)) - Shakespeare." |
Wow, 210 microseconds per run here ... '4,750 per second. 210 microseconds per run. 0.77984 % GC time.' Any idea where I could start investigating this delay?
For comparison, this is what I get from the shell:
$ time for i in {1..1000}; do cat erw7cadzaljrexm3vjka1kl5t; done
real 0m10.353s
user 0m0.875s
sys 0m7.313s
So effectively about 8 milli(!)seconds per run ... But you cannot compare this, can you?
Best, Christoph Von: Squeak-dev <[hidden email]> im Auftrag von tim Rowledge <[hidden email]>
Gesendet: Dienstag, 14. Januar 2020 00:38:07 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] The Inbox: ShoutCore-ct.78.mcz Just for fun I tried that on a Pi3+ with SSD and got
a) a halt because the first [ is in the wrong place, should be | filename | [filename := UUID new asString36. FileStream newFileNamed: filename do: [ :file | ]. [ FileStream fileNamed: filename do: [ :file | ] ] bench ] ensure: [ FileDirectory default deleteFileNamed: filename ]. b) '9,710 per second. 103 microseconds per run. 1.9 % GC time.' so 1/12th the performance. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim "#define QUESTION ((bb) || !(bb)) - Shakespeare."
Carpe Squeak!
|
On Tue, 14 Jan 2020, Thiede, Christoph wrote:
> > Wow, 210 microseconds per run here ... '4,750 per second. 210 microseconds per run. 0.77984 % GC time.' > > Any idea where I could start investigating this delay? > > > For comparison, this is what I get from the shell: > > > $ time for i in {1..1000}; do cat erw7cadzaljrexm3vjka1kl5t; done > > real 0m10.353s > user 0m0.875s > sys 0m7.313s > > So effectively about 8 milli(!)seconds per run ... But you cannot compare this, can you? comparison here's the same on my machine: $ touch foo $ time for i in {1..1000}; do cat foo; done real 0m0,839s user 0m0,594s sys 0m0,304s Levente > > > Best, > > Christoph > > _________________________________________________________________________________________________________________________________________________________________________________________________________________________________ > Von: Squeak-dev <[hidden email]> im Auftrag von tim Rowledge <[hidden email]> > Gesendet: Dienstag, 14. Januar 2020 00:38:07 > An: The general-purpose Squeak developers list > Betreff: Re: [squeak-dev] The Inbox: ShoutCore-ct.78.mcz > Just for fun I tried that on a Pi3+ with SSD and got > a) a halt because the first [ is in the wrong place, should be > | filename | > [filename := UUID new asString36. > FileStream newFileNamed: filename do: [ :file | ]. > [ FileStream fileNamed: filename do: [ :file | ] ] bench ] > ensure: [ FileDirectory default deleteFileNamed: filename ]. > > b) '9,710 per second. 103 microseconds per run. 1.9 % GC time.' so 1/12th the performance. > > tim > -- > tim Rowledge; [hidden email]; http://www.rowledge.org/tim > "#define QUESTION ((bb) || !(bb)) - Shakespeare." > > > > > |
> Well, it's not the same as just opening and closing a file. But for comparison here's the same on my machine: >
> $ touch foo
> $ time for i in {1..1000}; do cat foo; done
>
> real 0m0,839s
> user 0m0,594s
> sys 0m0,304s
But to be fair, I was testing on Windows Subsystem for Linux, which may be slower ...
If you have any other ideas how to trace this delay, let me know and I will try to test them :)
Best,
Christoph
Von: Squeak-dev <[hidden email]> im Auftrag von Levente Uzonyi <[hidden email]>
Gesendet: Dienstag, 14. Januar 2020 14:00:09 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] The Inbox: ShoutCore-ct.78.mcz On Tue, 14 Jan 2020, Thiede, Christoph wrote:
> > Wow, 210 microseconds per run here ... '4,750 per second. 210 microseconds per run. 0.77984 % GC time.' > > Any idea where I could start investigating this delay? > > > For comparison, this is what I get from the shell: > > > $ time for i in {1..1000}; do cat erw7cadzaljrexm3vjka1kl5t; done > > real 0m10.353s > user 0m0.875s > sys 0m7.313s > > So effectively about 8 milli(!)seconds per run ... But you cannot compare this, can you? Well, it's not the same as just opening and closing a file. But for comparison here's the same on my machine: $ touch foo $ time for i in {1..1000}; do cat foo; done real 0m0,839s user 0m0,594s sys 0m0,304s Levente > > > Best, > > Christoph > > _________________________________________________________________________________________________________________________________________________________________________________________________________________________________ > Von: Squeak-dev <[hidden email]> im Auftrag von tim Rowledge <[hidden email]> > Gesendet: Dienstag, 14. Januar 2020 00:38:07 > An: The general-purpose Squeak developers list > Betreff: Re: [squeak-dev] The Inbox: ShoutCore-ct.78.mcz > Just for fun I tried that on a Pi3+ with SSD and got > a) a halt because the first [ is in the wrong place, should be > | filename | > [filename := UUID new asString36. > FileStream newFileNamed: filename do: [ :file | ]. > [ FileStream fileNamed: filename do: [ :file | ] ] bench ] > ensure: [ FileDirectory default deleteFileNamed: filename ]. > > b) '9,710 per second. 103 microseconds per run. 1.9 % GC time.' so 1/12th the performance. > > tim > -- > tim Rowledge; [hidden email]; http://www.rowledge.org/tim > "#define QUESTION ((bb) || !(bb)) - Shakespeare." > > > > >
Carpe Squeak!
|
Free forum by Nabble | Edit this page |