LargeInteger parsing (?) broken between 5.2b and update 18615

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

LargeInteger parsing (?) broken between 5.2b and update 18615

timrowledge
Here's a strange bug that appears to have arrived between 5.2b and 18615.

In a scratch copy of a recent image, add this code to IntegerTest -
testParseInteger

        #(1 0 -1 2 -2 1073741823 -1073741824 10737418231073741823 "-10737418231073741823")  do: [ :each |
                self parse: each asString with: #parseIntegerLength: shouldGive: each ]

and compile it. Should be fine. don't worry about the #parse:with:shouldGive: - it's actually from PostGresV3 tests.

Now uncomment the last number in the list and try to compile it. Locks up on my Pi - so hard that cmd-. does nothing.

Works fine in a 5.2b image.

Gronk?

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Useful Latin Phrases:- Noli me vocare, ego te vocabo = Don't call me, I'll call you.



Reply | Threaded
Open this post in threaded view
|

Re: LargeInteger parsing (?) broken between 5.2b and update 18615

marcel.taeumel

Best,
Marcel

Am 20.07.2019 07:20:57 schrieb tim Rowledge <[hidden email]>:

Here's a strange bug that appears to have arrived between 5.2b and 18615.

In a scratch copy of a recent image, add this code to IntegerTest -
testParseInteger

#(1 0 -1 2 -2 1073741823 -1073741824 10737418231073741823 "-10737418231073741823") do: [ :each |
self parse: each asString with: #parseIntegerLength: shouldGive: each ]

and compile it. Should be fine. don't worry about the #parse:with:shouldGive: - it's actually from PostGresV3 tests.

Now uncomment the last number in the list and try to compile it. Locks up on my Pi - so hard that cmd-. does nothing.

Works fine in a 5.2b image.

Gronk?

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Useful Latin Phrases:- Noli me vocare, ego te vocabo = Don't call me, I'll call you.





Reply | Threaded
Open this post in threaded view
|

Re: LargeInteger parsing (?) broken between 5.2b and update 18615

timrowledge


> On 2019-07-20, at 1:32 AM, Marcel Taeumel <[hidden email]> wrote:
>
> Maybe related to this? http://forum.world.st/The-Trunk-Kernel-nice-1224-mcz-td5098877.html

Perhaps?

More info as I descend into the depths of code I'd prefer never to have to think about -
All seems ok down to Parser>parseCue:noPattern:ifFail: where I can get down to the
[methNode := self method: noPattern context: cue]
    on: ReparseAfterSourceEditing
    do: {.....
The debuggersteps to the Do block as usual and when I step over that in order to try the method:context: it simply locks up everything. Which is a bit of a surprise. That ReparseAfterSourceEditing isn't anything to do with recent Shout changes is it?

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Strange OpCodes: SLTMDL: Shift Left, Test Mask and Dim the Lights



Reply | Threaded
Open this post in threaded view
|

Re: LargeInteger parsing (?) broken between 5.2b and update 18615

Levente Uzonyi
On Sat, 20 Jul 2019, tim Rowledge wrote:

>
>
>> On 2019-07-20, at 1:32 AM, Marcel Taeumel <[hidden email]> wrote:
>>
>> Maybe related to this? http://forum.world.st/The-Trunk-Kernel-nice-1224-mcz-td5098877.html
>
> Perhaps?
>
> More info as I descend into the depths of code I'd prefer never to have to think about -
> All seems ok down to Parser>parseCue:noPattern:ifFail: where I can get down to the
> [methNode := self method: noPattern context: cue]
>    on: ReparseAfterSourceEditing
>    do: {.....
> The debuggersteps to the Do block as usual and when I step over that in order to try the method:context: it simply locks up everything. Which is a bit of a surprise. That ReparseAfterSourceEditing isn't anything to do with recent Shout changes is it?

No, that has nothing to do with Shout. Shout has its own parser, and it
was last modified in April.

Also, you have to click on Through to be able to debug #method:context:.
Over will just execute that block without the debugger entering it.

Btw, I remember running into an error related to LargeInteger parsing, but
a updating my image fixed it. What's the version of Kernel in your image?

Levente

>
> tim
> --
> tim Rowledge; [hidden email]; http://www.rowledge.org/tim
> Strange OpCodes: SLTMDL: Shift Left, Test Mask and Dim the Lights

Reply | Threaded
Open this post in threaded view
|

Re: LargeInteger parsing (?) broken between 5.2b and update 18615

timrowledge


> On 2019-07-20, at 10:29 AM, Levente Uzonyi <[hidden email]> wrote:
>
> Also, you have to click on Through to be able to debug #method:context:. Over will just execute that block without the debugger entering it.

D'oh. Serves me right for committing Debugging Before Caffeination. Now I get down to Parser>>method:context: and hit the problem in the 'self statements:#() innerBlock: doit' line. Quit, restart, redo...

>
> Btw, I remember running into an error related to LargeInteger parsing, but a updating my image fixed it. What's the version of Kernel in your image?

1246 in my current-working image but 1240 in the vanilla update #18615 image I also tested.

I'll dig deeper as time permits but the one really clear problem here is that there is no response to cmd-., a horrible thing that we must always fight to prevent. I know it can happen when the VM is having a hard time trying to write out a debug stack trace etc but I don't think I've ever known that to exceed several minutes, so my guess is this is not one of those cases.


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Strange OpCodes: RLB: Ruin Logic Board



Reply | Threaded
Open this post in threaded view
|

Re: LargeInteger parsing (?) broken between 5.2b and update 18615

Levente Uzonyi
On Sun, 21 Jul 2019, tim Rowledge wrote:

>
>
>> On 2019-07-20, at 10:29 AM, Levente Uzonyi <[hidden email]> wrote:
>>
>> Also, you have to click on Through to be able to debug #method:context:. Over will just execute that block without the debugger entering it.
>
> D'oh. Serves me right for committing Debugging Before Caffeination. Now I get down to Parser>>method:context: and hit the problem in the 'self statements:#() innerBlock: doit' line. Quit, restart, redo...
>
>>
>> Btw, I remember running into an error related to LargeInteger parsing, but a updating my image fixed it. What's the version of Kernel in your image?
>
> 1246 in my current-working image but 1240 in the vanilla update #18615 image I also tested.
>
> I'll dig deeper as time permits but the one really clear problem here is that there is no response to cmd-., a horrible thing that we must always fight to prevent. I know it can happen when the VM is having a hard time trying to write out a debug stack trace etc but I don't think I've ever known that to exceed several minutes, so my guess is this is not one of those cases.

Did you try to send SIGUSR1 to the vm's process?

Levente

>
>
> tim
> --
> tim Rowledge; [hidden email]; http://www.rowledge.org/tim
> Strange OpCodes: RLB: Ruin Logic Board

Reply | Threaded
Open this post in threaded view
|

Re: LargeInteger parsing (?) broken between 5.2b and update 18615

timrowledge
After patiently stepping through pretty much the entire compilation of the method it actually begins to look as if it is a problem with the recent acept-edit stuff that might be causing this. Which really is a bit puzzling.

Once the compile had apparently completed ok (I really thought for a few moments that it might be an interaction of large integer hashing and hashed collections growing, but it worked ok) I let the debugger proceed and got a notifier somewhere relating to acceptText... something, before a gazillion dNU notifiers splatted all across the display. The system was sorta-kinda running, very, very slowly but I did eventually manage to get the exit dialogue and quit cleanly. Which is very different to the prior attempts.

After that, another attempt where I uncommented the last intger and chopped it down to a SmallInt value just worked - so clearly the largeinteger had something to do with things... but the acceptText stuff ... no problem this time. Restore the last number to a large int and ... lockup.


> On 2019-07-21, at 4:10 PM, Levente Uzonyi <[hidden email]> wrote:
>
> On Sun, 21 Jul 2019, tim Rowledge wrote:
>
>>
>>
>>> On 2019-07-20, at 10:29 AM, Levente Uzonyi <[hidden email]> wrote:
>>> Also, you have to click on Through to be able to debug #method:context:. Over will just execute that block without the debugger entering it.
>>
>> D'oh. Serves me right for committing Debugging Before Caffeination. Now I get down to Parser>>method:context: and hit the problem in the 'self statements:#() innerBlock: doit' line. Quit, restart, redo...
>>
>>> Btw, I remember running into an error related to LargeInteger parsing, but a updating my image fixed it. What's the version of Kernel in your image?
>>
>> 1246 in my current-working image but 1240 in the vanilla update #18615 image I also tested.
>>
>> I'll dig deeper as time permits but the one really clear problem here is that there is no response to cmd-., a horrible thing that we must always fight to prevent. I know it can happen when the VM is having a hard time trying to write out a debug stack trace etc but I don't think I've ever known that to exceed several minutes, so my guess is this is not one of those cases.
>
> Did you try to send SIGUSR1 to the vm's process?

Not previously; since I normally d-click on the image to start it I typically don't think of that. Just tried it this time though and after using pgrep to find the pid, did kill pid -10 - which is the wrong way round, duh, and it killed it anyway. Sigh.

So now I'm even more puzzled. It's the largeint, but it isn't, but the acceptText.. blew up but it doesn't...

Such fun!

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Strange OpCodes: DUL: Delete Utility Library



Reply | Threaded
Open this post in threaded view
|

Re: LargeInteger parsing (?) broken between 5.2b and update 18615

timrowledge
Well, this is interesting; After updating my Pi VM (to the squeak.cog.spur_linux32ARMv6_201907192337.tar.gz version from the squeak.org inked bintray stuff) and getting no change - I hoped perhaps we'd updated the largeint plugin or something like that - I copied the image across to my iMac and tried the problematic compile there; no problem.

The Mac vm is a plain 5.2b release vm. Both are 32 bit (because the Pi does not yet do 64 bit). Happy happy; VM debugging time. There's a pleasure I haven't had in a while...

Anyone wanting to check things out can obtain a copy of the image and recent crash.dmp etc on request.

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Programmer: One who is too lacking in people skills to be a software engineer.



Reply | Threaded
Open this post in threaded view
|

Re: LargeInteger parsing (?) broken between 5.2b and update 18615

David T. Lewis
On Sun, Jul 21, 2019 at 09:15:41PM -0700, tim Rowledge wrote:
> Well, this is interesting; After updating my Pi VM (to the squeak.cog.spur_linux32ARMv6_201907192337.tar.gz version from the squeak.org inked bintray stuff) and getting no change - I hoped perhaps we'd updated the largeint plugin or something like that - I copied the image across to my iMac and tried the problematic compile there; no problem.
>
> The Mac vm is a plain 5.2b release vm. Both are 32 bit (because the Pi does not yet do 64 bit). Happy happy; VM debugging time. There's a pleasure I haven't had in a while...
>
> Anyone wanting to check things out can obtain a copy of the image and recent crash.dmp etc on request.
>

I just tried your test on my Ubuntu Intel machine with a locally compiled VM,
and I see no problem.

Suggestion: Recompile your VM with optimization turned off ( -O0 compiler flag)
and see if the problem goes away.


For reference, my VM is:


  Virtual Machine
  ---------------
  /usr/local/lib/squeak/5.0-201903061925/squeak
  Croquet Closure Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.2523]
  Unix built on Jun 20 2019 12:05:41 Compiler: 5.4.0 20160609
  platform sources revision VM: 201903061925 lewis@lewis-Gazelle-Pro:squeak/git/opensmalltalk-vm Date: Wed Mar 6 11:25:10 2019 CommitHash: 2ede003 Plugins: 201903061925 lewis@lewis-Gazelle-Pro:squeak/git/opensmalltalk-vm
  CoInterpreter VMMaker.oscog-eem.2523 uuid: 1d88061f-236d-40fd-a739-30fe691d0606 Jun 20 2019
  StackToRegisterMappingCogit VMMaker.oscog-eem.2521 uuid: 4f1618e4-2a0c-4ba8-be03-b8670286ba00 Jun 20 2019


Dave
 

Reply | Threaded
Open this post in threaded view
|

Re: LargeInteger parsing (?) broken between 5.2b and update 18615

timrowledge
Oh my, for extra fun it turns out that whilst the individual method can compile ok on my iMac, loading the entire postgresv3 core test package blows it away. Interestingly, all the crash.dmps I have, both new and old VMs on both Pi & iMac seem to have hash related stuff at the focus.

On the mac -
==================
Segmentation fault Mon Jul 22 10:04:23 2019


VM: 201701281910 https://github.com/OpenSmalltalk/opensmalltalk-vm.git $ Date: Sat Jan 28 11:10:24 2017 -0800 $
Plugins: 201701281910 https://github.com/OpenSmalltalk/opensmalltalk-vm.git $

C stack backtrace & registers:
        eax 0x001d24e8 ebx 0x000fac1a ecx 0x0636d997 edx 0x1ae871c5
        edi 0x1ae871c5 esi 0x1ae871c5 ebp 0xbfeefb10 esp 0xbfeefb04
        eip 0x0636d99c
0   ???                                 0x0636d99c 0x0 + 104257948
1   Squeak                              0x0015ec93 reportStackState + 706
2   Squeak                              0x0015efeb sigsegv + 113
3   libsystem_platform.dylib            0xa756a02b _sigtramp + 43
4   ???                                 0xffffffff 0x0 + 4294967295


Smalltalk stack dump:
0xbfeefb10 M MCClassDefinition>hash 0x66745d0: a(n) MCClassDefinition
0xbfeefb38 M Set>scanFor: 0x6694f50: a(n) Set
0xbfeefb58 M Set>add: 0x6694f50: a(n) Set
0xbfeefb7c I MCDependencySorter>addRequirement:for: 0x6683dd8: a(n) MCDependencySorter
0xbfeefba0 M [] in MCDependencySorter>addRequirements:for: 0x6683dd8: a(n) MCDependencySorter
0xbfeefbc4 M Array(SequenceableCollection)>do: 0x6694dd0: a(n) Array
0xbfeefbe8 I MCDependencySorter>addRequirements:for: 0x6683dd8: a(n) MCDependencySorter
0xbfeefc0c M MCDependencySorter>add: 0x6683dd8: a(n) MCDependencySorter
0xbfeefc28 M [] in MCDependencySorter>addAll: 0x6683dd8: a(n) MCDependencySorter
0xbfeefc4c M Array(SequenceableCollection)>do: 0x6683e98: a(n) Array
0xbfeefc70 I MCDependencySorter>addAll: 0x6683dd8: a(n) MCDependencySorter
0xbfeefc94 I MCDependencySorter class>items: 0x6b7ca28: a(n) MCDependencySorter class
0xbfeefcbc I MCDependencySorter class>sortItems: 0x6b7ca28: a(n) MCDependencySorter class
0xbfef1a7c I MCStWriter>writeDefinitions: 0x6683d98: a(n) MCStWriter
0xbfef1aa0 M [] in MCMczWriter>serializeDefinitions: 0x667b600: a(n) MCMczWriter
0xbfef1ac4 M String class(SequenceableCollection class)>new:streamContents: 0x6b71640: a(n) String class
0xbfef1ae4 M String class(SequenceableCollection class)>streamContents: 0x6b71640: a(n) String class

On the Pi -
===============
SIGUSR1 Sun Jul 21 21:04:28 2019


/home/pi/Squeak/sqcogspurlinuxhtRPi/lib/squeak/5.0-201907192337/squeak
Squeak VM version: 5.0-201907192337  Sat Jul 20 02:36:48 UTC 2019 gcc 4.9.2 [Production Spur VM]
Built from: CoInterpreter VMMaker.oscog-eem.2530 uuid: 4d90ede0-0700-4d15-8173-2aaf2360b7d1 Jul 20 2019
With: StackToRegisterMappingCogit VMMaker.oscog-eem.2530 uuid: 4d90ede0-0700-4d15-8173-2aaf2360b7d1 Jul 20 2019
Revision: VM: 201907192337 https://github.com/OpenSmalltalk/opensmalltalk-vm.git Date: Fri Jul 19 16:37:03 2019 CommitHash: 31a8d08 Plugins: 201907192337 https://github.com/OpenSmalltalk/opensmalltalk-vm.git
Build host: Linux travis-job-bd015c84-202a-4098-ac72-50be56578934 4.4.0-104-generic #127~14.04.1-Ubuntu SMP Mon Dec 11 12:44:15 UTC 2017 armv7l GNU/Linux
plugin path: /home/pi/Squeak/sqcogspurlinuxhtRPi/lib/squeak/5.0-201907192337 [default: /home/pi/Squeak/sqcogspurlinuxhtRPi/lib/squeak/5.0-201907192337/]


C stack backtrace & registers:
         r0 0x00601168 r1 0x00235ae0 r2 0x00501950 r3 0x7e9b478c
         r4 0x002128e0 r5 0x00601168 r6 0x04354b25 r7 0x00000000
         r8 0x002168d0 r9 0x00000000 r10 0x002128a4 fp 0x7e9b47c4
         ip 0x002169f0 sp 0x7e9b47ac lr 0x006010fc pc 0x006010fc
*[0x0]
/home/pi/Squeak/sqcogspurlinuxhtRPi/lib/squeak/5.0-201907192337/squeak[0x1e4508]


All Smalltalk process stacks (active first):
Process  0x3b16430 priority 40
0x7e9b47c4 M Array(SequenceableCollection)>hash 0x7b4dc0: a(n) Array
0x7e9b47ec M PluggableDictionary>scanFor: 0x7b1a58: a(n) PluggableDictionary
0x7e9b480c M PluggableDictionary(Dictionary)>at:ifAbsent: 0x7b1a58: a(n) PluggableDictionary
0x7e9b482c M PluggableDictionary(Dictionary)>at:ifAbsentPut: 0x7b1a58: a(n) PluggableDictionary
0x7e9b4854 I EncoderForV3PlusClosures(Encoder)>name:key:class:type:set: 0x7b18d8: a(n) EncoderForV3PlusClosures
0x7e9b4888 I EncoderForV3PlusClosures(Encoder)>encodeLiteral: 0x7b18d8: a(n) EncoderForV3PlusClosures
0x7e9b48ac I Parser>primaryExpression 0x7b1650: a(n) Parser
0x7e9b48cc I Parser>expression 0x7b1650: a(n) Parser
0x7e9b48f8 I Parser>statements:innerBlock:blockNode: 0x7b1650: a(n) Parser
0x7e9b4924 I Parser>statements:innerBlock: 0x7b1650: a(n) Parser
0x7e9b4964 I Parser>method:context: 0x7b1650: a(n) Parser
0x7e9b498c M [] in Parser>parseCue:noPattern:ifFail: 0x7b1650: a(n) Parser
0x7e9b49a8 M BlockClosure>on:do: 0x7b4ee0: a(n) BlockClosure
0x7e9b276c I Parser>parseCue:noPattern:ifFail: 0x7b1650: a(n) Parser
0x7e9b2798 I Compiler>translateNoPattern:ifFail: 0x7b1588: a(n) Compiler
0x7e9b27c0 I Compiler>compileCue:noPattern:ifFail: 0x7b1588: a(n) Compiler
0x7e9b27ec I Compiler>compile:ifFail: 0x7b1588: a(n) Compiler
0x7e9b2824 I PG3SocketReadStreamTest class(ClassDescription)>compile:environment:classified:withStamp:notifying:logSource: 0x3ba26c8: a(n) PG3SocketReadStreamTest class
0x7e9b285c I PG3SocketReadStreamTest class(ClassDescription)>compile:classified:withStamp:notifying:logSource: 0x3ba26c8: a(n) PG3SocketReadStreamTest
Segmentation fault Sun Jul 21 21:04:28 2019


/home/pi/Squeak/sqcogspurlinuxhtRPi/lib/squeak/5.0-201907192337/squeak
Squeak VM version: 5.0-201907192337  Sat Jul 20 02:36:48 UTC 2019 gcc 4.9.2 [Production Spur VM]
Built from: CoInterpreter VMMaker.oscog-eem.2530 uuid: 4d90ede0-0700-4d15-8173-2aaf2360b7d1 Jul 20 2019
With: StackToRegisterMappingCogit VMMaker.oscog-eem.2530 uuid: 4d90ede0-0700-4d15-8173-2aaf2360b7d1 Jul 20 2019
Revision: VM: 201907192337 https://github.com/OpenSmalltalk/opensmalltalk-vm.git Date: Fri Jul 19 16:37:03 2019 CommitHash: 31a8d08 Plugins: 201907192337 https://github.com/OpenSmalltalk/opensmalltalk-vm.git
Build host: Linux travis-job-bd015c84-202a-4098-ac72-50be56578934 4.4.0-104-generic #127~14.04.1-Ubuntu SMP Mon Dec 11 12:44:15 UTC 2017 armv7l GNU/Linux
plugin path: /home/pi/Squeak/sqcogspurlinuxhtRPi/lib/squeak/5.0-201907192337 [default: /home/pi/Squeak/sqcogspurlinuxhtRPi/lib/squeak/5.0-201907192337/]


C stack backtrace & registers:
         r0 0x7e9b2890 r1 0x00000000 r2 0x002128e0 r3 0x00640000
         r4 0x7e9b2890 r5 0x00000000 r6 0x002128e0 r7 0x002358c0
         r8 0x00000401 r9 0x00000001 r10 0x04354b25 fp 0x00601168
         ip 0x0000000a sp 0x7e9b3238 lr 0x00076758 pc 0x000766e0
*[0x0]
/home/pi/Squeak/sqcogspurlinuxhtRPi/lib/squeak/5.0-201907192337/squeak[0x1e4508]

Most recent primitives
value:
utcMicrosecondClock
primPosixMicrosecondClockWithOffset
primPosixMicrosecondClockWithOffset
*
**PrimitiveFailure**
digitMultiply:neg:
+
\\
//
\\
quo:
at:
at:
basicNew

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
"Daddy, what does FORMATTING DRIVE C mean?"



Reply | Threaded
Open this post in threaded view
|

Re: LargeInteger parsing (?) broken between 5.2b and update 18615

timrowledge
OK, so after fun doing a binary split, I can now assert that loading Kernel-eem.1198.mcz is what breaks the compilation of that particular method. That's the change where Eliot altered the hashing to make things be the same between 32 & 64 bit systems. On a 32bit Pi we clearly get a value beyond SmallInteger range with the new version of LPI>>hash. Reverting the change to LPI>>hash makes it possible to compile the problematic method.

I'm actually having a bit of a hard time working out how it can cause a complete lock up of the image though. Is it some issue with a 32bit VM not liking non-SmallInt hash values? Maybe some related prim that only accepts machine/vm word sized values?

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Justify my text?  I'm sorry but it has no excuse.



Reply | Threaded
Open this post in threaded view
|

Re: LargeInteger parsing (?) broken between 5.2b and update 18615

David T. Lewis
On Mon, Jul 22, 2019 at 03:16:02PM -0700, tim Rowledge wrote:
> OK, so after fun doing a binary split, I can now assert that loading
> Kernel-eem.1198.mcz is what breaks the compilation of that particular
> method. That's the change where Eliot altered the hashing to make things
> be the same between 32 & 64 bit systems. On a 32bit Pi we clearly get
> a value beyond SmallInteger range with the new version of LPI>>hash.
> Reverting the change to LPI>>hash makes it possible to compile the
> problematic method.
>

I can't think of any reason that it would be a problem, but note that
on 32 bit Spur images:

  1073741824 hash class ==> LargePositiveInteger

And on 64 bit Spur images:

  1073741824 hash class ==> SmallInteger

So the values of the hash values are the same on 32 bit and 64 bit images,
but the classes of the hash values can be different.

> I'm actually having a bit of a hard time working out how it can cause
> a complete lock up of the image though. Is it some issue with a 32bit
> VM not liking non-SmallInt hash values? Maybe some related prim that
> only accepts machine/vm word sized values?
>

I don't know the answer (and I am fairly clueless with regard to hashing),
but is there some general expectation that the hash value for an object
should fall within the range of small integers?

Prior to Kernel-eem.1198, I think it was true that all integers had
hash values that were small integers, and that now we have a certain
range of integers which, on 32-bit images, have hash values that are
LargePositiveInteger. I don't know why that would be a problem, but
maybe it is?

Dave