Hi all, while running a debug squeak.cog.spur macos x64, I noticed repeated logs in console: > (((usqInt)sourceIndex)) < endOfSource 2132 This corresponds to failing assert: in BitBltSimulation slang srcLongAt: idx <inline: #always> self assert: idx asUnsignedInteger < endOfSource. ^self long32At: idx Maybe it's a well known bug, but it's easy to trigger if you are after it: Open Squeak5.1-16548-64bit.image (I took that of nsboot while testing failing Newspeak bootstrap https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/tests/newspeakBootstrap.sh) Open TestRunner and run all tests... |
To trigger the failure, just do this: BitmapStreamTests new testShortIntegerArrayReadRefStream2. The source code of BitBlt hack in ShortIntegerArray>>restoreEndianness sounds correct. The assertion failure occurs when destX < sourceX - presumably a case when hDir = 1 (see checkSourceOverlap). I used LLDB rather than simulation to find it fast, but now it would be easier to just run VM simulation Here are the details: 1) I tried LLDB conditional break, but it's awfully slow 2) I modified generated source code and compiled the hardcoded condition: 2131 /* pick up next word */ if( ((usqInt)sourceIndex) >= endOfSource ) 2132 assert((((usqInt)sourceIndex)) < endOfSource); 3) launch lldb, set the breaskpoint (lldb) br se -f BitBltPlugin.c -l 2132 (lldb) run ../../image/trunk6-64.image 3) I ran the SUnit tests... Process 83691 launched: '/Users/nicolas/Smalltalk/OpenSmalltalk/opensmalltalk-vm/build.macos64x64/squeak.cog.spur/SqueakDebug.app/Contents/MacOS/Squeak' (x86_64) Process 83691 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 frame #0: 0x0000000100186871 Squeak`copyLoop at BitBltPlugin.c:2132 2129 destMask = mask1; 2130 2131 /* pick up next word */ if( ((usqInt)sourceIndex) >= endOfSource ) -> 2132 assert((((usqInt)sourceIndex)) < endOfSource); 2133 thisWord = long32At(sourceIndex); 2134 sourceIndex += hInc; 2135 Target 0: (Squeak) stopped. (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 * frame #0: 0x0000000100186871 Squeak`copyLoop at BitBltPlugin.c:2132 frame #1: 0x0000000100182b81 Squeak`copyBitsLockedAndClipped at BitBltPlugin.c:1595 frame #2: 0x000000010017fba6 Squeak`copyBits at BitBltPlugin.c:1257 frame #3: 0x000000010017fea3 Squeak`primitiveCopyBits at BitBltPlugin.c:5115 frame #4: 0x000000010000cd2d Squeak`interpret at gcc3x-cointerp.c:6192 frame #5: 0x0000000100022196 Squeak`enterSmalltalkExecutiveImplementation at gcc3x-cointerp.c:16192 frame #6: 0x0000000100001657 Squeak`interpret at gcc3x-cointerp.c:2761 frame #7: 0x000000010015e477 Squeak`-[sqSqueakMainApplication runSqueak](self=0x00000001018091d0, _cmd="runSqueak") at sqSqueakMainApplication.m:201 frame #8: 0x00007fff41cd87b8 Foundation`__NSFirePerformWithOrder + 360 frame #9: 0x00007fff3fb4df57 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__ + 23 frame #10: 0x00007fff3fb4de7f CoreFoundation`__CFRunLoopDoObservers + 527 frame #11: 0x00007fff3fb303f8 CoreFoundation`__CFRunLoopRun + 1240 frame #12: 0x00007fff3fb2fc93 CoreFoundation`CFRunLoopRunSpecific + 483 frame #13: 0x00007fff3ee1ad96 HIToolbox`RunCurrentEventLoopInMode + 286 frame #14: 0x00007fff3ee1aa0f HIToolbox`ReceiveNextEventCommon + 366 frame #15: 0x00007fff3ee1a884 HIToolbox`_BlockUntilNextEventMatchingListInModeWithFilter + 64 frame #16: 0x00007fff3d0caa73 AppKit`_DPSNextEvent + 2085 frame #17: 0x00007fff3d860e34 AppKit`-[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 3044 frame #18: 0x00007fff3d0bf885 AppKit`-[NSApplication run] + 764 frame #19: 0x00007fff3d08ea72 AppKit`NSApplicationMain + 804 frame #20: 0x0000000100158086 Squeak`main(argc=2, argv=0x00007ffeefbff910, envp=0x00007ffeefbff928) at main.m:74 frame #21: 0x00007fff67a5d015 libdyld.dylib`start + 1 (lldb) call printCallStack() 0x7ffeefbd21e8 I ShortIntegerArray>restoreEndianness 0x1083e30e8: a(n) ShortIntegerArray 0x7ffeefbd2248 I ReadStream(PositionableStream)>nextWordsInto: 0x108377710: a(n) ReadStream 0x7ffeefbd2298 I ShortIntegerArray class(ArrayedCollection class)>newFromStream: 0x1089b2bf0: a(n) ShortIntegerArray class 0x7ffeefbd22f0 M ReferenceStream(DataStream)>readWordLike 0x108377738: a(n) ReferenceStream 0x7ffeefbd2360 I ReferenceStream(DataStream)>next 0x108377738: a(n) ReferenceStream 0x7ffeefbd23c8 I ReferenceStream>next 0x108377738: a(n) ReferenceStream 0x7ffeefbd2420 I BitmapStreamTests>testShortIntegerArrayReadRefStream2 0x1083775e0: a(n) BitmapStreamTests snip... (lldb) print skew (sqInt) $25 = -24 (lldb) print preload (sqInt) $26 = 1 (lldb) print hDir (sqInt) $14 = 1 It seems to me that it's got something to do with 64bits shifter... No time to simulate the VM now, if someone want to take it, it's a good exercize... Le mar. 3 sept. 2019 à 23:44, Nicolas Cellier <[hidden email]> a écrit :
|
> On 2019-09-04, at 12:48 AM, Nicolas Cellier <[hidden email]> wrote: > > > It seems to me that it's got something to do with 64bits shifter... > No time to simulate the VM now, if someone want to take it, it's a good exercize... > It's highly likely to be a side-effect of the prefetch stuff; we've had 'fun' with this for decades. IIRC there was some real fun on Mac OS years ago if the bitmap memory happened to end just on a page boundary - or something of that sort. Basically if the mode involves reading the destination bitmap then we end up with the last pixel trying to read a word *after* the end of the bitmap. Obviously the clipping etc values are supposed to result in those bits (generally a header for the next object in memory of course) getting ignored. It can be amusing if the code is subtly wrong and dodgy bits get into your Forms. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Computer possessed? Try DEVICE=C:\EXOR.SYS |
Le mer. 4 sept. 2019 à 18:57, tim Rowledge <[hidden email]> a écrit :
yes, that happens very rarely, but that might happen and SEGV Basically if the mode involves reading the destination bitmap then we end up with the last pixel trying to read a word *after* the end of the bitmap. Obviously the clipping etc values are supposed to result in those bits (generally a header for the next object in memory of course) getting ignored. It can be amusing if the code is subtly wrong and dodgy bits get into your Forms. Do we ever use the extra word that we read? I don't think so, we don't generate funny artefacts. But we could try and test that now that we know how to trigger it (statistically). I think that Eliot spent some time to debug and eliminate most overrun cases in 2018 (every BitBltSimulation is stamped eem 2018). But it's a hell of a low level code mixing a bunch of states with a bunch of branches with a very imperative programming style... I don't know if the code is provable by a machine, but for a human brain, it's far more above the average capabilities ;) Maybe it's write only kind, even for Dan himself! -- |
> On 2019-09-04, at 11:00 AM, Nicolas Cellier <[hidden email]> wrote: > > tim > Do we ever use the extra word that we read? I don't think so, we don't generate funny artefacts. > But we could try and test that now that we know how to trigger it (statistically). We certainly shouldn't use that extra word. Unless something has gone horribly wrong, the only time this is an issue is on the last fetch of the last row of a BLT. So we *could* consider splitting out the last row and only for that last row include a check for the last word, and only for that word include a 'stick 0 in'. Since this is an assert for the debug, perhaps the smart thing to do is make the assert test more thoughtful in some manner. The typical case simply doesn't matter, since all we do is load a wasted word. Where things do need some care is at the end of object memory and page boundaries etc. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim No single raindrop believes it is to blame for the flood |
Le mer. 4 sept. 2019 à 20:16, tim Rowledge <[hidden email]> a écrit :
Yes, possibly, but then one also has to check the first row whenever vDir = -1 (checkSourceOverlap). I said a hell of a stateful branchful code, but you know it even better than I ;) Since this is an assert for the debug, perhaps the smart thing to do is make the assert test more thoughtful in some manner. The typical case simply doesn't matter, since all we do is load a wasted word. Where things do need some care is at the end of object memory and page boundaries etc. |
Hehe, I think I fixed it in VMMaker.oscog-nice.2563 The ultimate tool for understanding and debugging this was pen and paper! I don't know how young techies will do ;) Le mer. 4 sept. 2019 à 21:24, Nicolas Cellier <[hidden email]> a écrit :
|
Thats a great result Nicolas, after its been outstanding for so long.
+ bbW <= startBits - bbW < startBits + preload := bbW > startBits and: [(m1 bitAnd: mask1) ~= mask1]. "i.e. there are some missing bits" - preload := (m1 bitAnd: mask1) ~= mask1. "i.e. there are some missing bits" I nominate you for the Charles Steinmetz Chalk Mark Medal... https://playingintheworldgame.com/2014/07/19/the-chalk-mark/ cheers -ben On Sat, 14 Sep 2019 at 03:03, Nicolas Cellier <[hidden email]> wrote:
|
In reply to this post by Nicolas Cellier
On Fri, Sep 13, 2019 at 12:03 PM Nicolas Cellier <[hidden email]> wrote:
:-) Thank you!!!!!!!
_,,,^..^,,,_ best, Eliot |
In reply to this post by Ben Coman
+1 Best, Karl On Sat, Sep 14, 2019 at 2:00 AM Ben Coman <[hidden email]> wrote:
|
In reply to this post by Nicolas Cellier
On Fri, Sep 13, 2019 at 9:03 PM Nicolas Cellier <[hidden email]> wrote:
Great, I ported your fixes to GraalSqueak which contains a Java version of BitBlt. Glad that I could remove a few workarounds to avoid some IndexOutOfBounds errors, but even better that those segfaults are gone. Well done! Fabio
|
The sole problem is whether a single chalk mark will cover all the possible cases of failures... Let's cross fingers. Le lun. 16 sept. 2019 à 18:38, Fabio Niephaus <[hidden email]> a écrit :
|
In reply to this post by fniephaus
Hi Fabio, The only GraalSqueak repository I could find (which is also the one mentioned in the papers) is at https://github.com/hpi-swa/graalsqueak, but it has not been touched in the last two years. Is there a different public repository or are you only working on it in private? Thank you, Florin On Mon, Sep 16, 2019 at 11:38 AM Fabio Niephaus <[hidden email]> wrote:
|
Hi Florin, Thanks for your interest! Yes, we are still developing GraalSqueak in a private repository at the moment, but we are currently working towards open sourcing it for the MPLR paper presentation [1] mid October. Best, Fabio On Wed, Sep 25, 2019 at 5:34 AM Florin Mateoc <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |