Invalid access to memory location during GC

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Invalid access to memory location during GC

Chris Hayes-3
Something in my image went horribly, horribly wrong and I'm now getting an
"Invalid access to memory location" everytime GC occurs (some area of memory
is trying to be read).

Is there any way to recover from this situation without starting from a new
image?

Here's the stack that leads up to the fault:

ProcessorScheduler>>gpFault:
[] in ProcessorScheduler>>vmi:list:no:with:
BlockClosure>>ifCurtailed:
ProcessorScheduler>>vmi:list:no:with:
SmallInteger(Object)>>primitiveFailed
SmallInteger(MemoryManager)>>primCollectGarbage:
MemoryManager>>collectGarbage
MemoryManager>>aboutToIdle
InputState>>idleWinV5
InputState>>idleLoop

Thanks,

Chris Hayes


Reply | Threaded
Open this post in threaded view
|

Re: Invalid access to memory location during GC

Blair McGlashan
"Chris Hayes" <hayes@*zapthis*.creative-computing-inc.com> wrote in message
news:%VpN8.28606$[hidden email]...
> Something in my image went horribly, horribly wrong and I'm now getting an
> "Invalid access to memory location" everytime GC occurs (some area of
memory
> is trying to be read).
>
> Is there any way to recover from this situation without starting from a
new
> image?

No, almost certainly not (sorry). Although one glimmer of hope is that it is
very difficult to save down an image in this form because a full GC is run
before the snapshot, so normally the save fails before anything is written.
Also I think it unlikely that the image loader could successfully load a
corrupt image that the GC is unable to traverse. Therefore you might be able
to do something with 'prestart.st'. Place this file in the image directory,
then attempt whatever cleanup you can think of in there. It is a chunk
format file loaded quite early during startup - after the #primaryStartup
stage, but before initializing the windowing system, etc. You could put an
expression in it such as:

    SessionManager inputState windows become: LookupTable new.

(which will discard all the windows saved down with the image).

BTW: These sort of problems are normally caused by external library calls
writing off the end of an object. Chris Uppal found a case of this very
recently in the implementation of the CRTLibrary>>sprintfWith:with: method,
part of the base system. It is virtually impossible to corrupt memory from
within Smalltalk itself, although by taking the address of an object and
using one of the ExternalAddress primitives one can write in an unbounded
way to memory if one so desires. Therefore the second most likely suspect is
a mis-sized external structure. We almost always use the ActiveX Component
Wizard to build external structures automatically from type libraries these
days, even if it means writing the IDL (or bodging it up from C  header
files), since it removes the possibility of errors due to calculating the
wrong size or offset for fields. It was surprising how many such errors we
found in the structures we had originally built by hand when we re-generated
them using the wizard.

Regards

Blair


Reply | Threaded
Open this post in threaded view
|

Re: Invalid access to memory location during GC

Chris Hayes-3
"Blair McGlashan" <[hidden email]> wrote in message
news:ae5cik$43iqs$[hidden email]...
>
> No, almost certainly not (sorry). Although one glimmer of hope is that it
is
> very difficult to save down an image in this form because a full GC is run
> before the snapshot, so normally the save fails before anything is
written.
> Also I think it unlikely that the image loader could successfully load a
> corrupt image that the GC is unable to traverse. Therefore you might be
able
> to do something with 'prestart.st'. Place this file in the image
directory,
> then attempt whatever cleanup you can think of in there. It is a chunk
> format file loaded quite early during startup - after the #primaryStartup
> stage, but before initializing the windowing system, etc. You could put an
> expression in it such as:
>

Thanks for the response, Blair.  I wasn't making much headway so I went
ahead and started with a clean image.  Also thanks to Ian and his
ChunckBrowser for making recovery much less painful than it could have been.
I really need to start saving my packages more frequently :-).

Regards,

Chris


Reply | Threaded
Open this post in threaded view
|

Re: Invalid access to memory location during GC

rush
In reply to this post by Blair McGlashan
"Blair McGlashan" <[hidden email]> wrote in message
news:ae5cik$43iqs$[hidden email]...
> BTW: These sort of problems are normally caused by external library calls
> writing off the end of an object. Chris Uppal found a case of this very
> recently in the implementation of the CRTLibrary>>sprintfWith:with:
method,
> part of the base system. It is virtually impossible to corrupt memory from
> within Smalltalk itself, although by taking the address of an object and
> using one of the ExternalAddress primitives one can write in an unbounded
> way to memory if one so desires. Therefore the second most likely suspect
is
> a mis-sized external structure. We almost always use the ActiveX Component
> Wizard to build external structures automatically from type libraries
these
> days, even if it means writing the IDL (or bodging it up from C  header
> files), since it removes the possibility of errors due to calculating the
> wrong size or offset for fields. It was surprising how many such errors we
> found in the structures we had originally built by hand when we
re-generated
> them using the wizard.

Blair,

have you ever contemplated about creating a kind of debug VM, where objects
would have some kind of checksum, so that it would be easier to spot when
object is overwritten. So when external call is finished, VM could check
neighbouring objects for validity? I know it would be dog slow, but maybe it
could help to nail some hard problems more easily. Also, making such VM
could be from nightmare, to more or less easy thing depending on how many
places object layout is hardwired, and on how uniform is object write
access.

Davorin Rusevljan