More on deployed exe problem

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

More on deployed exe problem

Bill Schwab
Andy and Blair,

(I'm trying this again due to an error - sorry if it posts twice)

Below is an error log for a condition that I've seen a couple of times now.
It looks like an initial error (which might be secondary to some other
problem) followed by some kind of recursive disaster.  While I haven't again
gone through the process of rebuilding my image from packages, that will
probably fix it.

At this point, I suspect that some kind of triggering event occurs in the
development image, after which deployed executables fail as below.  It's
difficult to get detailed information after it occurs because the machine
claims to be starved for resources, and basically nothing  else runs until I
reboot.  Sometimes the failed app writes a log, sometimes not - well, it's
probably more a matter of whether the file has time and/or memory available
to flush to disk.  Crash dumps have been intermittent, and usually empty.

It's still possible that my computer is flaky, but, given the same failure
mode as before and no other obvious problems, I am now suspecting a problem
in Dolphin or my image.  This seems to be happening in primary startup,
before any of my home-grown startup code runs, so I doubt that's the
problem, though I can work on converting it.

Any suggestions for a bang-on-the-box fix?

Have a good one,

Bill



5:04:10 PM, Monday, January 01, 2001: Unhandled exception in
ScheduleWizardSession - a MessageNotUnderstood('VMLibrary does not
understand #asInteger'):
 'VMLibrary does not understand #asInteger'

VMLibrary(Object)>>doesNotUnderstand:
SmallInteger(VMLibrary)>>handleFromInteger:
SmallInteger(Integer)>>asExternalHandle
RPCLibrary(ExternalLibrary)>>handle:
RPCLibrary class(ExternalLibrary class)>>clear
[] in ExternalLibrary class>>onStartup
OrderedCollection>>do:
ExternalLibrary class>>onStartup
ScheduleWizardSession(SessionManager)>>openLibraries
ScheduleWizardSession(SessionManager)>>basicPrimaryStartup
ScheduleWizardSession(SessionManager)>>primaryStartup
ScheduleWizardSession(RuntimeSessionManager)>>primaryStartup
[] in ScheduleWizardSession(SessionManager)>>onStartup:
BlockClosure>>ensure:
ScheduleWizardSession(SessionManager)>>onStartup:
ProcessorScheduler>>onStartup:
[] in ProcessorScheduler>>vmi:list:no:with:
BlockClosure>>ifCurtailed:
ProcessorScheduler>>vmi:list:no:with:
ImageStripper>>{unbound}snapshot:
ImageStripper>>{unbound}saveExecutable:withStub:
MessageSend>>value
InputState>>loopWhile:
InputState>>mainLoop
[] in InputState>>forkMain
ExceptionHandler(ExceptionHandlerAbstract)>>markAndTry
[] in ExceptionHandler(ExceptionHandlerAbstract)>>try:
BlockClosure>>ensure:
ExceptionHandler(ExceptionHandlerAbstract)>>try:
BlockClosure>>on:do:
[] in BlockClosure>>newProcess


5:04:10 PM, Monday, January 01, 2001: Unhandled exception in
ScheduleWizardSession - a GPFault(5:04:10 PM, Monday, January 01, 2001:
Unhandled exception in ScheduleWizardSession - a GPFault(5:04:10 PM, Monday,
January 01, 2001: Unhandled exception in ScheduleWizardSession - a
GPFault(5:04:10 PM, Monday, January 01, 2001: Unhandled exception in
ScheduleWizardSession - a GPFault(5:04:10 PM, Monday, January 01, 2001:
Unhandled exception in ScheduleWizardSession - a GPFault(5:04:10 PM, Monday,
January 01, 2001: Unhandled exception in ScheduleWizardSession - a
GPFault(5:04:10 PM, Monday, January 01, 2001: Unhandled exception in
ScheduleWizardSession - a GPFault(5:04:10 PM, Monday, January 01, 2001:
Unhandled exception in ScheduleWizardSession - a GPFault(5:04:10 PM, Monday,
January 01, 2001: Unhandled exception in ScheduleWizardSession - a
GPFault(5:04:10 PM, Monday, January 01, 2001: Unhandled exception in
ScheduleWizardSession - a GPFault(5:04:10 PM,

--
Wilhelm K. Schwab, Ph.D.
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: More on deployed exe problem

Blair McGlashan
Bill

You wrote in message news:92r2bi$icg$[hidden email]...
> ...
> Below is an error log for a condition that I've seen a couple of times
now.

>....
> Any suggestions for a bang-on-the-box fix?
> ....
>  5:04:10 PM, Monday, January 01, 2001: Unhandled exception in
> ScheduleWizardSession - a MessageNotUnderstood('VMLibrary does not
> understand #asInteger'):
>  'VMLibrary does not understand #asInteger'
>
> VMLibrary(Object)>>doesNotUnderstand:
> SmallInteger(VMLibrary)>>handleFromInteger:
> SmallInteger(Integer)>>asExternalHandle
> RPCLibrary(ExternalLibrary)>>handle:
> RPCLibrary class(ExternalLibrary class)>>clear
> [] in ExternalLibrary class>>onStartup
> OrderedCollection>>do:
> ExternalLibrary class>>onStartup
> ScheduleWizardSession(SessionManager)>>openLibraries

I think I may have found and fixed this problem this very evening. Please
try the attached patch, although I would caution that it has had very
limited testing.

What I found to be happening is that a recent patch to the external library
startup code had introduced a sequencing problem. On image startup all
default library instances are cleared down (they used to be discarded
entirely pre 4.0), which involves nulling the handles and clearing cached
proc addresses, with this being done by passing zero to the #handle: setter
method. Unfortunately that method sends #asExternalHandle to its argument
before storing it down, and the implementation of that method in Integer
uses a VMLibrary exported function to rapidly convert a 32-bit Integer to a
byte object. What happens then is that the old VMLibrary proc address is
called, because it has not yet been cleared and so is still cached from the
previous image run, and a GPF results. Typically one gets away with calling
the old VM proc address, as the VM DLL is almost always loaded at the same
address, but this is not absolutely guaranteed.

I hope this solves your immediate problem. Please let me know if it doesn't,
or if you experience any side effects.

Regards

Blair
























begin 666 ExternalLibrary_clear.st
M(45X=&5R;F%L3&EB<F%R>2!M971H;V1S1F]R(0T*#0IH86YD;&4Z(&%(86YD
M;&4-"@DB4')I=F%T92 M(%-E="!T:&4@:&%N9&QE(&]F('1H92!E>'1E<FYA
M;"!L:6)R87)Y('=H:6-H('1H92!R96-E:79E<B!R97!R97-E;G1S+@T*"4%N
M<W=E<B!T:&4@<F5C96EV97(N(@T*#0H):&%N9&QE(#H](&%(86YD;&4@:7-.
M:6P@:69&86QS93H@6V%(86YD;&4@87-%>'1E<FYA;$AA;F1L95TA("$-"B%%
M>'1E<FYA;$QI8G)A<GD@8V%T96=O<FEE<T9O<CH@(VAA;F1L93HA86-C97-S
M:6YG(7!R:79A=&4A("$-"@T*(45X=&5R;F%L3&EB<F%R>2!C;&%S<R!M971H
M;V1S1F]R(0T*#0IC;&5A<@T*"2)0<FEV871E("T@0VQE87(@9&]W;B!C86-H
M960@97AT97)N86P@9G5N8W1I;VX@861D<F5S<V5S(&9R;VT@<')E=FEO=7,@
M<G5N<RX@#0H)5&AE(&1E9F%U;'0@:6YS=&%N8V5S('=I;&P@8F4@;&%Z:6QY
M(')E+6]P96YE9"!B96-A=7-E('1H96ER(&AA;F1L97,@=VEL;"!B92!N=6QL
M#0H);VX@:6UA9V4@<F4M<W1A<G0@*$5X=&5R;F%L2&%N9&QE<R!A<F4@875T
M;VUA=&EC86QL>2!N=6QL960@8GD@=&AE(%9-(&]N(&EM86=E(&QO860I+@T*
M"5-I;6EL87)Y(&9U;F-T:6]N(&%D9')E<W-E<R!W:6QL(&)E(&QA>FEL>2!R
M97%U97)I960@87,@<F5Q=6ER960N(@T*#0H)9&5F875L="!N;W1.:6P@:694
M<G5E.B!;9&5F875L="!H86YD;&4Z(&YI;%TN#0H)<V5L9B!C;&5A<DUE=&AO
M9$1I8W1I;VYA<GDZ('-E;&8@;65T:&]D1&EC=&EO;F%R>2XA("$-"B%%>'1E
M<FYA;$QI8G)A<GD@8VQA<W,@8V%T96=O<FEE<T9O<CH@(V-L96%R(6EN:71I
686QI>FEN9R%P<FEV871E(2 A#0H-"@``
`
end


Reply | Threaded
Open this post in threaded view
|

Re: More on deployed exe problem

Steve Waring-2
Hi Blair,

Could I have been seeing this problem with the WDK? I have never been able
to nail down an example that reproduces the problem, however it looks very
similar.

This only occurs when running as an applet under win95/95, and only when
using binary packages. In w2k I have not seen it. I have seen this in two of
my packages, both have loose methods added to ExternalLibrary subclasses.

The error message in IE is;
Invalid access to memory location. Reading 0x77EB7BCA, IP 0x77EB7BCA ()

The stack dump consistently shows;
[0x087E068C: 402]-->a KernelLibrary
[0x087E0688: 401]-->71238456
[0x087E0684: 400]-->VMLibrary>>crashDump:
[0x087E0680: 399]-->71238468
[0x087E067C: 398]-->13
[0x087E0678: 397]-->71238444
[0x087E0674: 396]-->a DWORDArray
[0x087E0670: 395]-->'Invalid access to memory location. Reading 0x77EB7BCA,
IP 0x77EB7BCA ()'
[0x087E066C: 394]-->a VMLibrary
[0x087E0668: 393]-->71238442
[0x087E0664: 392]-->PluginSessionManager>>logError:

And if I look back to where the problem starts I see something like;

{0x088D0400: cf 0x088D03ED, sp 0x088D0410, bp 0x08834128, ip 20,
ProcessorScheduler>>vmi:list:no:with:}
{0x088D03EC: cf 0x088D03D1, sp 0x088D03FC, bp 0x088D03E8, ip 1,
SmallInteger(Object)>>doesNotUnderstand:}
{0x088D03D0: cf 0x088D03B9, sp 0x088D03E0, bp 0x088D03C4, ip 3,
SmallInteger(GDILibrary)>>polyBezier:lppt:cPoints:}
{0x088D03B8: cf 0x088D03A5, sp 0x088D03BC, bp 0x088346D8, ip 41,
Canvas>>polyBezier:}

I have been avoiding this by just testing for win95/98 and disabling the
features that rely on these methods.


If this is the same problem, and there is a fix to the base image, how does
that get applied to the plug-in? Is there a facility in place to "live
update" plug-ins? Do you have a secret code to replace methods with methods
in a binary package :) ?

Thanks
Steve


Reply | Threaded
Open this post in threaded view
|

Re: More on deployed exe problem

Blair McGlashan
Steve

You wrote in message news:92rcnk$7q6qp$[hidden email]...
> ...
> Could I have been seeing this problem with the WDK? I have never been able
> to nail down an example that reproduces the problem, however it looks very
> similar.

No, I don't think so, at least in this case. This is a problem that occurs
very early on in startup processing, which doesn't match up with the
symptoms you are seeing (basically the plugin wouldn't work at all).

>
> This only occurs when running as an applet under win95/95, and only when
> using binary packages. In w2k I have not seen it. I have seen this in two
of

> my packages, both have loose methods added to ExternalLibrary subclasses.
> ...
> And if I look back to where the problem starts I see something like;
>
> {0x088D0400: cf 0x088D03ED, sp 0x088D0410, bp 0x08834128, ip 20,
> ProcessorScheduler>>vmi:list:no:with:}
> {0x088D03EC: cf 0x088D03D1, sp 0x088D03FC, bp 0x088D03E8, ip 1,
> SmallInteger(Object)>>doesNotUnderstand:}
> {0x088D03D0: cf 0x088D03B9, sp 0x088D03E0, bp 0x088D03C4, ip 3,
> SmallInteger(GDILibrary)>>polyBezier:lppt:cPoints:}
> {0x088D03B8: cf 0x088D03A5, sp 0x088D03BC, bp 0x088346D8, ip 41,
> Canvas>>polyBezier:}
>

Hmmm, that obviously needs a little further investigation. If you could send
me the full crash dump I'll see if I can deduce anything further.

> If this is the same problem, and there is a fix to the base image, how
does
> that get applied to the plug-in?

By releasing a new plugin. Not ideal perhaps, but there is currently way no
"official" way to patch a plugin image.

>....Is there a facility in place to "live
> update" plug-ins? Do you have a secret code to replace methods with
methods
> in a binary package :) ?

No, not at present. I imagine it would be possible by invoking the compiler
directly though.

Regards

Blair


Reply | Threaded
Open this post in threaded view
|

Re: More on deployed exe problem

Bill Schwab-2
In reply to this post by Blair McGlashan
Blair,

> What I found to be happening is that a recent patch to the external
library
> startup code had introduced a sequencing problem. On image startup all
> default library instances are cleared down (they used to be discarded
> entirely pre 4.0), which involves nulling the handles and clearing cached
> proc addresses, with this being done by passing zero to the #handle:
setter
> method. Unfortunately that method sends #asExternalHandle to its argument
> before storing it down, and the implementation of that method in Integer
> uses a VMLibrary exported function to rapidly convert a 32-bit Integer to
a
> byte object. What happens then is that the old VMLibrary proc address is
> called, because it has not yet been cleared and so is still cached from
the
> previous image run, and a GPF results.

That's a sneaky one: congratulations on finding it!


> Typically one gets away with calling
> the old VM proc address, as the VM DLL is almost always loaded at the same
> address, but this is not absolutely guaranteed.

This could explain the problems I was seeing, both on one machine between
development and deployed images, and (even more easily) the image that did
well on one machine and not the other.


> I hope this solves your immediate problem. Please let me know if it
doesn't,
> or if you experience any side effects.

There are times when life would be easier if I didn't have to show up at
work :)  The interesting test will be to apply the patch to the most recent
image I was using at home, and deploy an executable from it.  If the exe
runs, it'll be pretty good evidence that you fixed the problem.  For now,
I'll make sure I have a good backup and give it a try here in my office.

Thanks!!

Bill

--
Wilhelm K. Schwab, Ph.D.
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: More on deployed exe problem

Bill Schwab-2
Hi Blair,

It's been a slightly interesting day :)   I'm gradually working my way
around to having an image that works.  As I had feared might be the case, my
image from yesterday would not load on my office computer.  Being "stuck"
here by various circumstances, it seemed reasonable to build and image and
get something done.  As it turns out, it was a useful exercise.

I was reminded of the following, which I don't believe I have previously
mentioned:

!AXTypeInfoAnalyzer class methodsFor!

onTypeInfo: piTypeInfo
 "Answer a sub-instance of the receiver of an appropriate class to wrap
 the <ITypeInfo>, piTypeInfo, referencing the correct unique
<AXTypeLibraryAnalyzer>
 instance that represents its containing type library."

 | contain answer lib |
 #wksDangerous. "Added #asParameter"
 contain := piTypeInfo asParameter libraryAndIndex.
 lib := contain key.
 answer := (AXTypeLibraryAnalyzer onTypeLib: contain key)
    typeAnalyzerAt: contain value.
 lib free.
 ^answer
! !
!AXTypeInfoAnalyzer class categoriesFor: #onTypeInfo:!instance
creation!public! !

I found this necessary (well, it fixed a problem I was having) at one point.


With respect to installations and patch levels, _something_ isn't right in
Gainesville.  It could simply be that I didn't go through the right steps,
which would be understandable given that I had to restore some backups and
return to what I was doing.  However, one detail that I wanted to run down
was whether or not I still need my overlapped connect patches; I thought you
had addressed that, but, didn't see it in my image.  Upon inspecting the
contents of PL1, I see that the overlapped connect fixes are present.  So,
one of the following seems likely: (1) I blew it and need to start over; (2)
SE patches aren't applied to a PRO install; (3) my image is confused about
it's level; (4) there might be a couple of versions of PL1 floating around
(maybe one from the updated MSI and one from the live update??); (4) did I
mention that I might have made a mistake somewhere<g>.

I think it's time to double check that your recent patch is in my new image,
make a backup, and try to build it again paying close attention to PL1.

Have a good one,

Bill

--
Wilhelm K. Schwab, Ph.D.
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: More on deployed exe problem

Steve Waring-2
> With respect to installations and patch levels, _something_ isn't right in
> Gainesville.  It could simply be that I didn't go through the right steps,

Bill,

Do you remember a problem with the PersonalMoney.pac download? My first
attempt at installing pl1 only installed patches before the
PersonalMoney.pac  download, and not after it. My image showed PL1, and
except for a download timeout, which in hindsight I incorrectly restarted,
there was no indication I had not fully installed the patch.

Blair posted that evaluating;
SessionManager current productDetails at: 5 put: 0

resets the patch level, and Live update will re-install the patch.

Steve


Reply | Threaded
Open this post in threaded view
|

Re: More on deployed exe problem

Bill Schwab-2
Steve,

> Do you remember a problem with the PersonalMoney.pac download? My first
> attempt at installing pl1 only installed patches before the
> PersonalMoney.pac  download, and not after it. My image showed PL1, and
> except for a download timeout, which in hindsight I incorrectly restarted,
> there was no indication I had not fully installed the patch.

Thanks for the reminder.  Not long ago, I read this, and some other items in
the archive.  It got me further along but...


> Blair posted that evaluating;
> SessionManager current productDetails at: 5 put: 0
>
> resets the patch level, and Live update will re-install the patch.

I challenge<g> you to get the overlapped connect method to file in from the
patch.  Copy/paste/accept did the trick, but, it's obviously far from ideal.
BTW, the companion method that grabs and stores the error _did_ file in
though - weird.

One more comment re personal money: I'm not really able to explain how I got
the package.  The patch didn't want to work in the default image; when I
saved it to my working location, it "worked", though I don't know whether it
was simply an older copy of the package that made it happy.  Of course, if
that were the case, then I'd have to explain where it came from =:0

Anybody else having a problem with the connect method?

Have a good one,

Bill

--
Wilhelm K. Schwab, Ph.D.
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: More on deployed exe problem

Bill Schwab
In reply to this post by Bill Schwab
Hi Blair,

It looks like it works!  Applying the patch to the offending image allowed
it to deploy a working executable.  I probably should get a backup of it
patched image and transport it to my office to see if it will then be able
to load; I'm running against the clock right now though.

BTW, it seems that my commercial ISP will not show me attachments in this
group.  Is that common?  Because of this, I had to resort to my memory of
which methods you had in your patch, recorded below.  I filed them out of
the rebuilt/patched image from yesterday and then into the troubled image
which would load only at home.

I have an account with one of the free news providers which lets me see
attachments, though it appears that neither will allow me to post
attachments.

Thanks for the patch!

Have a good one,

Bill

--
Wilhelm K. Schwab, Ph.D.
[hidden email]


!ExternalLibrary methodsFor!

handle: aHandle
 "Private - Set the handle of the external library which the receiver
represents.
 Answer the receiver."

 handle := aHandle isNil ifFalse: [aHandle asExternalHandle]! !
!ExternalLibrary categoriesFor: #handle:!accessing!private! !

!ExternalLibrary class methodsFor!

clear
 "Private - Clear down cached external function addresses from previous
runs.
 The default instances will be lazily re-opened because their handles will
be null
 on image re-start (ExternalHandles are automatically nulled by the VM on
image load).
 Similary function addresses will be lazily requeried as required."

 default notNil ifTrue: [default handle: nil].
 self clearMethodDictionary: self methodDictionary.! !
!ExternalLibrary class categoriesFor: #clear!initializing!private! !


Reply | Threaded
Open this post in threaded view
|

Patch worked for me

Daryl Richter
In reply to this post by Blair McGlashan
My 4.0 deployment images were all GPFing.  Applying this patch fixed it for
me.  So far no adverse effects observed.

--
Regards,
Daryl

<< Sun, Zoom, Spark! >>

"Blair McGlashan" <[hidden email]> wrote in message
news:fq846.43493$[hidden email]...

> Bill
>
> You wrote in message news:92r2bi$icg$[hidden email]...
> > ...
> > Below is an error log for a condition that I've seen a couple of times
> now.
> >....
> > Any suggestions for a bang-on-the-box fix?
> > ....
> >  5:04:10 PM, Monday, January 01, 2001: Unhandled exception in
> > ScheduleWizardSession - a MessageNotUnderstood('VMLibrary does not
> > understand #asInteger'):
> >  'VMLibrary does not understand #asInteger'
> >
> > VMLibrary(Object)>>doesNotUnderstand:
> > SmallInteger(VMLibrary)>>handleFromInteger:
> > SmallInteger(Integer)>>asExternalHandle
> > RPCLibrary(ExternalLibrary)>>handle:
> > RPCLibrary class(ExternalLibrary class)>>clear
> > [] in ExternalLibrary class>>onStartup
> > OrderedCollection>>do:
> > ExternalLibrary class>>onStartup
> > ScheduleWizardSession(SessionManager)>>openLibraries
>
> I think I may have found and fixed this problem this very evening. Please
> try the attached patch, although I would caution that it has had very
> limited testing.
>
> What I found to be happening is that a recent patch to the external
library
> startup code had introduced a sequencing problem. On image startup all
> default library instances are cleared down (they used to be discarded
> entirely pre 4.0), which involves nulling the handles and clearing cached
> proc addresses, with this being done by passing zero to the #handle:
setter
> method. Unfortunately that method sends #asExternalHandle to its argument
> before storing it down, and the implementation of that method in Integer
> uses a VMLibrary exported function to rapidly convert a 32-bit Integer to
a
> byte object. What happens then is that the old VMLibrary proc address is
> called, because it has not yet been cleared and so is still cached from
the
> previous image run, and a GPF results. Typically one gets away with
calling
> the old VM proc address, as the VM DLL is almost always loaded at the same
> address, but this is not absolutely guaranteed.
>
> I hope this solves your immediate problem. Please let me know if it
doesn't,
> or if you experience any side effects.
>
> Regards
>
> Blair
>
>
>