Mantis #7695

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Mantis #7695

John McKeon
Hello list,
I thought I would take a stab at making a contribution to this software I so love.
So I posted a file out to fix this issue and would ask if someone could review it.

Thanks


Reply | Threaded
Open this post in threaded view
|

Re: Mantis #7695

Chris Muller-3
*** (Question for Bert and Josh Gargus, below).

OMG Thank You John!  I cannot review your solution but I tested it.  I
have been suffering this major problem for a long time now.  It's
awful, actually -- just working, deep in thought and suddenly -- the
image locking up seemingly for no reason.  I did eventually realize it
was due to a bug in my alarm code, but only after lost time and work.
I'm sure this was not an easy find / fix, thank you!

Maui uses alarms so I used that to test it.  I purposely broke my
alarm-code to have a DNU in it.  Without your fix, the image locks up.
 With your fix, I actually get TWO debuggers telling me the problem --
which is MUCH better than locking up.

I see all you did was fork valuing the alarm..?  Interesting.

Even more interesting, I went back to Andreas' original version of the
method (from year 2000).  When I revert to his version, I get no lock
up and just one debugger as I would expect.  It's perfect.

And this suddenly made me realize that I think Josh's change
introduced even another problem which I've been seeing and unable to
track down -- Maui messages inexplicably firing TWICE.  With your fix
having two debuggers appear vs. one with Andreas' original  method, it
sure implicates this method for THAT problem too!

*** Question for Bert and Josh:

This was introduced in Morphic-jcg.301 and Morphic-jcg.302.  The
comment for 301 says it was inspired by Berts suggestion in the
"Future Sends" thread.  I found that in the archives at around the
same time as this change, but I do not understand the discussion, nor
this change.  Could you guys please revisit this and see if you can
remember what this is about so we can fix it right?  As it is, it
really hurts.

I'd be happy to help test if necessary.

Thanks,
  Chris



On Tue, Sep 11, 2012 at 7:13 PM, John McKeon <[hidden email]> wrote:

> Hello list,
> I thought I would take a stab at making a contribution to this software I so
> love.
> So I posted a file out to fix this issue and would ask if someone could
> review it.
>
> Thanks
> John McKeon
>
> --
> jmck.seasidehosting.st
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Mantis #7695

Frank Shearar-3
On 12 September 2012 02:23, Chris Muller <[hidden email]> wrote:

> *** (Question for Bert and Josh Gargus, below).
>
> OMG Thank You John!  I cannot review your solution but I tested it.  I
> have been suffering this major problem for a long time now.  It's
> awful, actually -- just working, deep in thought and suddenly -- the
> image locking up seemingly for no reason.  I did eventually realize it
> was due to a bug in my alarm code, but only after lost time and work.
> I'm sure this was not an easy find / fix, thank you!
>
> Maui uses alarms so I used that to test it.  I purposely broke my
> alarm-code to have a DNU in it.  Without your fix, the image locks up.
>  With your fix, I actually get TWO debuggers telling me the problem --
> which is MUCH better than locking up.
>
> I see all you did was fork valuing the alarm..?  Interesting.

But that makes all the difference in the world: without the fork, any
errors raised will unwind the stack past WorldState >>
#triggerAlarmsBefore:, letting "userland" fails eat into "kernel"
parts of the code.

Without being an expert in this _at all_, the lock's there because
using futures means multiple Processes adding alarms. (I would
actually prefer to see futures using something similar to #fork rather
than using Morphic, and would have submitted something, if I weren't
doing umpteen other things. My bad!)

frank

> Even more interesting, I went back to Andreas' original version of the
> method (from year 2000).  When I revert to his version, I get no lock
> up and just one debugger as I would expect.  It's perfect.
>
> And this suddenly made me realize that I think Josh's change
> introduced even another problem which I've been seeing and unable to
> track down -- Maui messages inexplicably firing TWICE.  With your fix
> having two debuggers appear vs. one with Andreas' original  method, it
> sure implicates this method for THAT problem too!
>
> *** Question for Bert and Josh:
>
> This was introduced in Morphic-jcg.301 and Morphic-jcg.302.  The
> comment for 301 says it was inspired by Berts suggestion in the
> "Future Sends" thread.  I found that in the archives at around the
> same time as this change, but I do not understand the discussion, nor
> this change.  Could you guys please revisit this and see if you can
> remember what this is about so we can fix it right?  As it is, it
> really hurts.
>
> I'd be happy to help test if necessary.
>
> Thanks,
>   Chris
>
>
>
> On Tue, Sep 11, 2012 at 7:13 PM, John McKeon <[hidden email]> wrote:
>> Hello list,
>> I thought I would take a stab at making a contribution to this software I so
>> love.
>> So I posted a file out to fix this issue and would ask if someone could
>> review it.
>>
>> Thanks
>> John McKeon
>>
>> --
>> jmck.seasidehosting.st
>>
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Mantis #7695

Bert Freudenberg
In reply to this post by Chris Muller-3

On 2012-09-12, at 03:23, Chris Muller <[hidden email]> wrote:

> *** (Question for Bert and Josh Gargus, below).
>
> OMG Thank You John!  I cannot review your solution but I tested it.  I
> have been suffering this major problem for a long time now.  It's
> awful, actually -- just working, deep in thought and suddenly -- the
> image locking up seemingly for no reason.  I did eventually realize it
> was due to a bug in my alarm code, but only after lost time and work.
> I'm sure this was not an easy find / fix, thank you!
>
> Maui uses alarms so I used that to test it.  I purposely broke my
> alarm-code to have a DNU in it.  Without your fix, the image locks up.
> With your fix, I actually get TWO debuggers telling me the problem --
> which is MUCH better than locking up.
>
> I see all you did was fork valuing the alarm..?  Interesting.

Interesting, though completely wrong ;)

Alarms are supposed to be evaluated in the UI process. You can't just fork them.

But at least John found the right place to fix it :)

I just pushed a fix. It simply takes evaluating the alarm out of the critical section. Now the error shows just fine.

Don't know how to reproduce your double-firing problem.

- Bert -

> Even more interesting, I went back to Andreas' original version of the
> method (from year 2000).  When I revert to his version, I get no lock
> up and just one debugger as I would expect.  It's perfect.
>
> And this suddenly made me realize that I think Josh's change
> introduced even another problem which I've been seeing and unable to
> track down -- Maui messages inexplicably firing TWICE.  With your fix
> having two debuggers appear vs. one with Andreas' original  method, it
> sure implicates this method for THAT problem too!
>
> *** Question for Bert and Josh:
>
> This was introduced in Morphic-jcg.301 and Morphic-jcg.302.  The
> comment for 301 says it was inspired by Berts suggestion in the
> "Future Sends" thread.  I found that in the archives at around the
> same time as this change, but I do not understand the discussion, nor
> this change.  Could you guys please revisit this and see if you can
> remember what this is about so we can fix it right?  As it is, it
> really hurts.
>
> I'd be happy to help test if necessary.
>
> Thanks,
>  Chris
>
>
>
> On Tue, Sep 11, 2012 at 7:13 PM, John McKeon <[hidden email]> wrote:
>> Hello list,
>> I thought I would take a stab at making a contribution to this software I so
>> love.
>> So I posted a file out to fix this issue and would ask if someone could
>> review it.
>>
>> Thanks
>> John McKeon
>>
>> --
>> jmck.seasidehosting.st
>>
>>
>>
>


Reply | Threaded
Open this post in threaded view
|

Re: Mantis #7695

Chris Muller-3
> Interesting, though completely wrong ;)

It couldn't be "completely" wrong since locking the image is a lot
more wrong than not.

> I just pushed a fix. It simply takes evaluating the alarm out of the critical section. Now the error shows just fine.
>
> Don't know how to reproduce your double-firing problem.

The double-firing doesn't seem to happen with your fix.  Thanks!  I'm
really glad to be able to put this lurking spectre bed.

Reply | Threaded
Open this post in threaded view
|

Re: Mantis #7695

Hannes Hirzel
On 9/14/12, Chris Muller <[hidden email]> wrote:
>> Interesting, though completely wrong ;)

I assume the comment was influenced Bert's German style communication
habits. In their culture this is perfectly doable, even polite,  to be
short and straight. However in this context I would have preferred as
well some more elaboration...

But Bert put into perspective by the smiley to cater for a more
international audience.

>From a business point of view this was an excellent job of you all.
Fixing a bug nobody could have done alone. This problem has been
around for many years!

I hope to see more cooperation like this in the upcoming weeks as we
move towards the release of 4.4 in October. We want it  to really be
stable. (This type of comment however would more belong to Frank)



> It couldn't be "completely" wrong since locking the image is a lot
> more wrong than not.
>

Sure. Identifying the method were the corrective action has to take
place is narrowing down the problem from ten thousands of lines to
just 10..20.

>> I just pushed a fix. It simply takes evaluating the alarm out of the
>> critical section. Now the error shows just fine.
>>
>> Don't know how to reproduce your double-firing problem.
>
> The double-firing doesn't seem to happen with your fix.  Thanks!  I'm
> really glad to be able to put this lurking spectre bed.
>
Yessss.

--Hannes

Reply | Threaded
Open this post in threaded view
|

Re: Mantis #7695

Frank Shearar-3
On 14 September 2012 12:00, H. Hirzel <[hidden email]> wrote:

> On 9/14/12, Chris Muller <[hidden email]> wrote:
>>> Interesting, though completely wrong ;)
>
> I assume the comment was influenced Bert's German style communication
> habits. In their culture this is perfectly doable, even polite,  to be
> short and straight. However in this context I would have preferred as
> well some more elaboration...
>
> But Bert put into perspective by the smiley to cater for a more
> international audience.
>
> >From a business point of view this was an excellent job of you all.
> Fixing a bug nobody could have done alone. This problem has been
> around for many years!
>
> I hope to see more cooperation like this in the upcoming weeks as we
> move towards the release of 4.4 in October. We want it  to really be
> stable. (This type of comment however would more belong to Frank)

*cough*. What Hannes said!

frank

Reply | Threaded
Open this post in threaded view
|

Re: Mantis #7695

Bert Freudenberg
In reply to this post by Chris Muller-3
On 2012-09-14, at 03:18, Chris Muller <[hidden email]> wrote:

>> Interesting, though completely wrong ;)
>
> It couldn't be "completely" wrong since locking the image is a lot
> more wrong than not.

And a broken clock is right twice a day ;)

It changed the semantics of the whole system instead of only fixing the problem. We would have traded a clear-cut lockup for hard-to-debug problems caused by code running in a different process.

But I agree with Hannes that this was a great collaboration. John identified the problem and came up with a potential fix. That made it easy for me to write a correct version.

>> I just pushed a fix. It simply takes evaluating the alarm out of the critical section. Now the error shows just fine.
>>
>> Don't know how to reproduce your double-firing problem.
>
> The double-firing doesn't seem to happen with your fix.  Thanks!  I'm
> really glad to be able to put this lurking spectre bed.

Curious. My fix should not have any detectable effect, except for the error case. Hmm, maybe if an alarm callback schedules another alarm? That might be different.

- Bert -