error when updating Squeak4.4-12327 to trunk

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: error when updating Squeak4.4-12327 to trunk

Nicolas Cellier
2013/3/12 Eliot Miranda <[hidden email]>:

> On Sun, Mar 10, 2013 at 12:10 PM, Nicolas Cellier
> <[hidden email]> wrote:
>> OK, see the VM thread, I now think that problems does not come from
>> COG, but from ClassBuilder which in some cases fail to clean a cache
>> (primitive 116).
>> The problem does not show up in interpreter VM thanks to primitive 119
>> (this primitives does not unlink send in cogit).
>
> it does unlink sends, but only for that selector.  But is it really
> the case that it is a missing cache flush or is it a bug in Cog with
> its cache flushing?  I realised the way to test this is to try the
> Stack VM and see if it crashes or not.  I just tried that but now
> neither Cog nor the Stack VM crash although both fail the load with an
> MNU of #do: to UndefinedObject in Environment>>bindingOf:ifAbsent:.
> So how do I get the system back to a state where I can reproduce the
> Cog crash to compare the Stack and Cog VMs with each other?
>
> (Apologies for being unresponsive; I've just moved into a new
> apartment and only got my internet connection yesterday afternoon; at
> least its fast (for the states) :) ).
>
>

Well, primitive 119 does indeed seem clean the cache.
I was confused because there are two primitive 119

primitiveFlushCacheSelective (Interpreter)
primitiveFlushCacheBySelector (StackInterpreter)

It's really a drag to carry all those dead code when you want to
analyze quickly :(

So my first correction (avoid using MethodDictionary new in
ClassBuilder) was probably useless.

What happened is that while recompiling all the new Parser methods,
the old Parser compiled methods are still in use, and thus re-added to
the cache.
So my second attempt  (clean the cache again just before mutation in
ClassBuilder) did the trick.

As for going back in update process, taking an updated trunk, browsing
the update configuration, and loading them in backward order seems to
work.
Or the other way around, from an older 4.4 image, apply all updates up
to nice.221, I think Bert posted a script to automate that.

Nicolas

>> I have attempted a ClassBuilder fix and posted new updates from
>> nice-222 to cwp-227.
>>
>> Can I please ask our testers contribution once again?
>>
>> Nicolas
>>
>> 2013/3/8 Nicolas Cellier <[hidden email]>:
>>> 2013/3/8 Bert Freudenberg <[hidden email]>:
>>>>
>>>> On 2013-03-08, at 10:55, Frank Shearar <[hidden email]> wrote:
>>>>
>>>>> On 7 March 2013 23:25, Frank Shearar <[hidden email]> wrote:
>>>>>> On 7 March 2013 23:11, Bert Freudenberg <[hidden email]> wrote:
>>>>>>> On 2013-03-07, at 23:42, Frank Shearar <[hidden email]> wrote:
>>>>>>>
>>>>>>>>>> On 6 March 2013 15:59, Ken G. Brown <[hidden email]> wrote:
>>>>>>>>>>> Running on COG 2397, and after updating fresh Squeak4.4-12327 Release to
>>>>>>>>>>> 12332, updating to Trunk  fails at first attempt in the same place, then by
>>>>>>>>>>> abandoning and trying the update again, it apparently completes to 12511.
>>>>>>>>>>>
>>>>>>>>>>>  Ken G. Brown
>>>>>>>>>>>
>>>>>>>>>>>> With COG 2678, pretty well the same. First attempt it timed out during
>>>>>>>>>>>> the same update-nice-223, then trying again from what had already been
>>>>>>>>>>>> loaded, got the following during the same update, during compiling
>>>>>>>>>>>> SMLoader-fbs-78 as before:
>>>>>>>>
>>>>>>>> What I find strange about all this is that we take a 4.4-12327 image
>>>>>>>> and whatever the latest Cog is and update it all the way without any
>>>>>>>> probems quite a few times a day on the CI server.
>>>>>>>>
>>>>>>>> frank
>>>>>>>
>>>>>>> Looks like it's an intermittent problem, unfortunately:
>>>>>>>
>>>>>>> I just updated the new all-in-one-cog to latest trunk, no problem. This is a 4.4-12327 image with Cog VM 2697.
>>>>>>>
>>>>>>> I then tried what Ken described: update the fresh image first from the squeak44 stream, then switch to trunk, then update again.
>>>>>>>
>>>>>>> BOOM. Cog crash. Didn't save the log unfortunately.
>>>>>>>
>>>>>>> Tried again. Update, switch to trunk, update again. No crash. What?!
>>>>>>>
>>>>>>> Once more. Update, switch to trunk, update. Crash! See below.
>>>>>>>
>>>>>>> Tried yet again, with switching to trunk immediately in a fresh image. Crashes, too, same place.
>>>>>>>
>>>>>>> So it does crash, just not always. But it's been more than 50% in my case.
>>>>>>
>>>>>> Ah, interesting. The CI jobs, naturally, don't update from squeak44;
>>>>>> they switch to trunk and update just like that. Which I would have
>>>>>> thought would make no difference...
>>>>>
>>>>> Actually, I lie. Here's an example of the CI jobs hitting the same
>>>>> issue: http://build.squeak.org/job/SqueakTrunk/204/console And further
>>>>> if you look at http://build.squeak.org/job/SqueakTrunk/ and choose to
>>>>> see the failing tests you'll see times (say around build #184) where
>>>>> the test failure count is unusually low. And
>>>>> http://build.squeak.org/job/SqueakTrunk/buildTimeTrend shows grey
>>>>> streaks where builds die.
>>>>
>>>> Curious that it still runs the tests at all if the update failed ...
>>>>
>>>> So Cog crashes, but has someone tried to replicate this on an interpreter?
>>>>
>>>> - Bert -
>>>>
>>>
>>> I think that the problem comes form COG which tries to use an obsolete
>>> method sent AFTER the recompilation of Parser which is not the
>>> expected behavior.
>>> I have triggered such kind of strange behavior that does not happen on
>>> an Interpreter VM, see the thread opened by Jeff Gonis '[Vm-dev] Cog
>>> VM Crash on Windows'
>>> For me, it must be related to a cache that is not cleaned-up, I don't know why.
>>>
>>> Nicolas
>>
>
>
>
> --
> best,
> Eliot
>

Reply | Threaded
Open this post in threaded view
|

Re: error when updating Squeak4.4-12327 to trunk

Frank Shearar-3
In reply to this post by Eliot Miranda-2
On 11 March 2013 23:12, Eliot Miranda <[hidden email]> wrote:

> On Sun, Mar 10, 2013 at 12:10 PM, Nicolas Cellier
> <[hidden email]> wrote:
>> OK, see the VM thread, I now think that problems does not come from
>> COG, but from ClassBuilder which in some cases fail to clean a cache
>> (primitive 116).
>> The problem does not show up in interpreter VM thanks to primitive 119
>> (this primitives does not unlink send in cogit).
>
> it does unlink sends, but only for that selector.  But is it really
> the case that it is a missing cache flush or is it a bug in Cog with
> its cache flushing?  I realised the way to test this is to try the
> Stack VM and see if it crashes or not.  I just tried that but now
> neither Cog nor the Stack VM crash although both fail the load with an
> MNU of #do: to UndefinedObject in Environment>>bindingOf:ifAbsent:.
> So how do I get the system back to a state where I can reproduce the
> Cog crash to compare the Stack and Cog VMs with each other?
>
> (Apologies for being unresponsive; I've just moved into a new
> apartment and only got my internet connection yesterday afternoon; at
> least its fast (for the states) :) ).

I'm reasonably sure that this guy -
http://build.squeak.org/job/SqueakTrunk/208/ - has images that are
pre-latest-Environments code. That's #12519 at any rate.

frank

>> I have attempted a ClassBuilder fix and posted new updates from
>> nice-222 to cwp-227.
>>
>> Can I please ask our testers contribution once again?
>>
>> Nicolas
>>
>> 2013/3/8 Nicolas Cellier <[hidden email]>:
>>> 2013/3/8 Bert Freudenberg <[hidden email]>:
>>>>
>>>> On 2013-03-08, at 10:55, Frank Shearar <[hidden email]> wrote:
>>>>
>>>>> On 7 March 2013 23:25, Frank Shearar <[hidden email]> wrote:
>>>>>> On 7 March 2013 23:11, Bert Freudenberg <[hidden email]> wrote:
>>>>>>> On 2013-03-07, at 23:42, Frank Shearar <[hidden email]> wrote:
>>>>>>>
>>>>>>>>>> On 6 March 2013 15:59, Ken G. Brown <[hidden email]> wrote:
>>>>>>>>>>> Running on COG 2397, and after updating fresh Squeak4.4-12327 Release to
>>>>>>>>>>> 12332, updating to Trunk  fails at first attempt in the same place, then by
>>>>>>>>>>> abandoning and trying the update again, it apparently completes to 12511.
>>>>>>>>>>>
>>>>>>>>>>>  Ken G. Brown
>>>>>>>>>>>
>>>>>>>>>>>> With COG 2678, pretty well the same. First attempt it timed out during
>>>>>>>>>>>> the same update-nice-223, then trying again from what had already been
>>>>>>>>>>>> loaded, got the following during the same update, during compiling
>>>>>>>>>>>> SMLoader-fbs-78 as before:
>>>>>>>>
>>>>>>>> What I find strange about all this is that we take a 4.4-12327 image
>>>>>>>> and whatever the latest Cog is and update it all the way without any
>>>>>>>> probems quite a few times a day on the CI server.
>>>>>>>>
>>>>>>>> frank
>>>>>>>
>>>>>>> Looks like it's an intermittent problem, unfortunately:
>>>>>>>
>>>>>>> I just updated the new all-in-one-cog to latest trunk, no problem. This is a 4.4-12327 image with Cog VM 2697.
>>>>>>>
>>>>>>> I then tried what Ken described: update the fresh image first from the squeak44 stream, then switch to trunk, then update again.
>>>>>>>
>>>>>>> BOOM. Cog crash. Didn't save the log unfortunately.
>>>>>>>
>>>>>>> Tried again. Update, switch to trunk, update again. No crash. What?!
>>>>>>>
>>>>>>> Once more. Update, switch to trunk, update. Crash! See below.
>>>>>>>
>>>>>>> Tried yet again, with switching to trunk immediately in a fresh image. Crashes, too, same place.
>>>>>>>
>>>>>>> So it does crash, just not always. But it's been more than 50% in my case.
>>>>>>
>>>>>> Ah, interesting. The CI jobs, naturally, don't update from squeak44;
>>>>>> they switch to trunk and update just like that. Which I would have
>>>>>> thought would make no difference...
>>>>>
>>>>> Actually, I lie. Here's an example of the CI jobs hitting the same
>>>>> issue: http://build.squeak.org/job/SqueakTrunk/204/console And further
>>>>> if you look at http://build.squeak.org/job/SqueakTrunk/ and choose to
>>>>> see the failing tests you'll see times (say around build #184) where
>>>>> the test failure count is unusually low. And
>>>>> http://build.squeak.org/job/SqueakTrunk/buildTimeTrend shows grey
>>>>> streaks where builds die.
>>>>
>>>> Curious that it still runs the tests at all if the update failed ...
>>>>
>>>> So Cog crashes, but has someone tried to replicate this on an interpreter?
>>>>
>>>> - Bert -
>>>>
>>>
>>> I think that the problem comes form COG which tries to use an obsolete
>>> method sent AFTER the recompilation of Parser which is not the
>>> expected behavior.
>>> I have triggered such kind of strange behavior that does not happen on
>>> an Interpreter VM, see the thread opened by Jeff Gonis '[Vm-dev] Cog
>>> VM Crash on Windows'
>>> For me, it must be related to a cache that is not cleaned-up, I don't know why.
>>>
>>> Nicolas
>>
>
>
>
> --
> best,
> Eliot
>

Reply | Threaded
Open this post in threaded view
|

Re: error when updating Squeak4.4-12327 to trunk

Nicolas Cellier
2013/3/12 Frank Shearar <[hidden email]>:

> On 11 March 2013 23:12, Eliot Miranda <[hidden email]> wrote:
>> On Sun, Mar 10, 2013 at 12:10 PM, Nicolas Cellier
>> <[hidden email]> wrote:
>>> OK, see the VM thread, I now think that problems does not come from
>>> COG, but from ClassBuilder which in some cases fail to clean a cache
>>> (primitive 116).
>>> The problem does not show up in interpreter VM thanks to primitive 119
>>> (this primitives does not unlink send in cogit).
>>
>> it does unlink sends, but only for that selector.  But is it really
>> the case that it is a missing cache flush or is it a bug in Cog with
>> its cache flushing?  I realised the way to test this is to try the
>> Stack VM and see if it crashes or not.  I just tried that but now
>> neither Cog nor the Stack VM crash although both fail the load with an
>> MNU of #do: to UndefinedObject in Environment>>bindingOf:ifAbsent:.
>> So how do I get the system back to a state where I can reproduce the
>> Cog crash to compare the Stack and Cog VMs with each other?
>>
>> (Apologies for being unresponsive; I've just moved into a new
>> apartment and only got my internet connection yesterday afternoon; at
>> least its fast (for the states) :) ).
>
> I'm reasonably sure that this guy -
> http://build.squeak.org/job/SqueakTrunk/208/ - has images that are
> pre-latest-Environments code. That's #12519 at any rate.
>
> frank
>

You make me feel like I'm coming directly from stone age ;)

Nicolas

>>> I have attempted a ClassBuilder fix and posted new updates from
>>> nice-222 to cwp-227.
>>>
>>> Can I please ask our testers contribution once again?
>>>
>>> Nicolas
>>>
>>> 2013/3/8 Nicolas Cellier <[hidden email]>:
>>>> 2013/3/8 Bert Freudenberg <[hidden email]>:
>>>>>
>>>>> On 2013-03-08, at 10:55, Frank Shearar <[hidden email]> wrote:
>>>>>
>>>>>> On 7 March 2013 23:25, Frank Shearar <[hidden email]> wrote:
>>>>>>> On 7 March 2013 23:11, Bert Freudenberg <[hidden email]> wrote:
>>>>>>>> On 2013-03-07, at 23:42, Frank Shearar <[hidden email]> wrote:
>>>>>>>>
>>>>>>>>>>> On 6 March 2013 15:59, Ken G. Brown <[hidden email]> wrote:
>>>>>>>>>>>> Running on COG 2397, and after updating fresh Squeak4.4-12327 Release to
>>>>>>>>>>>> 12332, updating to Trunk  fails at first attempt in the same place, then by
>>>>>>>>>>>> abandoning and trying the update again, it apparently completes to 12511.
>>>>>>>>>>>>
>>>>>>>>>>>>  Ken G. Brown
>>>>>>>>>>>>
>>>>>>>>>>>>> With COG 2678, pretty well the same. First attempt it timed out during
>>>>>>>>>>>>> the same update-nice-223, then trying again from what had already been
>>>>>>>>>>>>> loaded, got the following during the same update, during compiling
>>>>>>>>>>>>> SMLoader-fbs-78 as before:
>>>>>>>>>
>>>>>>>>> What I find strange about all this is that we take a 4.4-12327 image
>>>>>>>>> and whatever the latest Cog is and update it all the way without any
>>>>>>>>> probems quite a few times a day on the CI server.
>>>>>>>>>
>>>>>>>>> frank
>>>>>>>>
>>>>>>>> Looks like it's an intermittent problem, unfortunately:
>>>>>>>>
>>>>>>>> I just updated the new all-in-one-cog to latest trunk, no problem. This is a 4.4-12327 image with Cog VM 2697.
>>>>>>>>
>>>>>>>> I then tried what Ken described: update the fresh image first from the squeak44 stream, then switch to trunk, then update again.
>>>>>>>>
>>>>>>>> BOOM. Cog crash. Didn't save the log unfortunately.
>>>>>>>>
>>>>>>>> Tried again. Update, switch to trunk, update again. No crash. What?!
>>>>>>>>
>>>>>>>> Once more. Update, switch to trunk, update. Crash! See below.
>>>>>>>>
>>>>>>>> Tried yet again, with switching to trunk immediately in a fresh image. Crashes, too, same place.
>>>>>>>>
>>>>>>>> So it does crash, just not always. But it's been more than 50% in my case.
>>>>>>>
>>>>>>> Ah, interesting. The CI jobs, naturally, don't update from squeak44;
>>>>>>> they switch to trunk and update just like that. Which I would have
>>>>>>> thought would make no difference...
>>>>>>
>>>>>> Actually, I lie. Here's an example of the CI jobs hitting the same
>>>>>> issue: http://build.squeak.org/job/SqueakTrunk/204/console And further
>>>>>> if you look at http://build.squeak.org/job/SqueakTrunk/ and choose to
>>>>>> see the failing tests you'll see times (say around build #184) where
>>>>>> the test failure count is unusually low. And
>>>>>> http://build.squeak.org/job/SqueakTrunk/buildTimeTrend shows grey
>>>>>> streaks where builds die.
>>>>>
>>>>> Curious that it still runs the tests at all if the update failed ...
>>>>>
>>>>> So Cog crashes, but has someone tried to replicate this on an interpreter?
>>>>>
>>>>> - Bert -
>>>>>
>>>>
>>>> I think that the problem comes form COG which tries to use an obsolete
>>>> method sent AFTER the recompilation of Parser which is not the
>>>> expected behavior.
>>>> I have triggered such kind of strange behavior that does not happen on
>>>> an Interpreter VM, see the thread opened by Jeff Gonis '[Vm-dev] Cog
>>>> VM Crash on Windows'
>>>> For me, it must be related to a cache that is not cleaned-up, I don't know why.
>>>>
>>>> Nicolas
>>>
>>
>>
>>
>> --
>> best,
>> Eliot
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: error when updating Squeak4.4-12327 to trunk

Eliot Miranda-2
In reply to this post by Frank Shearar-3
On Mon, Mar 11, 2013 at 4:26 PM, Frank Shearar <[hidden email]> wrote:

> On 11 March 2013 23:12, Eliot Miranda <[hidden email]> wrote:
>> On Sun, Mar 10, 2013 at 12:10 PM, Nicolas Cellier
>> <[hidden email]> wrote:
>>> OK, see the VM thread, I now think that problems does not come from
>>> COG, but from ClassBuilder which in some cases fail to clean a cache
>>> (primitive 116).
>>> The problem does not show up in interpreter VM thanks to primitive 119
>>> (this primitives does not unlink send in cogit).
>>
>> it does unlink sends, but only for that selector.  But is it really
>> the case that it is a missing cache flush or is it a bug in Cog with
>> its cache flushing?  I realised the way to test this is to try the
>> Stack VM and see if it crashes or not.  I just tried that but now
>> neither Cog nor the Stack VM crash although both fail the load with an
>> MNU of #do: to UndefinedObject in Environment>>bindingOf:ifAbsent:.
>> So how do I get the system back to a state where I can reproduce the
>> Cog crash to compare the Stack and Cog VMs with each other?
>>
>> (Apologies for being unresponsive; I've just moved into a new
>> apartment and only got my internet connection yesterday afternoon; at
>> least its fast (for the states) :) ).
>
> I'm reasonably sure that this guy -
> http://build.squeak.org/job/SqueakTrunk/208/ - has images that are
> pre-latest-Environments code. That's #12519 at any rate.

That's not the problem.  I have an image that crashed last week.  I
need a way of not applying all the updates.  Bert's version worked to
exclude some updates but others are creeping in. Time is limited so I
hoped for a quick fix :(

>
> frank
>
>>> I have attempted a ClassBuilder fix and posted new updates from
>>> nice-222 to cwp-227.
>>>
>>> Can I please ask our testers contribution once again?
>>>
>>> Nicolas
>>>
>>> 2013/3/8 Nicolas Cellier <[hidden email]>:
>>>> 2013/3/8 Bert Freudenberg <[hidden email]>:
>>>>>
>>>>> On 2013-03-08, at 10:55, Frank Shearar <[hidden email]> wrote:
>>>>>
>>>>>> On 7 March 2013 23:25, Frank Shearar <[hidden email]> wrote:
>>>>>>> On 7 March 2013 23:11, Bert Freudenberg <[hidden email]> wrote:
>>>>>>>> On 2013-03-07, at 23:42, Frank Shearar <[hidden email]> wrote:
>>>>>>>>
>>>>>>>>>>> On 6 March 2013 15:59, Ken G. Brown <[hidden email]> wrote:
>>>>>>>>>>>> Running on COG 2397, and after updating fresh Squeak4.4-12327 Release to
>>>>>>>>>>>> 12332, updating to Trunk  fails at first attempt in the same place, then by
>>>>>>>>>>>> abandoning and trying the update again, it apparently completes to 12511.
>>>>>>>>>>>>
>>>>>>>>>>>>  Ken G. Brown
>>>>>>>>>>>>
>>>>>>>>>>>>> With COG 2678, pretty well the same. First attempt it timed out during
>>>>>>>>>>>>> the same update-nice-223, then trying again from what had already been
>>>>>>>>>>>>> loaded, got the following during the same update, during compiling
>>>>>>>>>>>>> SMLoader-fbs-78 as before:
>>>>>>>>>
>>>>>>>>> What I find strange about all this is that we take a 4.4-12327 image
>>>>>>>>> and whatever the latest Cog is and update it all the way without any
>>>>>>>>> probems quite a few times a day on the CI server.
>>>>>>>>>
>>>>>>>>> frank
>>>>>>>>
>>>>>>>> Looks like it's an intermittent problem, unfortunately:
>>>>>>>>
>>>>>>>> I just updated the new all-in-one-cog to latest trunk, no problem. This is a 4.4-12327 image with Cog VM 2697.
>>>>>>>>
>>>>>>>> I then tried what Ken described: update the fresh image first from the squeak44 stream, then switch to trunk, then update again.
>>>>>>>>
>>>>>>>> BOOM. Cog crash. Didn't save the log unfortunately.
>>>>>>>>
>>>>>>>> Tried again. Update, switch to trunk, update again. No crash. What?!
>>>>>>>>
>>>>>>>> Once more. Update, switch to trunk, update. Crash! See below.
>>>>>>>>
>>>>>>>> Tried yet again, with switching to trunk immediately in a fresh image. Crashes, too, same place.
>>>>>>>>
>>>>>>>> So it does crash, just not always. But it's been more than 50% in my case.
>>>>>>>
>>>>>>> Ah, interesting. The CI jobs, naturally, don't update from squeak44;
>>>>>>> they switch to trunk and update just like that. Which I would have
>>>>>>> thought would make no difference...
>>>>>>
>>>>>> Actually, I lie. Here's an example of the CI jobs hitting the same
>>>>>> issue: http://build.squeak.org/job/SqueakTrunk/204/console And further
>>>>>> if you look at http://build.squeak.org/job/SqueakTrunk/ and choose to
>>>>>> see the failing tests you'll see times (say around build #184) where
>>>>>> the test failure count is unusually low. And
>>>>>> http://build.squeak.org/job/SqueakTrunk/buildTimeTrend shows grey
>>>>>> streaks where builds die.
>>>>>
>>>>> Curious that it still runs the tests at all if the update failed ...
>>>>>
>>>>> So Cog crashes, but has someone tried to replicate this on an interpreter?
>>>>>
>>>>> - Bert -
>>>>>
>>>>
>>>> I think that the problem comes form COG which tries to use an obsolete
>>>> method sent AFTER the recompilation of Parser which is not the
>>>> expected behavior.
>>>> I have triggered such kind of strange behavior that does not happen on
>>>> an Interpreter VM, see the thread opened by Jeff Gonis '[Vm-dev] Cog
>>>> VM Crash on Windows'
>>>> For me, it must be related to a cache that is not cleaned-up, I don't know why.
>>>>
>>>> Nicolas
>>>
>>
>>
>>
>> --
>> best,
>> Eliot
>>
>



--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: error when updating Squeak4.4-12327 to trunk

Frank Shearar-3
On 11 March 2013 23:30, Eliot Miranda <[hidden email]> wrote:

> On Mon, Mar 11, 2013 at 4:26 PM, Frank Shearar <[hidden email]> wrote:
>> On 11 March 2013 23:12, Eliot Miranda <[hidden email]> wrote:
>>> On Sun, Mar 10, 2013 at 12:10 PM, Nicolas Cellier
>>> <[hidden email]> wrote:
>>>> OK, see the VM thread, I now think that problems does not come from
>>>> COG, but from ClassBuilder which in some cases fail to clean a cache
>>>> (primitive 116).
>>>> The problem does not show up in interpreter VM thanks to primitive 119
>>>> (this primitives does not unlink send in cogit).
>>>
>>> it does unlink sends, but only for that selector.  But is it really
>>> the case that it is a missing cache flush or is it a bug in Cog with
>>> its cache flushing?  I realised the way to test this is to try the
>>> Stack VM and see if it crashes or not.  I just tried that but now
>>> neither Cog nor the Stack VM crash although both fail the load with an
>>> MNU of #do: to UndefinedObject in Environment>>bindingOf:ifAbsent:.
>>> So how do I get the system back to a state where I can reproduce the
>>> Cog crash to compare the Stack and Cog VMs with each other?
>>>
>>> (Apologies for being unresponsive; I've just moved into a new
>>> apartment and only got my internet connection yesterday afternoon; at
>>> least its fast (for the states) :) ).
>>
>> I'm reasonably sure that this guy -
>> http://build.squeak.org/job/SqueakTrunk/208/ - has images that are
>> pre-latest-Environments code. That's #12519 at any rate.
>
> That's not the problem.  I have an image that crashed last week.  I
> need a way of not applying all the updates.  Bert's version worked to
> exclude some updates but others are creeping in. Time is limited so I
> hoped for a quick fix :(

Sorry, Eliot. Didn't mean to add to the noise. It's just that we found
a second breakage in the updates (or a second serious bug that
recently appeared in an update, rather), so I assumed your mail was
related to this _new_ thing and not Nicolas' Parser thing.

frank

>> frank
>>
>>>> I have attempted a ClassBuilder fix and posted new updates from
>>>> nice-222 to cwp-227.
>>>>
>>>> Can I please ask our testers contribution once again?
>>>>
>>>> Nicolas
>>>>
>>>> 2013/3/8 Nicolas Cellier <[hidden email]>:
>>>>> 2013/3/8 Bert Freudenberg <[hidden email]>:
>>>>>>
>>>>>> On 2013-03-08, at 10:55, Frank Shearar <[hidden email]> wrote:
>>>>>>
>>>>>>> On 7 March 2013 23:25, Frank Shearar <[hidden email]> wrote:
>>>>>>>> On 7 March 2013 23:11, Bert Freudenberg <[hidden email]> wrote:
>>>>>>>>> On 2013-03-07, at 23:42, Frank Shearar <[hidden email]> wrote:
>>>>>>>>>
>>>>>>>>>>>> On 6 March 2013 15:59, Ken G. Brown <[hidden email]> wrote:
>>>>>>>>>>>>> Running on COG 2397, and after updating fresh Squeak4.4-12327 Release to
>>>>>>>>>>>>> 12332, updating to Trunk  fails at first attempt in the same place, then by
>>>>>>>>>>>>> abandoning and trying the update again, it apparently completes to 12511.
>>>>>>>>>>>>>
>>>>>>>>>>>>>  Ken G. Brown
>>>>>>>>>>>>>
>>>>>>>>>>>>>> With COG 2678, pretty well the same. First attempt it timed out during
>>>>>>>>>>>>>> the same update-nice-223, then trying again from what had already been
>>>>>>>>>>>>>> loaded, got the following during the same update, during compiling
>>>>>>>>>>>>>> SMLoader-fbs-78 as before:
>>>>>>>>>>
>>>>>>>>>> What I find strange about all this is that we take a 4.4-12327 image
>>>>>>>>>> and whatever the latest Cog is and update it all the way without any
>>>>>>>>>> probems quite a few times a day on the CI server.
>>>>>>>>>>
>>>>>>>>>> frank
>>>>>>>>>
>>>>>>>>> Looks like it's an intermittent problem, unfortunately:
>>>>>>>>>
>>>>>>>>> I just updated the new all-in-one-cog to latest trunk, no problem. This is a 4.4-12327 image with Cog VM 2697.
>>>>>>>>>
>>>>>>>>> I then tried what Ken described: update the fresh image first from the squeak44 stream, then switch to trunk, then update again.
>>>>>>>>>
>>>>>>>>> BOOM. Cog crash. Didn't save the log unfortunately.
>>>>>>>>>
>>>>>>>>> Tried again. Update, switch to trunk, update again. No crash. What?!
>>>>>>>>>
>>>>>>>>> Once more. Update, switch to trunk, update. Crash! See below.
>>>>>>>>>
>>>>>>>>> Tried yet again, with switching to trunk immediately in a fresh image. Crashes, too, same place.
>>>>>>>>>
>>>>>>>>> So it does crash, just not always. But it's been more than 50% in my case.
>>>>>>>>
>>>>>>>> Ah, interesting. The CI jobs, naturally, don't update from squeak44;
>>>>>>>> they switch to trunk and update just like that. Which I would have
>>>>>>>> thought would make no difference...
>>>>>>>
>>>>>>> Actually, I lie. Here's an example of the CI jobs hitting the same
>>>>>>> issue: http://build.squeak.org/job/SqueakTrunk/204/console And further
>>>>>>> if you look at http://build.squeak.org/job/SqueakTrunk/ and choose to
>>>>>>> see the failing tests you'll see times (say around build #184) where
>>>>>>> the test failure count is unusually low. And
>>>>>>> http://build.squeak.org/job/SqueakTrunk/buildTimeTrend shows grey
>>>>>>> streaks where builds die.
>>>>>>
>>>>>> Curious that it still runs the tests at all if the update failed ...
>>>>>>
>>>>>> So Cog crashes, but has someone tried to replicate this on an interpreter?
>>>>>>
>>>>>> - Bert -
>>>>>>
>>>>>
>>>>> I think that the problem comes form COG which tries to use an obsolete
>>>>> method sent AFTER the recompilation of Parser which is not the
>>>>> expected behavior.
>>>>> I have triggered such kind of strange behavior that does not happen on
>>>>> an Interpreter VM, see the thread opened by Jeff Gonis '[Vm-dev] Cog
>>>>> VM Crash on Windows'
>>>>> For me, it must be related to a cache that is not cleaned-up, I don't know why.
>>>>>
>>>>> Nicolas
>>>>
>>>
>>>
>>>
>>> --
>>> best,
>>> Eliot
>>>
>>
>
>
>
> --
> best,
> Eliot
>

Reply | Threaded
Open this post in threaded view
|

Re: error when updating Squeak4.4-12327 to trunk

Nicolas Cellier
In reply to this post by Eliot Miranda-2
2013/3/12 Eliot Miranda <[hidden email]>:

> On Mon, Mar 11, 2013 at 4:26 PM, Frank Shearar <[hidden email]> wrote:
>> On 11 March 2013 23:12, Eliot Miranda <[hidden email]> wrote:
>>> On Sun, Mar 10, 2013 at 12:10 PM, Nicolas Cellier
>>> <[hidden email]> wrote:
>>>> OK, see the VM thread, I now think that problems does not come from
>>>> COG, but from ClassBuilder which in some cases fail to clean a cache
>>>> (primitive 116).
>>>> The problem does not show up in interpreter VM thanks to primitive 119
>>>> (this primitives does not unlink send in cogit).
>>>
>>> it does unlink sends, but only for that selector.  But is it really
>>> the case that it is a missing cache flush or is it a bug in Cog with
>>> its cache flushing?  I realised the way to test this is to try the
>>> Stack VM and see if it crashes or not.  I just tried that but now
>>> neither Cog nor the Stack VM crash although both fail the load with an
>>> MNU of #do: to UndefinedObject in Environment>>bindingOf:ifAbsent:.
>>> So how do I get the system back to a state where I can reproduce the
>>> Cog crash to compare the Stack and Cog VMs with each other?
>>>
>>> (Apologies for being unresponsive; I've just moved into a new
>>> apartment and only got my internet connection yesterday afternoon; at
>>> least its fast (for the states) :) ).
>>
>> I'm reasonably sure that this guy -
>> http://build.squeak.org/job/SqueakTrunk/208/ - has images that are
>> pre-latest-Environments code. That's #12519 at any rate.
>
> That's not the problem.  I have an image that crashed last week.  I
> need a way of not applying all the updates.  Bert's version worked to
> exclude some updates but others are creeping in. Time is limited so I
> hoped for a quick fix :(

We could craft a special mixture of package in an update map and put
it in inbox for example.
But which packages exactly ?
What do you want to test ?

Nicolas

>>
>> frank
>>
>>>> I have attempted a ClassBuilder fix and posted new updates from
>>>> nice-222 to cwp-227.
>>>>
>>>> Can I please ask our testers contribution once again?
>>>>
>>>> Nicolas
>>>>
>>>> 2013/3/8 Nicolas Cellier <[hidden email]>:
>>>>> 2013/3/8 Bert Freudenberg <[hidden email]>:
>>>>>>
>>>>>> On 2013-03-08, at 10:55, Frank Shearar <[hidden email]> wrote:
>>>>>>
>>>>>>> On 7 March 2013 23:25, Frank Shearar <[hidden email]> wrote:
>>>>>>>> On 7 March 2013 23:11, Bert Freudenberg <[hidden email]> wrote:
>>>>>>>>> On 2013-03-07, at 23:42, Frank Shearar <[hidden email]> wrote:
>>>>>>>>>
>>>>>>>>>>>> On 6 March 2013 15:59, Ken G. Brown <[hidden email]> wrote:
>>>>>>>>>>>>> Running on COG 2397, and after updating fresh Squeak4.4-12327 Release to
>>>>>>>>>>>>> 12332, updating to Trunk  fails at first attempt in the same place, then by
>>>>>>>>>>>>> abandoning and trying the update again, it apparently completes to 12511.
>>>>>>>>>>>>>
>>>>>>>>>>>>>  Ken G. Brown
>>>>>>>>>>>>>
>>>>>>>>>>>>>> With COG 2678, pretty well the same. First attempt it timed out during
>>>>>>>>>>>>>> the same update-nice-223, then trying again from what had already been
>>>>>>>>>>>>>> loaded, got the following during the same update, during compiling
>>>>>>>>>>>>>> SMLoader-fbs-78 as before:
>>>>>>>>>>
>>>>>>>>>> What I find strange about all this is that we take a 4.4-12327 image
>>>>>>>>>> and whatever the latest Cog is and update it all the way without any
>>>>>>>>>> probems quite a few times a day on the CI server.
>>>>>>>>>>
>>>>>>>>>> frank
>>>>>>>>>
>>>>>>>>> Looks like it's an intermittent problem, unfortunately:
>>>>>>>>>
>>>>>>>>> I just updated the new all-in-one-cog to latest trunk, no problem. This is a 4.4-12327 image with Cog VM 2697.
>>>>>>>>>
>>>>>>>>> I then tried what Ken described: update the fresh image first from the squeak44 stream, then switch to trunk, then update again.
>>>>>>>>>
>>>>>>>>> BOOM. Cog crash. Didn't save the log unfortunately.
>>>>>>>>>
>>>>>>>>> Tried again. Update, switch to trunk, update again. No crash. What?!
>>>>>>>>>
>>>>>>>>> Once more. Update, switch to trunk, update. Crash! See below.
>>>>>>>>>
>>>>>>>>> Tried yet again, with switching to trunk immediately in a fresh image. Crashes, too, same place.
>>>>>>>>>
>>>>>>>>> So it does crash, just not always. But it's been more than 50% in my case.
>>>>>>>>
>>>>>>>> Ah, interesting. The CI jobs, naturally, don't update from squeak44;
>>>>>>>> they switch to trunk and update just like that. Which I would have
>>>>>>>> thought would make no difference...
>>>>>>>
>>>>>>> Actually, I lie. Here's an example of the CI jobs hitting the same
>>>>>>> issue: http://build.squeak.org/job/SqueakTrunk/204/console And further
>>>>>>> if you look at http://build.squeak.org/job/SqueakTrunk/ and choose to
>>>>>>> see the failing tests you'll see times (say around build #184) where
>>>>>>> the test failure count is unusually low. And
>>>>>>> http://build.squeak.org/job/SqueakTrunk/buildTimeTrend shows grey
>>>>>>> streaks where builds die.
>>>>>>
>>>>>> Curious that it still runs the tests at all if the update failed ...
>>>>>>
>>>>>> So Cog crashes, but has someone tried to replicate this on an interpreter?
>>>>>>
>>>>>> - Bert -
>>>>>>
>>>>>
>>>>> I think that the problem comes form COG which tries to use an obsolete
>>>>> method sent AFTER the recompilation of Parser which is not the
>>>>> expected behavior.
>>>>> I have triggered such kind of strange behavior that does not happen on
>>>>> an Interpreter VM, see the thread opened by Jeff Gonis '[Vm-dev] Cog
>>>>> VM Crash on Windows'
>>>>> For me, it must be related to a cache that is not cleaned-up, I don't know why.
>>>>>
>>>>> Nicolas
>>>>
>>>
>>>
>>>
>>> --
>>> best,
>>> Eliot
>>>
>>
>
>
>
> --
> best,
> Eliot
>

Reply | Threaded
Open this post in threaded view
|

Re: error when updating Squeak4.4-12327 to trunk

Eliot Miranda-2


On Mon, Mar 11, 2013 at 4:40 PM, Nicolas Cellier <[hidden email]> wrote:
> 2013/3/12 Eliot Miranda <[hidden email]>:
>> On Mon, Mar 11, 2013 at 4:26 PM, Frank Shearar <[hidden email]> wrote:
>>> On 11 March 2013 23:12, Eliot Miranda <[hidden email]> wrote:
>>>> On Sun, Mar 10, 2013 at 12:10 PM, Nicolas Cellier
>>>> <[hidden email]> wrote:
>>>>> OK, see the VM thread, I now think that problems does not come from
>>>>> COG, but from ClassBuilder which in some cases fail to clean a cache
>>>>> (primitive 116).
>>>>> The problem does not show up in interpreter VM thanks to primitive 119
>>>>> (this primitives does not unlink send in cogit).
>>>>
>>>> it does unlink sends, but only for that selector.  But is it really
>>>> the case that it is a missing cache flush or is it a bug in Cog with
>>>> its cache flushing?  I realised the way to test this is to try the
>>>> Stack VM and see if it crashes or not.  I just tried that but now
>>>> neither Cog nor the Stack VM crash although both fail the load with an
>>>> MNU of #do: to UndefinedObject in Environment>>bindingOf:ifAbsent:.
>>>> So how do I get the system back to a state where I can reproduce the
>>>> Cog crash to compare the Stack and Cog VMs with each other?
>>>>
>>>> (Apologies for being unresponsive; I've just moved into a new
>>>> apartment and only got my internet connection yesterday afternoon; at
>>>> least its fast (for the states) :) ).
>>>
>>> I'm reasonably sure that this guy -
>>> http://build.squeak.org/job/SqueakTrunk/208/ - has images that are
>>> pre-latest-Environments code. That's #12519 at any rate.
>>
>> That's not the problem.  I have an image that crashed last week.  I
>> need a way of not applying all the updates.  Bert's version worked to
>> exclude some updates but others are creeping in. Time is limited so I
>> hoped for a quick fix :(
>
> We could craft a special mixture of package in an update map and put
> it in inbox for example.
> But which packages exactly ?
> What do you want to test ?

The state that causes Cog to hard crash by reading off the end of the Parser instance because the old version of Parser>>parse:cue:noPattern:ifFail: on the stack uses the pre-reshape inst var offsets.


See Bert's message:  Filtering out updates > 222 used to ensure the crash.

On Tue, Feb 26, 2013 at 11:44 AM, Bert Freudenberg <[hidden email]> wrote:
* I downloaded Squeak4.4-12327.zip from http://ftp.squeak.org/current_stable/
* in preferences, change Update URL to trunk
* load updates
* Cog crashes (last log entry is Compiler-nice.256)
* Interpreter does not crash
This is on Mac with current Cog 4.0.2692.
Phhh, is there any way to load a specific update?  What's the update number/id?  Of course I'm too late to this party to debug the VM crash :(
Before updating, in MCMcmUpdater class>>updateListFor: insert this before the return:
updateList := updateList reject: [:ea | ea key > 222].

>
> Nicolas
>
>>>
>>> frank
>>>
>>>>> I have attempted a ClassBuilder fix and posted new updates from
>>>>> nice-222 to cwp-227.
>>>>>
>>>>> Can I please ask our testers contribution once again?
>>>>>
>>>>> Nicolas
>>>>>
>>>>> 2013/3/8 Nicolas Cellier <[hidden email]>:
>>>>>> 2013/3/8 Bert Freudenberg <[hidden email]>:
>>>>>>>
>>>>>>> On 2013-03-08, at 10:55, Frank Shearar <[hidden email]> wrote:
>>>>>>>
>>>>>>>> On 7 March 2013 23:25, Frank Shearar <[hidden email]> wrote:
>>>>>>>>> On 7 March 2013 23:11, Bert Freudenberg <[hidden email]> wrote:
>>>>>>>>>> On 2013-03-07, at 23:42, Frank Shearar <[hidden email]> wrote:
>>>>>>>>>>
>>>>>>>>>>>>> On 6 March 2013 15:59, Ken G. Brown <[hidden email]> wrote:
>>>>>>>>>>>>>> Running on COG 2397, and after updating fresh Squeak4.4-12327 Release to
>>>>>>>>>>>>>> 12332, updating to Trunk  fails at first attempt in the same place, then by
>>>>>>>>>>>>>> abandoning and trying the update again, it apparently completes to 12511.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  Ken G. Brown
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> With COG 2678, pretty well the same. First attempt it timed out during
>>>>>>>>>>>>>>> the same update-nice-223, then trying again from what had already been
>>>>>>>>>>>>>>> loaded, got the following during the same update, during compiling
>>>>>>>>>>>>>>> SMLoader-fbs-78 as before:
>>>>>>>>>>>
>>>>>>>>>>> What I find strange about all this is that we take a 4.4-12327 image
>>>>>>>>>>> and whatever the latest Cog is and update it all the way without any
>>>>>>>>>>> probems quite a few times a day on the CI server.
>>>>>>>>>>>
>>>>>>>>>>> frank
>>>>>>>>>>
>>>>>>>>>> Looks like it's an intermittent problem, unfortunately:
>>>>>>>>>>
>>>>>>>>>> I just updated the new all-in-one-cog to latest trunk, no problem. This is a 4.4-12327 image with Cog VM 2697.
>>>>>>>>>>
>>>>>>>>>> I then tried what Ken described: update the fresh image first from the squeak44 stream, then switch to trunk, then update again.
>>>>>>>>>>
>>>>>>>>>> BOOM. Cog crash. Didn't save the log unfortunately.
>>>>>>>>>>
>>>>>>>>>> Tried again. Update, switch to trunk, update again. No crash. What?!
>>>>>>>>>>
>>>>>>>>>> Once more. Update, switch to trunk, update. Crash! See below.
>>>>>>>>>>
>>>>>>>>>> Tried yet again, with switching to trunk immediately in a fresh image. Crashes, too, same place.
>>>>>>>>>>
>>>>>>>>>> So it does crash, just not always. But it's been more than 50% in my case.
>>>>>>>>>
>>>>>>>>> Ah, interesting. The CI jobs, naturally, don't update from squeak44;
>>>>>>>>> they switch to trunk and update just like that. Which I would have
>>>>>>>>> thought would make no difference...
>>>>>>>>
>>>>>>>> Actually, I lie. Here's an example of the CI jobs hitting the same
>>>>>>>> issue: http://build.squeak.org/job/SqueakTrunk/204/console And further
>>>>>>>> if you look at http://build.squeak.org/job/SqueakTrunk/ and choose to
>>>>>>>> see the failing tests you'll see times (say around build #184) where
>>>>>>>> the test failure count is unusually low. And
>>>>>>>> http://build.squeak.org/job/SqueakTrunk/buildTimeTrend shows grey
>>>>>>>> streaks where builds die.
>>>>>>>
>>>>>>> Curious that it still runs the tests at all if the update failed ...
>>>>>>>
>>>>>>> So Cog crashes, but has someone tried to replicate this on an interpreter?
>>>>>>>
>>>>>>> - Bert -
>>>>>>>
>>>>>>
>>>>>> I think that the problem comes form COG which tries to use an obsolete
>>>>>> method sent AFTER the recompilation of Parser which is not the
>>>>>> expected behavior.
>>>>>> I have triggered such kind of strange behavior that does not happen on
>>>>>> an Interpreter VM, see the thread opened by Jeff Gonis '[Vm-dev] Cog
>>>>>> VM Crash on Windows'
>>>>>> For me, it must be related to a cache that is not cleaned-up, I don't know why.
>>>>>>
>>>>>> Nicolas
>>>>>
>>>>
>>>>
>>>>

>>>> --
>>>> best,
>>>> Eliot
>>>>
>>>
>>
>>
>>
>> --
>> best,
>> Eliot
>>
>



--
best,
Eliot


12