image not opening.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

image not opening.

Stephane Ducasse-3
Hi 

I have a jenkins that produce an image that does not start
when I try to open it locally.

Now if I take the image and load the "same" configuration I get a working image

I have not idea how to debug that.

Stef
Reply | Threaded
Open this post in threaded view
|

Re: image not opening.

CyrilFerlicot
On 10/03/2017 15:27, Stephane Ducasse wrote:

> Hi
>
> I have a jenkins that produce an image that does not start
> when I try to open it locally.
>
> Now if I take the image and load the "same" configuration I get a
> working image
>
> I have not idea how to debug that.
>
> Stef
Hi,

I have the same problem, only on the Pharo 6 builds. And sometimes it
crash while launching the tests.

--
Cyril Ferlicot

http://www.synectique.eu

2 rue Jacques Prévert 01,
59650 Villeneuve d'ascq France


signature.asc (817 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: image not opening.

stepharong
In reply to this post by Stephane Ducasse-3
We found the problem with Guillermo and this may be a problem of the last VM.

The image got its starting window size really small: some pixels on some pixles. 
So may be the metadata of the image are systematically corrupted. 

Hi 

I have a jenkins that produce an image that does not start
when I try to open it locally.

Now if I take the image and load the "same" configuration I get a working image

I have not idea how to debug that.

Stef



--
Using Opera's mail client: http://www.opera.com/mail/
Reply | Threaded
Open this post in threaded view
|

Re: image not opening.

CyrilFerlicot

On ven. 10 mars 2017 at 18:42, stepharong <[hidden email]> wrote:
We found the problem with Guillermo and this may be a problem of the last VM.

The image got its starting window size really small: some pixels on some pixles. 
So may be the metadata of the image are systematically corrupted. 

Hi 

I have a jenkins that produce an image that does not start
when I try to open it locally.

Now if I take the image and load the "same" configuration I get a working image

I have not idea how to debug that.

Stef



--
Using Opera's mail client: http://www.opera.com/mail/

If it can help: here is the build: 
It is happening in almost all Pharo 6 builds
--
Cheers
Cyril Ferlicot
Reply | Threaded
Open this post in threaded view
|

Re: image not opening.

Ben Coman
Maybe related?
http://forum.world.st/BUG-A-problem-with-callbacks-that-shows-up-in-64bits-but-is-on-32bits-too-td4938152.html

On Sat, Mar 11, 2017 at 2:22 AM, Cyril Ferlicot
<[hidden email]> wrote:

>
> On ven. 10 mars 2017 at 18:42, stepharong <[hidden email]> wrote:
>>
>> We found the problem with Guillermo and this may be a problem of the last
>> VM.
>>
>> The image got its starting window size really small: some pixels on some
>> pixles.
>> So may be the metadata of the image are systematically corrupted.
>>
>> Hi
>>
>> I have a jenkins that produce an image that does not start
>> when I try to open it locally.
>>
>> Now if I take the image and load the "same" configuration I get a working
>> image
>>
>> I have not idea how to debug that.
>>
>> Stef
>>
>>
>>
>>
>> --
>> Using Opera's mail client: http://www.opera.com/mail/
>
>
> If it can help: here is the build:
>
> https://ci.inria.fr/pharo-contribution/job/OOnoz/
>
> It is happening in almost all Pharo 6 builds
> --
> Cheers
> Cyril Ferlicot

Reply | Threaded
Open this post in threaded view
|

Re: image not opening.

stepharong
guillermo was suspecting a problem with the interpretatin of the image  
metadata

> Maybe related?
> http://forum.world.st/BUG-A-problem-with-callbacks-that-shows-up-in-64bits-but-is-on-32bits-too-td4938152.html
>
> On Sat, Mar 11, 2017 at 2:22 AM, Cyril Ferlicot
> <[hidden email]> wrote:
>>
>> On ven. 10 mars 2017 at 18:42, stepharong <[hidden email]> wrote:
>>>
>>> We found the problem with Guillermo and this may be a problem of the  
>>> last
>>> VM.
>>>
>>> The image got its starting window size really small: some pixels on  
>>> some
>>> pixles.
>>> So may be the metadata of the image are systematically corrupted.
>>>
>>> Hi
>>>
>>> I have a jenkins that produce an image that does not start
>>> when I try to open it locally.
>>>
>>> Now if I take the image and load the "same" configuration I get a  
>>> working
>>> image
>>>
>>> I have not idea how to debug that.
>>>
>>> Stef
>>>
>>>
>>>
>>>
>>> --
>>> Using Opera's mail client: http://www.opera.com/mail/
>>
>>
>> If it can help: here is the build:
>>
>> https://ci.inria.fr/pharo-contribution/job/OOnoz/
>>
>> It is happening in almost all Pharo 6 builds
>> --
>> Cheers
>> Cyril Ferlicot
>


--
Using Opera's mail client: http://www.opera.com/mail/

Reply | Threaded
Open this post in threaded view
|

Re: image not opening.

Guillermo Polito
Yeh, to be more precise, 
* the problem is that the image somehow got it's display size modified to a very small one. 
* And that happenned while building an image in the CI in headless mode. 
* If the same image is built locally in a vm launched in non-headless mode, the issue cannot be reproduced.

So my conjecture is that
- either the image is changing the display size while loading pillar (less likely and not reproduceable locally)
- either the latest stable VM messes the display size in the image header when running on headless (this is yet to reproduce)

On Sun, Mar 12, 2017 at 10:52 AM, stepharong <[hidden email]> wrote:
guillermo was suspecting a problem with the interpretatin of the image metadata

Maybe related?
http://forum.world.st/BUG-A-problem-with-callbacks-that-shows-up-in-64bits-but-is-on-32bits-too-td4938152.html

On Sat, Mar 11, 2017 at 2:22 AM, Cyril Ferlicot
<[hidden email]> wrote:

On ven. 10 mars 2017 at 18:42, stepharong <[hidden email]> wrote:

We found the problem with Guillermo and this may be a problem of the last
VM.

The image got its starting window size really small: some pixels on some
pixles.
So may be the metadata of the image are systematically corrupted.

Hi

I have a jenkins that produce an image that does not start
when I try to open it locally.

Now if I take the image and load the "same" configuration I get a working
image

I have not idea how to debug that.

Stef




--
Using Opera's mail client: http://www.opera.com/mail/


If it can help: here is the build:

https://ci.inria.fr/pharo-contribution/job/OOnoz/

It is happening in almost all Pharo 6 builds
--
Cheers
Cyril Ferlicot



--
Using Opera's mail client: http://www.opera.com/mail/


Reply | Threaded
Open this post in threaded view
|

Re: image not opening.

Eliot Miranda-2


On Sun, Mar 12, 2017 at 3:04 AM, Guillermo Polito <[hidden email]> wrote:
Yeh, to be more precise, 
* the problem is that the image somehow got it's display size modified to a very small one. 
* And that happenned while building an image in the CI in headless mode. 
* If the same image is built locally in a vm launched in non-headless mode, the issue cannot be reproduced.

So my conjecture is that
- either the image is changing the display size while loading pillar (less likely and not reproduceable locally)
- either the latest stable VM messes the display size in the image header when running on headless (this is yet to reproduce)

I'm pretty confident this is to do with bugs in the Athens surface code which assumes that callbacks can be made in the existing copyBits and warpBits primitive.  They can't do this safely because a GC (scavenge) can happen during a callback, which then causes chaos when the copyBits primitive tries to access objects that have been moved under its feet.

I've done work to fix callbacks so that when there is a failure it is the copyBits primitive that fails, instead of apparently the callback return primitive.  One of the apparent effects of this fix is to stop the screen opening up too small; another is getting the background colour right, and yet another is eliminating bogus pixels in the VGTigerDemo demo.  But more work is required to fix the copyBits and warpBits primitives.  There are a few approaches one might take:

a)  fixing the primitive so that it saves and restores oops around the callbacks using the external pop table.  That's a pain but possible.

b) fixing the primitive so that it pins the objects it needs before ever invoking a callback

c) fixing the primitive so that it uses the scavenge and fullGC counters in the VM to detect if a GC occurred during one of the callbacks and would fail the primitive.   The primitive would then simply be retried. 

d) ?

I like c) as it's very lightweight, but it has issues.  It is fine to use for callbacks *before* cop[yBits and warpBits move any bits (the lockSurface and querySurface functions).  But it's potentially erroneous after the unlockSurface primitive.  For example, a primitive which does an xor with the screen can't simply be retried as the first, falling pass, would have updated the destination bits but not displayed them via unlockSurface.  But I think it could be arranged that no objects are accessed after unlockSurface, which should naturally be the last call in the primitive (or do I mean showSurface?).  So the approach would be to check for GCs occurring during querySurface and lockSurface, failing if so, and then caching any and all state needed by unlockSurface and showSurface in local variables.  This way no object state is accessed to make the unlockSurface and showSurface calls, and no bits are moved before the queryDurface and lockSurface calls.

If we used a failure code such as #'object may move' then the primitives could answer this when a GC during callbacks is detected and then the primitive could be retried only when required.



On Sun, Mar 12, 2017 at 10:52 AM, stepharong <[hidden email]> wrote:
guillermo was suspecting a problem with the interpretatin of the image metadata

Maybe related?
http://forum.world.st/BUG-A-problem-with-callbacks-that-shows-up-in-64bits-but-is-on-32bits-too-td4938152.html

On Sat, Mar 11, 2017 at 2:22 AM, Cyril Ferlicot
<[hidden email]> wrote:

On ven. 10 mars 2017 at 18:42, stepharong <[hidden email]> wrote:

We found the problem with Guillermo and this may be a problem of the last
VM.

The image got its starting window size really small: some pixels on some
pixles.
So may be the metadata of the image are systematically corrupted.

Hi

I have a jenkins that produce an image that does not start
when I try to open it locally.

Now if I take the image and load the "same" configuration I get a working
image

I have not idea how to debug that.

Stef




--
Using Opera's mail client: http://www.opera.com/mail/


If it can help: here is the build:

https://ci.inria.fr/pharo-contribution/job/OOnoz/

It is happening in almost all Pharo 6 builds
--
Cheers
Cyril Ferlicot



--
Using Opera's mail client: http://www.opera.com/mail/





--
_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: image not opening.

stepharong
thanks eliot for your analysis.

Stef



On Sun, Mar 12, 2017 at 3:04 AM, Guillermo Polito <[hidden email]> wrote:
Yeh, to be more precise, 
* the problem is that the image somehow got it's display size modified to a very small one. 
* And that happenned while building an image in the CI in headless mode. 
* If the same image is built locally in a vm launched in non-headless mode, the issue cannot be reproduced.

So my conjecture is that
- either the image is changing the display size while loading pillar (less likely and not reproduceable locally)
- either the latest stable VM messes the display size in the image header when running on headless (this is yet to reproduce)

I'm pretty confident this is to do with bugs in the Athens surface code which assumes that callbacks can be made in the existing copyBits and warpBits primitive.  They can't do this safely because a GC (scavenge) can happen during a callback, which then causes chaos when the copyBits primitive tries to access objects that have been moved under its feet.

I've done work to fix callbacks so that when there is a failure it is the copyBits primitive that fails, instead of apparently the callback return primitive.  One of the apparent effects of this fix is to stop the screen opening up too small; another is getting the background colour right, and yet another is eliminating bogus pixels in the VGTigerDemo demo.  But more work is required to fix the copyBits and warpBits primitives.  There are a few approaches one might take:

a)  fixing the primitive so that it saves and restores oops around the callbacks using the external pop table.  That's a pain but possible.

b) fixing the primitive so that it pins the objects it needs before ever invoking a callback

c) fixing the primitive so that it uses the scavenge and fullGC counters in the VM to detect if a GC occurred during one of the callbacks and would fail the primitive.   The primitive would then simply be retried. 

d) ?

I like c) as it's very lightweight, but it has issues.  It is fine to use for callbacks *before* cop[yBits and warpBits move any bits (the lockSurface and querySurface functions).  But it's potentially erroneous after the unlockSurface primitive.  For example, a primitive which does an xor with the screen can't simply be retried as the first, falling pass, would have updated the destination bits but not displayed them via unlockSurface.  But I think it could be arranged that no objects are accessed after unlockSurface, which should naturally be the last call in the primitive (or do I mean showSurface?).  So the approach would be to check for GCs occurring during querySurface and lockSurface, failing if so, and then caching any and all state needed by unlockSurface and showSurface in local variables.  This way no object state is accessed to make the unlockSurface and showSurface calls, and no bits are moved before the queryDurface and lockSurface calls.

If we used a failure code such as #'object may move' then the primitives could answer this when a GC during callbacks is detected and then the primitive could be retried only when required.



On Sun, Mar 12, 2017 at 10:52 AM, stepharong <[hidden email]> wrote:
guillermo was suspecting a problem with the interpretatin of the image metadata

Maybe related?
http://forum.world.st/BUG-A-problem-with-callbacks-that-shows-up-in-64bits-but-is-on-32bits-too-td4938152.html

On Sat, Mar 11, 2017 at 2:22 AM, Cyril Ferlicot
<[hidden email]> wrote:

On ven. 10 mars 2017 at 18:42, stepharong <[hidden email]> wrote:

We found the problem with Guillermo and this may be a problem of the last
VM.

The image got its starting window size really small: some pixels on some
pixles.
So may be the metadata of the image are systematically corrupted.

Hi

I have a jenkins that produce an image that does not start
when I try to open it locally.

Now if I take the image and load the "same" configuration I get a working
image

I have not idea how to debug that.

Stef




--
Using Opera's mail client: http://www.opera.com/mail/


If it can help: here is the build:

https://ci.inria.fr/pharo-contribution/job/OOnoz/

It is happening in almost all Pharo 6 builds
--
Cheers
Cyril Ferlicot



--
Using Opera's mail client: http://www.opera.com/mail/





--
_,,,^..^,,,_
best, Eliot



--
Using Opera's mail client: http://www.opera.com/mail/