argh, tests are failing!

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: argh, tests are failing!

Guillermo Polito
Two other things:

- if you choose to change the priority of the delivery process to the same priority as the running test process (i.e., 40) you would still need to tell the scheduler to give some chance to run to the other one. You can do that by yielding

 Processor yield

- About timeouts: Denis implemented not so long ago an automatic timeout for tests. So if tests take more than a specified amount they are timed out and failed by default. Check

TestCase >> defaultTimeLimit
^self class defaultTimeLimit

TestCase class >> defaultTimeLimit
^DefaultTimeLimit ifNil: [DefaultTimeLimit := 1 minutes]

So you may want to use that mechanism to timeout instead of hardcoding the timeouts of semaphores in each of the tests.

Moreover, the mechanism created by Denis will automatically kill any processes created during a test run, to ensure you leave the system somehow "clean". So you may have that into account also.

On Tue, Sep 12, 2017 at 9:46 AM, Guillermo Polito <[hidden email]> wrote:
But the thing is that those processes you are creating for delivery are running in priority 30. This means that it may happen that they may not run any time soon (even those 200ms) if there are processes scheduled with higher priorities.

So, the thing is that test is not a unit test at all. It depends a lot on the running environment. Solutions for that:
 - you change the priority of the delivery process for test purposes
 - Or, for testing purposes you don't create a new process, you just execute synchronously

If you want really to test the fact that you are creating and running a separate process, then you should also try to let explicit the processes you created run. This means, if the active process that is running the tests (usually priority 40) does not suspend itself, no processes of priority 30 will be able to run. Ways to suspend the active process and let lower priority ones run are:
 - calling suspend (Processor activeProcess suspend) but this is dangerous because somebody should resume it afterwards from a separate process
 - using a delay
 - some I/O like sockets or async files

On Mon, Sep 11, 2017 at 7:40 PM, Juraj Kubelka <[hidden email]> wrote:
Hi Guillermo,

I have not found better solution. Waiting without a timeout threshold is not nice and makes it difficult to run all tests. 
If you have better idea how to sync and test two processes, I will appreciate it. 
Otherwise I will use a higher timeout.

Cheers,
Juraj

El 11-09-2017, a las 06:55, Guillermo Polito <[hidden email]> escribió:

Hi Juraj,

think that it may really depend on the machine and the state of the system. Slower slave machines could not really work with that timeout...

The question is how could we make such test more robust.

On Sun, Sep 10, 2017 at 6:41 PM, Juraj Kubelka <[hidden email]> wrote:
Hi,

I have checked the EventRecorderTests and it works on my computer. The only reason that it might not work on other cases is that there an assert for 'semaphore waitTimeoutMSecs: 200’. I can put higher timeout if necessary. The timeout 200 has worked for couple of years, right?. There might be another issue.

Cheers,
Juraj

El 10-09-2017, a las 10:13, Guillermo Polito <[hidden email]> escribió:

Hi all,

Since a couple of builds we have consistently failing the following tests:



Green builds are the only hard metric to say if the build is healthy or not.

- We should not integrate anything until the build is green again...

Also, we spent with Pablo a lot of time to have a green build in all platforms...  I'd like to spend my time in other fun stuff than the CI :/

--
   
Guille Polito

Research Engineer
French National Center for Scientific Research - http://www.cnrs.fr


Phone: <a href="tel:+33%206%2052%2070%2066%2013" value="+33652706613" target="_blank">+33 06 52 70 66 13




--
   
Guille Polito

Research Engineer
French National Center for Scientific Research - http://www.cnrs.fr


Phone: <a href="tel:+33%206%2052%2070%2066%2013" value="+33652706613" target="_blank">+33 06 52 70 66 13




--

   

Guille Polito


Research Engineer

French National Center for Scientific Research - http://www.cnrs.fr



Web: http://guillep.github.io

Phone: <a href="tel:+33%206%2052%2070%2066%2013" value="+33652706613" target="_blank">+33 06 52 70 66 13




--

   

Guille Polito


Research Engineer

French National Center for Scientific Research - http://www.cnrs.fr



Web: http://guillep.github.io

Phone: +33 06 52 70 66 13

Reply | Threaded
Open this post in threaded view
|

Re: argh, tests are failing!

Stephane Ducasse-3
Nice It should be added to the doc :)

On Tue, Sep 12, 2017 at 9:51 AM, Guillermo Polito <[hidden email]> wrote:
Two other things:

- if you choose to change the priority of the delivery process to the same priority as the running test process (i.e., 40) you would still need to tell the scheduler to give some chance to run to the other one. You can do that by yielding

 Processor yield

- About timeouts: Denis implemented not so long ago an automatic timeout for tests. So if tests take more than a specified amount they are timed out and failed by default. Check

TestCase >> defaultTimeLimit
^self class defaultTimeLimit

TestCase class >> defaultTimeLimit
^DefaultTimeLimit ifNil: [DefaultTimeLimit := 1 minutes]

So you may want to use that mechanism to timeout instead of hardcoding the timeouts of semaphores in each of the tests.

Moreover, the mechanism created by Denis will automatically kill any processes created during a test run, to ensure you leave the system somehow "clean". So you may have that into account also.

On Tue, Sep 12, 2017 at 9:46 AM, Guillermo Polito <[hidden email]> wrote:
But the thing is that those processes you are creating for delivery are running in priority 30. This means that it may happen that they may not run any time soon (even those 200ms) if there are processes scheduled with higher priorities.

So, the thing is that test is not a unit test at all. It depends a lot on the running environment. Solutions for that:
 - you change the priority of the delivery process for test purposes
 - Or, for testing purposes you don't create a new process, you just execute synchronously

If you want really to test the fact that you are creating and running a separate process, then you should also try to let explicit the processes you created run. This means, if the active process that is running the tests (usually priority 40) does not suspend itself, no processes of priority 30 will be able to run. Ways to suspend the active process and let lower priority ones run are:
 - calling suspend (Processor activeProcess suspend) but this is dangerous because somebody should resume it afterwards from a separate process
 - using a delay
 - some I/O like sockets or async files

On Mon, Sep 11, 2017 at 7:40 PM, Juraj Kubelka <[hidden email]> wrote:
Hi Guillermo,

I have not found better solution. Waiting without a timeout threshold is not nice and makes it difficult to run all tests. 
If you have better idea how to sync and test two processes, I will appreciate it. 
Otherwise I will use a higher timeout.

Cheers,
Juraj

El 11-09-2017, a las 06:55, Guillermo Polito <[hidden email]> escribió:

Hi Juraj,

think that it may really depend on the machine and the state of the system. Slower slave machines could not really work with that timeout...

The question is how could we make such test more robust.

On Sun, Sep 10, 2017 at 6:41 PM, Juraj Kubelka <[hidden email]> wrote:
Hi,

I have checked the EventRecorderTests and it works on my computer. The only reason that it might not work on other cases is that there an assert for 'semaphore waitTimeoutMSecs: 200’. I can put higher timeout if necessary. The timeout 200 has worked for couple of years, right?. There might be another issue.

Cheers,
Juraj

El 10-09-2017, a las 10:13, Guillermo Polito <[hidden email]> escribió:

Hi all,

Since a couple of builds we have consistently failing the following tests:



Green builds are the only hard metric to say if the build is healthy or not.

- We should not integrate anything until the build is green again...

Also, we spent with Pablo a lot of time to have a green build in all platforms...  I'd like to spend my time in other fun stuff than the CI :/

--
   
Guille Polito

Research Engineer
French National Center for Scientific Research - http://www.cnrs.fr


Phone: <a href="tel:+33%206%2052%2070%2066%2013" value="+33652706613" target="_blank">+33 06 52 70 66 13




--
   
Guille Polito

Research Engineer
French National Center for Scientific Research - http://www.cnrs.fr


Phone: <a href="tel:+33%206%2052%2070%2066%2013" value="+33652706613" target="_blank">+33 06 52 70 66 13




--

   

Guille Polito


Research Engineer

French National Center for Scientific Research - http://www.cnrs.fr



Web: http://guillep.github.io

Phone: <a href="tel:+33%206%2052%2070%2066%2013" value="+33652706613" target="_blank">+33 06 52 70 66 13




--

   

Guille Polito


Research Engineer

French National Center for Scientific Research - http://www.cnrs.fr



Web: http://guillep.github.io

Phone: <a href="tel:+33%206%2052%2070%2066%2013" value="+33652706613" target="_blank">+33 06 52 70 66 13


12