Smalltalk › Pharo › Pharo Smalltalk Developers

SUnit Time out

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

24 messages Options

stephane ducasse

SUnit Time out

Hi guys

in Squeak andreas introduced the idea of test time out
Do you think that this is interesting?

Stef

SUnit
-----
All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method.
_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Michael Roberts-2

Re: SUnit Time out

yes think it's a good idea. I'm not sure the granularity that's
required though.

mike

On Saturday, May 29, 2010, stephane ducasse <[hidden email]> wrote:

> Hi guys
>
> in Squeak andreas introduced the idea of test time out
> Do you think that this is interesting?
>
> Stef
>
> SUnit
> -----
> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method.
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Schwab,Wilhelm K

Re: SUnit Time out

Stef,

Time to expose some of my ignorance (don't worry, I have plenty more waiting where I found this): what is the tag concept? That sounds very Tweak-ish, and I am a real believer in doing things "with the language, not TO the language" whenever possible. That is not to say that frameworks are bad; in fact, it means that frameworks are good, language extensions are anywhere from suspect to evil.

I have some code that I am still porting, but the basic idea is to be able to write

[
"code that might not complete"

] tryForSeconds:10 onTimeOut:[

].

With a robust capability to do such things, it is probably not necessary (or even appropriate) for TestCase to enforce timeouts. The timeout block can simply raise an exception or assert false, and there is no need to disable timeouts where they do not belong.

Bill

________________________________________
From: [hidden email] [[hidden email]] On Behalf Of Michael Roberts [[hidden email]]
Sent: Saturday, May 29, 2010 6:17 AM
To: [hidden email]
Subject: Re: [Pharo-project] SUnit Time out

yes think it's a good idea. I'm not sure the granularity that's
required though.

mike

On Saturday, May 29, 2010, stephane ducasse <[hidden email]> wrote:

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Stéphane Ducasse

Re: SUnit Time out

On May 29, 2010, at 4:43 PM, Schwab,Wilhelm K wrote:

> Stef,
>
> Time to expose some of my ignorance (don't worry, I have plenty more waiting where I found this): what is the tag concept?

probably a pragma

> That sounds very Tweak-ish, and I am a real believer in doing things "with the language, not TO the language" whenever possible. That is not to say that frameworks are bad; in fact, it means that frameworks are good, language extensions are anywhere from suspect to evil.
>
> I have some code that I am still porting, but the basic idea is to be able to write
>
> [
> "code that might not complete"
>
> ] tryForSeconds:10 onTimeOut:[
>
> ].
>
> With a robust capability to do such things, it is probably not necessary (or even appropriate) for TestCase to enforce timeouts. The timeout block can simply raise an exception or assert false, and there is no need to disable timeouts where they do not belong.

Yes this could be a nice extension
now what is nice with the tag (I imagine I did not check is that this is orthogonal to the code and this is jyst an indication
for the test runner and this is not intrusive.

>
> Bill
>
> ________________________________________
> From: [hidden email] [[hidden email]] On Behalf Of Michael Roberts [[hidden email]]
> Sent: Saturday, May 29, 2010 6:17 AM
> To: [hidden email]
> Subject: Re: [Pharo-project] SUnit Time out
>
> yes think it's a good idea. I'm not sure the granularity that's
> required though.
>
> mike
>
> On Saturday, May 29, 2010, stephane ducasse <[hidden email]> wrote:
>> Hi guys
>>
>> in Squeak andreas introduced the idea of test time out
>> Do you think that this is interesting?
>>
>> Stef
>>
>> SUnit
>> -----
>> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method.
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Lukas Renggli

Re: SUnit Time out

In reply to this post by Schwab,Wilhelm K

We already have #should:notTakeMoreThan: and friends in TestCase. The
complete TestCase can be protected by overriding #runCase, the
individual test by wrapping the code of the test method.

Lukas

On 29 May 2010 16:43, Schwab,Wilhelm K <[hidden email]> wrote:

> Stef,
>
> Time to expose some of my ignorance (don't worry, I have plenty more waiting where I found this): what is the tag concept? That sounds very Tweak-ish, and I am a real believer in doing things "with the language, not TO the language" whenever possible. That is not to say that frameworks are bad; in fact, it means that frameworks are good, language extensions are anywhere from suspect to evil.
>
> I have some code that I am still porting, but the basic idea is to be able to write
>
> [
> "code that might not complete"
>
> ] tryForSeconds:10 onTimeOut:[
>
> ].
>
> With a robust capability to do such things, it is probably not necessary (or even appropriate) for TestCase to enforce timeouts. The timeout block can simply raise an exception or assert false, and there is no need to disable timeouts where they do not belong.
>
> Bill
>
> ________________________________________
> From: [hidden email] [[hidden email]] On Behalf Of Michael Roberts [[hidden email]]
> Sent: Saturday, May 29, 2010 6:17 AM
> To: [hidden email]
> Subject: Re: [Pharo-project] SUnit Time out
>
> yes think it's a good idea. I'm not sure the granularity that's
> required though.
>
> mike
>
> On Saturday, May 29, 2010, stephane ducasse <[hidden email]> wrote:
>> Hi guys
>>
>> in Squeak andreas introduced the idea of test time out
>> Do you think that this is interesting?
>>
>> Stef
>>
>> SUnit
>> -----
>> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method.
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>

--
Lukas Renggli
www.lukas-renggli.ch

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Stéphane Ducasse

Re: SUnit Time out

For me the point is that I have the impression that this is an interesting feature to have
when we have a test server. Like that you make sure that you do not have tests with infinite recursion
now this is also true that on a test server you do not really care if your tests take 30s or 2 min.

Stef

> We already have #should:notTakeMoreThan: and friends in TestCase. The
> complete TestCase can be protected by overriding #runCase, the
> individual test by wrapping the code of the test method.
>
> Lukas
>
> On 29 May 2010 16:43, Schwab,Wilhelm K <[hidden email]> wrote:
>> Stef,
>>
>> Time to expose some of my ignorance (don't worry, I have plenty more waiting where I found this): what is the tag concept? That sounds very Tweak-ish, and I am a real believer in doing things "with the language, not TO the language" whenever possible. That is not to say that frameworks are bad; in fact, it means that frameworks are good, language extensions are anywhere from suspect to evil.
>>
>> I have some code that I am still porting, but the basic idea is to be able to write
>>
>> [
>> "code that might not complete"
>>
>> ] tryForSeconds:10 onTimeOut:[
>>
>> ].
>>
>> With a robust capability to do such things, it is probably not necessary (or even appropriate) for TestCase to enforce timeouts. The timeout block can simply raise an exception or assert false, and there is no need to disable timeouts where they do not belong.
>>
>> Bill
>>
>> ________________________________________
>> From: [hidden email] [[hidden email]] On Behalf Of Michael Roberts [[hidden email]]
>> Sent: Saturday, May 29, 2010 6:17 AM
>> To: [hidden email]
>> Subject: Re: [Pharo-project] SUnit Time out
>>
>> yes think it's a good idea. I'm not sure the granularity that's
>> required though.
>>
>> mike
>>
>> On Saturday, May 29, 2010, stephane ducasse <[hidden email]> wrote:
>>> Hi guys
>>>
>>> in Squeak andreas introduced the idea of test time out
>>> Do you think that this is interesting?
>>>
>>> Stef
>>>
>>> SUnit
>>> -----
>>> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method.
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
>
>
> --
> Lukas Renggli
> www.lukas-renggli.ch
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Michael Roberts-2

Re: SUnit Time out

exactly... my thought is running these automatically and headless. you
don't necessarily need it built into the test framework itself. You
can just have another process that monitors and kills after a timeout.
This is why i commented on the granularity. The feature Andreas
describes would be used more comprehensively than this. but i'm not
so fussed by that.

cheers,
Mike

On Sat, May 29, 2010 at 8:42 PM, Stéphane Ducasse
<[hidden email]> wrote:

> For me the point is that I have the impression that this is an interesting feature to have
> when we have a test server. Like that you make sure that you do not have tests with infinite recursion
> now this is also true that on a test server you do not really care if your tests take 30s or 2 min.
>
> Stef
>
>> We already have #should:notTakeMoreThan: and friends in TestCase. The
>> complete TestCase can be protected by overriding #runCase, the
>> individual test by wrapping the code of the test method.
>>
>> Lukas
>>
>> On 29 May 2010 16:43, Schwab,Wilhelm K <[hidden email]> wrote:
>>> Stef,
>>>
>>> Time to expose some of my ignorance (don't worry, I have plenty more waiting where I found this): what is the tag concept? That sounds very Tweak-ish, and I am a real believer in doing things "with the language, not TO the language" whenever possible. That is not to say that frameworks are bad; in fact, it means that frameworks are good, language extensions are anywhere from suspect to evil.
>>>
>>> I have some code that I am still porting, but the basic idea is to be able to write
>>>
>>> [
>>> "code that might not complete"
>>>
>>> ] tryForSeconds:10 onTimeOut:[
>>>
>>> ].
>>>
>>> With a robust capability to do such things, it is probably not necessary (or even appropriate) for TestCase to enforce timeouts. The timeout block can simply raise an exception or assert false, and there is no need to disable timeouts where they do not belong.
>>>
>>> Bill
>>>
>>> ________________________________________
>>> From: [hidden email] [[hidden email]] On Behalf Of Michael Roberts [[hidden email]]
>>> Sent: Saturday, May 29, 2010 6:17 AM
>>> To: [hidden email]
>>> Subject: Re: [Pharo-project] SUnit Time out
>>>
>>> yes think it's a good idea. I'm not sure the granularity that's
>>> required though.
>>>
>>> mike
>>>
>>> On Saturday, May 29, 2010, stephane ducasse <[hidden email]> wrote:
>>>> Hi guys
>>>>
>>>> in Squeak andreas introduced the idea of test time out
>>>> Do you think that this is interesting?
>>>>
>>>> Stef
>>>>
>>>> SUnit
>>>> -----
>>>> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method.
>>>> _______________________________________________
>>>> Pharo-project mailing list
>>>> [hidden email]
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>
>>
>>
>> --
>> Lukas Renggli
>> www.lukas-renggli.ch
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Chris Muller-3

Re: SUnit Time out

In reply to this post by stephane ducasse

(Copying squeak-dev too).

I'm not sold on the whole test timeout thing. When I run tests, I
want to know the answer to the question, "is the software working?"

Putting a timeout on tests trades a slower, but definitive, "yes" or
"no" for a supposedly-faster "maybe". But is getting a "maybe" back
really faster? I've just incurred the cost of running a test suite,
but left without my answer. I get a "maybe", what am I supposed to do
next? Find a faster machine? Hack into the code to fiddle with a
timeout pragma? That's not faster..

But, the reason given for the change was not for running tests
interactively (the 99% case), rather, all tests form the beginning of
time are now saddled with a timeout for the 1% case:

"The purpose of the timeout is to catch issues like infinite loops,
unexpected user input etc. in automated test environments."

If tests are supposed to be quick (and deterministic) anyway, wouldn't
an infinite loop or user-input be caught the first time the test was
run (interactively)? Seriously, when you make software changes, we
run the tests interactively first, and then the purpose of night-time
automated test environment is to catch regressions on the merged
code..

In that case, the high-level test-controller which spits out the
results could and should be responsible for handling "unexpected user
input" and/or putting in a timeout, not each and every last test
method..

IMO, we want short tests, so let's just write them to be short. If
they're too long, then the encouragement to shorten them comes from
our own impatience of running them interactively. Running them in
batch at night requires no patience, because we're sleeping, and
besides, the batch processor should take responsibility for handling
those rare scenarios at a higher-level..

Regards,
Chris

On Sat, May 29, 2010 at 2:53 AM, stephane ducasse
<[hidden email]> wrote:

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Stéphane Ducasse

Re: SUnit Time out

On May 30, 2010, at 8:52 PM, Chris Muller wrote:

> (Copying squeak-dev too).
>
> I'm not sold on the whole test timeout thing. When I run tests, I
> want to know the answer to the question, "is the software working?"
>
> Putting a timeout on tests trades a slower, but definitive, "yes" or
> "no" for a supposedly-faster "maybe". But is getting a "maybe" back
> really faster? I've just incurred the cost of running a test suite,
> but left without my answer. I get a "maybe", what am I supposed to do
> next? Find a faster machine? Hack into the code to fiddle with a
> timeout pragma? That's not faster..

Thanks this is a really good point.

> But, the reason given for the change was not for running tests
> interactively (the 99% case), rather, all tests form the beginning of
> time are now saddled with a timeout for the 1% case:
>
> "The purpose of the timeout is to catch issues like infinite loops,
> unexpected user input etc. in automated test environments."
>
> If tests are supposed to be quick (and deterministic) anyway, wouldn't
> an infinite loop or user-input be caught the first time the test was
> run (interactively)? Seriously, when you make software changes, we
> run the tests interactively first, and then the purpose of night-time
> automated test environment is to catch regressions on the merged
> code..

Yes this is what I was also implying in my previous mail.
If we have a test server this does not really help to have a time out
and I wonder the case of infinite loop because this may be really rare.

> In that case, the high-level test-controller which spits out the
> results could and should be responsible for handling "unexpected user
> input" and/or putting in a timeout, not each and every last test
> method..
>
> IMO, we want short tests, so let's just write them to be short. If
> they're too long, then the encouragement to shorten them comes from
> our own impatience of running them interactively. Running them in
> batch at night requires no patience, because we're sleeping, and
> besides, the batch processor should take responsibility for handling
> those rare scenarios at a higher-level..

I agree.
Thanks for sharing your thoughts.
So the issue is done. :)

>
> Regards,
> Chris
>
>
> On Sat, May 29, 2010 at 2:53 AM, stephane ducasse
> <[hidden email]> wrote:
>> Hi guys
>>
>> in Squeak andreas introduced the idea of test time out
>> Do you think that this is interesting?
>>
>> Stef
>>
>> SUnit
>> -----
>> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method.
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Nicolas Cellier

Re: SUnit Time out

2010/5/30 Stéphane Ducasse <[hidden email]>:

>
> On May 30, 2010, at 8:52 PM, Chris Muller wrote:
>
>> (Copying squeak-dev too).
>>
>> I'm not sold on the whole test timeout thing. When I run tests, I
>> want to know the answer to the question, "is the software working?"
>>
>> Putting a timeout on tests trades a slower, but definitive, "yes" or
>> "no" for a supposedly-faster "maybe". But is getting a "maybe" back
>> really faster? I've just incurred the cost of running a test suite,
>> but left without my answer. I get a "maybe", what am I supposed to do
>> next? Find a faster machine? Hack into the code to fiddle with a
>> timeout pragma? That's not faster..
>
> Thanks this is a really good point.
>
>> But, the reason given for the change was not for running tests
>> interactively (the 99% case), rather, all tests form the beginning of
>> time are now saddled with a timeout for the 1% case:
>>
>> "The purpose of the timeout is to catch issues like infinite loops,
>> unexpected user input etc. in automated test environments."
>>
>> If tests are supposed to be quick (and deterministic) anyway, wouldn't
>> an infinite loop or user-input be caught the first time the test was
>> run (interactively)? Seriously, when you make software changes, we
>> run the tests interactively first, and then the purpose of night-time
>> automated test environment is to catch regressions on the merged
>> code..
>
> Yes this is what I was also implying in my previous mail.
> If we have a test server this does not really help to have a time out
> and I wonder the case of infinite loop because this may be really rare.
>

My opinion differs here. Every test should run in a short time frame
but a few exceptions. So it seems reasonnable to just have to specify
a default timeout on your architecture and some specific timeouts for
a few specific tests (or test classes).

Your main argument is that manual tuning will always be better than
automated default behaviour, and we can only agree on that one.

But there are two pragmatic cases you don't take into account:
- the case when community supplied test cases do not comply with these
rather implicit requirements, and image integrator does not have time
to dig into each case and do the fine tuning for the rest of
community.
- the case when automated tests are used for exploratory package testing.

In the first case, integrator just put a threshold that will reject
some tests, up to rest of community to inject more time in solving the
problem.

Maybe Andreas was also addressing case of network-in-the-loop tests.
Thus it can be seen as a quick hack for by-passing the low level
timeout and number of retries (which are not always accessible that
easily in some APIs...).

Concerning infinite loops occurence, some are produced by uncompatible
packages. So when you automate compatible package exploration, it
might help because I doubt you will have explored each case
interactively. Of course, you can always put a timeout at upper level
in your bash or something, but it would not be particularly fine
grained, would it ?

One typical case I often bump into is those classes defining printOn:
by sending storeOn: et vice et versa, same for printString and
printOn:. If core happens to change between two releases, and you have
a subclass defined in your package, probability to run into one of
these infinite loops increases.

Nicolas

>> In that case, the high-level test-controller which spits out the
>> results could and should be responsible for handling "unexpected user
>> input" and/or putting in a timeout, not each and every last test
>> method..
>>
>> IMO, we want short tests, so let's just write them to be short. If
>> they're too long, then the encouragement to shorten them comes from
>> our own impatience of running them interactively. Running them in
>> batch at night requires no patience, because we're sleeping, and
>> besides, the batch processor should take responsibility for handling
>> those rare scenarios at a higher-level..
>
> I agree.
> Thanks for sharing your thoughts.
> So the issue is done. :)
>
>
>>
>> Regards,
>> Chris
>>
>>
>> On Sat, May 29, 2010 at 2:53 AM, stephane ducasse
>> <[hidden email]> wrote:
>>> Hi guys
>>>
>>> in Squeak andreas introduced the idea of test time out
>>> Do you think that this is interesting?
>>>
>>> Stef
>>>
>>> SUnit
>>> -----
>>> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method.
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Nicolas Cellier

Re: SUnit Time out

2010/5/31 Nicolas Cellier <[hidden email]>:

In other words, the timeout has the advantage to turn a rather
implicit requirement into an explicit requirement.
And it's then up to test producers to fine tune their tests wrt this
requirement or use the available hooks in case of long tests, rather
than letting the integrator guess.

Nicolas

> Maybe Andreas was also addressing case of network-in-the-loop tests.
> Thus it can be seen as a quick hack for by-passing the low level
> timeout and number of retries (which are not always accessible that
> easily in some APIs...).
>
> Concerning infinite loops occurence, some are produced by uncompatible
> packages. So when you automate compatible package exploration, it
> might help because I doubt you will have explored each case
> interactively. Of course, you can always put a timeout at upper level
> in your bash or something, but it would not be particularly fine
> grained, would it ?
>
> One typical case I often bump into is those classes defining printOn:
> by sending storeOn: et vice et versa, same for printString and
> printOn:. If core happens to change between two releases, and you have
> a subclass defined in your package, probability to run into one of
> these infinite loops increases.
>
> Nicolas
>
>>> In that case, the high-level test-controller which spits out the
>>> results could and should be responsible for handling "unexpected user
>>> input" and/or putting in a timeout, not each and every last test
>>> method..
>>>
>>> IMO, we want short tests, so let's just write them to be short. If
>>> they're too long, then the encouragement to shorten them comes from
>>> our own impatience of running them interactively. Running them in
>>> batch at night requires no patience, because we're sleeping, and
>>> besides, the batch processor should take responsibility for handling
>>> those rare scenarios at a higher-level..
>>
>> I agree.
>> Thanks for sharing your thoughts.
>> So the issue is done. :)
>>
>>
>>>
>>> Regards,
>>> Chris
>>>
>>>
>>> On Sat, May 29, 2010 at 2:53 AM, stephane ducasse
>>> <[hidden email]> wrote:
>>>> Hi guys
>>>>
>>>> in Squeak andreas introduced the idea of test time out
>>>> Do you think that this is interesting?
>>>>
>>>> Stef
>>>>
>>>> SUnit
>>>> -----
>>>> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method.
>>>> _______________________________________________
>>>> Pharo-project mailing list
>>>> [hidden email]
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Nicolas Cellier

Re: SUnit Time out

Oh, and I see one more advantage: Only quick tests are run
interactively by default from your TestRunner.
We could automate the separation between long tests and quick tests
based on timeout if the information is discoverable (either in method
annotations or class side query, I don't care).

Of course, as a user I would be interested to distinguish failures
errors and timeouts, so I would want the user interface to change.

Nicolas

2010/5/31 Nicolas Cellier <[hidden email]>:

> 2010/5/31 Nicolas Cellier <[hidden email]>:
>> 2010/5/30 Stéphane Ducasse <[hidden email]>:
>>>
>>> On May 30, 2010, at 8:52 PM, Chris Muller wrote:
>>>
>>>> (Copying squeak-dev too).
>>>>
>>>> I'm not sold on the whole test timeout thing. When I run tests, I
>>>> want to know the answer to the question, "is the software working?"
>>>>
>>>> Putting a timeout on tests trades a slower, but definitive, "yes" or
>>>> "no" for a supposedly-faster "maybe". But is getting a "maybe" back
>>>> really faster? I've just incurred the cost of running a test suite,
>>>> but left without my answer. I get a "maybe", what am I supposed to do
>>>> next? Find a faster machine? Hack into the code to fiddle with a
>>>> timeout pragma? That's not faster..
>>>
>>> Thanks this is a really good point.
>>>
>>>> But, the reason given for the change was not for running tests
>>>> interactively (the 99% case), rather, all tests form the beginning of
>>>> time are now saddled with a timeout for the 1% case:
>>>>
>>>> "The purpose of the timeout is to catch issues like infinite loops,
>>>> unexpected user input etc. in automated test environments."
>>>>
>>>> If tests are supposed to be quick (and deterministic) anyway, wouldn't
>>>> an infinite loop or user-input be caught the first time the test was
>>>> run (interactively)? Seriously, when you make software changes, we
>>>> run the tests interactively first, and then the purpose of night-time
>>>> automated test environment is to catch regressions on the merged
>>>> code..
>>>
>>> Yes this is what I was also implying in my previous mail.
>>> If we have a test server this does not really help to have a time out
>>> and I wonder the case of infinite loop because this may be really rare.
>>>
>>
>> My opinion differs here. Every test should run in a short time frame
>> but a few exceptions. So it seems reasonnable to just have to specify
>> a default timeout on your architecture and some specific timeouts for
>> a few specific tests (or test classes).
>>
>> Your main argument is that manual tuning will always be better than
>> automated default behaviour, and we can only agree on that one.
>>
>> But there are two pragmatic cases you don't take into account:
>> - the case when community supplied test cases do not comply with these
>> rather implicit requirements, and image integrator does not have time
>> to dig into each case and do the fine tuning for the rest of
>> community.
>> - the case when automated tests are used for exploratory package testing.
>>
>> In the first case, integrator just put a threshold that will reject
>> some tests, up to rest of community to inject more time in solving the
>> problem.
>>
>
> In other words, the timeout has the advantage to turn a rather
> implicit requirement into an explicit requirement.
> And it's then up to test producers to fine tune their tests wrt this
> requirement or use the available hooks in case of long tests, rather
> than letting the integrator guess.
>
> Nicolas
>
>> Maybe Andreas was also addressing case of network-in-the-loop tests.
>> Thus it can be seen as a quick hack for by-passing the low level
>> timeout and number of retries (which are not always accessible that
>> easily in some APIs...).
>>
>> Concerning infinite loops occurence, some are produced by uncompatible
>> packages. So when you automate compatible package exploration, it
>> might help because I doubt you will have explored each case
>> interactively. Of course, you can always put a timeout at upper level
>> in your bash or something, but it would not be particularly fine
>> grained, would it ?
>>
>> One typical case I often bump into is those classes defining printOn:
>> by sending storeOn: et vice et versa, same for printString and
>> printOn:. If core happens to change between two releases, and you have
>> a subclass defined in your package, probability to run into one of
>> these infinite loops increases.
>>
>> Nicolas
>>
>>>> In that case, the high-level test-controller which spits out the
>>>> results could and should be responsible for handling "unexpected user
>>>> input" and/or putting in a timeout, not each and every last test
>>>> method..
>>>>
>>>> IMO, we want short tests, so let's just write them to be short. If
>>>> they're too long, then the encouragement to shorten them comes from
>>>> our own impatience of running them interactively. Running them in
>>>> batch at night requires no patience, because we're sleeping, and
>>>> besides, the batch processor should take responsibility for handling
>>>> those rare scenarios at a higher-level..
>>>
>>> I agree.
>>> Thanks for sharing your thoughts.
>>> So the issue is done. :)
>>>
>>>
>>>>
>>>> Regards,
>>>> Chris
>>>>
>>>>
>>>> On Sat, May 29, 2010 at 2:53 AM, stephane ducasse
>>>> <[hidden email]> wrote:
>>>>> Hi guys
>>>>>
>>>>> in Squeak andreas introduced the idea of test time out
>>>>> Do you think that this is interesting?
>>>>>
>>>>> Stef
>>>>>
>>>>> SUnit
>>>>> -----
>>>>> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method.
>>>>> _______________________________________________
>>>>> Pharo-project mailing list
>>>>> [hidden email]
>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>>
>>>>
>>>> _______________________________________________
>>>> Pharo-project mailing list
>>>> [hidden email]
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>
>

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Stéphane Ducasse

Re: SUnit Time out

In reply to this post by Nicolas Cellier

ok tx
Now may be the test runner should be adapted.
We will see. Now I just remove the items from the urgent list and if the need emerges
we know that this is there.

Stef

On May 31, 2010, at 12:10 PM, Nicolas Cellier wrote:

> 2010/5/30 Stéphane Ducasse <[hidden email]>:
>>
>> On May 30, 2010, at 8:52 PM, Chris Muller wrote:
>>
>>> (Copying squeak-dev too).
>>>
>>> I'm not sold on the whole test timeout thing. When I run tests, I
>>> want to know the answer to the question, "is the software working?"
>>>
>>> Putting a timeout on tests trades a slower, but definitive, "yes" or
>>> "no" for a supposedly-faster "maybe". But is getting a "maybe" back
>>> really faster? I've just incurred the cost of running a test suite,
>>> but left without my answer. I get a "maybe", what am I supposed to do
>>> next? Find a faster machine? Hack into the code to fiddle with a
>>> timeout pragma? That's not faster..
>>
>> Thanks this is a really good point.
>>
>>> But, the reason given for the change was not for running tests
>>> interactively (the 99% case), rather, all tests form the beginning of
>>> time are now saddled with a timeout for the 1% case:
>>>
>>> "The purpose of the timeout is to catch issues like infinite loops,
>>> unexpected user input etc. in automated test environments."
>>>
>>> If tests are supposed to be quick (and deterministic) anyway, wouldn't
>>> an infinite loop or user-input be caught the first time the test was
>>> run (interactively)? Seriously, when you make software changes, we
>>> run the tests interactively first, and then the purpose of night-time
>>> automated test environment is to catch regressions on the merged
>>> code..
>>
>> Yes this is what I was also implying in my previous mail.
>> If we have a test server this does not really help to have a time out
>> and I wonder the case of infinite loop because this may be really rare.
>>
>
> My opinion differs here. Every test should run in a short time frame
> but a few exceptions. So it seems reasonnable to just have to specify
> a default timeout on your architecture and some specific timeouts for
> a few specific tests (or test classes).
>
> Your main argument is that manual tuning will always be better than
> automated default behaviour, and we can only agree on that one.
>
> But there are two pragmatic cases you don't take into account:
> - the case when community supplied test cases do not comply with these
> rather implicit requirements, and image integrator does not have time
> to dig into each case and do the fine tuning for the rest of
> community.
> - the case when automated tests are used for exploratory package testing.
>
> In the first case, integrator just put a threshold that will reject
> some tests, up to rest of community to inject more time in solving the
> problem.
>
> Maybe Andreas was also addressing case of network-in-the-loop tests.
> Thus it can be seen as a quick hack for by-passing the low level
> timeout and number of retries (which are not always accessible that
> easily in some APIs...).
>
> Concerning infinite loops occurence, some are produced by uncompatible
> packages. So when you automate compatible package exploration, it
> might help because I doubt you will have explored each case
> interactively. Of course, you can always put a timeout at upper level
> in your bash or something, but it would not be particularly fine
> grained, would it ?
>
> One typical case I often bump into is those classes defining printOn:
> by sending storeOn: et vice et versa, same for printString and
> printOn:. If core happens to change between two releases, and you have
> a subclass defined in your package, probability to run into one of
> these infinite loops increases.
>
> Nicolas
>
>>> In that case, the high-level test-controller which spits out the
>>> results could and should be responsible for handling "unexpected user
>>> input" and/or putting in a timeout, not each and every last test
>>> method..
>>>
>>> IMO, we want short tests, so let's just write them to be short. If
>>> they're too long, then the encouragement to shorten them comes from
>>> our own impatience of running them interactively. Running them in
>>> batch at night requires no patience, because we're sleeping, and
>>> besides, the batch processor should take responsibility for handling
>>> those rare scenarios at a higher-level..
>>
>> I agree.
>> Thanks for sharing your thoughts.
>> So the issue is done. :)
>>
>>
>>>
>>> Regards,
>>> Chris
>>>
>>>
>>> On Sat, May 29, 2010 at 2:53 AM, stephane ducasse
>>> <[hidden email]> wrote:
>>>> Hi guys
>>>>
>>>> in Squeak andreas introduced the idea of test time out
>>>> Do you think that this is interesting?
>>>>
>>>> Stef
>>>>
>>>> SUnit
>>>> -----
>>>> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method.
>>>> _______________________________________________
>>>> Pharo-project mailing list
>>>> [hidden email]
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Stéphane Ducasse

Re: SUnit Time out

In reply to this post by Nicolas Cellier

>>
>>
>
> In other words, the timeout has the advantage to turn a rather
> implicit requirement into an explicit requirement.
> And it's then up to test producers to fine tune their tests wrt this
> requirement or use the available hooks in case of long tests, rather
> than letting the integrator guess.

yes now for that we can probably use
"We already have #should:notTakeMoreThan: and friends in TestCase. The
complete TestCase can be protected by overriding #runCase, the
individual test by wrapping the code of the test method.
"

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Stéphane Ducasse

Re: SUnit Time out

In reply to this post by Nicolas Cellier

On May 31, 2010, at 12:29 PM, Nicolas Cellier wrote:

> Oh, and I see one more advantage: Only quick tests are run
> interactively by default from your TestRunner.
> We could automate the separation between long tests and quick tests
> based on timeout if the information is discoverable (either in method
> annotations or class side query, I don't care).

but do we have to tag that with a timeout.
jorge already sorted tests as unit = fast and integration = slow.
In interactive mode we could only run the fast one.

I think that we have already a lot and we do not use it enough. So we will learn and see.
Stef
_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Chris Muller-3

Re: SUnit Time out

In reply to this post by Nicolas Cellier

> In other words, the timeout has the advantage to turn a rather
> implicit requirement into an explicit requirement.

I can almost understand it for this, especially in the context of
development within an open source community. Unfortunately, it
neuters the deterministic property. Did it timeout because it was
stuck in a loop or was the Integrator watching a video while the tests
were running, causing a couple of them to timeout? That's an
obfuscation that is not quick and easy to see through.

And why will someone, someday, have to work their way through this
sort of opaque obfuscation? Merely because we imposed a design-time
constraint on ourselves, like working in static language.

The result of working through it most likely will be a new package
version that merely bumps the timeout, hurling us backward toward the
present; because the test suite is allowed to run longer, and longer..

> And it's then up to test producers to fine tune their tests wrt this
> requirement or use the available hooks in case of long tests, rather
> than letting the integrator guess.

It's very possible this is another case of me just not seeing or
understanding the key point. Exploratory package testing? Integrator
doesn't have time to "fine tune"? I must admit, my feelings about it
are more visceral than logical nature.

However this timeout breaks all long-running legacy tests. I intend
to make the default timeout be a global preference in Squeak. From
there I will wait and see quietly..

Regards,
Chris

On Mon, May 31, 2010 at 5:16 AM, Nicolas Cellier
<[hidden email]> wrote:

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Andreas.Raab

Re: SUnit Time out

In reply to this post by Chris Muller-3

Hi Chris -

Let me comment on this from a more general point of view first, before
going into the specifics. I've spent the last five years building a
distributed system and during this time I've learned a couple of things
about the value of timeouts :-) One thing that I've come to understand
is that *no* operation is unbounded. We may leisurely talk about "just
wait until it's done" but the reality is that regardless of what the
operation is we never actually wait forever. At some point we *will*
give up no matter what you may think. This is THE fundamental point
here. Everything else is basically haggling about what the right timeout is.

For the right timeout the second fundamental thing to understand is that
if there's a question of whether the operation "maybe" completed, then
your timeout is too short. Period. The timeout's value is not to
indicate that "maybe" the operation completed, it is there to say
unequivocally that something caused it to not complete and that it DID fail.

Obviously, introducing timeouts will create some initial false
positives. But it may be interesting to be a bit more precise on what
we're talking about. To do this I attributed TestRunner to measure the
time it takes to run each test and then ran all the tests in 4.2 to see
where that leads us. As you might expect, the distribution is extremely
uneven. Out of 2681 tests run 2588 execute in < 500 msecs (approx. 1800
execute with no measurable time); 2630 execute in less than one second,
leaving a total of 51 that take more than a second and only three tests
actually take longer than 5 seconds and they are all tagged as such.

As you can see the vast majority of tests have a "safety margin" of 10x
or more between the time the test usually takes and its timeout value.
Generally speaking, this margin is sufficient to compensate for "other"
effects that might rightfully delay the completion of the test in time.
If you have tests that commonly vary by 10x I'd be interested in finding
out more about what makes them so unpredictable.

So if your question is "are my timeouts to tight" one thing we could do
is to introduce the 10x as a more or less general guideline for
executing tests, and perhaps add a transcript notifier if we ever come
closer than 1/3rd of the specified timeout value (i.e., indicating that
something in the nature of the test has changed that should be reflected
in its timeout). This would give you ample warning that you need to
adjust your test even if it isn't (yet) failing on the timeout.

That said, a couple of concrete comments to your post:

On 5/30/2010 11:52 AM, Chris Muller wrote:
> (Copying squeak-dev too).
>
> I'm not sold on the whole test timeout thing. When I run tests, I
> want to know the answer to the question, "is the software working?"

Correct.

> Putting a timeout on tests trades a slower, but definitive, "yes" or
> "no" for a supposedly-faster "maybe". But is getting a "maybe" back
> really faster? I've just incurred the cost of running a test suite,
> but left without my answer. I get a "maybe", what am I supposed to do
> next? Find a faster machine? Hack into the code to fiddle with a
> timeout pragma? That's not faster..

See above. If you're thinking "maybe", then the timeout is too short.

> But, the reason given for the change was not for running tests
> interactively (the 99% case), rather, all tests form the beginning of
> time are now saddled with a timeout for the 1% case:

As the data shows, this is already the case. It may be interesting to
note that so far there were a total of 5 (five) places that had to be
adjusted in Squeak. One was a general place (the default timeout for the
decompiler tests) and four were individual methods. Considering that
computers usually don't become slower over time, it seems unlikely that
further adjustments will be necessary here. So the bottom line is that
the changes required aren't exactly excessive.

> "The purpose of the timeout is to catch issues like infinite loops,
> unexpected user input etc. in automated test environments."
>
> If tests are supposed to be quick (and deterministic) anyway, wouldn't
> an infinite loop or user-input be caught the first time the test was
> run (interactively)? Seriously, when you make software changes, we
> run the tests interactively first, and then the purpose of night-time
> automated test environment is to catch regressions on the merged
> code.

These changes are largely intended for automated integration testing. I
am hoping to automate the tests for community supported packages to a
point where there will be no user in front of the system. Even if there
were, it's not clear whether that person can fix the issue immediately
or whether the entire process is stuck because someone can momentarily
not fix the problem at hand and the tests will never run to completion
and produce any useful result.

So the idea here is not that unit tests are *only* to catch regressions
in previously manually tested (combinations of) code. The idea is to
catch interactions, and integration bugs and be able to produce a result
even if there is no user to watch the particular combination of packages
being loaded together in this particular form.

Perhaps that is our problem here? It seems to me that you're taking a
view that says unit tests are exclusively for regression testing and
consequently there is no way a previously successful test would suddenly
become unsuccessful in a way that makes it time out ... but you know,
having written this sentence, it makes no sense to me. If we'd know
beforehand that tests fail only in particular known ways we wouldn't
have to run them to begin with. The whole idea of running the tests to
catch *unexpected* situations and as a consequence there is value of
capturing these situations instead of hanging and producing no useful
result.

> In that case, the high-level test-controller which spits out the
> results could and should be responsible for handling "unexpected user
> input" and/or putting in a timeout, not each and every last test
> method..

Do you have such a "high-level test-controller"? Or do you mean a human
being spending their time watching the tests run to completion? If the
former, I'm curious as to how it would differ from what I did. If the
latter, are you volunteering? ;-)

> IMO, we want short tests, so let's just write them to be short. If
> they're too long, then the encouragement to shorten them comes from
> our own impatience of running them interactively. Running them in
> batch at night requires no patience, because we're sleeping, and
> besides, the batch processor should take responsibility for handling
> those rare scenarios at a higher-level..

The goal for the timeouts is *not* to cause you to write shorter tests.
If you're looking at it this way you're looking at it from the wrong
angle. Up your timeout to whatever you feel is sensible to have trust in
the results of the tests. As I said earlier, I'm quite happy to discuss
the default timeout; it's simply that with some 95% coverage on a 10x
safety margin it feels to me that we're playing it safe enough for the
remaining cases to have explicit timeouts.

Cheers,
- Andreas

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Chris Muller-3

Re: [squeak-dev] Re: SUnit Time out

Thanks for clarifying your goals w.r.t. introducing the timeout. I
think that's important because, as I've said, legacy tests that live
in external packages are affected.

I read your whole note a few times, and one part in particular stuck
out to me as a potentially useful use-case for test-case timeout:

> These changes are largely intended for automated integration testing. I am
> hoping to automate the tests for community supported packages to a point
> where there will be no user in front of the system.

If, by this, you mean you want to simply have a headless running
squeak image which:

[ true ] whileTrue:
[ loadLatestPackageCombinations.
runTestSuite.
mailResultsToSqueakDev ]

THEN, that brings us down to only haggling about the default timeout,
although I still would prefer to handle timeout it at a higher level..

If, however, this isn't the goal, then I still don't seem to have
grasped, what I sense is, some key point.. or that my own concerns
were properly understood. If so, let me try one more time. :)

> done" but the reality is that regardless of what the operation is we never
> actually wait forever. At some point we *will* give up no matter what you
> may think. This is THE fundamental point here. Everything else is basically
> haggling about what the right timeout is.

Of course we would "give up" after an unreasonable amount of time. In
either case, there is something to interrogate, either a live looping
test-runner machine, or a static report of test results with one or
more that say, "timed out".

In the former case, we have a bevy of useful information, (e.g., which
test is it trying to run? How much memory is the test image using
right now? Can I Alt+. interrupt it and get even more information?)

In the latter case, there is no choice but to start at square 1: Try
to recreate the problem. (What if it works?)

Personally, I would always prefer to deal with the former case than the latter..

> For the right timeout the second fundamental thing to understand is that if
> there's a question of whether the operation "maybe" completed, then your
> timeout is too short. Period. The timeout's value is not to indicate that
> "maybe" the operation completed, it is there to say unequivocally that
> something caused it to not complete and that it DID fail.

I didn't understand this. There is no question about "maybe
completed". We know if a test times out then it _didn't_ complete.
The "maybe" I referred to was about the core question: whether the
underlying software being tested can be used or not. "Maybe" it
could, then again, maybe it shouldn't. It sounds like we agree, a
timeout would *have* to be regarded as a failure.

> Obviously, introducing timeouts will create some initial false positives.

You mean false negatives? If we are saying that we must treat a
timeout as failure, and failure is "negative", then a timeout would be
false negative or a true negative....?

> But it may be interesting to be a bit more precise on what we're talking
> about. To do this I attributed TestRunner to measure the time it takes to
> run each test and then ran all the tests in 4.2 to see where that leads us.
> As you might expect, the distribution is extremely uneven. Out of 2681 tests
> run 2588 execute in < 500 msecs (approx. 1800 execute with no measurable
> time); 2630 execute in less than one second, leaving a total of 51 that
> take more than a second and only three tests actually take longer than 5
> seconds and they are all tagged as such.

That's fine for the 4.2 tests, but there are hundreds of tests in
external packages. With a mere 5-second default, many will need to be
updated with a pragma. But then we're talking about a branch in the
package because that won't be backward compatible with 3.9, will it?

> As you can see the vast majority of tests have a "safety margin" of 10x or
> more between the time the test usually takes and its timeout value.
> Generally speaking, this margin is sufficient to compensate for "other"
> effects that might rightfully delay the completion of the test in time.

I can see that jacking up the timeout may tend reduce the number of
false negatives (at the expense of potentially longer wait times!),
but when they do, we have no useful information whatsoever. Not even
certainty whether the underlying software is usable or not, because it
could be a false negative.

> If
> you have tests that commonly vary by 10x I'd be interested in finding out
> more about what makes them so unpredictable.

Well, again, it's not just about randomness in the tests but also
about external factors; CPU speed, current system load, etc.

> So if your question is "are my timeouts to tight" one thing we could do is
> to introduce the 10x as a more or less general guideline for executing
> tests,

Ok, with that kind of margin, the message I'm getting from you is that
it does about making a human have to wait. We just want to make sure
we "get some kind of report?"

>> But, the reason given for the change was not for running tests
>> interactively (the 99% case), rather, all tests form the beginning of
>> time are now saddled with a timeout for the 1% case:
>
> As the data shows, this is already the case. It may be interesting to note
> that so far there were a total of 5 (five) places that had to be adjusted in
> Squeak.

I'm not worried about the built-in tests; recall I acknowledged that I
can "almost understand" a forced timeout in the context of an
open-source project where people are all contributing their portions
and no one else wants to be "held up" because of one persons tests
looping.

My concern is more about the impact to legacy external packages..

> One was a general place (the default timeout for the decompiler
> tests) and four were individual methods. Considering that computers usually
> don't become slower over time, it seems unlikely that further adjustments
> will be necessary here.

Well, they do.. It's not just a function of time, but who's running
it, and on which machine. We all have different machines. Maybe
someone wants to test on an iPhone that might be considerably slower
than the original desktop on which the timeout was specified...

> So the bottom line is that the changes required
> aren't exactly excessive.

That depends on whether, to have an Community Supported Package be
included, how many test methods I have and whether I also want that to
run in 3.9 and whether, to do that, I have to put in a pragma..
(unless I'm mistaken about pragmas working in 3.9).

Bottom line: Today Magma runs on 3.9 - 4.2 + Pharo. Some of Magma's
tests necessarily take several minutes.

Question: Can Magma be a CSP and still retain this wide compatibility?

> These changes are largely intended for automated integration testing. I am
> hoping to automate the tests for community supported packages to a point
> where there will be no user in front of the system.
>
> Even if there were, it's
> not clear whether that person can fix the issue immediately or whether the
> entire process is stuck because someone can momentarily not fix the problem
> at hand and the tests will never run to completion and produce any useful
> result.

Who is "that person" and what is their role?

> begin with. The whole idea of running the tests to catch *unexpected*
> situations and as a consequence there is value of capturing these situations
> instead of hanging and producing no useful result.

To me, "timed out" is what is not useful. To find a hanging machine
that can be interrogated is much more useful.

>> In that case, the high-level test-controller which spits out the
>> results could and should be responsible for handling "unexpected user
>> input" and/or putting in a timeout, not each and every last test
>> method..
>
> Do you have such a "high-level test-controller"? Or do you mean a human
> being spending their time watching the tests run to completion? If the
> former, I'm curious as to how it would differ from what I did. If the
> latter, are you volunteering? ;-)

I meant the former. It differs from what you did in that it preserves
legacy compatibilty, and the legacy deterministic property of testing.
To handle automated test server, I would handle the on-timeout: from
a much higher place, and therefore it would not be for individual
tests, but for the whole suite. Information about the last running
test would be sufficient for me, especially if we're talking about all
of the other disadvantages I've mentioned for fine-grained timeouts..

- Chris

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

hernan.wilkinson

Re: SUnit Time out

In reply to this post by Stéphane Ducasse

I dont think it is a necessary... I mean, if you are doing TDD, you write the test and you run it inmediatly, if it takes too much time to run (that is, more than 2 seconds :-)) then you press ctrl+ . and problem solved...

I'm saying this because the should:notTakeMoreThan: has some interesting implementation details, like running the test in another process and synchronizing them, etc... so running all the test would create a process for each test, etc. tacking more time unnecessarily in most of the cases... unless the implementation takes care of only doing this special "feature" when needed (for example, when you have that pragma) but the, if you have to write the pragma why not just send the message #should:notTakeMoreThan: and problem solved?

On Sat, May 29, 2010 at 3:42 PM, Stéphane Ducasse <[hidden email]> wrote:

For me the point is that I have the impression that this is an interesting feature to have
when we have a test server. Like that you make sure that you do not have tests with infinite recursion
now this is also true that on a test server you do not really care if your tests take 30s or 2 min.

Stef

> We already have #should:notTakeMoreThan: and friends in TestCase. The
> complete TestCase can be protected by overriding #runCase, the
> individual test by wrapping the code of the test method.
>
> Lukas
>
> On 29 May 2010 16:43, Schwab,Wilhelm K <[hidden email]> wrote:
>> Stef,
>>
>> Time to expose some of my ignorance (don't worry, I have plenty more waiting where I found this): what is the tag concept? That sounds very Tweak-ish, and I am a real believer in doing things "with the language, not TO the language" whenever possible. That is not to say that frameworks are bad; in fact, it means that frameworks are good, language extensions are anywhere from suspect to evil.
>>
>> I have some code that I am still porting, but the basic idea is to be able to write
>>
>> [
>> "code that might not complete"
>>
>> ] tryForSeconds:10 onTimeOut:[
>>
>> ].
>>
>> With a robust capability to do such things, it is probably not necessary (or even appropriate) for TestCase to enforce timeouts. The timeout block can simply raise an exception or assert false, and there is no need to disable timeouts where they do not belong.
>>
>> Bill
>>
>> ________________________________________
>> From: [hidden email] [[hidden email]] On Behalf Of Michael Roberts [[hidden email]]
>> Sent: Saturday, May 29, 2010 6:17 AM
>> To: [hidden email]
>> Subject: Re: [Pharo-project] SUnit Time out
>>
>> yes think it's a good idea. I'm not sure the granularity that's
>> required though.
>>
>> mike
>>
>> On Saturday, May 29, 2010, stephane ducasse <[hidden email]> wrote:
>>> Hi guys
>>>
>>> in Squeak andreas introduced the idea of test time out
>>> Do you think that this is interesting?
>>>
>>> Stef
>>>
>>> SUnit
>>> -----
>>> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method.
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
>
>
> --
> Lukas Renggli
> www.lukas-renggli.ch
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

hernan.wilkinson

Re: [squeak-dev] Re: SUnit Time out

In reply to this post by Chris Muller-3

I completely agree

On Sun, May 30, 2010 at 2:52 PM, Chris Muller <[hidden email]> wrote:

(Copying squeak-dev too).

I'm not sold on the whole test timeout thing. When I run tests, I
want to know the answer to the question, "is the software working?"

Putting a timeout on tests trades a slower, but definitive, "yes" or
"no" for a supposedly-faster "maybe". But is getting a "maybe" back
really faster? I've just incurred the cost of running a test suite,
but left without my answer. I get a "maybe", what am I supposed to do
next? Find a faster machine? Hack into the code to fiddle with a
timeout pragma? That's not faster..

But, the reason given for the change was not for running tests
interactively (the 99% case), rather, all tests form the beginning of
time are now saddled with a timeout for the 1% case:

"The purpose of the timeout is to catch issues like infinite loops,
unexpected user input etc. in automated test environments."

If tests are supposed to be quick (and deterministic) anyway, wouldn't
an infinite loop or user-input be caught the first time the test was
run (interactively)? Seriously, when you make software changes, we
run the tests interactively first, and then the purpose of night-time
automated test environment is to catch regressions on the merged
code..

In that case, the high-level test-controller which spits out the
results could and should be responsible for handling "unexpected user
input" and/or putting in a timeout, not each and every last test
method..

IMO, we want short tests, so let's just write them to be short. If
they're too long, then the encouragement to shorten them comes from
our own impatience of running them interactively. Running them in
batch at night requires no patience, because we're sleeping, and
besides, the batch processor should take responsibility for handling
those rare scenarios at a higher-level..

Regards,
Chris

On Sat, May 29, 2010 at 2:53 AM, stephane ducasse

<[hidden email]> wrote:

> Hi guys
>
> in Squeak andreas introduced the idea of test time out
> Do you think that this is interesting?
>
> Stef
>
> SUnit
> -----
> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method.
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project