Unifying Testing Ideas

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

Unifying Testing Ideas

Sean P. DeNigris
Administrator
We have some cool matchers (Phexample, StateSpecs), some nice mocking libraries (Mocketry, BabyMock), and Phexample's acknowledgement that tests build on each other. The problem is, it's hard to cherry-pick and build one's perfect test enviroment. For example:
- Phexample and Mocketry, whose matching frameworks break each other
- Phexample and BabyMock both require subclassing from their own TestCase subclass, so they can't be used together.

I wonder what the solution is? Maybe some registration scheme? It'd be nice to be more coordinated here. As a first step, I manually merged BabyMock and Phexample to produce the following (IMHO) gorgeous tests...

    testNew
        self library should
                receive: #FMOD_System_Create:;
                with: FMOD_SYSTEM new;
                does: [ :h |
                        h handle: 20.
                        0 ].
        self library should
                receive: #FMOD_System_Init:;
                with: [ :a | a handle = 30 ];
                answers: 0.
       
        ^ FmodSystem new.

    testNewSound
        | soundFile  system |

        system := self given: #testNew.

        soundFile := FileLocator vmBinary.
        self library should
                receive: #FMOD_System_CreateSound:to:with:;
                with: soundFile fullName and: FMOD_SOUND new and: [ :h | h = system handle ];
                answers: 0.
       
        ^ system newSoundFromFile: soundFile.

    testPlaySound
        | sound |
        sound := self given: #testNewSound.

        self library should
                receive: #FMOD_System:PlaySound:on:;
                with: sound system handle and: sound handle and: FmodChannel new;
                answers: 0.
       
        ^ sound play.

    testChannelIsPlaying
        | channel |
        channel := self given: #testPlaySound.

        self library should
                receive: #FMOD_Channel_IsPlaying:storeIn:;
                with: channel and: NBExternalAddress new;
                does: [ :c :isPlaying | isPlaying value: 1 ].
       
        ^ channel isPlaying.

The tests... let's call them specifications... clearly state how the object should talk to the FMOD library. The neat part of Phexample is that even though each specification uses the result of the last, it's smart enough not to fail the dependent if the #given: fails. It moves it into "expected failures" this is important because otherwise a failure deep in the system would make it difficult to pinpoint since there could be many failing specifications.

n.b. in this case, I'm not sure that mocking is the best strategy and I may end up using a hand-written stub, but I wanted to do a case study. One area where I've really found mocks to shine is internal to one's system. That is, I'm writing an object and discover that it must talk to another as-yet-unwritten object. So I mock it out and in so doing define its API, which I then go implement. Also, the mocks are using a custom extension to BabyMock that allows you to pass the arguments do #does:, so it will not run on vanilla BabyMock

Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: Unifying Testing Ideas

Attila Magyar
Sean P. DeNigris wrote
- Phexample and BabyMock both require subclassing from their own TestCase subclass, so they can't be used together.
The reason for having the base class is to verify the expectations at the end of the tests automatically. Doing this manually is possible (context assertSatisfied), but probably people would forget it without this automatic mechanism. In Java for example, mock libraries use custom JUnit runners to do this, but I haven't found something like this in SUnit. If there is a better way to do this please let me know.
Reply | Threaded
Open this post in threaded view
|

Re: Unifying Testing Ideas

Sean P. DeNigris
Administrator
Attila Magyar wrote
The reason for having the base class is to verify the expectations at the end of the tests automatically. Doing this manually is possible (context assertSatisfied), but probably people would forget it without this automatic mechanism. In Java for example, mock libraries use custom JUnit runners to do this, but I haven't found something like this in SUnit. If there is a better way to do this please let me know.
Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: Unifying Testing Ideas

Sean P. DeNigris
Administrator
Attila Magyar wrote
The reason for having the base class is to verify the expectations at the end of the tests automatically. Doing this manually is possible (context assertSatisfied), but probably people would forget it without this automatic mechanism. In Java for example, mock libraries use custom JUnit runners to do this, but I haven't found something like this in SUnit. If there is a better way to do this please let me know.
I understand the motivation. My question is how do we create appropriate hooks so that we don't get into these conflicts?

I made a few enhancements to BabyMock:
- anyArgument now inst var of BabyMockTestCase, similar to BmAnyMessage
- #does: now optionally takes arguments

The second one turns:

    handle := FMOD_SYSTEM new.

    library should
        receive: #FMOD_System_Create:;
        with: [ :a |
                a = handle
                        ifTrue: [ handle := a. handle handle: 20. true ]
                        ifFalse: [ false ] ];
        answer: 0.
    library should
        receive: #FMOD_System_Init:;
        with: [ :a | a = handle ];
        answers: 0.
       
...into:
               
    library should
        receive: #FMOD_System_Create:;
        with: FMOD_SYSTEM new;
        does: [ :h | h handle: 20. 0 ].
    library should
        receive: #FMOD_System_Init:;
        with: [ :a | a handle = 20 ];
        answers: 0.

... eliminating the temporary, and the separate #with: and #answer: send in the first expectation.

I pushed them to http://smalltalkhub.com/mc/SeanDeNigris/FMOD/main/ since that's where I was using them, but if you give me access to the BM repo and you like the changes, I will clean them up and push there.
Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: Unifying Testing Ideas

Dennis Schetinin
I see the only correct way to build a good testing environment: tests should be basically objects, not methods. 


--

Best regards,


Dennis Schetinin



2013/12/3 Sean P. DeNigris <[hidden email]>
Attila Magyar wrote
> The reason for having the base class is to verify the expectations at the
> end of the tests automatically. Doing this manually is possible (context
> assertSatisfied), but probably people would forget it without this
> automatic mechanism. In Java for example, mock libraries use custom JUnit
> runners to do this, but I haven't found something like this in SUnit. If
> there is a better way to do this please let me know.

I understand the motivation. My question is how do we create appropriate
hooks so that we don't get into these conflicts?

I made a few enhancements to BabyMock:
- anyArgument now inst var of BabyMockTestCase, similar to BmAnyMessage
- #does: now optionally takes arguments

The second one turns:

    handle := FMOD_SYSTEM new.

    library should
        receive: #FMOD_System_Create:;
        with: [ :a |
                a = handle
                        ifTrue: [ handle := a. handle handle: 20. true ]
                        ifFalse: [ false ] ];
        answer: 0.
    library should
        receive: #FMOD_System_Init:;
        with: [ :a | a = handle ];
        answers: 0.

...into:

    library should
        receive: #FMOD_System_Create:;
        with: FMOD_SYSTEM new;
        does: [ :h | h handle: 20. 0 ].
    library should
        receive: #FMOD_System_Init:;
        with: [ :a | a handle = 20 ];
        answers: 0.

... eliminating the temporary, and the separate #with: and #answer: send in
the first expectation.

I pushed them to http://smalltalkhub.com/mc/SeanDeNigris/FMOD/main/ since
that's where I was using them, but if you give me access to the BM repo and
you like the changes, I will clean them up and push there.



-----
Cheers,
Sean
--
View this message in context: http://forum.world.st/Unifying-Testing-Ideas-tp4726787p4726855.html
Sent from the Pharo Smalltalk Developers mailing list archive at Nabble.com.


Reply | Threaded
Open this post in threaded view
|

Re: Unifying Testing Ideas

Attila Magyar
In reply to this post by Sean P. DeNigris
Sean P. DeNigris wrote
I understand the motivation. My question is how do we create appropriate hooks so that we don't get into these conflicts?
I don't know yet, I'm open for discussion.

Sean P. DeNigris wrote
- anyArgument now inst var of BabyMockTestCase, similar to BmAnyMessage
How do you use the anyArgument? The with: is optional, so

mock should receive: #a:b:c:

accepts a:b:c: with any arguments by default, unless you restrict it with with:and:and:.

Or did you want something like this?

mock a: anyArgument b: exactArgument1 c: exactArgument2

This seems to be a valid need.

Sean P. DeNigris wrote
- #does: now optionally takes arguments
I rarely use does, as far as I remember it is not even documented. I don't know the code behind the test but based on the names it looks like an adapter like thing. (e.g. a thin wrapper above a 3rd party api with not too much logic inside). If this is the case, normally I would test it with integration tests, instead of mocks.

Reply | Threaded
Open this post in threaded view
|

Re: Unifying Testing Ideas

Esteban A. Maringolo
In reply to this post by Dennis Schetinin
2013/12/3 Dennis Schetinin <[hidden email]>
>
> I see the only correct way to build a good testing environment: tests should be basically objects, not methods.

They are objects. Instances of TestCase.

But for simplicity they're implemented in a single "factory" class,
the TestCase subclass...



Esteban A. Maringolo

Reply | Threaded
Open this post in threaded view
|

Re: Unifying Testing Ideas

Dennis Schetinin
Sure, there are objects: it's Smalltalk, methods are obects too :) That's really correct: SUnit is the simplest thing that (still) works. But is it still enough? Isn't it a time to make more steps and improve?

I just mean it would be great to have test as distinct object that I can explore, add certain behavior and properties to, relate to other objects in system (other tests, test suites, groups and more), open in specific tools (to be yet created), log all runs and results… and associate with frameworks that it should be run within… In general, to create a full-fledged and easy-to-use environment for TDD and controlling all the development steps with tests, use them as documentation, etc. etc. etc.


--

Best regards,


Dennis Schetinin



2013/12/3 Esteban A. Maringolo <[hidden email]>
2013/12/3 Dennis Schetinin <[hidden email]>
>
> I see the only correct way to build a good testing environment: tests should be basically objects, not methods.

They are objects. Instances of TestCase.

But for simplicity they're implemented in a single "factory" class,
the TestCase subclass...



Esteban A. Maringolo


Reply | Threaded
Open this post in threaded view
|

Re: Unifying Testing Ideas

Sean P. DeNigris
Administrator
In reply to this post by Attila Magyar
Attila Magyar wrote
did you want something like this?
mock a: anyArgument b: exactArgument1 c: exactArgument2
Exactly.

Attila Magyar wrote
I rarely use does, as far as I remember it is not even documented. I don't know the code behind the test but based on the names it looks like an adapter like thing. (e.g. a thin wrapper above a 3rd party api with not too much logic inside). If this is the case, normally I would test it with integration tests, instead of mocks.
Yes it's a wrapper. It should be tested with integration/acceptance tests. But, it wraps callouts to a C dynamic library via NB, so it's hard to test - i.e. crashes the VM, etc. Interactions with it deserve to be tested via unit tests (in general I drop down to unit tests inside a failing integration/acceptance test anyway).

Anyway, does: is very important. Stubbing (can & (does: | answers:)) and mocking (should) seemed kind of like a nature-vs-nurture debate for a while, but there is a lot of overlap. Sometimes you need one, sometimes the other, and sometimes both (as shown in my example, although I kept thinking that in practice I'd probably write a test double by hand to be reused in other places).

In fact, #does: and #answers: could easily be merged, using double dispatch to pass either a block or a value.

Unrelatedly, it would be slightly clearer to rename "anything" to "anyMessage". "anything" makes it seem like it could be passed as an argument, which doesn't work. Or alternately, you could probably rename AnyMessage to Anything and combine its API with that of AnyArgument if they don't clash...
Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: Unifying Testing Ideas

Denis Kudriashov
In reply to this post by Esteban A. Maringolo
Problem with TestCase approach is mixing two separate concerns. First is how we want to define system specifications. Another is how we can persist such specifications. When we stay at test case level we lost freedom for specification definition. We restricted with smalltalk language artifacts.
For example, how we can group our tests? With classic test case approach we have four levels: packages, classes, protocols and methods. Now remember how many times you put long names for test like #shouldDoingSomethingWhenSomeSitiationHappensDuringAnotherSituationHappenButWhenSituation3NotHappen.
And most times you have tests for each separate situation with same long test names.
In smalltalk only way to refactor test names duplication is extraction same "when condition" to separate test case classes. But it means that you build classes with very long names and most times your test names stay big (but with reduced length). And do you know way how to group such classes? Only way is put such classes to own class category. Nobody doing this.

Now imagine we don't care how to persist tests (specifications). We can model specifications with explicit objects which present #when, #should and other test artifacts, which can group specifications with arbitrary way. With true objects we don't name our specifications with method name convention. We can name tests with natural language, we can put names for "when" and "should" expressions with natural language.
With objects we can build "smalltalk vision" of BDD framework. Really with smalltalk live system testing environment can be very different from other languages. BDD frameworks of other languages just changed names of "TDD dictionary". They all stay restricted with language syntax. 

So it is my vision. I really want to build such system but I have no time for this yet. 
Maybe Sean doing something similar?



2013/12/3 Esteban A. Maringolo <[hidden email]>
2013/12/3 Dennis Schetinin <[hidden email]>
>
> I see the only correct way to build a good testing environment: tests should be basically objects, not methods.

They are objects. Instances of TestCase.

But for simplicity they're implemented in a single "factory" class,
the TestCase subclass...



Esteban A. Maringolo


Reply | Threaded
Open this post in threaded view
|

Re: Unifying Testing Ideas

Attila Magyar
In reply to this post by Sean P. DeNigris
Sean P. DeNigris wrote
In fact, #does: and #answers: could easily be merged, using double dispatch to pass either a block or a value.
It's true, but it would make difficult to return a block. E.g.

    mock can receive: #msg; answers: [ [..] ]

Sean P. DeNigris wrote
Unrelatedly, it would be slightly clearer to rename "anything" to "anyMessage". "anything" makes it seem like it could be passed as an argument, which doesn't work. Or alternately, you could probably rename AnyMessage to Anything and combine its API with that of AnyArgument if they don't clash...
I agree, I'll try to combine them first, but anyMessage/anyArgument seems fine too.

Anyways, I'm working on the next version with a slightly different api. My plan is to fully separate the DSL from the main part. This will make possible to play with different syntax without touching the core. Or even the users will be able to define their own syntax.
Reply | Threaded
Open this post in threaded view
|

Re: Unifying Testing Ideas

Sean P. DeNigris
Administrator
On Dec 3, 2013, at 2:43 PM, Attila Magyar [via Smalltalk] <[hidden email]> wrote:
> It's true, but it would make difficult to return a block. E.g.
>
>     mock can receive: #msg; answers: [ [..] ]
I'll trade an extra pair of brackets in a less common case for a simpler API any day ;)

> Anyways, I'm working on the next version with a slightly different api. My plan is to fully separate the DSL from the main part. This will make possible to play with different syntax without touching the core. Or even the users will be able to define their own syntax.
Are you interested in integrating my changes into the current API? If so, I'll refactor them and clean them up. I personally would find it difficult to use the library without them. Coming from Ruby, a proper test double framework is the thing I miss most. Rspec was amazing. BabyMock is close, and with these changes has the test double features I commonly relied on Rspec for...
Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: Unifying Testing Ideas

Dennis Schetinin
In reply to this post by Denis Kudriashov
This is the task I would really like to participate in. Although, just as you, I have no time at all, I still will find at least few hours a week :)


--

Best regards,


Dennis Schetinin



2013/12/3 Denis Kudriashov <[hidden email]>
Problem with TestCase approach is mixing two separate concerns. First is how we want to define system specifications. Another is how we can persist such specifications. When we stay at test case level we lost freedom for specification definition. We restricted with smalltalk language artifacts.
For example, how we can group our tests? With classic test case approach we have four levels: packages, classes, protocols and methods. Now remember how many times you put long names for test like #shouldDoingSomethingWhenSomeSitiationHappensDuringAnotherSituationHappenButWhenSituation3NotHappen.
And most times you have tests for each separate situation with same long test names.
In smalltalk only way to refactor test names duplication is extraction same "when condition" to separate test case classes. But it means that you build classes with very long names and most times your test names stay big (but with reduced length). And do you know way how to group such classes? Only way is put such classes to own class category. Nobody doing this.

Now imagine we don't care how to persist tests (specifications). We can model specifications with explicit objects which present #when, #should and other test artifacts, which can group specifications with arbitrary way. With true objects we don't name our specifications with method name convention. We can name tests with natural language, we can put names for "when" and "should" expressions with natural language.
With objects we can build "smalltalk vision" of BDD framework. Really with smalltalk live system testing environment can be very different from other languages. BDD frameworks of other languages just changed names of "TDD dictionary". They all stay restricted with language syntax. 

So it is my vision. I really want to build such system but I have no time for this yet. 
Maybe Sean doing something similar?



2013/12/3 Esteban A. Maringolo <[hidden email]>
2013/12/3 Dennis Schetinin <[hidden email]>

>
> I see the only correct way to build a good testing environment: tests should be basically objects, not methods.

They are objects. Instances of TestCase.

But for simplicity they're implemented in a single "factory" class,
the TestCase subclass...



Esteban A. Maringolo



Reply | Threaded
Open this post in threaded view
|

Re: Unifying Testing Ideas

Stéphane Ducasse

On Dec 4, 2013, at 5:18 AM, Dennis Schetinin <[hidden email]> wrote:

This is the task I would really like to participate in. Although, just as you, I have no time at all, I still will find at least few hours a week :)

welcome to our nice club of busy people


Reply | Threaded
Open this post in threaded view
|

Re: Unifying Testing Ideas

Attila Magyar
In reply to this post by Sean P. DeNigris
Sean P. DeNigris wrote
Are you interested in integrating my changes into the current API? If so, I'll refactor them and clean them up. I personally would find it difficult to use the library without them. Coming from Ruby, a proper test double framework is the thing I miss most. Rspec was amazing. BabyMock is close, and with these changes has the test double features I commonly relied on Rspec for...
Yes, anyArgs can be useful, and I was thinking about the does extension too, and I convinced myself that it is ok and useful. I gave you access to the repo.

The reason for my initial hesitation was that my goal is to not have too powerful features like rspec has. E.g. in rspec, it is easy to stub out global class objects and I see lots of ruby programmer doing it. I don't think this is the right solution for the problem.
Reply | Threaded
Open this post in threaded view
|

Re: Unifying Testing Ideas

Frank Shearar-3
On 4 December 2013 10:01, Attila Magyar <[hidden email]> wrote:

> Sean P. DeNigris wrote
>> Are you interested in integrating my changes into the current API? If so,
>> I'll refactor them and clean them up. I personally would find it difficult
>> to use the library without them. Coming from Ruby, a proper test double
>> framework is the thing I miss most. Rspec was amazing. BabyMock is close,
>> and with these changes has the test double features I commonly relied on
>> Rspec for...
>
> Yes, anyArgs can be useful, and I was thinking about the /does/ extension
> too, and I convinced myself that it is ok and useful. I gave you access to
> the repo.
>
> The reason for my initial hesitation was that my goal is to not have too
> powerful features like rspec has. E.g. in rspec, it is easy to stub out
> global class objects and I see lots of ruby programmer doing it. I don't
> think this is the right solution for the problem.

Can you give an example of what you mean?

Being about to stub out the file system, or the clock, seem like
extremely useful things to do. (See https://github.com/defunkt/fakefs
for instance.)

frank

Reply | Threaded
Open this post in threaded view
|

Re: Unifying Testing Ideas

Sean P. DeNigris
Administrator
In reply to this post by Attila Magyar
The reason for my initial hesitation was that... in rspec, it is easy to stub out global class objects and... I don't think this is the right solution for the problem. 

Ha ha, you were right to be cautious. I've been thinking for offer a year about how to do that without method wrappers. But I will control myself in your project ;)
Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: Unifying Testing Ideas

Frank Shearar-3
On 4 December 2013 11:34, Sean P. DeNigris <[hidden email]> wrote:
> The reason for my initial hesitation was that... in rspec, it is easy to
> stub out global class objects and... I don't think this is the right
> solution for the problem.
>
>
> Ha ha, you were right to be cautious. I've been thinking for offer a year
> about how to do that without method wrappers. But I will control myself in
> your project ;)

It's still a work in progress, but Squeak's Environments ought to
provide this: your test runs in a custom, temporary Environment into
which you import your own DateAndTime or FileReference or whatever.
Code inside the Environment just calls DateAndTime, blissfully unaware
that it's calling your custom mocked implementation.

frank

Reply | Threaded
Open this post in threaded view
|

Re: Unifying Testing Ideas

Attila Magyar
In reply to this post by Frank Shearar-3
I'll try.

I think mocking the file system and other third party, low abstraction stuffs can lead brittle tests and duplications.
For example we want to store certificate files in pem format in the file system.

Mocking out the fs would look like this:

file = mock('file')
File.should_receive(:open).with("filename", "w").and_yield(file)
file.should_receive(:write).with("pem content of the cert")

I don't know ruby too well, so don't know how many different ways I can write some content into a file, but I suspect there are lot.
E.g. if I change the "write" to "puts" the test will fail, even if the functionality is still working. That's why it is brittle.
There are also some duplication between the test and the production code, because both has the very same structure.

Introducing a CertificateStore as a dependency with a more abstract interface solves this problem and simplifies the test as well.

cert_store.should_receive(:store_cert).with(certificate)

I'm no longer passing the content as a string but as certificate object, and I don't care about the low level details of the storing mechanism. As a matter of fact the storage may store the cert in an other location than the file system, and the client won't care.

Then the question is, how do I test the certificate store? And this is the point where I would switch from mocking to integration test. Because I want to be sure that I understood the file system api correctly, and it writes the file physically to the disk. The previous test didn't provide this confidence.

Using an inmemory file system to speed up the test is ok imho, but there is a risk of course that it behaves differently than the real one. Personally I wouldn't use in memory file system in a test like this.


On Wed, Dec 4, 2013 at 11:31 AM, Frank Shearar <[hidden email]> wrote:

Can you give an example of what you mean?

Being about to stub out the file system, or the clock, seem like
extremely useful things to do. (See https://github.com/defunkt/fakefs
for instance.)

frank


Reply | Threaded
Open this post in threaded view
|

Re: Unifying Testing Ideas

Sean P. DeNigris
Administrator
In reply to this post by Denis Kudriashov
Denis Kudriashov wrote
Problem with TestCase approach is mixing two separate concerns. First is
how we want to define system specifications. Another is how we can persist
such specifications.
...
So it is my vision. I really want to build such system but I have no time
for this yet.
Maybe Sean doing something similar?
Yes, yes, yes!!! This is something I've been working on and thinking about since I came to Smalltalk. Right now we have something like:
1. Use standard Smalltalk artifacts and tools to write serialized descriptions of tests
2. Have SUnit automagically create live test objects, which we never really see or reason about
3. Use some Test-specific tools, like TestRunner and Nautilus icons

This is writing in "testing assembly language". We work with the serialized implementation details, which then bubble up all the way to the UI i.e. TestRunner lists Packages in the left pane, and TestCase subclasses in the next, neither of which have anything to do with the domain of testing. It's similar to the way we encode files (e.g. images) as strings returned by methods because we can't version them easily with Monticello.

Every time I try to create "what TDD/BDD would look like in a live, dynamic environment", I start with a custom test creation UI and work backward. The UI creates a testing domain model, which may be serialized as packages, classes, and methods, but neither the test writer, runner, or reader should ever have to deal with them directly.

I have had quite a few false starts, but have learned a lot. I think a real testing domain model could create huge leverage for our community. In fact I came to Smalltalk from Ruby because I instantly recognized that manipulating text via files and command line had missed the point. In short, when you're ready, I'd love to collaborate...
Cheers,
Sean