[ntp-GSoC] Unity and TAP

Tomasz Flendrich t.flendrich at gmail.com
Mon Jun 1 19:26:12 UTC 2015


Hi all,

below is a shortcut of what I am trying to achieve to inform people that
weren't involved before.

As you all know, we are now converting NTP from GTest to Unity.
Harlan wanted one feature in Unity: a way to show that some particular test
should fail. Where it would be useful? If there is a bug, we first create a
test that actually fails, then we fix this bug and run the tests again: now
this test should pass.

My idea of achieving that goal is to make a TAP producer. TAP is a protocol
of expressing test results that is programming language agnostic. It's very
simple, but versatile, as it can be used with, for example, Jenkins.
It has a feature: a "TODO" directive that marks tests as to-do. If a TODO
test fails, it's okay. If it passes, it arouses suspicion.
TAP itself is very long: if we had 10 000 testcases, we would have 10 000
lines of logs and noone would be able to read it. This is what test
harnesses are for. They gather the test results and present them in a nice,
short form. I used one of them and did some screenshots to show you how it
looks like.

We have one file of 5 tests, 2 of which failed. What we see is:
http://i.imgur.com/TU02bFm.jpg

but we would want to see which tests failed! There is an option to see the
failed tests:
http://i.imgur.com/GEkeHCn.jpg

here is how it looks when the tests pass. Note that I used --verbose
option, so that I see all the tests. Yellow ones are TODO.
http://i.imgur.com/2VNKFwC.jpg

What about PASS TODO tests? This is exactly what we want. This test harness
explicitly warns us that a TODO test unexpectedly succeeded.
http://i.imgur.com/2xIuSRb.jpg

Please note that this test harness doesn't count PASS TODOs as failed, so
if all the test pass and some of which are TODO, we have:
http://i.imgur.com/QoW9ks3.jpg
Are we happy with it? Or should a "PASS TODO" count as a fail? It's no
problem for me to do either. Harlan, what do you think?




I already did the Unity output to TAP converter. We now need a way to show
that a test shouldn't pass.
There are two ways to do that.

a) Say it in the name of testcase
If we have a test named "test_ParserSomething", if we want it to fail, we
could rename it to "test_ParserSomethingFAIL", or change the "FAIL" word to
"ShouldFail", "TODO" or something like that.
(Please note that we cannot say it in an assertion's message, because they
are printed only when a test fails.)
This option works only on C's Unity.

b) leave the C code alone and operate on TAP output
We would need a TAP to TAP converter and an external file. Inside that
file, we would have written down which tests shouldn't pass. The converter
would edit TAP accordingly, marking tests that shouldn't pass as TODO.
With this solution, we could also look at the current NTP's revision and do
something with that knowledge. Would we even use the information about
revision anyway?
The fact that we leave the test code alone is both an advantage and
disadvantage. How can we keep track of which tests pass and which don't if
the test names change? But we have to edit a test in an external file. On
the other hand, it doesn't happen often - only with bugs and only once per
bug.
If we choose this option, it would also work on other programming
languages, because it operates on TAP. We may have some Python tests soon.


Which option is better? Did I miss some pros and cons of both options? Does
anyone have any other idea or suggestion or a question?
I personally think b) is more versatile and powerful.

Does anyone have any preferences or ideas regarding the TAP harness? (this
is the thing that executes tests, takes their output and gives us an output
that is human-readible). I plan to compare them and pick one that suits us
the best.

Please speak your mind.

Tomasz Flendrich


More information about the GSoC mailing list