Software and More: 2015

Monday, May 4, 2015

Why xUnit Should Be Called xTest

JUnit is one of the simplest, yet also perhaps the most effective of all the software frameworks I've ever seen: it solves a very specific class of problems (writing and executing automated tests in Java) very well, with very little code and very little ceremony, which is especially true of version 4, which takes great advantage of Java annotations. Consequently, I use it whenever I'm writing Java code using Test Driven Development (TDD) method.
There is one big problem with JUnit, however, that keeps popping up over and over again: its name.
Every now and then, I will see Joe the developer broadcasting an e-mail to explain that Mary from the other group broke several unit tests of Joe's component by changing behavior of her component that Joe's component depends on. This is typically followed by a comment by Bob saying that he's very well aware of Mary's unfortunate change, because it also broke five unit tests of Bob's component.
What's going on here is that a typical developer these days seems to think that the very fact that a test is written using JUnit, or any other xUnit framework (where x is typically substituted by the first letter, or an acronym of programming language that the framework supports) automatically makes that test a "unit test".
Majority of automated tests written using JUnit or any other xUnit testing framework that I've had the opportunity to see are in reality integration tests, which test not only the unit that the author is interested in testing (typically a single class, or a cluster of tightly coupled classes), but also every bit of code (including 3rd party libraries, services and systems) that this unit happens to depend on. That's probably because xUnit tests become integration tests by mere inertia, while making them unit tests requires focused effort on mocking out dependencies.
Why does incorrectly categorizing such tests matter? Because the properties of unit and integration tests are extremely different! For example, by calling the JUnit-based test that you've written a "unit test", you are implicitly signalling to other developers that it can be run quickly (in a matter of milliseconds) and reliably, without impacting any database, or other system. But if that test turns out to be integration test, these implicit assumptions are going to be violated, and cause a lot of grief, potentially not just to the person who ran the test (e.g., by having to wait 30 minutes for it to finish), but also to the people that depend on this test (e.g., by corrupting their database, and thus conflicting with the automated tests they've gotten accustomed to depend on).
Over years, I've adopted a simple, but effective naming scheme that minimizes confusion about what kind of test a particular xUnit class implements: if the class is a collection of true unit tests, then it has suffix "UnitTest", and if it's a collection of integration tests, then its suffix is "IntegrationTest". This does make the test class names a bit longer than with the usual xUnit suffix "Test", but it takes the guess work out of the question. It also allows automated running of only unit tests, or all automated tests, depending on their type by simple pattern matching.
Perhaphs the biggest advantage of such approach when applied with discipline is that it forces me to think about the type of the test I'm about to write before I write a single line.
If I have the choice, I will almost always write a unit test, because it's always going to be faster, more reliable and easier to maintain than the integration test of the same functionality. But when the code I want to test depends on a library or framework that's hard to replace with a testing dummy or a mock, integration test may just be the right kind of test to write.
Recently, I ran into such a case when using Apache Velocity templating engine, which allows adding few lines of its (quite awkward) proprietary scripting language to the HTML page in order to populate it with data. This is accompanied with a few lines of Java code that provides that data to Velocity framework. So, while it might be even possible to mock Velocity's Java APIs using an advanced mocking technology, that kind of unit test would not be particularly valuable, as it would not test the script code embedded in the HTML. So, in this case, I decided to create a class that hides the presence of Velocity library from the rest the system, and then write an integration test for this class, which tested the implementation of that class together with its interactions with Velocity and the script embedded in HTML. It did not have to use any database, but it did end up reading and writing files on the local file system. I used HTMLUnit testing framework to verify that the output HTML was as expected for a simplified, but still non-trivial set of inputs.
This testing strategy was useful because it addressed the highest risk of the component being tested: ensuring that all pieces of the puzzle, each simple enough on its own right, fit together and produce expected output. This integration test was definitely not as fast as a unit test: one full run would take up to a minute. It was also harder to write and maintain than a unit test: having a non-trivial set of input data meant there was a non-trivial set of outputs, which thus required a lot of assertions in the test. But, a minute long feedback loop was still useful enough to allow its repeated incremental execution, and this integration test was still an order of magnitude faster than an equivalent end-to-end test would be. It was also much easier to write and maintain, since the input data set was still an order of magnitude simpler than it would have been in an end-to-end test (where input data set for this component would have full complexity of a real world system).
This integration test paid off very quickly, when it allowed me to detect several issues, always caused by a single missing, or misplaced character (did I mention that this syntax of Velocity scripting language is really awkward?). Being able to quickly detect such issue, see the error message produced by Velocity framework, and iterate to find the fix was priceless. If I was depending on a unit test, I would never have noticed the problem at the first place.
And if I had relied solely on an end-to-end test, it could have easily taken several days, instead of minutes to find the right fix and verify it.
In conclusion, integration tests lie between unit tests on one end of the test automation spectrum, and end-to-end tests, on another end. Most of the time, they provide the least bang for the buck, and developers are better off covering their code with unit tests, and addressing highest project risks with few well-chosen end-to-end tests. But there are some special circumstances where integration tests may be the most cost effective choice. The key is to be clear what kind of automated test one is writing, and make that decision consciously by assessing the project risks and associated constraints before actually starting to write the test. Following a naming scheme like the one I've described above helps to enforce that, and avoid the trap of automatically considering every test written using an xUnit framework a unit test.

Monday, March 2, 2015

Cucumber Tip: Dry Run

Working with Cucumber, you run the scenarios all the time. Typically, you care about the result of a set of scenarios, so you want Cucumber to execute each matched scenario step-by-step and report results of this execution.

However, there are a few tasks that come up quite regularly where actually executing the step definitions is not necessary. One example: you'd like to find all scenarios with a particular tag. Second example: you'd like to find location of a step definition that matches a particular step. Yet another example: you'd like to verify that a step definition's regular expression matches the text in the scenario you're working on.

When working on such tasks, you can save a lot of time by using the dryRun option in the Cucumber runner. It instructs Cucumber to find the scenarios to run by matching the given tags, just like it does during normal run, but then skip each step, and pass this step info to formatters, again, just like in a normal run.

Using Cucumber-JVM, you can get Cucumber to do a dry run by setting the dryRun parameter in CucumberOptions annotation, like this:

import org.junit.runner.RunWith;

import cucumber.api.CucumberOptions;
import cucumber.api.junit.Cucumber;

/**
 * Executes only a dry run of released scenarios!
 *
 * @author Sanjin Tulac
 */

@RunWith(Cucumber.class)
@CucumberOptions(
    dryRun = true,
    tags = { "@released" }, 
    monochrome = true, 
    format = { "pretty" },
    glue = { "com.tulac.stepDefs" },
    features = { "src/main/resources/com/tulac/features" })

public class DryRunOfReleasedScenarios {}

In this example, I'm using "pretty" formatter to list all the matched scenarios and their step definitions, including their location. This can be very useful in case your IDE does not support finding matching step defs on its own.

To give you sense of the output without overly complicating with too many details, here's a partially shortened example of a dry run output for a simple secondary scenario:

Feature: ... (the rest of feature name is listed here)
  
  This feature ... (the rest of feature description is listed here)

  Background:                                 # campaignApis.feature:6
    Given subscription with ReST APIs enabled # FeatureStepDefs.subscription_with_ReST_APIs_enabled()

  @released ... (the rest of scenario tags are listed here)
  Scenario: asking a trigger campaign to process more than 100 leads results in a 1003 # campaignApis.feature:27
    Given a brand new global static list "101 leads"                                   # ListStepDefs.a_brand_new_global_static_list(String)
    And 101 leads in the list "101 leads"                                              # ImportLeadStepDefs.leads_are_imported_into_list(Integer,String)
    Then deleting leads in list "101 leads" via a trigger campaign fails with "1003"   # CampaignStepDefs.deleting_leads_in_list_via_a_trigger_campaign_fails_with(String,String)

1 Scenarios (1 skipped)
4 Steps (4 skipped)
0m0.000s

Notice that both background and scenario steps have comments (starting with '#') in which the file and function name of a matching step def is listed. Similarly, feature file that contains the matching scenario and the line number in which it begins are listed as comment for each scenario and associated background.

In conclusion, using dryRun option when running Cucumber scenarios can save you a lot of time by skipping all matched step defs, while still producing specified reports, so remember to use it when appropriate!

Tuesday, February 24, 2015

Jenkins Plug-In Development: 10 Lessons Learned

Jenkins is by far the most popular continuous integration server around, with hundreds of plug-ins extending its core functionality, and all of that license-free, and open-sourced. Sooner, or later, however, one encounters some functionality that one needs, which is not readily available, at which point, it's time to build a new Jenkins plug-in. Recently, I've spent several weeks developing such a plug-in, and despite many tutorials available (e.g., by Jenkins team, Miel Donkers, Anthony Dahanne), this experience taught me several lessons that were anything but obvious from tutorials, which I'd like to share here with you.

1. Plan to expend at least a few weeks

Even though you likely won't need to write a lot of Jenkins-specific glue code, it's going to take quite a bit of experimentation to get it right. Jenkins provides hundreds of different, scarcely documented ways for your code to plug into it, and figuring the best way is going to take some time, so don't hope to be done in a day or two!

2. Learn from the source code of existing plug-ins

Jenkins code base is huge, fairly complex, and neither particularly well organized, nor documented. But there are several hundreds of existing plug-ins available as open source software. So, the best way to minimize your time to market is to pick one or more existing plug-ins that are doing something similar to what your plug-in should do, and try to understand how are they doing it by looking at their source code before you start writing your own plug-in code.

3. Pick appropriate base version of Jenkins

Once you're ready to start writing your own plug-in code, you'll need to pick a version of Jenkins on which to base it. This is an important decision, so don't take it lightly. Your plug-in will not be able to run on Jenkins instances running any version older than the one you pick as your base, so you may want to be conservative by picking a fairly old version. On the other hand, new plug-in APIs are being constantly added, so if you pick a too old of a version, you won't be able to take advantage of those new APIs. Also, old Jenkins APIs are being constantly deprecated and replaced by new ones, so if you base your plug-in code on Jenkins version that's too old, you may end up using deprecated APIs. This in turn will make it much harder to maintain your plug-in, should it need to upgrade to newer Jenkins base version to take advantage of its new features. Your target installed base of Jenkins servers definitely needs to be calculated in this trade-off decision. stats.jenkins-ci.org shows the cumulative view of installed base for each published Jenkins version, but your target set of Jenkins servers may have different version distribution.

4. Consider 3rd party dependencies and their versions

Jenkins code depends on a significant set of 3rd party open source Java libraries (e.g., JFreeChart), which get deployed as part of Jenkins WAR file. Should your plug-in want to use some of these 3rd party libraries, the simplest way to make it work is to use the same version that Jenkins is depending on. In this case, your pom.xml file should not explicitly declare such dependency, but rather inherit it from the parent pom.xml file. Failing to do so will result in run-time exceptions, as the version of the 3rd party library used will be the one specified and deployed by Jenkins, unless you override the class loader used by your plug-in by following these instructions.

5. Be conservative when choosing target Java version

Jenkins is completely written in Java, so the plug-in code you write will also need to be in Java. Jenkins executes all of its plug-in code in the same JVM in which it runs, which means that if Jenkins runs in a major version of JVM that's lower than major version of Java your plug-in code is using, JVM will be throwing run-time exceptions instead of executing your code. So, if you're targeting general installed base of Jenkins servers, you'll want to forfeit the bells and whistles of the latest version of Java and use the lowest version still officially supported.

6. Develop and test in short iterations

Due to scarce documentation, a lot of your development time will be lost on experimentation, e.g., figuring exactly in which directory to put your Jelly files. To minimize development time, be disciplined in sticking to as short development iterations as possible. In other words, try to make only a single change between any two test runs of your glue code and configuration. Only this way you'll be able to catch any regressions as soon as they happen, and avoid costly "ghost busting".

7. Test plug-in code using `hpi:run` Maven target

Short iterations can be quite expensive if you're constantly manually uploading the latest version of your plug-in to a stand-alone Jenkins server, even if it's running on your machine. Use Maven target hpi:run provided by Jenkins plug-in pom.xml file to minimize overhead of testing the latest version. This will automatically deploy the latest version of the plug-in you're developing on a freshly started instance of your base Jenkins version.

8. Identify your plug-in with a precise version number

Jenkins APIs allow plug-in code to write to standard output that's captured as part of the Job's console output. It's a good idea to identify any output from your plug-in with a prefix of plug-in name within angled brackets, such that there's no confusion which output is coming from your plug-in, and which from Jenkins, or some other plug-in. Also, identifying exact version of the plug-in (which you increment in each short iteration) will turn out to be very useful in case some deployment issue prevents the latest version of plug-in from being deployed. Such deployment issues are quite common, so this disciplined approach is likely to pay off handsomely sooner rather than later.

9. Use java.util.logging

Console output is useful for giving general status information to the end user, but it does not scale to debugging information, where you need ability to choose the level of abstraction of debugging data displayed. Thus, for debugging purposes, your plug-in code needs to use java.util.logging APIs, which are the only logging APIs supported by Jenkins. So, make sure you're using these APIs from the start, otherwise you'll be up for an ugly surprise when the alternative logging APIs you may be used to working with turn out to be disabled by Jenkins.

10. Loosely couple plug-in code and the main code

Chances are that beyond simply plugging into Jenkins, your plug-in will need to implement some logic and/or computations. Testing such code using manual testing cycles that include restarting Jenkins will cause you to unnecessarily waste a lot of your time. And while this manual testing is pretty much necessary for the glue code, because Jenkins APIs do not allow writing simple unit tests for it, this limitation should not be extended to the core of your code, just because the glue code calls it. Instead, make sure you strictly separate the glue code, that depends on Jenkins, from the core of your plug-in. The glue code should depend on both Jenkins and core code, but core code should not depend on either Jenkins, nor glue code. Such loose coupling will allow you to write comprehensive unit tests of your core code, and thus speed up your development cycle (provided, of course, you're practicing Test-Driven Development (TDD), which I strongly recommend).

In other words, limited testability of Jenkins APIs should not prevent you from developing the vast majority of your plug-in in short iterations starting with an automated unit test, whose execution does not require starting Jenkins server. As a secondary benefit, porting your plug-in to another platform (should the need arise in the future) will be much easier than if all the plug-in code was tightly coupled with Jenkins APIs.