This post is part of an ongoing series about Unit Testing.

In addition to the single-object unit tests that we’ve been discussing, we also write tests at other levels. We need to test the interactions between objects, the integration of several objects or subsystems, and the system as a whole.

We can think of these different kinds of tests as being in layers. At the lowest layer are unit tests where we test individual objects. At the next layer are integration tests where we test the interaction of several objects. Finally, there are system tests that test the entire system. There are other kinds of tests that fit between these, but let’s focus on these three broad categories.

In the agile world, the system tests can also be called acceptance tests, customer tests, feature tests, or story tests because they should be specified by, or at least approved by, our customer/stakeholder and related to the story or feature we’re working on . These tests are very important, because they help us determine if we are building the right system. The lower-level tests help us determine whether we’re building the system right.

Over the years, there have been several variations of test-driven development that put the initial focus on these higher-level tests. Variously called Acceptance Test Driven Development (ATDD), Storytest-Driven Development (STDD) or outside-in testing, these methods all start with writing one or more tests for the desired system behavior before drilling into the integration and unit layers.

All of these test layers are important, but they carry very different costs in terms of development and maintenance time as well as in run time. System tests are often written at the GUI or browser level which can make them slow and fragile if not written carefully. The higher the layer of the tests, the longer they take to run. This observation led Mike Cohn to introduce the testing pyramid:

'The Test Pyramid'

(image from http://martinfowler.com/bliki/TestPyramid.html)

As the pyramid indicates visually, Cohn suggests that we should have relatively fewer tests at the higher layers, and relatively more tests at the lower layers. This intuitively makes sense, but how do we accomplish this feat while still ensuring that our application works as desired?

We’ll look at this in the next few posts.