Pointing to Pyramids


When we look at the collection of test automation pyramids that have been published over the last couple of years, it is hard to get a clear picture of its (original) purpose and the significance of its variations. It appears that every test automation pyramid we encounter is a new model in itself due to sometimes slight and sometimes fundamental adjustments. There is no doubt about its popularity as a model in software testing. But its development and use is troubled by the rather reckless treatment of its lineage.

There are many reasons why the field of software testing easily loses track of what has been produced in the past. One of the reasons is that it is hard to find good references. Finding references requires the study of literature and this is something that is often taken for granted. As an example of how the field of software testing obfuscates rather than clarifies the history, the evolution and the use of its models, I would like to take a few lines from an article that was recently published in the magazine Tea-time with Testers.

Up front I must mention that Tea-time with Testers is a magazine that is offered for free to the testing community. It is platform for those who desire to contribute to the field. I think we should appreciate any initiative that tries to improve the state of testing and that we should respect the people, such as Lalitkumar Bhamare (editor of Tea-time with Testers) who invest their time in these initiatives. Furthermore we should respect the writers who invest their time in sharing their experiences with the community.

In the January edition of Tea-time with Testers there is an article entitled The Agile Testing Pyramid: Not so much about Tools But more about People And Culture. In the article a reference is made to Mike Cohn’s test automation pyramid. The article itself is an experience report of the implementation of a test automation strategy in accordance with the test automation pyramid. There are some interesting conclusions and on the whole it is nice to read about the application of the pyramid in practice.

It is not my goal to criticize the article. Merely, I want to use the way it references the test automation pyramid as an example of how inaccuracies in referencing can obscure the intentions of a model. I could have taken another article or even a book about software testing to show that same lack of accuracy or sometimes even the total lack of acknowledgement of the existence of predecessor.

At the end of the article the following is stated:

The Agile Testing Pyramid is an agile test automation concept developed by Mike Cohn.


For more information: Succeeding with Agile: Software Development Using Scrum. Mike Cohn, Pearson Education, 2010

The Agile Testing Pyramid as shown in the article

The picture besides the text in the article shows the test automation pyramid with the three layers: unit, service and UI as developed by Mike Cohn. The pyramid that is shown in the article strongly resembles the pyramid that is published in the book Succeeding with Agile, which was published by Addison-Wesley in November 2009, ISBN 0321579364. It is highly likely that the (2010) reference in the article actually intends to point to this book, but we cannot be one hundred percent sure. Here is why…

The article states that Mike Cohn’s concept is called the Agile Testing Pyramid. But in the text of Succeeding with Agile, there is no mention of an Agile Testing Pyramid. Cohn calls his concept the test automation pyramid, throughout the entire book. So can we assume that there has been slight inaccuracy, that the naming of the pyramids (remember there are many of them) has gotten mixed up, but that the article still intends to point to Cohn’s 2009 pyramid? Maybe we can.

There are in fact quite a number of articles and blogs that refer to Cohn’s 2009 pyramid as the Agile Testing Pyramid, so it is not uncommon to make this mistake. There is even a book called Agile Swift by Godfrey Nolan that refers to Cohn’s 2009 pyramid as the Agile Testing Pyramid and displays an image of a totally different pyramid. Any awkwardness in this area, however, can be avoided by looking up the pyramid in Cohn’s book and referring to his work correctly.

The test automation pyramid by Mike Cohn, 2009

The question we should ask ourselves is whether test automation pyramid means something different than agile testing pyramid. It seems that the testing community believes that these names can be used interchangeably. And practitioners in the field throw other names into the mix, such as the testing pyramid, the software testing pyramid, the test pyramid, or the agile test automation pyramid. But if we do not use the name that was assigned to the model by its author, how can we be sure we indicate that particular model? We can, for example, reference the source. Without that the name Agile Testing Pyramid can mean just about anything. Luckily the article mentions the source of the pyramid. But as we saw the reference is slightly incorrect. We already suspected that the authors intended to point to the 2009 book. If there is any remaining doubt at all, this is removed by the fact that a picture of a model closely resembling Cohn’s 2009 model is shown in the article. These three facts combined, though they each have their flaws, point to the intended model. In the article that lies before us we need this three-step method for establishing that we have the correct model in mind.

  • Indicate the name of the model.
  • Make a reference to the publication of the model.
  • Add a picture of the model.

If the name and the reference would have been correct, the first two steps would have sufficed. And yet we are lucky that the authors of the article spent some effort trying to guide us to the intended model. Sometimes we are not so lucky. Sometimes Cohn’s 2009 pyramid is referred to as Cohn’s pyramid. The fact that we know of at least two disparate pyramids that were published by Mike Cohn (one in 2004 and one in 2009) means that referring to Cohn’s pyramid is not enough to point to the intended model.

Another delightful example of creating mysteries using references is the aim to point to Cohn’s 2009 pyramid by referring to it as the original pyramid. In the text of the article this phrase is used too.

The original agile testing pyramid knows three levels: unit tests, services and UI tests

The other original pyramid: the test automation pyramid by Lisa Crispin and Janet Gregory, 2009

At this point in the article we already know which model is intended by the original agile testing pyramid. Nevertheless, the phrase is wrong. If we define original as earliest then Cohn’s 2009 pyramid is not the original. There are at least a couple of test automation pyramids that were published before November 2009. None of these pyramids seem to have been popular enough to leave a lasting impression on mainstream testing. But to designate Cohn’s 2009 pyramid as the original would be distorting the truth. Even if we leave out the lesser known models there is one contender for the title of original testing pyramid, which is the model that was published by Lisa Crispin and Janet Gregory in the popular book Agile Testing in January 2009.

One of the root causes of pyramid mayhem in software testing is that there is huge amount of uncertainty about what came first and what came in what particular order. Because of this it is nearly impossible to discern any pattern or direction in the development of the test automation pyramid as a concept. If we take the Utopian view of the development of software testing we hope that it will go along the lines of the scientific method. This way a model evolves from a certain initial model and newer versions are created by testing, building on, extending and refining that initial model. In reality what we have is a Cambrian explosion of models because we do not know which models exist and how they have been tried. Each sloppy reference just stirs the soup and makes it a little bit murkier. It is because of this that the following situations may happen in practice.

  • The author is blissfully unaware of any test automation pyramid. It seems to him that his idea is new to the field of software testing. Must. Publish.
  • The author has heard about the existence of a test automation pyramid. He googles for it and after having skimmed the first three search results decides that his idea is new and fresh and the world needs to know about it.
  • The author has heard about the existence of a testing automation pyramid. He googles for it and reads the first search result. The pyramid in the first search result was published a couple of years ago and does not reference to any older pyramid. He decides that his pyramid is an improvement over the pyramid shown in the first search result and decides to publish and reference to the first search result.
  • The author has a bit more knowledge of the field of software testing. He has read the book Agile Testing by Crispin and Gregory. He knows there is a pyramid in that book. He publishes a piece about the development of the Agile pyramid and references the book as containing the original Agile pyramid.
  • The author has found Mike Cohn’s 2009 test automation pyramid. He references the pyramid in his own work and places beside it a picture of a totally different pyramid. I have actually found two examples of this, so there should be more.

Not a Conference on Test Strategy


A response to this blog post was written by Colin Cherry on his weblog. His article is entitled (In Response to DEWT5) – What Has a Test Strategy Ever Done for Us?

On page one, line two of my notes of the 5th peer conference of the Dutch Exploratory Workshop on Testing — the theme was test strategy — the following is noted:

Test (strategy) is dead!

And scribbled in the sideline:

Among a conference of 24 professionals there seems to be no agreement at all on what test strategy is.

In putting together a talk for DEWT5 I struggled to find examples of me creating and handling a test strategy. In retrospect, perhaps this struggle was not as much caused by a lack of strategizing on my part, as it was caused by my inability to recognize a test strategy as such.

Still I find it utterly fascinating that in the field of study that we call ‘software testing’ — which has been in existince since (roughly) the 1960s — we are at a total loss when we try to define even the most basic terms of our craft. During the conference it turned out that there are many ways to think of a strategy. During the open season after the first talk, by the very brave Marjana Shammi, a discussion between the delegates turned into an attempt to come to a common understanding of the concept of test strategy. Luckily this attempt was nipped in the bud by DEWT5 organizers Ruud Cox and Philip Hoeben.

For the rest of the conference we decided to put aside the nagging question of what me mean when we call something a test strategy, and just take the experience reports at face value. In hindsight, I think this was a heroic decision, and it proved to be right because the conference blossomed with colourful takes on strategy. Particularly Richard Bradshaw‘s persistent refusal to call his way of working — presented during his experience report —  a ‘strategy’, now does not stand out so much as an act of defiance, yet as an act of sensibility.

A definition of test strategy that reflects Richard’s point of view and was mentioned in other experience reports as well,  is that a strategy is “the things (that shape what) I do”.

And yet I couldn’t help myself by overturning the stone yet one more time during lunch on Sunday with Joep Schuurkes and Maaret Pyhäjärvi. Why is it that we are in a field of study that is apparently in such a mess that even seasoned professionals among themselves are unable to find agreement on definitions and terms. I proposed that, for example, the field of surgery will have very specific and exact definitions of, for example, the way to cut through human tissue. Why don’t we have such a common language?

Maaret offered as an answer that there may have been a time in our field of study when the words ‘test strategy’ meant the same thing to a relatively large number of people. At least we have books that testify of a test strategy in a confident and detailed way. The fact that the participants of the fifth conference of the Dutch Exploratory Workshop on Testing in 2015 are unable to describe ‘strategy’ in a common way, perhaps reflects the development of the craft since then.

Tower of Babel, Pieter Bruegel

The Tower of Babel by Pieter Bruegel the Elder (1563)

As a personal thought I would like to add to this that we should not necessarily think of our craft as a thing that progresses (constantly). It goes through upheavals that are powerful enough to destroy it, or to change it utterly. It may turn out that DEWT5 happened in the middle of one of these upheavals; one that forced us to rethink the existence of a common language. The biblical tale of the tower of Babel suggests that without a common language, humans are unable to work together and build greater things. Perhaps the challenge of working together and sharing knowledge without having access to a common language is what context-driven testing is trying to solve by adhering to experience reports. ISTQB and ISO 29119 are trying to fix the very same problem by declaring the language and forcing it upon the testing community. This is a blunt, political move, but, like the reaction from the context-driven community, it is also an attempt to survive.

With regards to my ‘surgery’ analogy, Joep suggested that surgeons deal with physical things and as such, they have the possibility to offer a physical representation of the definition. Software testing deals with the intangible, and as such our definitions are, forever, abstractions. If we want to look for analogies in other domains then perhaps the field of philosophy is closer to software testing. And in philosophy the struggle with definitions is never ending; it runs through the heart of this field. Maybe it is something we just need to accept.

Refactoring and Testing the Testing Timeline


While there are still at least 20 recent events in the History of Software Testing that need to be edited and published, I managed to do a little refactoring on the timeline – the graphical representation of the history.

The main thing that I did was to reign in the text that was overflowing from some of the boxes. The image is generated (from database entries) by a script that uses the GD library of PHP. The script is legacy code that I refactored a couple of times because it ran out of memory or because I wanted parameterize certain settings. The GD library allows you to pick the font for your text. I was using Tahoma, but since that was becoming a little bit boring, I thought of selecting a new font.

For an image in which the text is very small, but needs to readable (also when printed) and has to fit in the boxes, selecting a font is not very easy. The text in the boxes is cut off (to form a new line) at 22 characters, so the ultimate width of a box is 22 characters. Whether or not these 22 characters overflow depends a lot on the type, the spacing and the weight of the font. Testing – I did this in production, I must admit – is the only way to find out how a font fits. Also some of the selected fonts made it to print (using a Canon PIXMA MX360), and were tested for readability under low-light circumstances.

Using this exploratory approach I tried several fonts and ended up with Signika (semi-bold), which is an open source font from Google Fonts. At first I more or less randomly picked some sans serif fonts, but through testing I learned more about what I require from a font. Google Fonts was a great tool because it allows you to select fonts using criteria and also to add fonts to your collection and download them all at once in the right format (ttf). This certainly speeded up development.

I could have done more refactoring to the script (to allow, for example, for the runtime selection of a font when generating the timeline) but decided that it wasn’t worth the effort. And I still have to verify the qualities (color representation) of the Canon printer, which is a second-hand printer that I recently bought for nearly nought. But all in all I am satisfied with the result.

The renewed timeline can be found here.



Taking the History of Software Testing beyond 2009


I have had some requests to take the History of Software Testing beyond the year 2009. After all, the craft of software testing has not stood still in recent years.

In the table below is a first draft of some of the events that I would like to add. This is list is probably by no means complete. Also, a certain amount (some will say a huge amount) of (selection) bias on my part is involved. As can be concluded from the list I lean toward context-driven testing, Agile and test automation. These are the things that I am familiar with.

This brings us to the topic of how to select which event should be in the history and which event should not be in the history. I have some selection criteria, but I think I violate each of these criteria over the whole of the history. Perhaps then this should be the starting point of a discussion on what we deem important enough to enscribe in our collective memory. Sometimes though, what we deem important is a highly individual opinion. It is therefore not probable that a study of history is 100% objective.

Also, it may be a fool’s errand to appraise at this moment what was important last year, or the year before that. It is a given that the importance of events will be seen in a completely different light 50 years from now.

In short, I am aware that my selection is not the ‘right’ selection, nor perhaps even a satisfactory selection.

I will write about some of selection criteria that I employ. For now, being aware of that fact that it could start a hellish debate, I sort of encourage you to post the most obvious things I forgot in the comments. I do not guarantee inclusion of anything, for the obvious reason that mediating discussions on what’s important and what’s not, is likely to be a full-time job. If you feel that a discussion is neccessary, do feel free to use the social media to start one.

[table id=1 /]

The ‘Gartner bias’ in software testing


Yesterday, I stumbled upon the call for papers for the Dutch Testing Day 2013. The programme committee has decided that the theme of this year’s conference should be the future of testing. As the future is always a source of copious speculation and therefore discussion, it probably serves well as a conference theme. It may also prove to be interesting.

However, in the case of the Dutch Testing Day 2013, it is likely that the programme committee did not do much more than skim some leaflets and pick some terms from the trend reports by Gartner, Ovum and Forrester. Whether the committee’s effort actually stretched beyond this simple reading exercise, does not become clear from their call for papers, which acutely suffers from the  ‘Gartner bias’.

Below is the list of themes that will be important in the future of software testing, according to the programme committee. I know this list is suffocatingly boring as it is piped through every possible software development marketing channel worldwide. As I am not out to torture you, I would have left it out, but there is a point to be made.

  • Continuous delivery
  • Mobile
  • Crowd
  • Big data
  • Security
  • Privacy
  • Lean testing
  • Social media

Now compare this list to a 2012 Gartner report entitled ‘Top 10 strategic technology trends for 2013‘.  Gartner mentions the main trends listed below and segments them into 10 topics.

  • Mobile
  • Cloud
  • Big data

Sounds familiar, right? If you want to add security, privacy and whatever you like, go see the Ovum 2013 Trends to Watch and copy, paste. Plenty of stuff to discuss and you’re done in less than a minute creating a program. The only slightly annoying problem that remains is that you’re doing the craft of software testing a huge disservice. This way of discussing software testing should be considered – the Merriam Webster dictionary states it correctly – a harmful act. In other words, the list of topics presented by the programme committee, was not created by software testers, because apparently the first question in software testing was never asked: “Is that really so?”.

The first reason why software testing should be not be equated with the latest marketing fads in software development is that the trends are exactly that: moving targets and fleeting infatuations. Even Gartner and Ovum make their predictions just for the year ahead. They know (and they probably earn a nice amount of money from the fact) that next year things could be different. Wanting to guide the craft of software testing into the future by fixating solely on trends is like trying to cross the Atlantic just being tossed around by the currents and the winds, without using instruments to manipulate the forces of nature into your favor. Sure, there may be a very slight chance that you reach the other end of the ocean… alive, hopefully.

Time and again, when we link software testing to infatuations, we take away focus from essentials of the craft. Furthermore, with this kind of thinking, we do not encourage software testers to look for anything beyond trends. We just tell them to learn whatever technology is in vogue and to learn the next big thing a couple of years later, without ever thinking about how software testing is done and why it is done that way. This is a way to move backward, not forward.

The second reason is that software testing is not technology-driven. Software testing is, and always has been, about the investigation of software. How and to what end the software is investigated depends on what is to be tested and what questions we try to answer. However; the instruments of reasoning that we use in software testing – the fundamental paradigms driving software testing – are not going to change because the application is written in Java or C++, or whatever means is used to store data.

The instruments of reasoning are essential to software testing and when there is a discussion about the advancement of software testing, I am expecting a discussion of developments in, for example, anthropology, the philosophy of science or linguistics. Anyone coming up with the next technological infatuation just isn’t taking the craft seriously.

The third reason is that software testing is not going to be driven by the next software development or management trend. As said above, software testing is an investigation into software. This investigation is bounded by many factors, such as the software development paradigm du jour, but the paradigms driving this investigation are not bounded by the trend. If they were, it would be like saying that in test-driven development we test the software only and exclusively using boolean algebra, while in lean software development we are only and exclusively going to use deductive reasoning. This, clearly, is nonsense.

My question to the programme committee is whether they truly thought about the goal, as stated in the call for papers,

Our goal is to create a test event that will inspire testers and prepare them for the future.

and if they can explain why they think their current approach is the best way to reach that goal.

Slides of my presentation on the history of software testing


Last Monday (27 May) I presented on the history of software testing for the Community of Practice Testing of Capgemini in the Netherlands. It was a pleasant evening and the auditorium was filled with a good and eager crowd. Among those present were Eric Roeland and Eddy Bruin. The presentation is entitled History Class – For software testers. Its aim is to make testers a bit more aware of the background of our craft.

To summarize the 37 slides: everything you know about testing right now was invented in the 1970’s (or earlier). And now a bit more seriously: the history of software testing is all about the vantage point. Not so long ago I argued that the history is mainly used as a sales argument. My point of view on the history should also be regarded as such. History is a difficult field.

One remark I got on the history and my take on the history of context-driven testing was that (traditional) software testing and context-driven testing may have totally different starting points. From a philosophical point of view, there is a certain amount of truth in that. Context-driven testing is based on a different set of (scientific) concepts. I am going to develop this idea.

Until then, these are the slides of my presentation. Hope you can make sense of them.