The moment when it clicks

Standard

There are moments when something suddenly clicks. Something that appeared to be veiled and impossible to understand suddenly becomes intelligible and clear. It is a very astonishing moment that occasionally happens when we study something. My understanding of my own learning of a particular subject—whether it be a tool or a domain—is that it happens gradually. I add pieces to the puzzle and over time a more complete picture evolves. It can be a tedious affair. Sudden insight, as if passing through a door that unexpectedly opens, does not happen to me a lot.

Yet last week I had such a moment. I had been toying around with Kibana during the last couple of weeks, but without a lot of success. We use Kibana to sift through the logging that is generated in the production environment. We try to gather relevant statistics, signals through the aggregation of the data that is logged. I personally think the tester should familiarize himself with the usage of logs to analyze what is going on in production. The data gathered can inform testing, can tell him about the actual usage of the product and can reveal risk and help him direct his testing.

So, since our team uses Kibana (Kibana 3, to be precise), I felt like I had no excuse to dodge that bullet. I probably could have gotten away with avoiding looking at the logging. In my team there are at least two engineers who regularly look at the dashboards and I could have left it up to them to monitor the production environment and perhaps do some requests for me. But I personally wanted to get more out of monitoring and so I had to try to tackle the Elastic Stack.

For weeks I struggled with the Kibana dashboard. The queries and filtering seemed counter-intuitive and the results almost random. The creation of rows and panels (the layout of the dashboard) baffled me. It was my first encounter with Log4j and Tomcat logging and my inexperience with many of the parts of the Elastic Stack caused frustration. I would spend a couple of hours creating some queries but never ended up with the right result. The Elastic query DSL just failed to make a logical connection in my head. I looked up tutorials and some instructions videos on Youtube, but I did not advance. It was like knocking at the same door all the time to find it shut tight.

And last week the door suddenly opened. In the matter of an hour I went from hitting keys in frustration to freely and joyfully playing around with the tool. I do not think there is a single thing that unlocked the door, but in retrospect there are some things that helped. I’d like to offer a quick examination of those things.

First off, last week, I set myself a small, well-defined Kibana task, caused by the following. My team uses a Grafana dashboard to keep track of the errors that are generated in the production environment. The dashboard is shown on a wide screen television that is on all the time. Errors appear on our dashboard but it seems that we pay only marginal attention to them. The lack of interest that I noticed is a common one. It is the same lack of interest that can be observed when putting the results of flaky automated tests on a dashboard. Over time, the lack of trust in the results of these tests causes a kind of boredom, the shutting out of the false alarm. Since the Grafana dashboard does not facilitate the splitting up of the errors by root cause but Kibana does, my only task was to split up the errors by root cause and therewith increase our insight in the errors. This task was within my reach. The fact that there were some examples, created by other teams, readily available also helped.

Second, I finally took the time to notice the things that were going on in the Kibana dashboard. I should have paid attention to them long ago, but I think my frustration got in the way. For example; it is pretty easy to create a query in Kibana that will run indefinitely. Setting the scope of the query to a large number of days can do that for you. It will leave you guessing endlessly about the flakiness of your query unless you notice the tiny, tiny progress indicator running in the right upper corner of the panel.

Also, different panels of the dashboard will react differently to the results of the query. The table panel, which shows a paginated table of records matching your query, can show results pretty quickly, but a graph potentially takes a lot of time to build up. This seems downright obvious and yet understanding this dynamic takes away a lot of the frustration of working with a Kibana dashboard. It is a delicate tool and you have to think through each query in terms of performance.

Thirdly, I think determination also contributed to the click moment. I desperately wanted to win the battle against Kibana and I wanted to take away some of fuziness of the dashboard. Last week I noticed a difference between the number of errors as shown in the Grafana dashboard and the number of errors (for the same time period) as gathered from Kibana. So there was a bug in our dashboard. Then I knew for certain that Kibana can serve as a testing tool. Once I was fully aware of its potential, I knew there was only one way forward.

 

 

 

Advertisements

Giving the good advice

Standard

In my previous post I tried to explain that software testing, as it happens in practice, cannot be represented as a logical and orderly (coherent) sequence of events. There are too many factors—specific circumstances— that influence the decision making process that guides testing. The study of testing is the study of this decision making process. It is driven by detailed information about the circumstances. And yet, in many publications about software testing, these details are disregarded. What remains is a representation of testing that is orderly, classified and coherent; a representation that is often lacking details (facts) by which we can verify the selected approach, method or model.

One such representation—which I mentioned in my comment on the previous post—is the catalog; a list of systematically arranged items. Usually, such a list contains items that are abstracted from reality. Items can be resources (such as software testing tools), skills, characteristics, methods, practices, preferences and many other things. The list can have many purposes; it can serve, for example, as a checklist, an overview, a heuristic, a process description or a guideline. In software testing we have a wide diversity of catalogs and my argument is that we focus too often on achieving an orderly, classified and coherent representation, while forgetting about reality.

As an example I would like to discuss a catalog that is presented in the EuroSTAR blog post G(r)ood testing 25: Tips for how to boost Unit testing as a Functional Tester. The catalog that is offered is called ‘Tips for boosting Unit testing’. In order to understand what the list looks like, a part of it is presented below.

Tips for boosting Unit testing

  • Focus on doing useful tests (tests deliver information) rather than discussing whether the tests are unit tests or functional tests
  • Ask the developer for Help (or offer him to help?), rather than telling him he needs to do UT
  • Pair the tester and developer together while defining the tests
  • Demonstrate the damage of not doing UT – E.g. showing that late found defects slow everyone down
  • Organize coding Dojo’s where you put unit testing on the agenda
  • Introduce test design techniques to the developers
  • Get an understanding about the build and deployment process

I am currently involved in setting up unit testing in an Agile team, so the topic of unit testing—and putting unit testing on the agenda—is familiar to me. When I read through the list that is mentioned in the blog post, at first glance I feel that the advice that is presented is sensible and practical. I even applied some of the tips that are on the list. It appears to contain solid and practical advice to someone wanting to make more of unit testing.

But is it really solid and practical advice? The fact that it looks very familiar to me, might suggest that it is folklore, which is defined by Merriam-Webster as “an often unsupported notion, story, or saying that is widely circulated.” Could it be that these tips are widely circulated and therefore look very familiar to me? This could be an interesting area for further investigation. And then the next question: is it good advice? Some state that “most advice is terrible”. This statement is made in one of the first links that turned up when I searched for the phrase “good or bad advice” in Google. It is interesting, because it suggests that this list of tips for unit testing that we a looking at is likely to be terrible. But it also suggests that we probably should look further and come to a more nuanced perspective of what advice means. In order to get to this nuanced perspective we need to study the nature of the advice, its context, the persons involved etc…

In order to evaluate the advice that is given to us, we need to know more about it. The blog post reveals quite a number things. The list of tips was the outcome of a brainstorm session that was held during the the 21st testing retreat in Château de Labusquière à Montadet in France. We learn that senior testers (twelve in total) from various countries were present. The fact that all testers were senior suggests that the advice has been derived from years of experience in software testing. With the means at hand it is not possible to find out if that experience encompasses unit testing, so we have to assume this. But all in all, the word ‘senior’ suggests that the advice should be good, since it must have been tested in practice extensively.

There are other aspects of the advice that can be learned from the text. For example, a motivation for giving the advice is stated (to help testers in their struggle) and in a very general sense a couple of situations are described in which the advice might be of use. But apart from the fact that the advice is likely to come from experience, we have no other indications that the advice is good. Moreover, we lack information by which we can verify whether the advice is good or not. Sure, one can apply each of the tips to the best of one’s abilities, but if this leads to (horrible) failure, how does this reflect on the quality of the advice? Perhaps the advice was bad indeed but the failure to deliver may also have been caused by shortcomings on the tester’s part. Perhaps she did not understand the advice or she lacked of skills to apply it. Furthermore there may have been circumstances adversarial to implementing the advice. Perhaps the timing was wrong, the order in which it was applied was incorrect or certain preconditions were not met.

By now it should be obvious that we lack two things. Firstly, the criteria by which we can evaluate the success or failure of the advice that is given are not present. And secondly we lack information about the context in which the advice is applicable and what it needed to apply it. Without these criteria it is possible to give any sort of advice. If I wanted for example to get from Utrecht in the Netherlands to New York, the advice might be to get on a plane. This advice contains a huge amount of implicit assumptions that are essential to the success of the advice. In other words; the advice is useless. Perhaps the same can, for example, be said of the advice to “introduce test design techniques to the developers.” I can see how test design techniques help us make an informed decision about domain coverage, but aren’t test design techniques usually based on detailed functional specifications? What is there to cover when we are coding and learning about the functioning of the application in parallel? And how do we know what technique to apply if we know little about risk? And if a test design technique tells us that we should write say thirty unit tests, how would I deal with the boredom of writing those tests? How would I handle the frustration of the developer who wrote these elaborate design-technique-based tests and has to throw them out three sprints later because of new insights with regards to the functionality? And which tests should I write that are not based on test design techniques? And what should they be based on?

As an afterthought, it may be a nice exercise in critical evaluation to add tips to the list and see if the list gets better or deteriorates because of it. If creativity is needed just look up the Celestial Emporium of Benevolent Knowledge, which is a brightly shining example of a catalog.

Never in a straight line

Standard

The theme of the seventh annual peer conference of the Dutch Exploratory Workshop on Testing (DEWT7) is lessons learned in software testing. In the light of that theme I want to share a lesson recently learned.

Broadly stated, the lesson learned is that nearly any effort in software testing develops in a non-linear way. This may seem like a wide open door, but I find that it contrasts with the way software testing is portrayed in many presentations, books and articles. It is likely that due to the limitations of the medium, decisions must be made to focus on some key areas and leave out seemingly trivial details. When describing or explaining testing to other people, we may be inclined to create coherent narratives in which a theme is gradually developed, following logical steps.

Over the last couple of months I came to realize something that I’ve been experiencing for a longer time; the reality of testing is not a coherent narrative. Rather; it is a series of insights based on a mixture of (intellectual) effort and will, craftsmanship, conflicts and emotions, personality and personal interests and, last but not certainly least, circumstance, among which chance and serendipity. The study aimed at the core of testing is the study of the decision making process that the software tester goes through.

My particular experience is one of balancing many aspects of the software development process in order to move towards a better view of the quality of the software. I spent six full weeks refactoring an old (semi) automated regression test suite in order to be able to produce test data in a more consistent manner. As expected, there was not enough time to complete this refactoring. Other priorities were pressing, so I got involved in the team’s effort to build a web service and assist in setting up unit testing. My personal interest in setting up unit testing evolved out of my conviction that the distribution of automated tests as shown in Cohn’s Test Automation Pyramid is basically a sound one. The drive to make more of unit testing was further fueled by a presentation by J.B. Rainsberger (Integrated Tests Are A Scam). I used unit testing to stimulate the team’s thinking about coverage. I was willing to follow through on setting up a crisp and sound automation strategy, but having set some wheels in motion I had to catch up with the business domain. With four developers in the team mainly focusing on code, I felt (was made to feel) that my added value to the team was in learning as much as needed about why we were building the software product. To look outward instead of inward. And this is were I am at the moment, employing heuristics such as FEW HICCUPS and CRUSSPIC STMPL (PDF) to investigate the context. It turns out that my investment in the old automated regression test suite to churn out production-like data is now starting to prove its worth. Luck or foresight?

All this time a test strategy (a single one?) is under development. Actually, there have been long (and I mean long) discussions about the test approach within the team. I could have ‘mandated’ a testing strategy from my position as being the person in the team who has the most experience in testing. Instead I decided to provide a little guidance here and there but to keep away from a formal plan. Currently the test strategy is evolving ‘by example’, which I believe is the most efficient way and also the way that keeps everyone involved and invested.

The evolution of the understanding of the quality of the software product is not a straight path. Be skeptical of anything or anyone telling you that testing is a series of more or less formalized steps leading to a more or less fixed outcome. Consider that the evolution of the understanding of quality is impacted by many factors.

 

 

The Last Retrospective

Standard

The Bustle in a House

The Morning after Death

Is solemnest of industries

Enacted opon Earth –

 

The Sweeping up the Heart

And putting Love away

We shall not want to use again

Until Eternity.

Emily Dickinson

 

Thirty odd sticky notes hang on the whiteboard, carefully categorized this time. Written upon in neat, thick pencil strokes, they await inspection, analysis.

The last retrospective didn’t happen. It was canceled at the last minute by the person who was assigned to be the scrum master, because the fixing of incidents in production had priority over the retrospective session. The stickies have been on that whiteboard for two weeks now, unattended. I like to believe they serve as a monument, as a tombstone for our exploits into an Agile way of working. One of these days though, I am going to have to clean up the board.

It would have been the last retrospective anyway because we are returning from an Agile way of working to waterfall software development – which is, euphemistically, being sold as Kanban. We tried Agile for some ten months but it never took hold. On the ‘what went well’ side of the retrospective board there are stickies stating ‘At least we try to keep on smiling’ and ‘Somehow we manage to deliver some work’. On the ‘what could have been better’ side there is an exhaustive to the point of masochistic description of the swamp that we’ve been ploughing through for the last couple of months. The stickies state that we do not work as a team, there is no focus, there are no priorities, it is not clear who does what, there is no clear goal, we are continuously interrupted and there is lack of understanding of quality.

I feel it is okay to move away from Agile. In fact, we (the team) discussed this a couple of weeks ago as a viable option. We decided it would be beneficial to let go of the Agile expectations. I think the main reason for not succeeding at Agile was lack of commitment from management at every bloody step of the way. While the company envisioned to charge into the future the Agile way, middle management just basically ignored that message. There is an abundance of other reasons as well. I like to think that I tried to make Agile work but that the opposition was just overwhelming. In retrospect, I think it was.

Agile Testing Days 2015 – At the Lab

Standard

This year I attended the Agile Testing Days for the second time and I can say for sure that I will be attending in 2016. It is worth to return to this conference, because it has as delightful mix of topics across the Agile spectrum. The program offers technical talks and workshops that focus on coding, debugging, frameworks and test automation. It also features talks and workshops on Agile leadership and collaboration, with focus on human aspects, motivation and learning. And if offers sessions on many aspects of testing, such as exploratory testing, note-taking, modeling and the relation of testing to Agile methodologies. There is more than enough to go round for the modern tester.

CodeBug

CodeBug

From this year’s conference I took away some particular things. I spend quite some time in the Anything Build Party & TestLab, expertly hosted by James Lyndsay and Bart Knaack. I played around with CodeBug, which is a programmable and wearable device with 25 leds (see picture). The device can be programmed using a Blockly-based programming interface, which makes it easy to build controls for the LEDs. I spend a while thinking about what I would like to create and then got stuck halfway through coding. At that point I paired up with another participant trying to improve the program that I wrote. We seemed to run into the limits of the programming interface, but it was fun doing this together and we were both energized by the experience.

I also finally took a serious look at some black box puzzles, created by James Lyndsay. At the Belgium Testing Days last year I struggled with a black box puzzle, turned into a physical puzzle by Altom. Now I wanted to look at the other puzzles. With some hints from James I found a satisfying description of the functional behavior of puzzle 8. After that success I picked up puzzle 7 which I was unable to fully figure out, also because the conference was drawing to an end.

I learned the following things from the puzzles.

  • Taking notes helps. Using the notes as a guideline, it is easier to explain your testing. At least, that was how I was able to explain my testing when people who stopped by asked what I had been testing so far. Taking notes in a notebook with a pen or pencil is quicker, much more flexible and much more intuitive than taking notes in the form of a mind map. I tried both the mind map and the pencil. Creating a mind map (even tough it appears to be very light weight tool), seemed to interrupt the flow of thoughts and the flow of testing exactly when I needed that flow. I chucked the mind map and grabbed the pencil. I always thought of a mind map as a ‘informal’ tool that enables creativity. I do so now to a lesser degree.Black box puzzle 8
  • Having a mental model helps you devise theories and experiments. The mental model is the idea you have about how the subject under test might function. Whether the idea is right or wrong doesn’t matter that much. I got pretty far in explaining the behavior of puzzle 8 using a wrong (but very useful!) mental model. The beauty is that once you have an idea about how an application might function, you can test that theory. Often this provides you with more information about its behavior and about the validity of your model. Having gained that information I was able to move to a more advanced model. I think I moved through several mental models during puzzle 7 and 8.
  • Frequent interaction with the subject under test is important. With regards to the puzzles I tried, I found it not very useful to sit back and philosophize a whole lot about the excercise. Reflection is needed in order to move ahead through the theories about the functioning of the device, but you need data to reflect on. Interacting with the subject under test makes you notice certain behavior and it will generates test data. This is the data that you need in order to build a theory. I observed people playing the dice game during the Agile Games Market on Wednesday evening and one participant in this game took long pauses (of up to a minute!) between ‘throws’. I was almost yelling ‘Come on, test, test, test!!’ to urge him to gather data faster. The brain needs something to work with.
  • And yet, taking a break helps. This has been mentioned time and again in testing literature and blog posts. It is true that when one detaches himself from the situation, usually the mind starts offering clues as to how the application functions. I took a cigarette break and it helped me. James told me a story about a man who tried one of his puzzles at a conference and got stuck. During the ensuing keynote he had an epiphany, went back to the lab and presented, in one go, the answer to the puzzle.
  • Use tools. Since puzzles 7 & 8 are about timing, it is nice to have a timer and use it. I took my smartphone and used the timer to at least get an idea of the time between the flashing of the red light. It helped me to be a bit more confident about the reproducibility of certain situations and it helped me to falsify (or confirm) my theories. Yet at my current assignment quite a number of testers seem to be very wary of using tools, whether it be a SQL query tool, a tool to transfers loan data from one database to another, to make notes, to call a SOAP interface, to view logs, to drive a GUI, to debug the environment etc… I think it is inexperience in using such tools that is keeping them.

Since I didn’t ‘solve’ puzzle 7, James gave me a final hint and he urged me to look at it at home. I will do this, but in the meantime I want to thank James for creating these puzzles, and James and Bart for running the testlab at so many conferences, making it possible for testers to reflect on their testing and to learn from it.

On Organization by Circumstance

Standard

One of the books that influenced my thinking in the past couple of months is The Peter Principle by the Canadian teacher and author Laurence J. Peter. The book is famous for its principle, which goes as follows:

In a hierarchy every employee tends to rise to his level of incompetence given enough time and enough levels in the hierarchy.

And there is Peter’s Corollary to this principle.

In time, every post tends to be occupied by an employee who is incompetent to carry out its duties.

At first glance it appears that the book is an attempt at satire or parody. In many ‘case studies’ Peter humors the way that employees move upward in an organization to their level of incompetence and paints a somewhat melancholic and bleak picture of the employee who is caught at this level, like a rat in a cage.

Once you progress through the book and read about the symptoms and syndromes of ‘final placement’, you start to realize that this is actually happening all around you. The principle is viciously simple and Peter shows over and over again that when you try to explain why the hierarchy of the organization is the way it is, the Peter Principle is the only way to account for that.

Though the principle is a philosophic contemplation rather than a scientific fact, it has made me realize that the hierarchy of an organization is not formed of individuals being placed in these positions because they are the best fit for the job. I know that this, like the realization in my previous post, is a wide open door. And yet it made me look at the organization as an organism; as an entity consisting of people who are organized along other guiding principles than you might expect or suspect.

Especially the fact that you expect people in a certain position in the hierarchy to behave in a way or to show traits that are characteristic of that position, reduces your chances of interacting with the organization in meaningful way. Right now I am looking at the organization as a system in which people move around rather like molecules in a gas, bouncing off other molecules. Thus, the reasons for a person to be in a certain position are circumstantial and should be analyzed through the evolution of his or her environment, rather than from the perspective of organizational intent.

On Being Part of the Problem

Standard

He that is not with me is against me; and he that gathereth not with me scattereth abroad.

Matthew 12:30 King James Version

Two weeks ago I visited the Agile Testing Days in Potsdam. It is an awesome conference and if you have the opportunity to be there in 2015, do consider attending. I am sifting through the talks that I attended and the conversations that I had, and will try to come up with some form of retrospective. Today something struck me about ‘being agile’ that actually never occurred to me before. I think it was my visit to the Agile Testing Days that actually set the train of thought in motion that lead me to this realization.

Actually, the thing that struck is me probably as revealing as the fact that the earth rotates around the sun. And yet it takes an actual experience to really know what it means. The realization is that when you are a member of an Agile team, this implies that you are part of the solution for the problem that the team is trying to solve. It means that you are enlisted to throw all the capabilities you have at the problem. You may not have the capabilities to solve the problem entirely so that is why we have teams of people who complement each other. But you are part of the solution.

The thing that made me feel this was that I was pleading my case before a team member, inviting him to help, assist or guide me in whichever way possible. The way I actually said it was: “Please feel free at any time to correct my insights, come up with a better plan, teach me things that I do not know or do anything else to make this solution that we as a team are working on, a success.” I made this plea because this same person provided me with some critical remarks in the days before. Remarks such as “I do not think this is documented in the right way”, “I think there is a better way to report these results”, “I think there are persons who are in a better position to evaluate this situation” and “I think we should take another look at the decisions that we made.” Without expanding on what he actually thought.

So this plea was probably my final attempt to get this person on board, a final invitation to show me what was on his mind and how things could be improved. I probably could have handled a 45 minute rant detailing how he thought we should move forward from this. Yet the sphinx-like answer I got was that I should be asking him the right questions, based on which he might offer me some of his insights. Which is a game that was last played in ancient Greece.

In an Agile team, when you criticize the work of a team member and after that vigorously and consistently refrain from offering help, then you’re not part of the solution. And if you’re not part of the solution, you’re not part of the team.