A Very Edgy Sequel: Testing on the Edge II

August 13, 2017 by Jeremy Wenisch | 2 Comments

Fourteen months and two posts ago, I described several ways that I am a tester on the edge; that is, I had noticed several “tensions within myself while I test software: I tend to teeter on the edge between sets of two things – tactics, concepts, mindsets, emotions.” The life of a tester (for me, at least) seems to be a life of balances.

Since putting a name to the phenomenon, I have noticed even more examples, and I’ve expanded on a handful below.

Overreporting vs. Underreporting

Here’s a feeling I hate: A release goes out, a week later a bug report from a user comes in, and I recognize it immediately. I caught that bug during testing, but I successfully convinced myself that it wasn’t worth reporting, at least not yet. Maybe we were in a crunch for the release and either I didn’t think it was critical or I didn’t think it was new to this release, or I thought it would take too much time to investigate and I didn’t want to report it without pinning it down, or maybe we weren’t in a crunch but I didn’t think it was likely a user would run into it or I didn’t see a crucial risk in the bug or I didn’t think it would get fixed. I caught the bug, I didn’t report it, and it bugged a user.

Here’s another feeling I hate: I report a bug, and it gets closed as won’t-fix, or deferred to Someday. Maybe I overvalued how badly it would bug a user or how likely it would be to occur, or maybe I didn’t uncover or include enough evidence to make the risk claim credible, or maybe the fix would be too invasive or destabilizing, or maybe it came down to aesthetic nit-picking not worth addressing. I caught the bug, I reported it, and it wasn’t fixed. When this happens too often, credibility with developers takes a dip.

So I find myself teetering on the edge between actions that avoid those feelings I hate; between overreporting to avoid missing important bugs and underreporting to avoid losing credibility.

Analysis vs. Evidence

Here’s another way that a tester’s credibility with developers can suffer: too often taking a guess at the root cause of a bug. This can take several forms, among them:

Statement of apparent fact: “The widget is breaking when I enter a value of zero because the underlying function isn’t handling divide-by-zero properly.”
Accusation: “The widget is breaking when I enter date values in the past because you didn’t initialize the field correctly.”
Wild guess hedged with a question mark: “The widget is breaking when I enter text. I think because the field type is wrong?”

I’m wary of trying to identify a bug’s root cause too often, no matter how tactfully I present it, because I am not the developer and I do not know the code like the developer does; I could too easily be wrong in my analysis, and every time I’m wrong my credibility slips just a bit further.

But it also seems there are valid reasons to try to suggest the cause of a bug in the first place. Maybe the evidence I’ve provided isn’t quite enough, but I have good hunch based on experience; maybe I’ve looked at the code for the most recent fix, and I actually do see the problem, or at least have a good idea of what sort of issue is causing the symptom or symptoms I found. I’ve hesitantly offered my idea of the root cause of a bug before only to be pleasantly surprised by a “Thanks, that saved me a lot of time!” note from the developer.

So, I teeter on the edge between wanting to venture an analysis of a bug’s root cause to help the developer and wanting to stick to the evidence to avoid looking silly.

Rejecting vs. Accepting “No user would ever do that”

Here’s a scenario: A developer submits a fix for a complex bug and says, “I fixed it so that when a user does A, the system no longer does Z, but I didn’t prevent Y from happening when a user does B, because no user would ever do B.” If you’re a tester, that’s a smell, right? The little red critical-thinking light above your head starts spinning and flashing and you start asking questions like, “How do we know that no user would ever do that? Would a user do something similar to that? Could something similar still lead to unwelcome behavior? Can we find evidence of whether users do things like this? Is there a more severe version of the unwelcome behavior possible? Could a user accidentally do this? Might a mischievous user do it?”

But here’s another question: What is this action called B? Is it something like clicking outside of a form where you wouldn’t expect a user to click? Or entering a value that you wouldn’t expect a user to enter? Or, is it something like a billion unique users submitting a form at the same precise moment? Or a user opening the dev tools and modifying the html in a form?

And another question: What is this outcome called Y? Is it something catastrophic, like a server crash or irreparable data loss? Or is it something mild, like a goofy-looking form or a slight delay in loading time?

“No user would ever do that” is a smell that there might be more wrong than a developer realizes, but it can also be a smell that a tester might not be prioritizing their time well, perhaps spending too much of it hunting down low-frequency, low-impact issues.

So I stay on the edge between rejecting and accepting “No user would ever do that.”

Multi-tasking vs. Flow

In my first Testing on the Edge post, I wrote about getting into a flow state in the context of staying on the edge between taking notes and staying in flow:

When I test uninterrupted for awhile, I can get into a flow state, where I keep most new information in my brain’s working memory, interacting with the software, asking and answering questions on the fly.

Note-taking deals with managing how I spend my time while working on a particular testing task — testing a feature, testing a bug fix, touring a new app, sense-making, reproducing a mystery bug. But what about managing how I spend my time overall, among many tasks? I find myself teetering on the edge between two approaches.

The first approach is to select a small handful of tasks to work on in my mental “now” bucket. Why do this? Say I’m working on Task A, testing a new feature. I’ve tested every test idea I can think of. But I’m not convinced I’ve thought of everything — I have that nagging feeling that I’ve forgotten or overlooked something. If I’m only working on one task at a time — finish one, then move on to the next — then I’d have two options: (1) keep pushing myself through the mental block until I’m satisfied I’m done or (2) declare “Done!” and move on. But if I have Task B and Task C sitting in my bucket as well, I can just set Task A back in the bucket, pull out Task B, and be productive. Task B is exploring a redesigned part of the software to find regression bugs. At some point, I find myself dragging through Task B, staring at the screen without really doing anything. But, hey! I think I have another idea for Task A. I’ll drop Task B back in the bucket and pull out Task A again.

It’s starting to sound like I’m presenting a full endorsement of this sort of task management, and not one side of a teeter, but this multi-tasking does come with a price: that “now” bucket of tasks constantly occupies mental space. This can be both draining and disruptive. Maybe things are going along great with Task B, and that’s when my new idea for Task A (which I didn’t declare as “Done!”) decides to pop up. Now I have to spend mental energy making a decision: do I drop Task B to tend to Task A, or do I keep my flow and risk losing that Task A idea?

The second approach is to limit the “now” bucket to one task at a time. Keep flow as much as possible. When things drag or I think I’m done but have the nagging feeling of not-done, then I pick up something unproductive, like a puzzle, or go for a walk — but I don’t clutter the mental space with additional tasks.

Which side I teeter on seems to depend on the nature of the tasks at hand, and on my mood.

Goal vs. Deadline

In my current role, I’m asked to test the same product most of the time. That product has different customers, though, with different service agreements and contracts, and as a result not every release that I test has the same level of urgency. Some releases have an informal goal date we’re shooting for, but we can cut the release whenever we feel ready, and some have a hard deadline date when the release will be installed, ready or not.

I’ve observed that I approach testing differently depending on the level of urgency. When the urgency is low, I allow myself to be more reflective, to think through risks more thoroughly, to chase suspicions and curiosities. When the urgency is high, I more actively prioritize tasks and ideas, I keep myself more focused and dawdle less, I avoid rabbit holes by taking note of potential issues to investigate later rather than immediately.

Neither mode of being is perfect, and each can be beneficial in different ways. When I’m goal-oriented, I tend to gain a deeper understanding of the product and more often identify patterns over time; when I’m deadline-oriented, I tend to be more efficient and more often trust my intuition about potential issues rather than fall into a trap of over-thinking.

And so, even when there is no external goal or deadline in place, I’ve found that I still teeter on the edge between being goal-oriented and deadline-oriented.

Once again, I’d love to hear from you. Do you teeter too? In what ways are you a tester on the edge? (Or a even a non-tester on the edge! I was excited when my wife said a few of the examples above resonated with the non-testing work she’s doing right now.)

Posted in:

COMMENTS to Robert Day

Robert Day says:

August 14, 2017 at 5:07 am

Overreporting vs. Underreporting: I take the view (and I think all my colleagues agree) is that the tester should report all bugs they find. (Hopefully, the bug tracking system you use allows assessments of scale of impact and importance.) It is a business decision as to whether the bug demands fixing, i.e. demands resources being deployed to make the fix. That business decision has to take the likely impact on users into account as well as reputational damage – but for us (a specialist software vendor), that is specifically a BUSINESS decision.

Analysis vs. evidence: I do not have a background in coding; so if I ever speculate on what is causing a bug, I am usually pretty circumspect on pointing the finger in case I’m laying a false trail for the developer to follow. I will say “This looks as if..”, “This looks similar to…”, or I will even phrase it as a question: “Is this related to…?” or “Is this because…?”. I don’t want a bug to go unfixed because a developer follows a false trail that I may have laid due to my lack of knowledge, but at the same time I want to be as helpful to the developer as possible.

“No user would ever do that”: Oh yes they would. And worse. No-one ever predicts exactly how users will mistreat your product. And they will often find ways of misusing or breaking the product that no-one has ever thought of. But we, as testers, should always try to anticipate ways users will interact with the application.

Reply
Pingback: Five Blogs – 23 Augustus 2017 – 5blogs

Tester's Notebook