JVM Advent

The JVM Programming Advent Calendar

ACRUMEN: What is “software quality” anyway?!


First, let’s level-set some expectations.  This definition isn’t meant for the extreme levels of quality usually associated with the software used in

avionics, implanted medical devices, nuclear power plants, heavy machinery, weapons systems, and so on.  It’s meant for the other five-nines of us, writing consumer-grade systems like web or mobile apps, where, if something goes wrong, or it’s unclear and the user makes a mistake, there may be frustration on their part and embarrassment on ours, but nobody’s going to die.  Those other kinds of industries already have their own approaches, often including regulations and much closer inspection than our software will ever get.

Now let’s start things off with a question.  Do you like low-quality software?  Presumably not!  So let’s try another question.  Have you written any low-quality software?  I know I sure have!  To those of you who said yes, congratulations!  As the saying goes, Step Number One is to realize you have a problem!  For the rest of you: welcome to software development; I hope you enjoy this career you’ve obviously just started.

So, we’ve got plenty of people writing low-quality software, but we don’t like it.  It seems pretty clear to me: we need more software quality!  But that leads us to one tiny little question: what is it?!  If we don’t have a usable definition, it’s hard to improve even our own software quality, let alone the entire state of the art.

Several years ago, I was looking for a good definition, but the ones I found all had serious problems.  Most were

long lists of complicated terms, full of developer jargon.  Jargon is fine for talking among ourselves, but I wanted a definition that other people would understand, even non-technical people, so they could understand our challenges better, and give us more precise feedback about exactly how our software sucks.

Some definitions were

proprietary, requiring us to buy expensive tools or documents.  Some were only applicable within the context of certain technologies, often also proprietary.  I felt that all of that was just plain wrong.  I wanted something that everybody could use, for free.

Some definitions focused exclusively on issues of interest to

us developers, ignoring the needs of the users and other stakeholders.  For instance, many were completely about maintainability — which nobody else knows or cares about, at least not directly, and omitted other important things, like whether the software is easy to use.  Management may know and care about the effects of poor maintainability, like changes taking longer and introducing bugs and developer headaches.  However, they generally don’t know that these symptoms are caused by poor maintainability, and wouldn’t recognize poorly maintainable software if it bit them in the proverbial posterior.

Some definitions weren’t even about the software at all, but all about

the process, or the byproducts, dictating that you must hold these meetings or produce those documents.  Some of these meetings and documents may be helpful, but to make them the definition misses the whole point.  It’s certainly possible to do all that and yet produce horrible software, or to produce great software without them.  I wanted something more flexible, and more focused on the software itself.

I didn’t see any that I liked, nor that were commonly accepted, so in the spirit of XKCD,

I decided to make my own.  To keep it simple, I zoomed out from down in the weeds, where we developers tend to live, past the 40,000-foot view, up to about low earth orbit, so I could look at continents, not pebbles.  That let me trim it down to just six aspects, with simple names and relatively simple explanations.  The result is so short, it literally fits easily on the back of my business card.

The Big Reveal

I call this list of aspects ACRUMEN, but what does that mean?  Originally, it was a Latin word, meaning sour fruit, like grapefruit, limes, and  lemons.  But what is it in this context?  The acronym ACRUMEN (try saying that ten times fast!), simply takes those six aspects, and puts them in priority order.  By now you’re probably wondering, SO WHAT ARE THE @#$%^&* ASPECTS ALREADY?!  They are that software should be Appropriate, Correct, Robust, Usable, Maintainable, and Efficient.  But what does all that mean?!

First and foremost, it needs to be doing what the stakeholders need it to do, in other words, do the right job.  Then it needs to be doing that job correctly, or in other words, do the job right.  It should be hard for anyone to make it malfunction, which is mainly about being insecure or fragile, or even seem so, and we’ll get much deeper into that later.  However, it should be easy for the users to use and for the developers to change.  (The other way round, not so much.  Generally, you don’t want your users changing what your software does, and if we find our own software easy to use, but we’re not the intended users, then what good is that?)  Last, dead last despite how we developers tend to worship this, it should be easy on resources, not only the technical ones that we usually think of, but other kinds as well, and again we’ll get deeper into that later.

To put that all together in one easily grabbable chunk:

Appropriate: doing the right job
Correct: doing the job right
Robust: hard to make it malfunction, or seem to
Usable: easy for the users to use
Maintainable: easy for the developers to change
Efficient: easy on resources

Now, I’ve said a few times that it consists of six aspects, and I’ve told you about six, but ACRUMEN has seven letters!  So, what does the N stand for?  Nnnnnnothing, I just tacked it on to make a real word, even if an obsolete one.  🙂


While the basic definition is fresh in your minds, I’ll address a few frequently asked questions.

Aside from going into detail on the tips, how do we actually use ACRUMEN itself, the list?

Mainly, we can keep it in mind as a checklist when writing or evaluating software.  We can ask, is it appropriate, is it correct, and so on, or how good is it in each aspect, on a scale of 1 to 10, or by simple triage, or is it good enough for our needs?  And if the answer is ever that it’s not good enough, we can ask what can be done to

make it so?  In the more immediate term, we can ensure that our current projects are likely to meet these criteria.  In the longer term, we can ensure that our processes support these criteria, by including various helpful activities and requirements, and maybe even an explicit evaluation against the ACRUMEN aspects.  We can also set

targets, for how good we need it to be in each aspect.

How can we quantify this, and boil it down to one number that shows the quality of a piece of software?

Mainly, I advise that you don’t do that!  Instead, at the very least, keep six numbers, one for each aspect.  Otherwise, you lose too much valuable information.  A single number might tell you that the software is good or bad, but a set of six numbers will tell you how.  For instance, with a chart like this:

AspectScore (out of 10)

we’re probably talking about a program that does most of what’s needed, does it absolutely correctly, but not very efficiently, I would bet slowly, impacting the usability.  It’s also fairly robust and maintainable, but could still use some improvement there too.  These numbers can help prioritize further work on it.

Is ACRUMEN, or rather ACRUME, always the right ordering?  Some projects seems a little different.

No, ACRUMEN is just the typical case.  Your mileage may well vary.  Consider the case of a company-internal command-line physics simulation tool, using a standard algorithm that will never change.  It needs to do the right job, may need to be very efficient, but maybe we can make do with a rough approximation, rather than a precisely correct number that would take much longer to calculate.  It might not need to be so usable because it’s for ourselves, not customers, nor so robust because of the limited interfaces and fewer things to go wrong, nor so maintainable because the logic is never going to change.  So, its list may well look more like AECURM, rather than ACRUME.

The only real constant is that appropriate will always be at the top.  We’ll see shortly why because now we’re going to look at each aspect in more detail, and up first is of course:


If our software doesn’t have this, then Nothing Else Matters.  If our software is doing the wrong job, then it doesn’t matter how well it’s doing the wrong job.  So, appropriateness is not only more important than any other aspect, it’s even more important than all the others put together!  And yet, we developers are generally not taught that this is even a thing, let alone one that we need to think about.

To prove the importance of being appropriate, let’s try a little thought experiment.  Suppose you want a program to play

checkers, and I write for you the world’s greatest chess playing program.  It’s as correct, robust, usable, maintainable, and efficient as anyone could ever want.  But will you be happy with it?  Probably not.  But why not, if it’s such a great program?  Because it’s not checkers!  It’s not what you asked for.  It’s not what you need.  Or in ACRUMEN terms, it’s not appropriate.

So, now that we know how important this is, how do we achieve it?  In an ideal world, we would have

frequent direct contact with the stakeholders.  Ideally face to face, or as close to that as possible.  We can ask what they want, and break it down into smaller and smaller pieces.  (Developers should be good at that, it’s how programming basically works!)  But, we should go a step further, and ask why they want things.  This will help reveal what they really need, which is what we really need to satisfy — as opposed to what they say they want, which makes it two steps removed from there.

Unfortunately, we don’t usually get that opportunity.  Second best is to bring in the experts, which in this case would be Requirements Analysts.  But we usually don’t get those either, at least outside huge companies.  So, we usually have to settle for occasional remote or indirect contact with at least a representative of some stakeholders, like a Product Owner in Scrum.  It doesn’t work quite as well, but having some communication with someone with a clue, is vital.

Once we think we have a good grasp of their needs, we can show them

mockups and prototypes of what we intend to do, and demos of what we have done.  This gives them a chance to correct our wrong ideas of their needs, before we go too far down the wrong rabbit-hole.  I think we’ve all been there, wasting time implementing the wrong thing.  Ideally, show them these frequently, as a sort of continuous course correction.  Frequent feedback from the stakeholders is even more important than being able to ask them questions.

There’s another thing, though, that I’ll be returning to over and over in this talk.  We can propose

tests!  In particular, I recommend the Given/When/Then pattern:

  • Given these preconditions, such as data being in a certain state;
  • When this happens, usually some kind of input from users, or a timer, or a sensor, or another system over a queue or an API;
  • Then this is the result, usually either something the user sees, or data being in a desired new state.

This makes a great link between the worlds of business and tech because the business people can understand it, and we can turn it into a runnable test.


If our software doesn’t have this, then, it has bugs.  It could be giving obviously incorrect results, or worse yet, subtly incorrect, so we don’t notice so soon.  It could be putting data, whether correct or not, in the wrong records or files, or even deleting them!

Nothing can actually stop us from writing code that isn’t correct, at least with the decently productive tools we have today.  So, the big question is: just like the

Thermos (or in British English, the Dewar Flask) that keeps hot things hot and cold things cold, how do we know?  I mentioned the answer just a moment ago: tests prove whether our code is correct — assuming of course that the tests themselves are correct.  (Actually, even then, it’s not quite true, but I’ll get to that shortly.)

I’ll skip over a lot of the advice about how many of what kinds of tests to write, and how, as you can find that in a bazillion other articles, blog posts, videos, books, and so on.  But, I will point out that typical types of tests, like end-to-end/system, feature, integration, unit, and so on, can only prove the correctness of cases that we thought to test.  There are some advanced techniques, though, that can help find unusual cases we didn’t think of.

Property-based testing tests whether some desired property of our code, what formal computer scientists would call an “invariant”, holds true for all valid inputs.  A property testing tool makes up lots of random test data to try, somewhat like the security concept of “fuzzing”, but staying within defined bounds of validity, rather than trying to find and exceed them.  If it finds an input that makes our property fail, that means that there is an edge case that we didn’t consider.

Mutation testing runs our tests against slightly altered versions of our code.  Each altered version should make at least one test fail.  If not, that means that our code isn’t “meaningful” enough for the mutation to make a difference in its behavior, such as if it’s redundant or unreachable, or our tests aren’t strict enough to catch the difference the mutation made, or maybe both.  (I also speak on mutation testing at conferences, and you can check out my Youtube playlist of versions of that talk.)

We should have enough test coverage, of assorted kinds and levels, and verified to actually test our code rather than game a metric, to have strong confidence in the correctness of our code.


If our software doesn’t have this, then, at best, it may simply show a lot of error messages, and seem fragile and unreliable, or it may crash a lot and actually be fragile and unreliable.  It may even get hacked because Robustness includes Security.

The short explanation is that it’s hard to make the software malfunction (or even seem to), but what does that even mean?!  There are a few other things, but most of what I mean is covered by a core concept of information security:

the CIA Triad.  No, it’s nothing to do with spies and gangsters, it’s this triangle up here, of Confidentiality, Integrity, and Availability.  So, robust software does not reveal data when it’s not supposed to, alter data when it’s not supposed to, or become unavailable when it’s not supposed to, even when an attacker is trying to force it to violate them.

So, how do we achieve all that?
Once again, we could bring in the experts, and in this case, that would be…

penetration testers, or for short, pen testers.  (You can see why I couldn’t resist using that image!)  The good news is, you don’t have to work for a huge company to use them.  Many work for independent computer security companies, that you can hire on contract.  However, they are usually expensive, and disruptive because they need to test the production system.

So, once again, we’ll usually have to do without the experts, but, we can use some of their tools, especially software such as static analyzers (which simulate the execution of our program), fuzzers (which test our program’s reactions to various kinds of invalid inputs, in the “fuzzing” technique I mentioned earlier), and probes (which test our system for vulnerability to specific known attacks).  Many of these are available as open source.

Even without their software, we can still get a long way by using their mindset.  The main part of that is to ask ourselves what could go wrong.  Here the tone of voice is critical, it’s not “What could go wrong?”, as though we think nothing could, but almost statement-like, “What could go wrong.”, as if to say, “I know a lot could go wrong, I’m trying to list it, I don’t need a demonstration thankyouverymuch, God!”

For instance, if the system wants the user to type a filename, the user could type it wrong, or type correctly the name of a file they don’t have access to, and so on.  The program should not crash, or show a mysterious error message like “ENOENT” or “HTTP 500”, but instead show a clear and friendly error message, and let the user try again.

There may even be external factors, like losing a network connection, or other hardware problems.  Our software should handle all reasonably foreseeable types of problems as gracefully as practical.

That may sound like a lot, but so far, we’ve only covered innocent mistakes and mishaps.  To make it really robust, we must make it secure, which means we must think like an attacker.  We must ask ourselves, what are the system’s weak points?  What can attackers make happen, that would get them one step closer to their goal?  In what unusual ways can someone get information out of – or into – our system?

Once we’ve brainstormed and run out of answers to such questions, then for everything we’ve come up with, we must somehow handle it.  Yes, that’s extremely vague, but how to handle something is going to vary immensely, depending on exactly what it is.  If the user types a bad filename, ask for another!  But if the system detects an attack in progress and data may be getting corrupted, the proper response may be to shut the whole thing down, and not bring it back until someone goes into the data center and presses the big green button!  In-between, there are many possibilities.  Perhaps we can prevent the situation, mitigate the negative effects, or recover from them, perhaps with the help of insurance.  But whatever response we decide on, we must test it, as it is now an important part of our system.

Our next aspect is one often seen as a tradeoff with security:


If our software doesn’t have this, our users will become frustrated, and may stop using or recommending our software.  That could be disastrous for a software vendor, or a software-as-a-service company!  Also, hard-to-use software can lead the user to do the wrong thing.  Remember what happened in Hawai’i in January 2018, due to software that was hard to use?  They had a false alarm about an incoming nuclear missile!  Just think what could happen if that were the launch system, not just an alarm!

Unfortunately, if we Google software usability, we find mostly things about ensuring that users with various challenges can use our software about as well as the rest of us.  In other words, accessibility.  That’s a good goal in itself, but I’m adding on that it should be easy for everyone to use, not just equally difficult!

To go into more depth: it should be clear at all times what the user can do, should do, and must do, how they can do it, and what else the software can do, especially any help facilities.  And, all of that should be easy to do, despite any challenges the user may be facing.

We can start with the things that accessibility usually addresses, like lack of vision, color vision, hearing, fine motor control, and so on.  But there are other whole types of challenges we should be aware of, like lack of literacy, at least in our character set.  The user may lack certain knowledge, such as culture references, like the significance of traffic light colors.  They may even be of low intelligence!  Yes, we may joke about stupid users, but statistically, about half of them will be below average.

Also, again, there may be external factors, like a noisy or shaky environment.  Imagine someone uses your mobile app on a small phone, while standing up on a crowded bus in downtown rush-hour traffic!  I don’t know what that’s like where you live, but at least in Washington DC, accurate tapping is Not Happening.

Another often-overlooked part of usability is that all software should be usable, whether it’s a web app, a mobile app, a desktop GUI app, or a command-line app — or an API, be it through function calls like with a library or framework, or a wire protocol, whether binary or textual, or whatever.

So, how do we achieve all this?  Once again, ideally we can bring in the experts.  The bad news is, the people called Usability Experts are mostly really about accessibility.  The good news is, we have a wide range of other professions we can get help from!  We mainly want a User Experience expert, or at least a User Interface expert.  But even a web designer, or even an old-fashioned print graphic designer, has training in principles of practical visual design that can help us, at least in that aspect of usability.  However, as usual, we’ll often have to do without any help, but we can go a long way by applying the principles of these experts.  For instance:

here we see an illustration of the KISS Principle, meaning “Keep It Simple, Stupid!”  (Or if we don’t want to be so negative, “Keep It Super-Simple”.)  Note the simplicity of these stereotypical apps from two highly successful companies with reputations for simple ease of use, compared to the cluttered unusable mess from “your company”.  I think many of us, even the front-enders who are usually expected to do better at visual design than back-enders like myself, will recognize some of our own work in that.

Another thing we can do, if the software is something we can use ourselves, is to

“eat our own dog food”.  But remember, if we find our own software easy to use, that does not mean that our users will!  We have inside knowledge, that makes it much easier.  But if there’s anything we find difficult or unclear, it will be much worse for our users.  So, dog food it mainly to find the pain points.

Lastly, it may not be as definable and quantifiable as correctness, but a user interface can still be

tested!  We can bring in some of our typical users, even ones that don’t already know our system, and have them try to do common tasks.  We can watch them use it (which is what’s going on in the photo above), and look for signs, on their faces and screens, of confusion or frustration, or if we’re lucky, satisfaction or happiness.  Afterward, ask them what they found hard or easy, unclear or obvious?  Then fix their pain points, do more of the good parts, and lather, rinse, repeat.

The next aspect is the one we usually think of most:


If our software doesn’t have this, then changes take longer, and are more likely to introduce bugs, and developer headaches.  Delays could make the company miss opportunities.  Bugs damage the company’s reputation.  Developer headaches are bad enough for us, but for the company, they could make key personnel quit in frustration.  I’d bet most of us have been there, as either the quitter or a survivor who had to pick up the slack.

We’d probably all agree that the basic concept is that “maintainable” software is easy to change.  (Thank you, Captain Obvious!)  But I’m going to add that it’s easy to change, with low chance of error, and low fear of error, even for a novice programmer, who is also new to our project.

So how do we achieve all this?  For better or worse, the vast majority of software engineering advice is aimed squarely at this.  So, rather than expound on countless generic principles like good naming, or the Single Responsibility Principle, or low coupling and high cohesion, I’m going to stick to my theme and tell you how testing can help with maintainability.

Some of you may already usually use tests as a sort of documentation of how the code should be used.  But old tests, like the ones we wrote to verify any prior changes we made, like adding a feature or fixing a bug, can be useful in other ways.  They form a regression test suite, to catch anything we break that used to work.  Just knowing that that is there, will reduce our fear of error, like a safety net.  And that will allow us to progress at a quick pace with a clear and focused mind, rather than creeping along slowly because we’re terrified of breaking something accidentally and not discovering it until users complain.  And that is why I mentioned fear at all.

There are also numerous tools we can use, like linters, complexity analyzers, and just cranking up the warnings on our compilers or interpreters.  These will give us plenty of hints how to improve our code, mainly in its maintainability, and occasionally uncovering subtle bugs.  It’s astonishing how many nasty bugs you can catch, just by cranking up the warnings!


If we don’t have this, then our programs may run slowly, or make the users buy more resources.  They could clog the network, or even crash machines by running out of memory or disk space, or drain other resources.  Mainly we know about technical resources, but there are others, such as the user’s patience and brainpower, and the company’s money!

So, how do we achieve efficiency?  Just as there are many kinds of resources, there are many different kinds of inefficiency we could fix, but for this discussion I’m going to focus on fixing the most obvious and common kind: slowness.

I’m sure we’ve all had a program run slowly, then we stare at the code, spot where we think it’s inefficient, spend a long time optimizing that little piece, run the program again, and…  it’s still slow!  So … don’t do that!

Measure it instead!  Humans aren’t really good at spotting the inefficiencies, but there are profilers and packet capture programs and such, that will tell us exactly where, or at least when, we’re using too much CPU, RAM, bandwidth, etc.

Once we’ve found where or when it’s slow, though, there’s still the question of why it’s slow?  Certain kinds of programs tend to have certain problems.  For instance, a distributed system may be doing too much communication, or using a slow network.  A database-driven system may have an inefficient query or data model.  But in the general case, usually the problem is either something architectural, which is more complex than I want to get into right now, or a bad algorithm.  Maybe we’re using something with a polynomial or exponential runtime, when thinking about the problem a little differently could let us use a better algorithm, such as one with linear, root or logarithmic, or could be even constant runtime.  Perhaps we’re using a bad data structure, and that is forcing us to use a bad algorithm.

The upshot is that we should be familiar with the basic common data structures and algorithms, and how to recognize them when we see them in real-world problems, analyze and compare their demands on our assorted resources, and choose and change and combine them.  That way, we can use solutions that have stood the test of time, sometimes with ready-made implementations that are well tested and maybe even optimized.  Once it’s fast enough, we can slap a performance test around it (you knew I had to mention writing some kind of test eventually!), to ensure we don’t have that kind of regression.

In conclusion, if we make sure that our software is Appropriate, Correct, Robust, Usable, Maintainable, and Efficient, then Nobody should have any cause to be sour about the fruits of our labors.

Author: Dave Aronson

Dave is a semi-retired software development consultant, and frequent international conference speaker, with over 37 years of professional experience in a wide variety of languages, platforms, domains, techniques, etc. He is the T. Rex of Codosaurus, his one-person consulting company (which explains how he can get such a cool title), in Fairfax, Virginia, a suburb of Washington DC. In his spare time, he brews mead, and teaches other people how.

Next Post

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

© 2023 JVM Advent | Powered by Jetbrains LogoJetBrains & steinhauer.software Logosteinhauer.software

Theme by Anders Norén

%d bloggers like this: