Good Resilience vs Bad Resilience – ON PRODUCT MANAGEMENT

Pretty much everyone agrees that resilience is good, especially in software. When one of a million external dependencies fails you don’t want your software falling over like a poorly designed tower.

So how could resilience possibly be bad? Resilience goes bad when an application gracefully – and silently – handles an error even though the error was really, really bad. (One of these examples came up at work today but I won’t say which one). And apologies to Jeff Lash.

Good Resilience: Your application is flexible enough to let users install it on machines that don’t exactly meet the minimum resource requirements. After all, let’s not be picky.

Bad Resilience: Your application is willing to let users install it on a machine that will run so poorly that it will make the application look bad. Users might say “Does it really need two gigabytes of memory? Really?” but if the application is using some sort of SQL-based database storage, yeah, it probably does. And when it runs like a dog? It’s your company’s fault.

Good Resilience: Your application doesn’t scream and crash when a file it expects to find isn’t present. Yay for not crashing.

Bad Resilience: When an expected file is missing, your application merrily continues along and provides the user with the wrong results. Like the previous example, the application is a bit too resilient.

Good Resilience: The application lets any user run it.

Bad Resilience: The application doesn’t check to see if the user has sufficient privilege to carry out the operation and indicates that something is missing when, in fact, it’s only missing because the user can’t access it. Windows apps may not care about this much, but for those “UNIX-Enterprise” apps, this can be a real headache.

Most of the time requirements capture what happens when all the stars align and everything works correctly. But spend some time with your pre- or post-sales field staff and it quickly becomes apparent that requirements also have to describe what should happen in bad situations that can be reliably predicted (or, more realistically, how the next version should handle the situations that have the field people pulling their hair out).