Use the Ladder of Errors to improve your code quality

Software that crashes is a terrible experience. The Ladder of Errors is a suite we can use to surface software issues sooner in the development process.

Software that crashes on users is a terrible experience. People have a tolerance for software issues, but will eventually stop using it as they continually encounter friction. Some of the most frequent negative user reviews in the App Store, Google Play store, etc. are the result of software behaving unexpectedly or crashing altogether.

As an engineer, there are many tools we have at our disposal to mitigate software issues before they make it out into the wild. We can use things like compile-time errors or intentional runtime errors to give us a nudge when something isn't right. The focus of this article is to introduce the Ladder of Errors as a suite we can use to surface software issues sooner in the development process and improve the quality of the software that is shipped to users. There are multiple levels in the ladder with each level moving closer to being a user-facing problem.

Compile-Time Errors

The first and most helpful of the errors are those that occur when you compile your code. It makes it easier to address issues given how immediate these kinds of errors are. This category is not limited to compiler errors, but it also includes other compile-time operations like linters and build-scripts.

Let's take a look at an example in Swift of how we can design a system in a way that the compiler will throw an error if the system isn't changed properly. For the sake of this example, let's assume we are building a system for a dog hotel where there are different instructions for different breeds of dogs. We'll start by defining an enum with the two breeds we want to handle for now.

enum DogBreed {
    case poodle
    case borderCollie
}

We have two important functions at our dog hotel: feeding and grooming the dogs. Let's define those functions:

func feed(_ dog: DogBreed) {
    switch dog {
    case .poodle, .borderCollie:
        print("feed kibble!")
    }
}

func groom(_ dog: DogBreed) {
    switch dog {
    case .poodle:
        print("give bath then dry!")
    case .borderCollie:
        print("give bath then dry then brush!")
    }
}

Now, when we expand our hotel to support other dog breeds, we will get a compiler error if we don't provide feeding or grooming instructions.

While compile-time errors are great because they provide near-immediate feedback, we simply can't catch everything at this stage. The next level of the ladder is dependent on your workflow using pull requests to merge code, but it is a valuable tool nonetheless.

Pull-Request Errors

Pull requests are an opportunity for others to review your code and suggest improvements. It is also a great time to run longer operations that don't make sense to perform every time you build your project. This category is relatively broad and has a lot of space for creativity. Here are a few examples of things you can do during pull requests improve the quality of your code:

Automate common code review tasks with tools like danger (e.g. make sure all strings are added to translations file)
Run continus integration (CircleCI, Jenkins, etc.) to make sure your entire project builds
Run all unit tests for your project and ensure they pass before merging

Once your code is out of your hands, it becomes more diffcult to detect and understand errors. Sometimes errors result in crashes and sometimes they simply cause your app to get into an unintended state. This is where runtime errors come into play.

Runtime Errors

One of the most helpful ways to surface errors while using your app is to intentionally crash it. While this may sound a bit weird, it has saved me numerous times by explicitly making an issue known by crashing the app when it might have otherwise gone unknown and caused problems down the road.

Let's look at an example. Let's say we wanted to provide an image to display alongside each dog breed:

enum DogBreed {
    case poodle
    case borderCollie

    var imageURL: URL {
        switch self {
        case .poodle:
            guard let url = URL(string: "not a valid url .com") else {
                fatalError("Invalid image url for poodle!")
            }
            return url
        case .borderCollie:
            guard let url = URL(string: "htp://borderCollie.com/image.jpg") else {
                fatalError("Invalid image url for borderCollie!")
            }
            return url
        }
    }
}

// do something with image of dog
print(DogBreed.poodle.imageURL.absoluteString)

The URL(string: ...) constructor returns an optional URL?. In this example, the "not a valid url .com" string causes the URL constructor to return nil. Realistically, the use case here is safeguarding against typos such as entering htps instead of https or something similar.

The other nice benfit of using fatalError() is that we can work with URL type instead of an optional. In the latter case, many resort to force unwrapping which has its use cases and could achieve the same outcome, but unwrapping and providing a contextual message when calling fatalError can help debug things easier.

Error Logs in Production

The last level in the error ladder is using error logs in production to understand when things aren't working as they should. Crash logs are very helpful, but sometimes can be vague and are not always straightforward when it comes to pinpointing the root cause. Furthermore, it can be helpful to understand when things aren't working as expected even when your application does not crash.

Many error reporting tools provide the ability to leave contextual messages around events that occur in the moments leading up to an error. There are many different names (Crashlytics calls it Custom Logging, Bugsnag calls them Breadcrumbs) they all can be used to achieve the same thing at the end of the day: provide more information about what caused your app to crash.

I also mentioned logging errors that don't cause your app to crash. It is not a good experience for users to have an app crash while using it, so it is good practice to handle errors where possible. Most error reporting tools provide the ability to report handled errors (Crashlytics, Bugsnag) which can be extremely helpful to understand how often some of particular errors are occurring.

Conclusion

At the end of the day, there are many tools at our disposal and it is up to each of us how we utilize them to improve the quality of our work. I laid of this Ladder of Errors merely as a platform-agnostic tool to help engineers understand how to use various design paradigms and error reporting tools to improve the stability of their code and surface errors earlier on in the development process.