Death by a thousand existential checks

August 12, 2020 on Eric Bower's blog

Existential checks are when we have to detect whether or not a variable has a value - that is, checking to see if a variable exists. If the value is null, undefined or otherwise falsy, then it fails the check. This usually takes the form of an if-statement.

if (thingThatExists) {
  // do something with `thingThatExists`
}

They are a natural – and often necessary – part of codebases. However, their over abundance can make the readability of the codebase difficult. When existential checks are nested within existential checks, it becomes difficult to understand the context of the code we are trying to read.

In this article I will try to demonstrate, where they are used has a dramatic effect on code reuse, readability, and maintainability.

What’s the problem with existential checks?

It increases code nesting

As code structure determines its function, the graphic design of code determines its maintainability. Indentation – while necessary for visualizing the flow control a program – is often assumed to be merely an aesthetic appeal. However, what if indentation can help determine unnecessary code complexity? Abrupt code indentation tends to convolute control flow with minor details. Linus Torvalds thinks that greater than three levels of indentation is a code smell which is part of a greater design flaw

Now, some people will claim that having 8-character indentations makes the code move too far to the right, and makes it hard to read on a 80-character terminal screen. The answer to that is that if you need more than 3 levels of indentation, you’re screwed anyway, and should fix your program.

Jeff Atwood thinks that nesting code has a high cyclomatic complexity value which is a measure of how many distinct paths there are through code. A lower cyclomatic complexity value correlates with more readable code, and also indicates how well it can be properly tested.

Deeply nested structures are a bad idea.

I started my career in software engineering by trying many different programming languages. I took something along with me after diving into writing pragmatic and idiomatic code for each language.

For python, PEP 20 describes a set of design principles that every python developer should think about when architecturing a codebase. There’s one line in this principle that I think about all the time:

Flat is better than nested

This guiding principle has led me to what I believe is more maintainable, readable code. When we apply this principle to data structures, that means we should avoid deeply nested data structures.

Deeply nested structures are difficult to understand – even when strongly typed – and they tend to promote cases where a nested object could be empty. This has the consequence of requiring developers to make many existential checks, especially if the data is used often.

Redux also has an great set of recommendations for organizing application state that also strongly recommends normalizing state.

There are many articles about how to normalize state, but the tldr is to think of an application’s state like a relational database: each object is a database table where the key is the id and the value is the object data. This makes data easier to query, update, and reuse.

It sets poor expectations for other developers and makes it harder for them to grok the codebase.

When objects can be empty, it sets terrible expectations for the end developer. Here are some questions it can raise:

With any project-size, it is important for us to set clear expectations with our code. Setting expectations leads to more readable and manageable code. When interfaces are littered with optional or nullable properties, we set terrible expectations. Therefore, we make a concerted effort to minimize the number of optional or nullable properties in our front-end codebase.

It usually means we forgot an important step in our data pipeline.

When it comes to web development, I regularly have to build a pipeline to consume an API that is usually a separate service, constructed by a set of HTTP endpoints that we have to extract, transform, and load into front-end applications. When I first started building front-end web applications, I got into a really bad habit of skipping the transform step. I would take the API response and load it directly into my application state. By ignoring the transformation step, it made it harder to make updates to the codebase when an API endpoint changed. APIs are not always built with the consumers in mind. They are built in terms of being RESTful, with strict rules on how data should be formed and sent to their consumers.

Another side-effect of ignoring the transformation step is pushing optional properties to the view layer (e.g. react components). There’s no quicker way to complicate a react component than to make a bunch of existential checks inside the render body even if it uses the new syntax sugar of optional chaining.

import { useSelector } from 'react-redux';

interface Author {
  id: string;
  username: string;
  name: string;
}

interface Blog {
  id: string;
  body: string;
  author: Author | null;
}

interface Props {
  blogId: string;
}

const selectBlogs = (state) => state.blogs || {};
const selectBlogById = (state, { id }: { id: string }) =>
  selectBlogs(state).[id];

const BlogArticle = ({ blogId }: Props) => {
  const blog = useSelector(
    (state) => selectBlogById(state, { id: blogId })
  );
  if (!blog) {
    return <div>Could not find blog article</div>;
  }

  return (
    <div>
      <div>{blog.body}</div>
      written by: {blog.author?.name}
    </div>
  );
};

This example is meant to demonstrate how code can become more complicated when there are existential checks inside our react components. We have made no guarantees for the data we are sending to the view layer and as a result we have to make many existential checks and fallbacks to accommodate.

How do we avoid existential checks?

Make optional properties the exception, not the rule

Instead of accepting what the backend sends us, we should instead create a consistent and reliable set of data structures that our app uses. Optional properties should be an exception, not the rule. Let’s build some ideal interfaces without optional or nullable properties and then figure out how to build our state with it.

interface Author {
  id: string;
  username: string;
  name: string;
}

interface Blog {
  id: string;
  body: string;
  author: Author;
}

It’s a pretty simple exercise, we go through and remove the possibility of a property not existing or the property being null. This is great, but how do we make this interface a reality with the data we are being provided?

Build entity factories

The general approach to retrieving data from our redux store is to always send the object that the user is requesting. Instead of our selector potentially returning null or undefined, we can return a default blog object, with sane defaults for each property. This is a concept inspired by golang. Variables without an initial value are set to their zero value. This means that every entity in our front-end codebase has an entity creation function that accepts a partial of that entity and does a simple merge.

const defaultAuthor = (author: Partial<Author> = {}): Author => {
  return {
    id: '',
    username: '',
    name: '',
    ...author,
  };
};

const defaultBlog = (blog: Partial<Blog> = {}): Blog => {
  return {
    id: '',
    body: '',
    author: defaultAuthor(blog.author),
    ...blog,
  };
};
/*
  console.log(
    defaultBlog({ id: '123', body: 'blog content!' })
  );
  {
    id: '123',
    body: 'some content!',
    author: {
      id: '',
      username: '',
      name: '',
    }
  }
*/

This concept of creating default entities or fabricators is a concept used in ruby, primarily for specs but also can be used for anything.

By spending a little up-front time building entity factories, we save a ton of time for every end-developer that needs to create a new entity. It seems tedious, but we’ve been able to scale this concept to even massive entities with good ROI. Our human labor will pay off. We’re not using a library to do this for us - it’s straight-forward and easy to copy/paste.

Default entity functions help:

Transform data from HTTP requests

Back to the fundamentals of ETL, it is imperative that we do not skip building the T in ETL. The way we do this is by creating a deserializer for each entity in our API responses.

You can see in our responses where our original interfaces came from: this is what the API is sending us. We shouldn’t continue the trend of maybe having properties or maybe having an object.

interface AuthorResponse {
  id: string;
  user_name: string;
  name: string;
}

interface BlogResponse {
  id: string;
  body: string;
  author: Author | null;
}

// always type yor API responses!
interface BlogCollectionResponse {
  blogs: BlogResponse[];
}

// always create a deserializer for each entity
function deserializeBlog(blog: BlogResponse): Blog {
  return {
    id: blog.id,
    body: blog.body,
    author: deserializeAuthor(blog.author),
  };
}

function deserializeAuthor(author: AuthorResponse | null): Author {
  if (!author) {
    return defaultAuthor();
  }

  // you can see here that we change
  // the API response from `user_name` to `username`
  return {
    id: author.id,
    username: author.user_name,
    name: author.name,
  };
}

async function fetchBlogs() {
  const resp = await fetch('/blogs');
  if (!resp.ok) {
    // TODO: figure out error handling
    return;
  }
  const data: BlogCollectionResponse = await resp.json();
  const blogs = data.blogs.map(deserializeBlog);
  // TODO: save to redux
}

This seems tedious, but a developer writes it once and now we have:

This ETL structure is the basis of our front-end business logic and has scaled well to date.

Avoid existential checks in react components

All of the work in the previous sections should pay off now, let’s see what it looks like.

import { useSelector } from 'react-redux';

interface Author {
  id: string;
  username: string;
  name: string;
}

interface Blog {
  id: string;
  body: string;
  author: Author;
}

const fallbackBlog = defaultBlog();
const selectBlogs = (state) => state.blogs;
// here we use a fallback blog for when
// we cannot find the blog article
const selectBlogById = (state, { id }: { id: string }) =>
  selectBlogs(state)[id] || fallbackBlog;

const BlogArticle = ({ blogId }: Props) => {
  const blog = useSelector((state) => selectBlogById(state, { id: blogId }));
  return (
    <div>
      <div>{blog.body}</div>
      written by: {blog.author.name}
    </div>
  );
};

What did we accomplish?

This last point is interesting: we don’t need a loader to prevent this code from throwing an error, it’s safe to use while data is being fetched. We can defer implementing loading states until later.

One could argue showing an empty blog post isn’t much of an improvement. But the point isn’t that we are missing a critical message to the user, it’s the fact that we can defer the decision to handle the blog not found error case until later. It’s hard to articulate with a trivial example the impact this change has on a codebase, but let’s say we want to add messaging. In this case, we will need at least one existential check. What we normally do is create a helper function that performs the check for us and then use that inside the react component.

const hasBlog = (blog: Blog): boolean => blog.id != '';
if (!hasBlog(blog)) {
  return <div>Could not find blog article</div>;
}
// ...

Conclusion

The goal of flattening our objects is not to save on lines of code; rather, it’s to build a scalable, readable, and maintainable architecture that is predictable to use.

Architecting code is hard. With a little planning and pushing a few existential checks to the transform layer of ETL, we end up with a repeatable pattern for dealing with optional or nullable properties, and making our view layer easier to build. You can keep your optional chaining.


Articles from blogs I read

Generated by openring

How to help improve SourceHut's design

SourceHut is a software development forge and it is designed with the software engineer’s needs first and foremost. The design prioritizes things like page speed, minimal distractions, and information-forward layouts. It does not prioritize aesthetics, and p…

via Blogs on Sourcehut October 13, 2022

In praise of ffmpeg

My last “In praise of” article covered qemu, a project founded by Fabrice Bellard, and today I want to take a look at another work by Bellard: ffmpeg. Bellard has a knack for building high-quality software which solves a problem so well that every other solu…

via Drew DeVault's blog October 12, 2022

Site Update: HLS support

Hi blog readers! It's time for a regular update on stuff I've gained expertise in frantically googled and hacked together so I can put it in front of your faces. I use YouTube as a video hosting service because it's largely predictable and it'…

via Xe's Blog October 11, 2022