Node.js context across async calls

2021-03-26 ← Back to home

Introduction

Say we’ve got some Node.js application that handles HTTP requests, and for the lifetime of each of these requests we want to keep track of some context.

Or more generally: an application does multiple things, and for each of these things we desire to propagate a context along with function calls sync and async ones.

I happened to desire this when integrating logging with winston into an application that runs code both on servers and in browsers, and is built with react.

Typical solutions

I don’t think the desire to pass data along some async execution is new, and there are different solutions to the problem:

Having the context globally accessible.
- We desire to avoid this to have less global state.
- In case of several concurrent requests this doesn’t allow us to distinguish between requests.
Threading the context trough all calls that have to deal with it.
- In case of Express the canonical way appears to pass it along with the response object.
- If this spans multiple layers of function calls there is a risk to end up with many functions that just pass the context along, but don’t require usage of it themselves and by that leading to less concise code.
In our code we often desire to isolate impure code and prefer to keep the majority of functions pure. By isolating impurities it can be possible to isolate the context along with it. That is until there is either a need to use that context in the pure code or unless most of the code is concerned with plumbing together different impurities.
In our scenario we had the happy coincidence that react comes with a context concept itself. This allows us to pass the tools for instrumentation and logging trough a component tree just using it in places where we actually need to measure something or log some errors.

So we’ve got a solution that works nicely for our code and doesn’t force additional noise onto most of the code. The situation left me wondering whether there are any approaches that offer something like the react context but that still work well with async callbacks.

I’ve noted that when handling such simultaneous processes sometimes thread specific storage of context data is used. That way the data lives as long as the desired execution which is as long as the thread lives. This doesn’t work when dealing with an event-loop instead of worker threads, which is the usual scenario in Node.js projects.

The documentation on worker threads specifically states:

Workers (threads) are useful for performing CPU-intensive JavaScript operations. They do not help much with I/O-intensive work. The Node.js built-in asynchronous I/O operations are more efficient than Workers can be.

Looking for more

So I was searching the web and stumbled across a StackOverflow Post: Is it possible to have “thread” local variables in Node?

Following that I learned about two related talks by Thomas Watson.

He held a talk on this at NodeConf Oslo in 2016, and the slides are online.
He held a somewhat similar talk at Node.js Interactive Europe in 2016, which has also made its way to YouTube, and is embedded below.

Instrumentation and Tracing in Node.js - Thomas Watson, Opbeat

📜 Read the Goolge Privacy Policy

The problem that Thomas Watson describes for Opbeat, which was later acquired and became Elastic APM seemed very fitting to what I had in mind - but less fuzzy and way more articulated and clear.

Notes from the talks

In the slides the problem is described with goals such as these:

Track HTTP requests as transactions
Instrument IO and CPU expensive operations
Associate exceptions with transactions
Stop manually passing context
Achieve small overhead

The suggested solution in his talks comes down to:

Recording the context when a callback is queued
Restoring the context when dequeuing
Having a globally accessible state that gets switched to the fitting context depending on events

Experimenting

After looking at the talks I found some packages on npm that I wanted to take a closer look at:

For this I was also in need of a test problem to experiment with.

The experiment setup

The experiment makes use of these imports:

import range from "lodash/range.js";
import cls from "continuation-local-storage";
import nodeCls from "node-cls";

To simulate some work we’re just waiting a random amount of time. We then act as if we’d done the work to read or write a number:

const work = () =>
  new Promise((resolve) => {
    setTimeout(resolve, Math.floor(500 * Math.random()));
  });

const readNumber = async (label) => {
  await work();
  const n = Math.floor(Math.random() * 1000);
  console.log(`${label} reads number: ${n}`);
  return n;
};

const writeNumber = async (label, n) => {
  await work();
  console.log(`${label} writes number: ${n}`);
};

Given that the main part of our test will be to have 5 concurrent cases of our testWork, where each case gets labeled with a number and states which step of work it is at.

Our desire here is to eliminate the need to pass the label into our readNumber and writeNumber functions. We imagine these functions to be placeholders for bigger parts of work that can contain deeply nested async callbacks.

const testWork = async (label) => {
  console.log(`${label} started`);
  const n = await readNumber(label);
  await writeNumber(label, n);
  console.log(`${label} stopped`);
};

range(5).map(testWork);

I’m testing this code with node v14.5.0, saving it in a file named experiment.mjs. The output looks something like this:

$ node experiment.mjs
0 started
1 started
2 started
3 started
4 started
2 reads number: 142
3 reads number: 555
4 reads number: 180
4 writes number: 180
4 stopped
2 writes number: 142
2 stopped
0 reads number: 662
1 reads number: 355
0 writes number: 662
0 stopped
3 writes number: 555
3 stopped
1 writes number: 355
1 stopped

continuation-local-storage

Next I wanted to have a look at continuation-local-storage.

This library makes use of async-listener and emitter-listener to note events happening and adjust a context accordingly.

emitter-listener has a mandatory disclaimer:

There are times when it’s necessary to monkeypatch default behavior in JavaScript and Node. However, changing the behavior of the runtime on the fly is rarely a good idea, and you should be using this module because you need to, not because it seems like fun.

I find this to be fine and clearly communicated. The current interest is to learn about possibilities and not about building stable or maintainable software.

continuation-local-storage also follows continuation-passing style, a style of functional programming in which ‘control is passed on in the form of a continuation’. I remember somewhat vividly stumbling across the concept of continuations in Haskell before and feeling quite entertained by it. At the same time I also had no clear memories of how the concept works and what that would mean for the usage in JavaScript. Luckily the package documentation came with some examples 😅.

The idea with continuation-passing style is loosely, that functions should accept an additional parameter - we’ll call it next - which is a function, and instead of directly returning a value x they will return the result of executing next with their desired return value.

// boring usual style:
const add = (x) => x + 1;

// continuation-passing style:
const addCps = (x, next) => next(x + 1);

For the continuation-local-storage package this means that we’ll have to place our callbacks inside .run() or .runAndReturn() calls.

The resulting code looks like this:

const clsNamespace = cls.createNamespace("first-experiment");

const provideNamespaceLabel = (label) => {
  clsNamespace.set("label", label);
};

const useNamespaceLabel = () => {
  return clsNamespace.get("label");
};

const clsReadNumber = () => {
  const label = useNamespaceLabel();
  return clsNamespace.runAndReturn(() => readNumber(label));
};

const clsWriteNumber = (n) => {
  const label = useNamespaceLabel();
  return clsNamespace.runAndReturn(() => writeNumber(label, n));
};

const clsTestWork = (label) =>
  clsNamespace.runAndReturn(() => {
    console.log(`${label} started`);
    provideNamespaceLabel(label);
    clsReadNumber()
      .then((n) => clsNamespace.runAndReturn(() => clsWriteNumber(n)))
      .then(() => {
        console.log(`${label} stopped`);
      });
  });

range(5).map(clsTestWork);

This works fine and without passing the label parameter to the clsReadNumber and clsWriteNumber functions. However in my opinion it adds quite a bit of noise to the code with all the callbacks instead of Promises.

node-cls

Following that I had a look at node-cls. This library comes without additional dependencies and makes direct use of async_hooks.

If I’m not mistaken async_hooks have been introduced with node 8, but are still experimental with the current node release v15.12.0. If I recollect correctly earlier versions to 8 had some C/C++ API that could be used instead. However I didn’t look into any of the specific versions as I considered that to be out of scope.

I wrote the following code as an experiment with node-cls:

const contextName = "second-experiment";

const provideContextLabel = async (label, withContext) => {
  const context = nodeCls.create(contextName);
  context.label = label;
  await context.run(withContext);
  nodeCls.exit(contextName);
};

const useContextLabel = () => nodeCls.get(contextName).label;

const nodeClsReadNumber = async () => {
  const label = useContextLabel();
  return readNumber(label);
};

const nodeClsWriteNumber = async (n) => {
  const label = useContextLabel();
  return writeNumber(label, n);
};

const nodeClsTestWork = async (label) => {
  console.log(`${label} started`);

  await provideContextLabel(label, async () => {
    const n = await nodeClsReadNumber();
    await nodeClsWriteNumber(n);
  });

  console.log(`${label} stopped`);
};

range(5).map(nodeClsTestWork);

This turned out to work just fine, and I’m very happy with how the context could be used in nodeClsReadNumber and nodeClsWriteNumber.

Nested contexts

Finally I wanted to see whether nesting of contexts would work, too. For that I came up with this recursive experiment:

const nodeClsRecursiveTestWork = async (label, count) => {
  if (label >= count) {
    return;
  }

  console.log(`${label} started`);

  await provideContextLabel(label, async () =>
    Promise.all([
      nodeClsRecursiveTestWork(label + 1, count),
      nodeClsReadNumber().then((n) => nodeClsWriteNumber(n)),
    ])
  );

  console.log(`${label} stopped`);
};

nodeClsRecursiveTestWork(0, 5);

The output of that looks like this:

$ node experiment.mjs
0 started
1 started
2 started
3 started
4 started
3 reads number: 544
2 reads number: 335
4 reads number: 951
1 reads number: 415
2 writes number: 335
1 writes number: 415
0 reads number: 18
3 writes number: 544
4 writes number: 951
4 stopped
3 stopped
2 stopped
1 stopped
0 writes number: 18
0 stopped

Note how the ‘stopped’ messages are now ordered due to the nature of the recursively nested calls of Promise.all.

Conclusion

It was quite interesting for me to have a look at these packages and get a better idea of what APM libraries are worrying about.

For most of the practical problems I’ve encountered so far a combination of pushing impure code to the outside - and I’m counting logging/instrumentation under that - and availability of react context have offered fine solutions so far.

It feels nice to me, to be a bit more aware of async_hooks than I was before.