Work Behind The Scenes – Using Background Workers

One of the major selling points of Pagoda Box (the forerunner to Nanobox), for me, was the distinction between web components and worker components. This is fairly standard practice, now, with PaaS (and micro-PaaS) offerings, but it's still a little bit obscure. The basic principles of "push heavy work to the background" and "do long-running tasks away from the users" are easy enough to grasp. But how does one actually push heavy work to the background, and keep long-running tasks away from users?

Well, perhaps obviously, one element is a worker component, a separate space for background tasks to be performed. The better the separation between the worker component and your regular web component, the better the resulting infrastructure will perform, as the work done by one won't overwhelm that done by the other. The other element is some form of message bus – a way to communicate between the web and worker components. Preferably, this will take the form of something that neither side has to interrupt the other to use.

Let's take a deeper look at the pieces, and how to use them together to improve your apps.

Code Components

There are two main logical abstractions when it comes to describing the form and function of your code: web code, and worker code. Web code is the external interface to your app, the way users and other consumers of your app's content and functionality gain access. Worker code, meanwhile, runs continuously in the background, only interacting directly with the outside world during conversations it starts, if then. All your app's code can be grouped into one of these two categories.

Many PaaS and micro-PaaS offerings (including Nanobox) encapsulate this logical distinction by providing separate environments for each type of code. Web components are accessible from the outside world (albeit usually only through a load balancer of some sort), while worker components are completely blocked off, able to initiate a connection, but not to accept one. This has a number of advantages – high load in one won't affect the other, and each component can be scaled completely separately from the other, among others.

This separation does mean that some methods of communicating back and forth between the two types of code are limited. The file system, for example, cannot be used to shuttle files back and forth, since it's not the same file system in both places. There are ways to address this, but it actually serves us well. It forces us to look at other ways to communicate between components, many of which are much more resilient when operating at high scale.

Messaging

There are actually several ways to communicate between components. A network-attached filesystem can be mounted into each component for sharing files between them. A relational database can be accessed by either component to add or manipulate instructions and other data. A cache server can be accessed by both for storing and using temporary data. But the most common mechanism for communicating between components, especially with regard to controlling the work done in the background, and signalling its progress back to the webs, is what's called a "queue".

Queues are generally made up of several parts. On one end, you have the enqueuer, the piece of code that adds tasks to a job list (the part which is the actual queue). In between, the queue itself, generally stored in some ordered form by a piece of software that specializes in that kind of thing. On the other end is the reserver, which reads tasks from the queue and sets up the actual processes which will perform them. And finally, there's a part that monitors each task's progress, and reports that data back to the rest of the app, usually via database, or possibly even the storage mechanism that holds the task list(s).

Practical Design

The Queue

So, how do you use this knowledge in your own apps? Well, for starters, you'll want to select a queueing engine. Ruby users may select the GitHub-created Resque, or the ships-with-Rails Sidekiq. PHP users may find that the Laravel Queue project suits their needs. Other languages and frameworks have their own options available, so shop around a bit to find one that has the features you want to use.

Your choice of target environment will influence your available options for queueing engines, somewhat. Most PaaS offerings (Nanobox included) support running Redis within your app, which is what I personally recommend for queue backing. This should probably be the first thing you select, but it's best to consider the available libraries for your language and framework of choice first, and then select the best backing from what they support.

Once you have both of those selected, you'll need to add them to your app. The process for installing the queueing engine will depend solely on your target language and/or framework, as well as the engine itself. Consult the relevant documentation to perform this step.

The backing, meanwhile, depends on your deployment solution. Since this is a Nanobox article, I'll show you how to add a Redis component to your app with Nanobox. Add the following YAML block to your boxfile.yml:

data.queue  
  image: nanobox/redis:3.0

That's it! The next time you nanobox run or nanobox deploy, you'll have a container included in your app's infrastructure that runs Redis, and the environment variable (DATA_QUEUE_HOST) to connect to it from all your other components.

The Worker

Now that you have a queueing engine added to your project, and a backing for jobs to be stored in while awaiting processing, you can start splitting off the long-running portions of your app to be run by a worker component. We can start this process pretty much anywhere, but in this case, I'm going to start with the easy part – the worker component itself. Once again to boxfile.yml:

worker.main  
  start: [worker start command]

You can use most of the other options offered by web components, here – notable exceptions include routes, because workers aren't meant to be accessible from the outside world. At the very least, you'll probably want to include any network_dirs that your app uses, so your worker can process files if needed.

You can view all the options in the worker documentation.

In development, your worker process(es) won't run automatically, since dev environments are an entirely manual thing (dev is such a drastically different setup from production, and you're very likely to start, stop, and restart a number of chunks of your code in the course of normal development tasks). So you'll need to open a separate console (another tab, another window, maybe a tmux or screen session) and nanobox run [worker start command] there to fire the worker(s) up in development mode.

The exact value of [worker start command] is determined by your queueing engine. Consult its documentation to determine that.

Jobs

Now for the part that actually performs the actual tasks. The exact way you build, identify, and select jobs will depend on your language and queueing engine, but generally speaking, each task will have a job type (usually a class name in OOP languages) and arguments, which is what the enqueuer is responsible for adding to the task list. The job code itself is then loaded up by the queueing engine's reserver, and started up with the arguments passed along to help determine the exact behavior the task will have.

Each job should do one thing, and do it well. If you need to send out a newsletter, your job might run through the list of subscribed users and fire off an email to each one, or you might enqueue one job per email. If the job is meant to preprocess an image file to rescale it and generate a thumbnail, your job would load up the original from disk (probably a network share), perform the manipulations, then save the results back to the disk for use elsewhere. A job is a single task, encapsulated in a block of code that is called whenever that job is needed. It can run as long as you need it to, but keep in mind that the longer a single task takes, the longer your app will need to wait before it can start the next one.

Status

To communicate job status back to the main application, there are generally a few options. First, most queueing engines have some form of tracking for the progress of their queues. Second, many tasks will make changes to data on disk or in a database, so the presence of these changes can be used to verify success. Finally, you can set up your jobs to enqueue a status flag in a "return queue", which the main app can check to verify progress, if that makes sense for it to do. The key, here, is that your job's actual actions can be an indicator of success/failure, but most of the time you'll communicate more explicitly using the same backing you used to queue the task in the first place.

Onward!

That's really all there is to know about designing an app with background workers in mind. Some queueing engines provide more advanced options for job handling, such as scheduling jobs to be added to the queue at a later time, removing jobs before they are executed, and preventing duplicate jobs from being added in the first place. But these extra features don't really affect the app design/architecture very much, so much as your selection of queueing engine to use in the first place. So what's presented here is really all you'll need, in nearly all cases.


Did this article help you? Do you have any feedback to offer on how it could be improved? Please, use the comment section below to let us know!

Daniel Hunsaker

Author, Father, Programmer, Nut. Dan contributes to so many projects he sometimes gets them mixed up. He'll happily help you out on the Nanobox Slack server when the staff are offline.

@sendoshin Idaho, USA

Subscribe to Nanobox

Get the latest posts delivered right to your inbox.

or subscribe via RSS with Feedly!