Josh Thompson     about     archive     tags

Sidekiq and Background Jobs for Beginners

I’ve recently had to learn more about background jobs (using Sidekiq, specifically) for some bugs I was working on.

I learned a lot. Much of it was extremely basic. Anyone who knows much at all about Sidekiq will say “oh, duh, of course that’s true”, but at the time, it wasn’t obvious to me.

The reason I needed such basic overviews is because prior to my current job, I’d had just a few hours of exposure to background jobs, and understood little of those hours. And I got dropped into a project that has dozens of jobs, handling hundreds of thousands of actions a day.

As is my style, when I don’t understand something, I like to go to the very basics.

Most of the interesting stuff is way down at the bottom, on watching Redis do it’s thing

In this case, I went back to Turing! I found the background jobs lesson from Mod 3, and worked through it.

I very much enjoy seeing evidence of things working “under the hood”, rather than just accepting that BackgroundWorker.perform_later(foo.id) works differently than BackgroundWorker.new.perform(foo.id), etc. So, this post will focus not as much on using Sidekiq, but seeing that it’s working.

If you want to follow along, do the above tutorial. This is what my repo looks like right now. I’ll recap most of what’s in the tutorial.


To run the app, run each of the following, using multiple terminal tabs as needed:

rails s
redis-server
sidekiq
mailcatcher

The app should now working. Navigate to http://localhost:3000/, and you should see Missy Elliot in all her glory.

Open up http://localhost:3000/sidekiq/ to see the sidekiq dashboard, and then over to http://localhost:1080/ for mailcatcher.

You’ll notice that when sending emails via the app, nothing is happening on [http://localhost:3000/sidekiq/], and the redis terminal window is untouched:

redis

Convert a non-background-job to a background job

The essence of a background job is to do stuff in the background, without making the Rails app sit around doing all the work.

To simulate the pain of waiting for synchronous jobs, when you use this app, the “send email” method has a five-second sleep in it.

Lets make this a background job:

  1. create the job. (you can use rails generate job <job_name>, per the ActiveJob docs)
  2. Call the notify user job from the controller, instead of calling the user notifier directly.

Make the job

We’ll hand-roll this. Make app/jobs/send_user_gif_job.rb.

class SendUserGifJob < ActiveJob::Base
  queue_as :default
  
  def perform(*args)
    # do da ting
  end
  
end

(Deviated slightly from the docs with ActiveJob::Base. I’m working with Rails 4.2)

Make a test

Working through the rubyonrails.org docs on testing jobs, I’ll set up the following:

# test/jobs/send_user_gif_job_test.rb

require 'test_helper'

class SendUserGifJobTest < ActiveJob::TestCase 
  test 'that email is sent' do
    SendUserGifJob.perform_async("[email protected]", "hello")
    # literally no idea what to assert here...
    # assert 
  end
  
end

I’ve no idea what to assert just yet, but we’ll get there. Lets run the test!

Unfortunately, this test passes. :(

After taking a look at the testing Sidekiq docs, I’ve got some ideas.

Messed up Sidekiq?

After a bit of playing in the rails console, I had a bunch of bad jobs that Sidekiq was trying to process. Every time I started Sidekiq, it broke with a stack trace for “uninitialized constant”, for a job/class/worker that didn’t exist.

To clear out everything in Sidekiq, run the following from the rails console:

Sidekiq::Queue.all.each(&:clear)
Sidekiq::RetrySet.new.clear
Sidekiq::ScheduledSet.new.clear
Sidekiq::DeadSet.new.clear

As usual, I found the answer on Stack Overflow (I could see this being a very dangerous command to run in any sort of production environment. Don’t do that, please.)

After clearing out the queue, I can run Sidekiq just fine.

Reworked the test and worker

These are workers and not jobs. ActiveJob jobs live in /jobs. So, if you want a worker, don’t put it in the /jobs directory, put it in the /workers directory.

Don’t ask me why I know this.

I had problems because of where I stuck these files, and had some other problems because of naming conventions decided naming conventions are important.

So, I threw away all the work and did rails g sidekiq:worker SendGifToUserWorker.

Here’s what I’ve got right now, after the rails g and taking some examples from the testing docs:

My SendGifToUserWorker

# app/workers/send_gif_to_user_worker.rb

class SendGifToUserWorker
  include Sidekiq::Worker

  def perform(*args)
    # Do something
  end
end

My SendGifToUserWorkerTest

# test/workers/send_gif_to_user_worker_test.rb

require 'test_helper'

class SendGifToUserWorkerTest < ActiveJob::TestCase
  test 'that email is sent' do
    SendGifToUserWorker.perform_async("[email protected]", "hello")
    # literally no idea what to assert here...
    # assert 
  end
  
  test 'that job is pushed to queue' do
    assert_equal 0, SendGifToUserWorker.jobs.size
    SendGifToUserWorker.perform_async("[email protected]", "hello")
    assert_equal 1, SendGifToUserWorker.jobs.size
  end
end

Unfortunately, the tests pass. This tells me the job is running fine (I guess), but no clue what is happening under the hood.

correction: the second test passes every-other-time or so. The jobs.size queue isn’t always starting at 0, so it fails the first assertion of 0.

A fix was to add the following setup method:

# test/workers/send_gif_to_user_worker_test.rb:5

def setup
  Sidekiq::Worker.clear_all
end

Making Sidekiq do stuff via the Rails Console

Since the tests don’t push actual jobs to Sidekiq, I don’t see any indication that anything interesting is happening in Sidekiq web, or Redis, or the Sidekiq terminal window. :(

I updated the mail model in the application to actually use Sidekiq (no failing test quite right now, sorry) and here’s my worker:

# app/workers/send_gif_to_user_worker.rb
class SendGifToUserWorker
  include Sidekiq::Worker

  def perform(email, thought)
    UserNotifier.send_randomness_email(email, thought).deliver_now
  end
end

And it’s getting called from the mailers controller, like so:

# app/controllers/mailers_controller.rb
class MailersController < ApplicationController
  
  def create
    SendGifToUserWorker.perform_async(params[:mailers][:email], params[:mailers][:thought])
    flash[:message] = "You did it! Email sent to #{params[:mailers][:email]}"
    redirect_to "/sent"
  end

  def sent
  end
end

With that setup, in my rails console, I can do something like SendGifToUserWorker.perform_async("[email protected]", "hello"), and I get back some sort of GUID:

main:0> SendGifToUserWorker.perform_async("[email protected]", "hello")
=> "08e6a309cf7c46dc0178c53f"
main:0> SendGifToUserWorker.perform_async("[email protected]", "hello")
=> "8b962d28217ae177564f0fd7"

Each of these talks to Sidekiq, and you can see these jobs go by in the logs:

2018-07-27T17:13:55.023Z 10221 TID-ovusw76r0 SendGifToUserWorker JID-08e6a309cf7c46dc0178c53f INFO: start
2018-07-27T17:13:55.023Z 10221 TID-ovusw76r0 SendGifToUserWorker JID-08e6a309cf7c46dc0178c53f INFO: done: 0.0 sec
2018-07-27T17:13:57.521Z 10221 TID-ovusw781o SendGifToUserWorker JID-8b962d28217ae177564f0fd7 INFO: start
2018-07-27T17:13:57.521Z 10221 TID-ovusw781o SendGifToUserWorker JID-8b962d28217ae177564f0fd7 INFO: done: 0.0 sec

This is what it looks like, in the logs and sidekiq web, running the jobs from the Rails console:

rails console, sidekiq web, and sidekiq

So, cool. My job still isn’t doing anything, but at least it’s running. I guess.

Restart Sidekiq when you make a change to a worker

It makes sense that the Sidekiq worker test might assert JUST that jobs get queued correctly.

I’m still not content - my tests are passing, without the sidekiq worker actually doing anything. I’d feel great about a red test related to it.

Everything to this point is on commit 38f5750, if you’re following along.

Sidekiq is queuing the job as I’d expect it to, even though it’s not doing anything.

.

.

.

I just spent an embarrassing amount of time “troubleshooting” why my worker wasn’t doing what I thought it should do. Turns out you need to restart Sidekiq if you change a Sidekiq job. Maybe this isn’t always true, but if you’re saying

why isn’t showing up in Sidekiq?

just restart Sidekiq.

Watching Redis

I want to make sure that this stuff is getting in and out of Redis as expected. Redis is a super fast key:value store, and we should see stuff getting written to, and read from Redis.

Use redis-server to start Redis running. It’s pretty boring to watch, and doesn’t show you any information about data placed in/out of it, so not that helpful.

boring redis

Once you have redis-server running, you can run (in another terminal tab) redis-cli monitor, which dumps you into something that prints a TON of logs when it’s not doing anything. (all of the hmset macbookpro stuff is still me doing nothing. It’s reading “server status” details like a hyperactive chipmunk on crack.)

slow your roll, Redis

I found the signal-to-noise ratio of redis-cli monitor to make it near-useless. What we care about in Redis are hset, and lpush. Maybe more, but this is sufficient for finding what I want.

So, once you’ve got redis running, to watch the logs for JUST hsets and lpushes, run:

redis-cli monitor | grep -E "(hset|lpush)"

And you’ll see nothing, until Sidekiq pushes jobs to Redis, and reads from it:

thank you, redis

Here’s those lines, formatted for easier reading:

lpush

1532784329.095661 [0 127.0.0.1:53832] lpush queue:default 
  {
    class:SendGifToUserWorker,
    args:[flip,flop],
    retry:true,
    queue:default,
    jid:adfa15f29ed6c09cda7f6973,
    created_at:1532784329.095427,
    enqueued_at:1532784329.095466
  }

hset

1532784332.778327 [0 127.0.0.1:53803] hset MacBook-Pro-6715.local:32134:cc8d1568c5c6:workers ow3kb5tjc 
  {
    queue:default,
    payload:
      {
        class:SendGifToUserWorker,
        args:[flip,flop],
        retry:true,
        queue:default,
        jid:adfa15f29ed6c09cda7f6973,
        created_at:1532784329.095427,
        enqueued_at:1532784329.095466
      },
    run_at:1532784329
  }

By the way, Redis is a bit cleaner if you run the server as a background process. To do this, do:

redis-server & (the & makes it a background process).

To stop Redis, just do redis-cli shutdown

What does/does not occur

So, how can we be sure that our Sidekiq job is actually firing? Lets see what it looks like, using this worker with Sidekiq, and without.

Compare these two lines:

SendGifToUserWorker.new.perform(params[:mailers][:email], params[:mailers][:thought])
SendGifToUserWorker.perform_async(params[:mailers][:email], params[:mailers][:thought])

Which one is using Sidekiq?

The first line is creating a new instance of the worker class, and calling perform directly on it. It’s not hitting Sidekiq, and if the job fails, it won’t requeue.

At least, I think this is the case.

Here’s what rails server (left), redis (filtering for hset and lpush events, top right) and Sidekiq (bottom right) look like when I trigger the email worker:

using sidekiq

Here’s how I can see a job using Worker.new.perform doesn’t “hit” sidekiq:

not using sidekiq

The Rails server just sits for a while, and redis and Sidekiq seem clueless of the event?

Conclusion

One of the bugs I was working with, the root of the problem was the worker/job seemed to not be requeuing when it failed. I spent time touching up the worker, and logging around the specific error (it was a timeout on an internal API endpoint), but even after rescuing the timeout, the job wasn’t happening again.

Someone else on the team wisely pointed out that the entire worker was getting called with Worker.new.perform(), which was sidestepping (oh, the puns) the Sidekiq infrastructure. I understood this to be true, but I still wanted to see it for myself.

In the process of working through this lesson, I found a Stack Overflow post about “how to make Sidekiq execute a job immediately”. The only answer suggested calling Worker.new.perform, and that must be, for some reason, why this bit of code ended up in our codebase. The developer wanted it to happen immediately, and perhaps did not expect it to fail at any point.

Anyway, I’m content that I understand the basic difference between Worker.perform_async and Worker.new.perform, and next time I encounter an issue like this, I’ll much more quicly wrap my head around it.

Resources

Get occasional emails

If I've written any new posts, you'll get an email with summaries on Friday. If I've not, you'll not hear from me.