The Parallel Gem and PostgreSQL (oh, and rails.)

Some Preface

An app I’m working on has a rake that that basically gathers data from the web.  Doing things serially is really REALLY slow and to get around this I started using the (fantastic) parallel gem by Michael Grosser.

This gem is great — it’s simple to use and in something like 3 minutes after installing it I had my task running in 10 processes and absolutely murdering the work that needed to be done.

The Issue

I swapped databases not too long ago and moved from MySQL to PostgreSQL for full text search (which i don’t use anymore) and to stay in line with my host, Heroku.

Today i attempted to run the task against my PostgreSQL db and came up with an error I had never seen before:

message type 0x5a arrived from server while idle

WHUT?

After some investigating on Google I found the root cause — PostgreSQL does not allow using the same connection for more than one thread.

That’s pretty straight forward.

I believe the issue was that I had something like 10 ruby processes that were spawned from 1 process that was holding the db connection.  Not allowed!

The Solution

The solution is actually very straight forward as well.  To get around this you simply need to reconnect to the database each time you spawn a process or thread.

What does this look like in code?

Before (bad) :

After (GOOD):

 

And that’s pretty much it.

  • Reid Thompson

    code is not visible

    • Could you check now? It looks fine for me… Maybe github was having problems serving Gists at the time you looked.

  • Reid Thompson

    code is not visible