Unpacking ActiveRecord binary blob to hex string drops escape characters ("%25" converted to "%")

| | August 8, 2015

I have a Ruby-on-Rails app that accepts a binary file upload, stores it as an ActiveRecord object in a local database, and passes a hex equivalent of the binary blob to a back-end web service for processing. This usually works great.

Two days ago, I ran into a problem with a file containing the hex sequence x25x32x35, %25 in ASCII. The binary representation of the file was stored properly in the database but the hex string representation of the file that resulted from

sample.binary.unpack('H*').to_s

was incorrect. After investigating, I found that those three bytes were converted to hex string 25, the representation for %. It should have been 253235, the representation for %25

It makes sense for Ruby or Rails or ActiveRecord to do this. %25 is the proper URL-encoded value for %. However, I need to turn off this optimization or validation or whatever it is. I need blob.unpack('H*') to include a hex equivelant for every byte of the blob.

One (inefficient) way to solve this is to store a hex representation of the file in the database. Grabbing the file directly from the HTTP POST request works fine:

params[:sample].read.unpack('H*').to_s 

That stores the full 253235. Something about the roundtrip to the database (sqlite) or the HTTPClient post from the front-end web service to the back-end web service (hosted within WEBrick) is causing the loss of fidelity.

Eager to hear any ideas, willing to try whatever to test out suggestions. Thanks.

One Response to “Unpacking ActiveRecord binary blob to hex string drops escape characters ("%25" converted to "%")”

  1. This is a known issue with rails and it’s sqlite adapter:

    There is a bug filed here in the old rails system (with patch):
    https://rails.lighthouseapp.com/projects/8994/tickets/5040

    And a new bug filed here in the new rails issue tracking system:
    https://github.com/rails/rails/issues/2407

    Any string that contains ‘%00′ will be mangled when converting to binary and back. A binary that contains the string ‘%25′ will be converted to ‘%’ which is what you are seeing.

Leave a Reply