Alieniloquent


Broken Window in ActiveRecord: ActiveRecord::StatementInvalid

March 3rd, 2008

I love ruby, and I love Rails, but in some ways it really is a ghetto. It has a lot of broken windows that only serve to encourage bad coding from developers who should know better. Today I ran into an example of one of those broken windows and I was beside myself. I could not believe what I was reading.

One of the projects I work on for my employer is an import process that takes a long time. In order to make it resilient to database fail-overs, I wanted to catch the exception that is raised when the connection dies, wait a few seconds, and then try to reconnect. The idea is simple, and it works once I account for the broken window, but I am not pleased with the code I had to write.

When the database connection disappears, the database driver throws an exception. ActiveRecord::Base catches that exception and does this:

# Find this in Rails 2.0.2
# active_record/connection_adapter/abstract_adapter.rb:121

rescue Exception => e
  # Log message and raise exception.
  # Set last_verfication to 0, so that connection gets verified
  # upon reentering the request loop
  @last_verification = 0
  message = "#{e.class.name}: #{e.message}: #{sql}"
  log_info(message, name, 0)
  raise ActiveRecord::StatementInvalid, message
end

This is the exception handler that catches all exceptions raised during a query run by ActiveRecord. As you can see, it snags the class name, and the exception message off of the exception, and then throws the object away, reraising with ActiveRecord::StatementInvalid. So, if your database driver has hundreds of error codes which are provided in order for you to tell specifically what error occurred, such as Mysql::Error, you lost them.

So ActiveRecord provides one exception that covers everything from primary key violations to database connection errors, and the only way to distinguish them is by inspecting the message. Surely, that can’t be true, right? I dig further and find this:

# Find this in Rails 2.0.2
# active_record/connection_adapters/mysql_adapter.rb:244
#
# Note: I snipped the error message because it is very long

rescue ActiveRecord::StatementInvalid => exception
  if exception.message.split(":").first =~ /Packets out of order/
    raise ActiveRecord::StatementInvalid, snipped_error_message
  else
    raise
  end
end

That is just completely unacceptable. I can find it in my heart to forgive the abstract adapter for doing something that throws away implementation-specific information, but the Mysql adapter should remedy that. It willingly lets it’s exception information be cast aside and goes about inspecting what the abstract adapter had the decency to keep around.

“But that information is good enough to tell what the exception is,” you might say.

Until the Mysql folks change the error message. The Mysql API exposes numeric constants, and I’m sure they’re very careful to keep them the same, but do you think they take the same approach to error messages? I doubt it. They provide a function that will give you an error message given the numeric constant, and encourage you to use it. That’s what the Mysql bindings for ruby do.

Expecting developers to inspect the exception message is essentially promoting programming with magic numbers. Sure, they’re string literals, but they’re still duplicated information, and extremely brittle.

All I’d want is an inner_exception attribute available on ActiveRecord::StatementInvalid or maybe its parent, and then assign it when doing reraises. Is that too much to ask for?

String transforms using Enumerable#inject

February 15th, 2008

I love functional programming, and I love Ruby. One of the most awesome things about Ruby is how much it borrows from the functional programming mindset. One of the most powerful concepts that functional programming brings to the table is higher-order functions. Ruby’s Enumerable module is a great example of how it embraces the idea of higher-order functions to abstract out the various things you do with a collection and let you focus on the operation for each item.

One of the most mysterious methods on Enumerable is Enumerable#inject. The example that’s always given is this:

irb> [1, 2, 3, 4].inject(0) {|sum, i| sum + i}
10

That’s fine, and usually makes sense. But when you try to branch out into more esoteric uses of inject, it can get confusing. So I’m going to give an example of accomplishing something useful with inject that you hopefully find useful.

I always find myself doing a sequence of substitutions on a string. For example, when I implement a Telnet client, I like to normalize the line endings I’m sending so that they’re sane. I accomplish that by translating “\r\n” to “\n”, then translating “\r” to “\n”, then translating “\n” to “\r\n”. It’s a simple thing to do, and I could do it like this:

string.gsub("\r\n", "\n").gsub("\r", "\n").gsub("\n", "\r\n")

But that’s not very extensible. I’d like to apply this idea of a sequence of substitutions in an abstract way so that I can do dynamically. And while I could do something with Object#send, that’s like cheating. This is where inject comes to the rescue.

def normalize_line_endings(string)
  transforms = [proc {|s| s.gsub("\r\n", "\n")},
                proc {|s| s.gsub("\r", "\n")},
                proc {|s| s.gsub("\n", "\r\n")}]
  transforms.inject(string) {|s, transform| transform.call(s)}
end

Kernel#proc (or Kernel#lambda if you prefer) is Ruby’s way of making higher-order functions. It returns a block which you can then call with an argument. In the above code, I make an array of transforms that take a string and return a string. The call to inject at the end is where the magic happens. It calls the first transform with string which was provided as the argument to inject. Then it calls the second transform with the result of the first, and it calls the third transform with the result of the second. That list could be as big as you want. It could even be dynamically generated.

That’s nice, but it’s still a a little verbose. I like to hide my use of Kernel#proc behind a declarative interface when I’m doing this sort of thing with it. So here’s how we can rewrite the method.

def transform(string, specifications = [])
  transforms = specifications.collect do |spec|
    proc {|s| s.gsub(spec[:from], spec[:to])}
  end
  transforms.inject(string) {|s, transform| transform.call(s)}
end

def normalize_line_endings(string)
  transform(string, [{:from => "\r\n", :to => "\n"},
                     {:from => "\r", :to => "\n"},
                     {:from => "\n", :to => "\r\n"}])
end

Of course, at that point, we don’t really need to create the procs. We can just use inject right on the specifications array, so the final code I came up with for this was:

def transform(string, specifications = [])
  specifications.inject(string) do |s, spec|
    s.gsub(spec[:from], spec[:to])
  end
end

def normalize_line_endings(string)
  transform(string, [{:from => "\r\n", :to => "\n"},
                     {:from => "\r", :to => "\n"},
                     {:from => "\n", :to => "\r\n"}])
end

Now that can be used with any list of transformations. Those transformations can be dynamically generated, and it’s a very clean implementation. That is the power of Enumerable#inject.

Living In the House That Rails Built

January 29th, 2008

I wanted to share a snippet of code. This code will print a call stack to STDOUT every time a Ruby class definition is evaluated. It is particularly useful when you find that class constants are being mysteriously redefined.

class Foo
  puts "\nRequired from:\n  #{Kernel.caller.join("\n  ")}"
  # ...
end

What inspired me to write that code? Rails did. The key to writing Ruby on Rails is that you’re writing Ruby on Rails. You don’t follow the Rails best practices because they’re convenient. You follow the Rails best practices because your program won’t work unless you do. Just like trains, you stay on the track and everything is great. If you try to take your train off-track, then it’s gruesome enough to make the nightly news.

How did I derail my application such that I cared how and where a file was being required? I wrote a unit test that explicitly required a model object. Oops. Remember that the semantics of require is load-once based on the name. So:

require “foo”

and:

require “models/foo”

are very different to require. Rails is super helpful and requires everything that it makes for you. So it requires models for you, even when you run your unit tests.

So take this code:

class Foo < ActiveRecord::Base
  RAILS_IS_A_GHETTO = true
end

And then write a test for something that Rails didn’t generate (such as something in the lib directory like I did):

# Require some other stuff
require "foo"

class TestTruth < Test::Unit::TestCase
  def test_truth
    assert true
  end
end

If you rake test you will get an error complaining that RAILS_IS_A_GHETTO was reinitialized, and that’s because Rails loads it for you as “models/foo” and you load it as “foo” so it gets loaded twice.

The moral of the story is: let Rails load the things it built, and you load the things you built.

Base32 0.1.1 Released

June 29th, 2007

Quickly on the heels of the initial release of my Base32 library, I have an update. I should have tried to compile it on Linux, as the GCC settings on my Gentoo box caught some silly things I had done.

It’s all better now, and the gem can install on both Mac OS X and Gentoo Linux. I assume other Linuxes are probably fine, as are BSDs and other *NIXes.

To download it go here.

Base32 0.1.0 Released

June 28th, 2007

As you may know, I’ve been working with base32 encoding. Well, I decided to share my work with the world in the form of a library.

This first release simply contains the code I needed for my original project, but I’ve packaged it up as a nice Ruby extension.

You can visit the project page here.
You can download the release here.

Base32 Encoded Freedom

June 5th, 2007

So I’m writing the license-key generation code for the store-front for a shareware program my friend Tyler and I are preparing to release (more about that later). We’ve decided to use cryptography to reduce the likelihood that our licensing schema will be compromised (for relatively little effort on our part). We also decided to base32 encode the actual keys to make them easier to read.

Well, the store-front is going to be a Rails app, of course. Ruby has a module to base64 encode, but it doesn’t have one to base32 encode. So, I wrote one, and I did it test first (of course).

The first four tests were easy. Really short strings, but they worked out most of the kinks. But, I wanted something that would boost my confidence further. So I wrote the following test which ended up being quite patriotic.

def test_constitution_preamble
  plaintext =<<-EOT
    We the people of the United States, in order to form a more perfect union,
    establish justice, insure domestic tranquility, provide for the common
    defense, promote the general welfare, and secure the blessings of liberty
    to ourselves and our posterity, do ordain and establish this Constitution
    for the United States of America.
  EOT
  encoded = %W(
    EAQCAIBAEBLWKIDUNBSSA4DFN5YGYZJAN5TCA5DIMUQFK3TJORSWIICTORQXIZLTFQQGS3RA
    N5ZGIZLSEB2G6IDGN5ZG2IDBEBWW64TFEBYGK4TGMVRXIIDVNZUW63RMBIQCAIBAEAQGK43U
    MFRGY2LTNAQGU5LTORUWGZJMEBUW443VOJSSAZDPNVSXG5DJMMQHI4TBNZYXK2LMNF2HSLBA
    OBZG65TJMRSSAZTPOIQHI2DFEBRW63LNN5XAUIBAEAQCAIDEMVTGK3TTMUWCA4DSN5WW65DF
    EB2GQZJAM5SW4ZLSMFWCA53FNRTGC4TFFQQGC3TEEBZWKY3VOJSSA5DIMUQGE3DFONZWS3TH
    OMQG6ZRANRUWEZLSOR4QUIBAEAQCAIDUN4QG65LSONSWY5TFOMQGC3TEEBXXK4RAOBXXG5DF
    OJUXI6JMEBSG6IDPOJSGC2LOEBQW4ZBAMVZXIYLCNRUXG2BAORUGS4ZAINXW443UNF2HK5DJ
    N5XAUIBAEAQCAIDGN5ZCA5DIMUQFK3TJORSWIICTORQXIZLTEBXWMICBNVSXE2LDMEXAU===).join
  assert_equal(encoded, Base32.encode(plaintext))
end

Quirky Behavior in String#gsub

August 31st, 2006

At my office I develop in Delphi. We use Delphi 2006. As far as IDEs go, it’s not that great. For example, when you tell the Delphi 2006 IDE to do a build all (something you’d think developers do quite frequently), it has a very annoying behavior: it eats up scads of memory. In fact when the build all operation completes on our project group, Delphi has laid claim to over 1GB of memory, and it won’t let it go until you quit the application. But, this post isn’t about Delphi or its buggy IDE. It’s about ruby. More specifically, it’s about a quirk (read: bug) in ruby.

The String class in ruby has a method called gsub. This method takes two parameters and each can take two types of object. The first parameter can either be a Regexp or a String, and it represents what is to be replaced. The second can either be a String or a block, and supplies the value with which to replace it. This seems perfectly natural.

Now, if you’ve ever used regular expressions, you probably know about back-references. When you use the grouping operator in a regular expression (e.g. ^a(ab)b$) it stores a numbered back-reference to the matched value of each group. In ruby you can reference these with the special variables $1, $2, and so on. But, if you are passing a string as the replacement, it will only be interpolated once and those back-references won’t be correct. So, what gsub does is let you put in \1 and \2 instead.

That behavior is awesome, and exactly what you want, if you’re matching a regular expression. But if you’re just matching a string literal, there is absolutely no reason to do it. In fact, if all you’re doing is matching a string literal those back-references will all be the empty string.

So, how do I know all this? Well, because Delphi 2006’s build all operation bites, we wrote a ruby script to replace it. This script has to do file-name manipulation and all sorts of other string manipulation in order to get all of the correct compiler options. One of the things it does is replace strings like $(CodeBase) with a path such as c:\svn\trunk. Well, we have separate code bases for our branches, and they have names like c:\svn\2006. You see that \2 there? Yeah, that one, right in the middle of the path. Even though the script was matching a string literal, gsub was replacing back-references. Since the path happened to have a \2 in it, it would end up coming out of gsub as c:\svn006, and that certainly wasn’t right.

Thankfully, there is a simple work around. Instead of providing a string for the replacement, we can provide a block. That block gets called every time and the value that it returns is exactly what gets used as the replacement.

Strings, arrays, and duck typing goodness

February 14th, 2006

We use more and more ruby around my shop every day, and that just tickles me pink. One thing that we’ve been using it a lot for is managing our Subversion working copies. We have a script that will delete unversioned files. We have a script that will delete ignored files. I wanted to write a script that would do both of those things and also revert any modified files (thus returning the working copy to a pristine state, essentially).

There was a lot of duplication between the two scripts. In fact, the only thing different was one character in a regex: it was ‘?’ for the unversioned and ‘I’ for the ignored. I went through and wrote a new class to represent these things and then I wanted to write a method named delete_if_status which would take a list of statuses and delete any items in the checkout that matched any of them.

I thought it would be cool if I could call it like this:

list.delete_if_status(['?', 'I'])

But also call it like this:

list.delete_if_status(’?I’)

Naturally, I figured Ruby would have a duck-typing answer to this problem, but just the way it solved it surprised me (just a little–actually, now that I think about it, it’s unsurprising). Here is an IRB log that demonstrates just what I discovered.

irb(main):001:0> ‘I?’.split
=> ["I?"]
irb(main):002:0> ‘I?’.split(”)
=> ["I", "?"]
irb(main):003:0> ‘I?’.to_a
=> ["I?"]
irb(main):004:0> ['I','?'].to_s
=> “I?”
irb(main):005:0> ['I','?'].to_s.split(”)
=> ["I", "?"]

So what I ended up with was this method:

def delete_if_status(spec)
  status_list = spec.to_s.split(”)
  self.delete_if do
    |item|
    status_list.include? item.status
  end
end

I love Ruby.

Edited: Fixed some formatting with code and output snippets.

Ruby Talk

October 18th, 2005

I know, I know. It’s been a while since I last blogged (9 days, and longer since it was anything of content). I’ve been busy! Every so often I think to myself “Do I want to blog about what I’m doing, or just do it.” Between work, school, home-ownership, and sleeping…there’s not much time to do all the things I find fun. Sadly, blogging seems to fall to the wayside.

But, I am getting better about sharing with other people. For the third time in two months, I gave a talk about Ruby at a user group meeting here in Omaha. This time it was OJUG, and I gave them Jim Weirich’s OSCON 2005 talk, “10 Things Java Programmers Should Know About Ruby” (yay for creative commons). Well, at least I gave a talk using Jim’s slides, as they were just perfect.

I’ve really enjoyed doing this, so I’m going to put this out there. If you are a group in the Omaha area, and would like to have me come talk about Ruby, shoot me an email. I’m cheap. All I need is some free food, and I’m your man.

Layout, design, graphics, photography and text all © 2005-2007 Samuel Tesla unless otherwise noted.

Portions of the site layout use Yahoo! YUI Reset, Fonts & Grids.