Alieniloquent


String transforms using Enumerable#inject

February 15th, 2008

I love functional programming, and I love Ruby. One of the most awesome things about Ruby is how much it borrows from the functional programming mindset. One of the most powerful concepts that functional programming brings to the table is higher-order functions. Ruby’s Enumerable module is a great example of how it embraces the idea of higher-order functions to abstract out the various things you do with a collection and let you focus on the operation for each item.

One of the most mysterious methods on Enumerable is Enumerable#inject. The example that’s always given is this:

irb> [1, 2, 3, 4].inject(0) {|sum, i| sum + i}
10

That’s fine, and usually makes sense. But when you try to branch out into more esoteric uses of inject, it can get confusing. So I’m going to give an example of accomplishing something useful with inject that you hopefully find useful.

I always find myself doing a sequence of substitutions on a string. For example, when I implement a Telnet client, I like to normalize the line endings I’m sending so that they’re sane. I accomplish that by translating “\r\n” to “\n”, then translating “\r” to “\n”, then translating “\n” to “\r\n”. It’s a simple thing to do, and I could do it like this:

string.gsub("\r\n", "\n").gsub("\r", "\n").gsub("\n", "\r\n")

But that’s not very extensible. I’d like to apply this idea of a sequence of substitutions in an abstract way so that I can do dynamically. And while I could do something with Object#send, that’s like cheating. This is where inject comes to the rescue.

def normalize_line_endings(string)
  transforms = [proc {|s| s.gsub("\r\n", "\n")},
                proc {|s| s.gsub("\r", "\n")},
                proc {|s| s.gsub("\n", "\r\n")}]
  transforms.inject(string) {|s, transform| transform.call(s)}
end

Kernel#proc (or Kernel#lambda if you prefer) is Ruby’s way of making higher-order functions. It returns a block which you can then call with an argument. In the above code, I make an array of transforms that take a string and return a string. The call to inject at the end is where the magic happens. It calls the first transform with string which was provided as the argument to inject. Then it calls the second transform with the result of the first, and it calls the third transform with the result of the second. That list could be as big as you want. It could even be dynamically generated.

That’s nice, but it’s still a a little verbose. I like to hide my use of Kernel#proc behind a declarative interface when I’m doing this sort of thing with it. So here’s how we can rewrite the method.

def transform(string, specifications = [])
  transforms = specifications.collect do |spec|
    proc {|s| s.gsub(spec[:from], spec[:to])}
  end
  transforms.inject(string) {|s, transform| transform.call(s)}
end

def normalize_line_endings(string)
  transform(string, [{:from => "\r\n", :to => "\n"},
                     {:from => "\r", :to => "\n"},
                     {:from => "\n", :to => "\r\n"}])
end

Of course, at that point, we don’t really need to create the procs. We can just use inject right on the specifications array, so the final code I came up with for this was:

def transform(string, specifications = [])
  specifications.inject(string) do |s, spec|
    s.gsub(spec[:from], spec[:to])
  end
end

def normalize_line_endings(string)
  transform(string, [{:from => "\r\n", :to => "\n"},
                     {:from => "\r", :to => "\n"},
                     {:from => "\n", :to => "\r\n"}])
end

Now that can be used with any list of transformations. Those transformations can be dynamically generated, and it’s a very clean implementation. That is the power of Enumerable#inject.

OCaml Talk

February 6th, 2008

So, I was going to give a talk on OCaml at ODYNUG last night. But, well, snow happened, and the meeting was canceled.

I will be giving the talk next month, on March 4th along with Brent Adkisson who will be giving a talk about Android.

Make SLIME load faster

December 29th, 2007

I have joined up with some of the guys from ODYNUG who have started meeting for breakfast and learning Common Lisp together. We are all using some version of Emacs, SLIME, and SBCL.

Blaine shared a cool way to make SLIME load much faster by taking advantage of the fact that Lisp uses images like Smalltalk (or more accurately, Smalltalk uses images like Lisp). He posted it for the group to see, but I wanted to post it here for my readers.

SBCL allows you to specify an image, or as they call it a core, by passing the --core option along with (as far as I can tell) an absolute path to the core file (well, at least it doesn’t know that ~ means $HOME). It, of course, also provides a way to create these core files, so you can load a bunch of stuff in, and then save a core file that has all of that already loaded.

So first, go into your SLIME directory and copy swank-loader.lisp to swank-loader-original.lisp. Then make swank-loader.lisp look like this (changing slime-dir to be wherever your SLIME is, of course):

(if (not (find-package 'swank-loader))
    ;; Edit SLIME-DIR to be where you have SLIME installed.
    (let ((slime-dir (merge-pathnames ".elisp/slime/" (user-homedir-pathname))))
      (load (merge-pathnames "swank-loader-original" slime-dir))))

Then, make a file called bootstrap.lisp with the following content:

;; Load Swank
(load (merge-pathnames ".elisp/slime/swank-loader" (user-homedir-pathname)))

;; Save image
(sb-ext:save-lisp-and-die "sbcl-with-slime.core")

And run this command:

$ sbcl --load bootstrap.lisp

Then copy sbcl-with-slime.core somewhere safe, I put mine in with my slime code to keep it all together. Then you just have to add the following to your .emacs:

(let* ((slime-dir (concat elisp-dir "/slime"))
       (core-file (concat slime-dir "/sbcl-with-slime.core")))
  (setq inferior-lisp-program (concat "sbcl --core " core-file)))

Then you can M-x slime and it will be super fast.

A little more lambda

December 23rd, 2007

Alonzo Church invented the lambda calculus. He also figured out how to encode many kinds of data as lambda expressions. Take your simple booleans, for example.

This is true:

fn x y. x

And this is false:

fn x y. y

That makes the identity function the if then else construct:

> (fn p. p) (fn x y. x) a b;
a
> (fn p. p) (fn x y. y) a b;
b

And similarly you can get a logical and:

> (fn p. p) ((fn p q. p q p) (fn x y. x) (fn x y. x)) a b;
a
> (fn p. p) ((fn p q. p q p) (fn x y. x) (fn x y. y)) a b;
b
> (fn p. p) ((fn p q. p q p) (fn x y. y) (fn x y. x)) a b;
b

Fiddling around with these church booleans revealed several bugs in my code, which I’ve fixed. I’ve additionally added a new node to the parse tree to represent the () grouping that is typed into the code so that when it is formatted for display it looks better.

You can get the newest code here.

Currying Function Parameters

December 23rd, 2007

One of the first things I wanted to do to improve the readability of my language was to add the currying of function parameters. Since it is such a common pattern to have three or four abstractions right in a row to bind variables, there is a syntax for expressing them more consisely.

So this:

fn x. fn y. fn z. x y z

Becomes this:

fn x y z. x y z

Adding the code do this was nearly trivial, and all in the parser. First I wrote a function that given a list of variables and an expression for the body, would be able to construct the parse tree for a curried function:

let curry ids body =
  List.fold_right (fun id expr -> Abstraction(id, expr)) ids body

Then I took the existing production for recognizing expressions:

expr:
  aexprs {apply $1}
| FN VAR PERIOD expr {Abstraction ($2, $4)}
;

And turned it into this:

expr:
  aexprs {apply $1}
| FN ids PERIOD expr {curry $2 $4}
;

ids:
  VAR {[$1]}
| VAR ids {$1::$2}
;

That ids production is using the OCaml :: operator which performs the cons operation. So as I recurse on the right, I’m building up a list and consing each new id onto it all the way up.

And just like that I’ve added currying to my language.

The Lambda Calculus

December 23rd, 2007

Earlier this fall I wrote a little functional programming language. However, the guts of it were not based on the lambda calculus. I used more of a denotational semantics approach to the evaluation, which worked fine. But, I still wanted to implement an actual lambda calculus interpreter.

So, now that I am done with school and have some free time, I threw a little something together. I used it as an introductory project to OCaml, and really enjoyed writing it.

So what is the lambda calculus, you might ask?

There are three basic concepts in lambda calculus. There are variables:

x

There are abstractions:

fn x. x

And there are applications:

f x

Applications are left associative so:

f x y

is the same as:

(f x) y

So for a more complicated example from the REPL:

> (fn f. fn x. f x) (fn y. y) z;
z

The first part declares a function which binds f and returns a function which binds x and returns the application of f to x. We pass to that the identity function (fn y. y) and the variable z. That all reduces to just z.

You can download my code here. I will be posting snippets of it in future posts. I will also be blogging as I extend it to add more features.

Remember ML-Yacc makes error-correcting parsers

October 9th, 2007

So I got my language working. But there are still some things I want to add to it. One thing that was bothering me was that both this code:

fn x => x

and this code:

x => x

parsed to the same thing.

I banged my head against this. My grammar had the production right:

expr : ...
     | FN ident RARROW expr (T.FnDef (ident,expr))
     ...

and my lexer produced the tokens just fine:

<INITIAL> "fn" => (Tokens.FN(!pos, !pos));
<INITIAL> "=>" => (Tokens.RARROW(!pos, !pos));

So, I was confused. I downloaded the source for SML/NJ in hopes that their grammar and lexer would shed insight on what I was (obviously) doing wrong. But, inasmuch as SL is like SML, the grammar and lexer were the same.

Sleep beckoned, so I went. This morning I banged my head at it some more. Then once I started combing over the documentation, it hit me. ML-Yacc produces error-correcting parsers. It will perform single-token substitutions in order to get a valid parse. And, if you notice, it only has to make a single-token correction to get from the bad code to the good code.

My solution? The same as SML/NJ’s, set the lookahead to zero for interactive sessions and fail fast, so that if you are trying stuff interactively (or from unit tests) it will be relentless about grammar. On the other hand, if you are parsing a file, my interpreter will be forgiving. After all, why should it fail the whole file if all you’re missing is fn?

A Gentle Introduction to Erlang

October 4th, 2007

I gave a talk at ODYNUG on Tuesday about one of my favorite dynamic languages: Erlang. It went pretty well, I think. Unlike my Lisp talk from last year, I don’t think I caused too many heads to explode.

I’ve posted my slides and some example code here.

I’ll be giving a talk on ML on Febuary 5, 2008.

Layout, design, graphics, photography and text all © 2005-2007 Samuel Tesla unless otherwise noted.

Portions of the site layout use Yahoo! YUI Reset, Fonts & Grids.