Alieniloquent


Three little, two little, one little-endian

April 24th, 2007

I recently found myself wanting a Cocoa class that represents a set of 8-bit bytes. Cocoa has NSCharacterSet, but that is for unichar, not uint8_t. So I wrote one. It was easy enough, I gave it an array of UINT8_MAX booleans and said that if a particular element in the array was YES then that byte was in the set, and not if the element was NO.

Initially the class only knew how to answer questions of membership: is a byte in the set or not? But then I found a number of places where I was enumerating all possible values and testing for membership, so I figured adding a method that would return a NSData with just the bytes included in the set would be useful.

So I wrote this:

- (NSData *) dataValue
{
  NSMutableData *result = [NSMutableData data];
  for (unsigned i = 0; i <= UINT8_MAX; ++i)
  {
    if (contains[i])
      [result appendBytes: &i length: 1];
  }
  return result;
}

I had unit tests that proved it worked, and they all passed, so I checked in. All was good in the world.

Five days later, I flip open my laptop and decide to use the program this code is part of. I always try to eat my own dog food, and I prefer the freshest dog food I can get. So, whenever I want to use this application, I delete it, update from our Subversion repository, and build it.

Much to my surprise, when I built it on my laptop, some of those tests did not pass. I was expecting the NSData returned from -dataValue to have certain bytes in it. The NSData I actually got back did have the correct number of bytes, but they were all zeroes.

I banged my head against it for about twenty minutes, until I had a flash of insight. My desktop machine at home is an iMac, and inside it is an Intel Core Duo processor. My laptop is a PowerBook, and inside it is a Motorola G4 processor. The Core Duo, like most other Intel processors, stores numbers in the little-endian format, whereas the G4 stores them in big-endian format.

Endianess is a computer topic that makes a lot of programmers’ heads hurt. Unfortunately, Cocoa programmers do have to think about this now. Since Apple switched from their old, big-endian, Motorola platform to their new, little-endian, Intel platform, applications that are meant to run on both have to be aware of byte-order issues.

Computers store data in bytes, which are eight bits long. However, eight bits is only enough to store a number up to 255. In order to store larger numbers, computers just concatenate bytes together. A 16-bit number is comprised of two bytes, and a 32-bit number is comprised of four. The endianess of a system determines what order those bytes are stored in.

When you read a decimal number like 4242, you read it from left to right. The most significant digit is the left-most digit. Similarly, when you read a binary number like 1000010010010, the most significant digit is the left-most digit. If we divide that number into bytes, 00010000 10010010, the left-most byte is called the most significant byte, or the high-order byte. The right-most byte is called the least significant byte, or the low-order byte.

A big-endian processor, like the G4, stores numbers exactly like you’d read them. So if you read a 16-bit integer in big-endian order, the first byte you read is the high-order byte. Now, if the number is less than 255, for example 42, you’ll get this: 00000000 00101010.

A little-endian processor, like the Core Duo, stores numbers just the opposite of how you’d expect. The first byte you read is the least significant byte, followed by the next most significant byte, and then so on. So when we read our binary number in we’ll get 10010010 00010000 instead of what we expected. Now, if we look at that small number again, you’d get this: 00101010 00000000.

So, to bring this back to my bug. The unsigned type is actually an unsigned 32-bit integer. Since my code was manipulating a set of 8-bit numbers, every single number would fit into the low-order byte of that unsigned, thus leaving the other three bytes all zero.

The line of code where I do this:

[data appendBytes: &i length: 1]

Is a clever little trick I’ve used to avoid having to actually declare a one-byte array when I want to append just one byte. It works great if i is actually an uint8_t. It also works great if i is an unsigned and stored in little-endian format, since the first byte happens to be the byte I’m interested in. However, on a big-endian processor, that will reference the most significant byte of the number instead, and since i never gets any bigger than UINT8_MAX (which is 11111111 in binary), that byte will always be zero.

So now the code looks like this:

- (NSData *) dataValue
{
  NSMutableData *result = [NSMutableData data];
  uint8_t byte[1];
  for (unsigned i = 0; i <= UINT8_MAX; ++i)
  {
    if (contains[i])
    {
      byte[0] = i;
      [result appendBytes: byte length: 1];
    }
  }
  return result;
}

The compiler knows to do the correct conversion between the 32-bit and 8-bit types when assigning from one to another, so the new code now works on both of my machines.

Update: The title is a joke that Erica made up when I told her about this bug. All blame for its terribleness should go to her, I just recognized how apropos it was for the post.

Podcast

September 4th, 2006

So I decided to start a pod cast. Check it out: The Agile Mac.

J3Testing 1.0

September 1st, 2006

I decided to make my J3TestCase code into an actual framework. This way I don’t have to copy the files each time I want to use the class.

I’ve put up a disk image with binaries and one with source.
The source is worth looking at, especially to see how I made targets to automatically build those disk images.

These replace the old J3TestCase code I had posted, and I’ve removed that tarball. So, sorry if I broke that link. This is a much better way to deploy anyway.

YAGNI: byte-order conversion explained

August 30th, 2006

In my previous post I talked about this problem I ran into with byte-ordering and tests failing. Brian and Joe both felt that I left them hanging by going into all the technical details of what my problem was and not going into the technical details of why I shouldn’t have used htons to start with. Rather than editing the other post, here’s a new one.

The code that I was using htons in was my SOCKS5 implementation. For those who aren’t familiar with SOCKS5, it is a proxying protocol. The computer connects to the proxy and sends it a hostname and port for the actual connection. The proxy then makes the connection and relays packets for the remainder of the session.

Some of you may be guessing what I was calling htons on already: the port to be serialized. Here are the two offending lines of code:

[buffer append:(0xFF00 & htons(port)) >> 8];
[buffer append:(0x00FF & htons(port))];

That buffer was then, in turn, written out across the socket to the server (or inspected by unit tests). But, since I was extracting each byte individually the byte-order didn’t matter since 0xFF00 will be in the same byte order as port every time.

I hope this explains things a little better.

YAGNI: byte-order conversion

August 29th, 2006

So I have this project, and it compiles for the Mac on both PPC and i386 architectures. Naturally, I have unit tests for this project.

One of the things that I’ve had to write for this is a simple SOCKS5 implementation (because of issues with Apple’s implementation). As part of this I had to do some manipulation of a port number and get the high and low bytes.

I’ve done this sort of thing before. That’s what htons and friends are for. So naturally, I went ahead and put this in where I thought it mattered and went on my merry way. On my ppc powerbook, my tests all passed.

Yesterday, when I ran the unit tests on my intel mac for the first time, I discovered that a test failed. It was failing because of some byte-order problem. After troubleshooting it, I narrowed it down to the htons calls. It turns out I did not need them.

See the thing is, on a ppc, things are already in network byte-order so htons does absolutely nothing. However, on i386 host byte-order is different from network byte-order. So the reason it worked on ppc was that it was essentially as if I hadn’t done it.

It turns out that the particular thing I was doing didn’t need the conversion to work correctly. I removed it, and everything worked peachy on both machines.

Linker warning: -bind_at_load

August 29th, 2006

Apple has had Intel machines for about a year. When they came out with the new architecture, they came up with the idea of a universal binary. It is a binary that will run natively on either the PPC or Intel architectures. None of the apps I work on have been compiled as universal binaries until today.

I just got my iMac yesterday, and it’s beautiful. The most interesting thing about it, though, is that it is an Intel machine. So naturally, when I got my projects all set up on it, I finally had to bite the bullet and make things work cross-platform.

That meant I had to recompile some frameworks to be universal binaries themselves. I also had to twiddle some build options. It was all well documented, and easy enough. But then I ran into a weird error that I couldn’t figure out:

/usr/bin/ld: warning suggest use of -bind_at_load, as lazy binding may result in errors or different symbols being used
symbol _atan2f used from dynamic library
/Developer/SDKs/MacOSX10.4u.sdk/usr/lib/gcc/powerpc-apple-darwin8/4.0.1/../../../libSystem.dylib(floating.o) not from earlier dynamic library /usr/lib/libmx.A.dylib(single module)

After much searching, I finally found an article, and it explained what I needed to do. All I had to do was add -lSystem to my linking flags.

Now all I have to do is fix my one byte-order issue, and the program is all better.

Cocoa + Google = Yay!

August 23rd, 2006

So, I’m going to be writing an app to interface with Google Calendar, so I need to learn how to speak the Google Data APIs. The first part of this is being able to authenticate using their client login protocol.

I figured I’d throw together a simple little Cocoa app that just has text inputs for all the things that need to be sent and then I could do that.  I figured once I had the code written, I could extract it out into a class that I could use in my real app.

I spent 30 minutes remembering how to throw the GUI together (I’m a bit rusty).

I spent an hour or so understanding the NSURL API and figuring out how to url encode in Cocoa (it turns out there isn’t anything in the Foundation classes).

I spent another fifteen minutes not understanding why it told me my authentication was bad, despite it being correct, and then realizing I wasn’t actually sending my authentication information (or any part of the request).  It only took five minutes to fix it.

I now get a successful response back from Google when I log in using my little login testing application.  How cool is that?

Unit Testing of Cocoa Apps

February 23rd, 2006

There’s plenty of articles out there telling you how to use OCUnit, especially now that it ships with XCode 2.0. They’re all excellent. You should read them, especially Apple’s. This article is simply about a little trick that I’ve come up with to solve a minor annoyance I have with using Apple’s method of hooking up what they call a dependent test bundle.

A dependent test bundle is a really neat idea, actually. It uses your actual application as a framework at link time and then uses some nifty magic to launch the test bundle from inside your application and run all of the tests. Why bother with all of that? It means you can keep your application code in one target and your test code in another target — completely separate. That’s a worthy goal.

The only thing I don’t like about this, though, is that if you’re developing a GUI Cocoa application (and let’s face it, most people are) then the process of running the tests from within the application has the consequence of popping up the GUI and letting it sit there until you quit manually. That’s annoying.

So I came up with a solution. It’s really nothing that fancy, but I thought I’d share. I make a Cocoa Shell Tool target, and I name it something like stub. I make a C file and name it stub-main.m. In that file I put the following three lines:

int main(int argc, char *argv[]) {
  return 0;
}

Then I add all of the application source files that my tests are going to need to link to and make sure it all compiles. I just use that as the bundle loader for my tests, and it’s all good. No GUI popping up.

For an example of how to do this, you can look at the source for OCFit.

Layout, design, graphics, photography and text all © 2005-2007 Samuel Tesla unless otherwise noted.

Portions of the site layout use Yahoo! YUI Reset, Fonts & Grids.