Software archaeology: Apple’s cctools

One of the things I’ve been working on in Tigerbrew is backporting modern Apple build tools. The latest official versions, bundled with Xcode 2.5, are simply too old to be able to build some software. (For example, the latest GCC version available is 4.0.)

In the process, I’ve found some pretty fascinating bits of history littered through the code and makefiles for Apple’s build tools. Here are some findings from Apple’s cctools1 package:

This comment from near the top of cctools’s Makefile lists some of the valid build targets, which includes:

  • Kodiak, which was the Mac OS X public beta from September, 2000
  • Gonzo (Developer Preview 4), Bunsen (Developer Preview 3), and Beaker (PR2)
  • Rhapsody (internal name for the OS X project as a whole), Hera (Mac OS X Server 1.0, released 1999), and teflon (unknown to me)
  • OPENSTEP, NeXT’s implementation of their own OpenStep API

From further down in the same Makefile:

A lot of familiar cats here, along with a couple of early iOS versions (SugarBowl, BigBear) and a lot of names I’m not familiar with. (Please leave a comment if you have any insight!) As far as I know “Vail” was the Mac LC III from 1993 with no NeXT connection, but I’m sure it must be referring to something else.

From elsewhere in the tree, there’s code to support various CPU architectures. Aside from the usual suspects (PPC, i386), there are some other interesting finds:

  • HP/PA, aka PA-RISC, a CPU family from HP; some versions of NeXTSTEP were shipped for this
  • i860, an Intel CPU used in the NeXTdimension graphics board for NeXT’s computers
  • M680000, the classic Motorola CPU family, used in the original NeXT computers
  • M880000, a Motorola CPU family; NeXT considered using this in their original hardware but never shipped a product using it
  • SPARC, a CPU family from Sun; some versions of NeXTSTEP were shipped for this

I find it fascinating that, even now, cctools still carries the (presumably unmaintained) code for all of these architectures Apple no longer uses.

  1. Apple’s equivalent of binutils. []

Tiger’s `which` is terrible; or, Necessity is the mother of invention

One of the most useful things about running software in unusual configurations is that sometimes it exposes unexpected flaws you never knew you had.

The which utility is one of those commandline niceties you never really think about until it’s not there anymore. While sometimes implemented as a shell builtin1, it’s also frequently shipped as a standalone utility. Apple’s modern version, which is part of the shell_utils package and crystallized around Snow Leopard, works like this:

  • If the specified tool is found on the path, prints the path to the first version found (e.g., the one the shell would execute), and exits 0.
  • If the specified tool isn’t found, prints a newline and exits 1.

This version of the tool is really useful in shell scripts to determine a) if a program is present, and b) where it’s located, and until fairly recently Homebrew used it extensively. Unfortunately, early on in my work on Tigerbrew, I discovered that Tiger’s version was… deficient. It works like this:

  • If the specified tool is found on the path, prints the path to the first version found, and exits 0.
  • If the specified tool isn’t found, prints a verbose message to stdout, and exits 0.

The lack of a meaningful exit status and the error message on stdout are both pretty poor behaviour for a CLI app, and broke Homebrew’s assumptions about how it should work.

To work around this, I replaced Homebrew’s wrapper function with a two-line native Ruby method for Tigerbrew, like so:

As it turns out, not only does it work better on Tiger, but this method is actually faster2 than shelling out like Homebrew did; process spawning is relatively expensive. As a result, I ended up using the new helper in Homebrew even though it wasn’t strictly necessary.

(And as for the commandline utility, Tigerbrew has a formula for the shell_cmds collection of utilities.)

  1. zsh does; bash doesn’t. []
  2. On the millisecond scale, at least. []

Adventures with Ruby 1.8.2

Homebrew has always used the version of Ruby which comes with OS X,1 a design decision I decided to keep with Tigerbrew. Tiger comes with Ruby 1.8.2, built on Christmas Day, 2004, and with a version of Ruby that old I went in steeling myself for the inevitable ton of compatibility issues.

On the whole I was pleasantly surprised. Most of what Homebrew uses is provided in exactly the same form, and while there are differences that range from puzzling2 to major3, pretty much everything Just Works.

Except, at first, for Pathname. Ruby’s Pathname class, which is an object-oriented wrapper around the File and Dir classes, is at the heart of Homebrew’s file management. The first time I tried to install something with the newborn Tigerbrew, I was quickly treated to a strange exception with an equally mysterious backtrace: Errno::ENOTDIR: Not a directory.

Curious, I dug in. I soon discovered that the bug occurred while Homebrew was unlinking an existing version of a package before beginning to install an upgrade. (For those not in the know, Homebrew installs software into isolated versioned prefixes. The active version of a given package is symlinked into the standard /usr/local locations.) Most of the files were linked and unlinked just fine, but a few files caused the method Pathname#unlink to throw an exception every time. Eventually I noticed a pattern — every symlink that Pathname choked on represented a directory. Once I noticed that, it clicked.

For those who don’t know, symlinks are actually treated on the filesystem level as special files containing their target as text. For most operations, symlinks transparently act as their targets. However, applications which hit the filesystem directly will see them as files — even when they point to directories. Since Pathname handles files and directories differently, handing its instance methods off to File or Dir as appropriate, the bug happened something like this:

  1. The #unlink method is called on a Pathname object representing a symlink to a directory.
  2. Pathname examines the object to see if it represents a file or directory, in order to determine whether to call File.unlink or Dir.unlink.
  3. In doing so, Pathname follows the symlink to its target and examines the properties of the target.
  4. Seeing that the target is a directory, Pathname calls Dir.unlink on the original symlink.
  5. Dir.unlink raises Errno::ENOTDIR because, of course, the symlink isn’t a directory.

Ruby 1.8.2 is of course quite old at this point, but it’s pretty mindboggling to see a bug this major in a programming language’s standard library. Thankfully Ruby supports monkeypatching, so I was able to override the method with a version which acts sanely.

The overridden version of the method can be found here. The rest of Tigerbrew’s current backports are in Tigerbrew’s file extend/tiger.rb, for the curious.

  1. For predictability, and so the user doesn’t have to install Ruby before installing Homebrew. []
  2. String’s [] operator always returns the sliced character’s ASCII ordinal, not a string. []
  3. File#flock doesn’t exist in any form. []

Introducing Tigerbrew

Some of you may know that my other gig is Homebrew, the package manager for Mac OS X. Over the last few months, I’ve been spending some time on a fork of Homebrew that’s starting to become usable enough that I think it’s ready to be announced.

When I was attending the AMIA1 conference in December, my partner and I were travelling together; while I was at the conference during the day, he worked from various places in Seattle on his laptop. Since it’s practically impossible to attend a modern conference without a laptop, and he uses a desktop at home, I dug out my 2005-era PowerBook G4 to take notes. It may be eight years old, but as soon as I opened it up I remembered why I loved that laptop so much. It’s still in great shape, and it feels like a crime to leave it sitting unused so much of the time.

It’s slow by modern standards, of course, but the thing really keeping it from being usable all the time is software. Apple’s left PowerPC behind as of Mac OS X Leopard2, and so have nearly all developers at this point. There are still a few developers carrying the torch (shoutouts to TenFourFox), but as a commandline junkie what I really need is an up-to-date shell3 and CLI software. And as big Homebrew fan, as well as a developer, MacPorts just wasn’t going to cut it. Tigerbrew was born.

The first version of Tigerbrew was pulled together over an evening at the hotel after the first day of the conference, and I’ve been plugging away at it regularly since. At this point I’m proud to say that a significant number of packages build flawlessly,4 and thanks to some backports from newer versions of OS X5 Tigerbrew can supply a much more modern set of essential development tools than Apple provides.

Tigerbrew’s still very much an alpha, and there’s some more work needed until it’s mature, but at this point I consider it ready enough to announce to the world.6 If you have a PowerPC Mac yearning to be used again, why not give it a go?

  1. Association of Moving Image Archivists []
  2. And many hardcore PowerPC users stick with their old Macs for OS 9 compatibility, which was last supported in Tiger. []
  3. bash 2.5 doesn’t cut it. []
  4. Even complex software with a lot of moving parts, like FFmpeg. []
  5. I’m very indebted to the MacPorts developers, whose portfiles served as a reference for the buildsystems for several of these. []
  6. Development’s been happening in the public for months, of course, and there are already a few other users out there. []

PSA: homebrew-digipres repository now available!

Outside of archivy, I’m also a collaborator on Homebrew, the awesome, lightweight package manager for OS X. I’ve been building a private repository of niche packages which aren’t available in the core repository for some reason or another, and ended up collecting enough digital preservation tools to create a new digital preservation-focused repository. You can find the new homebrew-digipres here: https://github.com/mistydemeo/homebrew-digipres I’d welcome any contributions if you want to improve an existing formula, submit updates, or add a new package! Fork away.

Sigma SD1 – update

Heartbreak :’(

http://dpreview.com/news/1105/11052010sigmasd1.asp

Hiatus

Apologies for the long hiatus. I didn’t announce it on the blog, but I began a new job in January and moved to another province to take it.

I plan to get back to posting in the next bit. It will be awhile until things get set up, but I promise I’ll have cool stuff to share – and I’ll get back to posting digitization tutorials in the near future as well.

Tech to watch out for – Sigma SD1

Photokina, the world’s biggest photography exhibition, took place last month and I was eagerly scanning the headlines for good book scanning news. It may not have been the biggest news of the show for photography buffs, but I was very excited to see the announcement (via DPReview) of Sigma’s new SD1 camera. I’ve been holding out posting in hopes that more concrete details or sample images would show up, but since it looks like there may not be any news for a few months I decided to go ahead.

The SD1 is the newest camera Sigma has released using the Foveon technology, which is something interesting but which hasn’t seemed quite there yet before now. The SD1 is the first time Foveon has been competitive with traditional cameras, and could mean substantially better colour reproduction than is possible right now.

Why this could be so good takes a little explanation. A computer monitor treats colour by mixing together primary colours; every pixel on your monitor has three lights, representing red, green and blue (RGB). Essentially all cameras, on the other hand, use something called a “Bayer array” on their sensor. Instead of capturing the primary colours as a computer works with, they capture a single colour (red, green or blue) for each pixel. When the data is processed to produce the photo image, the colour information of each pixel is then averaged with the adjacent pixels to produce a full set of RGB values for each pixel. This means that the image has lightness, or luma, resolution for each pixel but a lower colour, or chroma, resolution because each pixel has only one true colour value.


Bayer sensor pixels. Each pixel is coloured to show the colour it detects. Illustration from Wikipedia.

Foveon works in a very different way. Instead of layering pixels flat on a single sensor, it uses three layers, each sensitive to one primary. The “stacks” of pixels are associated with each other in the same way that the three colour sub-pixels in a computer monitor pixel make up one single pixel, which means that every pixel has three true colour values and a chroma resolution equal to its luma resolution.

The Sigma SD1 marks the first time that Sigma’s cameras really have the chance to wow people. Their previous highest resolution sensors have been 4.7 megapixels, which meant that the theoretical advantage of the technology was a bit moot in the face of the competition’s raw number of pixels. The SD1, on the other hand, is a 15.4 megapixel sensor which means that it has the potential to compete on detail with the best cameras currently available if it delivers on Sigma’s promises.

My understanding is that Foveon-based technology has been used in industrial applications for some time now, but a high-resolution Foveon sensor for consumer use would make the first time it’s usable for those interested in building book scanners. If colour accuracy isn’t compromised, this is very promising. I’m keeping an eye open, and I’ll report on any updates.