sprezzatech

sprezzatech blog #0005

Dirty South Supercomputers and Waffles

UNIX, HPC, cyberwarfare, and chasing perf in Atlanta.

autoconfization? autotoolizing? autoconfiscation? adding autotools, pt. 1.
tue 2012-02-21 04:29:11 est

Anyone who's worked with much open source software is familiar with the invocation ./configure && make && sudo make install. In such usages, the configure script and resulting Makefile are (aside from the case of MPlayer, which — being run by mad Hungarians — does everything differently) the work of the GNU Autotools. I can't do better than the GNU project page itself for a description:

“Autoconf is an extensible package of m4 macros that

m4 is a POSIX-standard (1003.1) general-purpose macro processing facility courtesy of Kernighan and Ritchie. Besides Autotools, it's used by Sendmail and some RATFOR implementations; I use it to send my Christmas cards. YMMV.

produce shell scripts to automatically configure software source code packages. These scripts can adapt the packages to many kinds of UNIX-like systems without manual user intervention. Autoconf creates a configuration script for a package from a template file that lists the operating system features that the package can use, in the form of m4 macro calls.”

I had until now foregone Autotools in my own packages, for any number of reasons (Linux bigotry, horrible stories of buggy ./configure scripts, knowing how difficult it is to properly handle cancellation (especially portably) in scripts and Makefiles, the belief that portability issues can be more properly addressed in code itself, habit).

A digression, if you'll allow one, regarding make and particularly GNU Make. First off, if you've never read it, please go read, reread, internalize and never forget Peter Miller's 1997 classic, “Recursive Make Considered Harmful”. It's the single best bit of advice regarding make I've ever read, and I end up having to print it out and thrust it upon my peers everywhere I go work. Check out Paul Smith's Rules of Makefiles while you're at it, and my own GNU Make page has some useful animadversions. If you follow the suggestion about using a portable make (i.e. GNU Make) rather than writing portable Makefiles, you'll want USE_GMAKE=yes in your FreeBSD Port (FreeBSD uses Berkeley Make by default). It's best in any case to rely on portable tools, however, so take GNU's advice: include SHELL:=/bin/sh [0] in your Makefile's header, and avoid utilities beyond awk, cat, cmp, cp, diff, echo, egrep, expr, false, grep, install-info, ln, ls, mkdir, mv, printf, pwd, rm, rmdir, sed, sleep, sort, tar, test, touch, tr, and true [1], [2].

After Steve Jenson (of Twitter's infrastructure team) asked about whether Libtorque could run on Mac OSX, however, I realized it was about time that I took the plunge and studied Autotools. Since then, I've added (fairly minimal) Autotools support to Omphalos; this post details such additions, and the lessons learned.

One of my major blocks regarding Autotools was its lack of integration into standard texts (and generally shitty man pages), and absence from the POSIX standards. You'll find no mention of it in Stevens's and Rago's Advanced Programming in the UNIX® Environment or Kerrisk's Linux Programming Interface. ESR's Art of UNIX Programming has just over a page of material, with no usage notes. At least to me, there's long seemed something ad hoc and disreputable about Autotools. I've not yet read Calcote's Autotools: A Practioner's Guide (No Starch Press is very much a hit-or-miss publishing house, largely — like O'Reilly — motivated by authors, marked by dubious editing, and at times little better than a bound copy of the program manuals), but it looks pretty decent. RedHat makes freely available an older book, GNU Autoconf, Automake and Libtool. For now, your best reference appears to be the Autoconf manual on the GNU site or in your Autotools distribution. Throughout this article, I'll be referring to the 2010-09-22 publication of the manual, corresponding to the 2.68 release.

My goals regarding Autotools included:

Replacement of ad hoc environment variables used to control the build (well, they weren't all ad hoc; where possible, I use identifiers from the GNU Makefile Conventions, including $DESTDIR and the Standard Targets). They required explicit documentation and were, I felt, hard to use correctly (they needed be maintained across make invocations; I can never remember whether they need be defined via env/export or if they can be defined as arguments to make; ditto when they get exported to subprocesses; the rules regarding undefined-vs-empty variables (of much relevance to the ?= operator syntax and GNU Make's origin function) are baroque indeed, &c.). By using Autotools, I'd get standard configuration syntax (e.g. --prefix=PREFIX), online configuration help (./configure --help), and the options provided would be written to a file maintained across builds (this last seems both a good and a bad thing).
Eased porting to other platforms. Experiences building shared libraries on Mac OSX in 2010 had left an ashen taste in my mouth. Moving to FreeBSD had always involved some annoyance (off the top of my head: absence of mremap(2), different semantics of mmap(2), different prototypes on sendfile(2), slightly different getopt_long(3)s, &c. ad nauseam). OSX support is pretty much a requirement these days, and I personally care to use that platform as absolutely little as possible. libnl is constantly in a state of flux, and I track GCC quite closely; even if Autotools could just manage use of -march=native, that would be a nice win.

DRAMATIS PERSONAE

Autotools: The suite of GNU build configuration tools. Sometimes called merely “Autoconf” in a bit of glib synecdoche (ambiguous pars pro toto). Don't do this.
autoreconf: Runs subtools as necessary to (re)generate everything up through configure. The generated Makefile contains necessary dep information to handle rebuilding once it's created.
automake: Generates Makefile.in (input to configure) from a provided Makefile.am.
aclocal: Generates aclocal.m4, a collection of m4 macros used by configure. autoreconf ought handle this for you — I've not had to use aclocal.
autoconf: Generates configure from configure.in and various helper files. Cref “Autotools”.
autoscan: Attempts to generate configure.in via simple analysis of source code.
libtool: Helper for building (shared) libraries.
autom4te: No clue. Haven't had to use it, for which I'm thankful, as typing 'autom4te' would be sure to piss me off. Appears to be a cute pun on “m4”.
autoupdate: Updates configure.in from one version of autoconf syntax to another. Creepy!
autoheader: Generates header files based off the configure.in AM_CONFIG_HEADER directive.
install-sh: First off, it ought be called install.sh from all I can tell. Portable GNU install-like script. Generated by autoreconf -i or automake -a.

Least Astonishment. People expect open source projects to use Autotools. Despite eschewing it for years, I myself become wary when a project lacks good ol' configure. Conforming to expectations goes a long way towards overcoming curmudgeonly users' natural resistance to newfangled software.
Minimal changes to my existing source or semantics. I didn't want to introduce Autotools-specific syntax all over my source tree; preferably, I'd need modify only my Makefile, and perhaps rewrite some source in terms of a new configuration header. Similarly, I wanted to retain all my various toolchain- and system-dependent functionality, when possible.

Adoption of Autotools means partitioning your userbase into two camps: those who have Autotools installed, and those who do not. If you distribute the Autotools output (by which I primarily mean configure), the tools themselves are not necessary to build your package. It is thus recommended that release tarballs include the generated files, and indeed this is exactly what's happening when you run configure out of a downloaded tarball. As you develop your program, you'll need update your Autotools scripts in tandem, and furthermore you'll want to take advantage of new Autotools versions as they're released. It is thus necessary to retain the Autotools inputs in your source tree. Whether to keep the outputs there is a matter of debate: with them checked in, users without Autotools installed can build from a source checkout. Unfortunately, their inclusion of various line numbers &c. leads to a great deal of churn, and viewing changes across runs is not always a profitable endeavor. If you're not going to keep them checked in, be sure to add them to .gitignore files or svn:ignore properties. There's then the issue of whether your Makefile ought clean up Autotools output: since Makefile itself ends up being an output, make would need remove its own control file as part of the clean target!

Take a second to acquaint yourself with the “Dramatis Personae” table above.

autoreconf will complain unless certain project-specific files required by the GNU Standards are present, including AUTHORS, ChangeLog, NEWS, and COPYING. I found this behavior most vexing: the files themselves are free-form; their contents are unused.

Thankfully, you'll rarely end up needing to use more than one command: autoreconf. It knows the various dependency chains among Autotools and their inputs, and will properly rebuild as necessary. The -f option will forcefully reconfigure, and -i will make local copies of “missing files” e.g. install-sh. I have thus added to the Omphalos README a note to run “autoreconf -fi” when building from source. Using the -m switch, autoreconf will invoke ./configure and make if necessary; this yields an end-to-end rebuild on platforms with Autotools installed.

So, let's go to a project lacking Autotools support:

[skynet](0) $ make clean
rm -rf .out core
[skynet](0) $ ls
doc  GNUmakefile  LICENSE  README  src  tools
[skynet](0) $

If you're the kind who skips to the end, we're going to end up changing this in at least two ways ([3]): we'll be creating a configure.in, and renaming GNUmakefile to Makefile.am after making some changes to it. Try running autoconf -fi:

[skynet](0) $ autoreconf -fi
autoreconf: `configure.ac' or `configure.in' is required
[skynet](1) $

Construction of configure.in appears to be, without a doubt, one of the more painful experiences in UNIX development. Even the first macro [4] seems an issue of some confusion, and you'll find divergent examples in various tutorials. As for me and my house, we shall look to the penultimate source:

Macro: AC_INIT (package, version, [bug-report], [tarname], [url])

package is a full package name, including whitespace. bug-report is an email address. Without this macro, no configure script will be generated.

AC_INIT([omphalos], [0.99.1-pre], [omphalos-dev@googlegroups.com],
	[omphalos], [http://dank.qemfd.net/dankwiki/index.php/Omphalos])

With this line added, autoreconf generates a minimal configure script, which can be invoked with e.g. --help arguments.

In the following article of this two-parter, we'll walk through what else to put in configure.in, and modifying our Makefile to build based off of configure's output.

Some systems install e.g. Bash as /bin/sh [5], but POSIX-compliant shells will use Bourne-compatibility modes when invoked as /bin/sh.
Actually, don't use sleep even if it is standard, because you're almost certainly just papering over a bug.
See also the Autoconf manual's chapter regarding Portable Shell Programming.
We'll also, of course, be setting up the necessary ignores in our source control system of choice.
Technically, it must merely appear “before any macros causing output.” I've not yet run into a situation where one needs a silent macro prior to AC_INIT, and hope with some fervency that this remains the case.
And some install Dash. See this lengthy Ubuntu bug and their Dash as /bin/sh Wiki page for in-depth info.

Books mentioned in this post: