[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

Preface

The purpose of this document is to introduce you to the GNU build system, and show you how to use it to write good code. It also discusses peripheral topics such as how to use GNU Emacs as a source code navigator, how to write good software, and the philosophical concerns behind the free software movement. The intended reader should be a software developer who knows his programming languages, and wants to learn how to package his programs in a way that follows the GNU coding standards.

This manual introduces you to the GNU build system and shows you how to develop high-quality

This manual shows you how to develop high-quality software on GNU using the GNU build system that conforms to the GNU coding standards. These techniques are also useful for software development on GNU/Linux and most variants of the Unix system. In fact, one of the reasons for the elaborate GNU build system was to make software portable between GNU and other similar operating systems. We also discuss peripheral topics such as how to use GNU Emacs as an IDE (integrated development environment), and the various practical, legal and philosophical concerns behind software development.

When we speak of the GNU build system we refer primarily to the following four packages:

The GNU build system has two goals. The first is to simplify the development of portable programs. The second is to simplify the building of programs that are distributed as source code. The first goal is achieved by the automatic generation of a `configure' shell script. The second goal is achieved by the automatic generation of Makefiles and other shell scripts that are typically used in the building process. This way the developer can concentrate on debugging his source code, instead of his overly complex Makefiles. And the installer can compile and install the program directly from the source code distribution by a simple and automatic procedure.

The GNU build system needs to be installed only when you are developing programs that are meant to be distributed. To build a program from distributed source code, you only need a working make, a compiler, a shell, and sometimes standard Unix utilities like sed, awk, yacc, lex. The objective is to make software installation as simple and as automatic as possible for the installer. Also, by setting up the GNU build system such that it creates programs that don't require the build system to be present during their installation, it becomes possible to use the build system to bootstrap itself.

Some tasks that are simplified by the GNU build system include:

The Autotoolset package complements the GNU build system by providing the following additional features:

Autotoolset is still under development and there may still be bugs. At the moment Autotoolset doesn't do shared libraries, but that will change in the future.

This effort began by my attempt to write a tutorial for Autoconf. It involved into "Learning Autoconf and Automake". Along the way I developed Autotoolset to deal with things that annoyed me or to cover needs from my own work. Ultimately I want this document to be both a unified introduction of the GNU build system as well as documentation for the Autotoolset package.

I believe that knowing these tools and having this know-how is very important, and should not be missed from engineering or science students who will one day go out and do software development for academic or industrial research. Many students are incredibly undertrained in software engineering and write a lot of bad code. This is very very sad because of all people, it is them that have the greatest need to write portable, robust and reliable code. I found from my own experience that moving away from Fortran and C, and towards C++ is the first step in writing better code. The second step is to use the sophisticated GNU build system and use it properly, as described in this document. Ultimately, I am hoping that this document will help people get over the learning curve of the second step, so they can be productive and ready to study the reference manuals that are distributed with all these tools.

This manual of course is still under construction. When I am done constructing it some paragraph somewhere will be inserted with the traditional run-down of summaries about each chapter. I write this manual in a highly non-linear way, so while it is under construction you will find that some parts are better-developed than others. If you wish to contribute sections of the manual that I haven't written or haven't yet developed fully, please contact me.

Chapters 1,2,3,4 are okay. Chapter 5 is okay to, but needs a little more work. I removed the other chapters to minimize confusion, but the sources for them are still being distributed as part of the Autotoolset package for those that found them useful. The other chapters need a lot of rewriting and they would do more harm than good at this point to the unsuspecting reader. Please contact me if you have any suggestions for improving this manual.

Remarks by Marcelo: I am currently updating this manual to the last release of the autoconf/automake tools.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

Acknowledgements

This document and the Autotools package have originally been written by Eleftherios Gkioulekas. Many people have further contributed to this effort, directly or indirectly, in various way. Here is a list of these people. Please help me keep it complete and exempt of errors.

FIXME: I need to start keeping track of acknowledgements here


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

Copying

This book that you are now reading is actually free. The information in it is freely available to anyone. The machine readable source code for the book is freely distributed on the internet and anyone may take this book and make as many copies as they like. (take a moment to check the copying permissions on the Copyright page). If you paid money for this book, what you actually paid for was the book's nice printing and binding, and the publisher's associated costs to produce it.

The following notice refers to the Autotoolset package, which includes this documentation, as well as the source code for utilities like `acmkdir' and for additional Autoconf macros. The complete GNU development tools involves other packages also, such as Autoconf, Automake, Libtool, Make, Emacs, Texinfo, the GNU C and C++ compilers and a few other accessories. These packages are free software, and you can obtain them from the Free Software Foundation. For details on doing so, please visit their web site http://www.fsf.org/. Although Autotoolset has been designed to work with the GNU build system, it is not yet an official part of the GNU project.

The Autotoolset package is also "free"; this means that everyone is free to use it and free to redistribute it on a free basis. The Autotoolset package is not in the public domain; it is copyrighted and there are restrictions on its distribution, but these restrictions are designed to permit everything that a good cooperating citizen would want to do. What is not allowed is to try to prevent others from further sharing any version of this package that they might get from you.

Specifically, we want to make sure that you have the right to give away copies of the programs that relate to Autotoolset, that you receive source code or else can get it if you want it, that you can change these programs or use pieces of them in new free programs, and that you know you can do these things.

To make sure that everyone has such rights, we have to forbid you to deprive anyone else of these rights. For example, if you distribute copies of the Autotoolset-related code, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must tell them their rights.

Also, for our own protection, we must make certain that everyone finds out that there is no warranty for the programs that relate to Autotoolset. If these programs are modified by someone else and passed on, we want their recipients to know that what they have is not what we distributed, so that any problems introduced by others will not reflect on our reputation.

The precise conditions of the licenses for the programs currently being distributed that relate to Autotoolset are found in the General Public Licenses that accompany them.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1. Installing GNU software

Free software is distributed in source code distributions. Many of these programs are difficult to install because they use system dependent features, and they require the user to edit makefiles and configuration headers. By contrast, the software distributed by the GNU project is autoconfiguring; it is possible to compile it from source code and install it automatically, without any tedious user intervention.

In this chapter we discuss how to compile and install autoconfiguring software written by others. In the subsequent chapters we discuss how to use the development tools that allow you to make your software autoconfiguring as well.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1.1 Installing a GNU package

Autoconfiguring software is distributed with packaged source code distributions. These are big files with filenames of the form:

 
package-version.tar.gz

For example, the file `autoconf-2.57.tar.gz' contains version 2.57 of GNU Autoconf. We often call these files source distributions; sometimes we simply call them packages.

The steps for installing an autoconfiguring source code distribution are simple, and if the distribution is not buggy, can be carried out without substantial user intervention.

  1. First, you have to unpack the package to a directory:
     
    % gunzip foo-1.0.tar.gz
    % tar xf foo-1.0.tar
    

    This will create the directory `foo-1.0' which contains the package's source code and documentation. Look for the files `README' to see if there's anything that you should do next. The `README' file might suggest that you need to install other packages before installing this one, or it might suggest that you have to do unusual things to install this package. If the source distribution conforms to the GNU coding standards, you will find many other documentation files like `README'. See section Maintaining the documentation files, for an explanation of what these files mean.

  2. Configure the source code. Once upon a time that used to mean that you have to edit makefiles and header files. In the wonderful world of Autoconf, source distributions provide a `configure' script that will do that for you automatically. To run the script type:
     
    % cd foo-1.0
    % ./configure
    
  3. Now you can compile the source code. Type:
     
    % make
    

    and if the program is big, you can make some coffee. After the program compiles, you can run its regression test-suite, if it has one, by typing

     
    % make check
    
  4. If everything is okey, you can install the compiled distribution with:
     
    % su
    # make install
    

The `make' program launches the shell commands necessary for compiling, testing and installing the package from source code. However, `make' has no knowledge of what it is really doing. It takes its orders from makefiles, files called `Makefile' that have to be present in every subdirectory of your source code directory tree. From the installer perspective, the makefiles define a set of targets that correspond to things that the installer wants to do. The default target is always compiling the source code, which is what gets invoked when you simply run make. Other targets, such as `install', `check' need to be mentioned explicitly. Because `make' takes its orders from the makefile in the current directory, it is important to run it from the correct directory. See section Compiling with Makefiles, for the full story behind `make'.

The `configure' program is a shell script that probes your system through a set of tests to determine things that it needs to know, and then uses the results to generate `Makefile' files from templates stored in files called `Makefile.in'. In the early days of the GNU project, developers used to write `configure' scripts by hand. Now, no-one ever does that any more. Now, `configure' scripts are automatically generated by GNU Autoconf from an input file `configure.in'. GNU Autoconf is part of the GNU build system and we first introduce in in The GNU build system.

As it turns out, you don't have to write the `Makefile.in' templates by hand either. Instead you can use another program, GNU Automake, to generate `Makefile.in' templates from higher-level descriptions stored in files called `Makefile.am'. In these files you describe what is being created by your source code, and Automake computes the makefile targets for compiling, installing and uninstalling it. Automake also computes targets for compiling and running test suites, and targets for recursively calling make in subdirectories. The details about Automake are first introduced in Using Automake and Autoconf.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1.2 The Makefile standards

The GNU coding standards are a document that describes the requirements that must be satisfied by all GNU programs. These requirements are driven mainly by technical considerations, and they are excellent advice for writing good software. The makefile standards, a part of the GNU coding standards, require that your makefiles do a lot more than simply compile and install the software.

One requirement is cleaning targets; these targets remove the files that were generated while installing the package and restore the source distribution to a previous state. There are three cleaning targets that corresponds to three levels of cleaning: clean, distclean, maintainer-clean.

clean

Cleans up all the files that were generated by make and make check, but not the files that were generated by running configure. This targets cleans the build, but does not undo the source configuration by the configure script.

distclean

Cleans up all the files generated by make and make check, but also cleans the files that were generated by running configure. As a result, you can not invoke any other make targets until you run the configure script again. This target reverts your source directory tree back to the state in which it was when you first unpacked it.

maintainer-clean

Cleans up all the files that distclean cleans. However it also removes files that the developers have automatically generated with the GNU build system. Because users shouldn't need the entire GNU build system to install a package, these files should not be removed in the final source distribution. However, it is occasionally useful for the maintainer to remove and regenerate these files.

Another type of cleaning that is required is erasing the package itself from the installation directory; uninstalling the package. To uninstall the package, you must call

 
% make uninstall

from the top level directory of the source distribution. This will work only if the source distribution is configured first. It will work best only if you do it from the same source distribution, with the same configuration, that you've used to install the package in the first place.

When you install GNU software, archive the source code to all the packages that you install in a directory like `/usr/src' or `/usr/local/src'. To do that, first run make clean on the source distribution, and then use a recursive copy to copy it to `/usr/src'. The presence of a source distribution in one of these directories should be a signal to you that the corresponding package is currently installed.

Francois Pinard came up with a cute rule for remembering what the cleaning targets do:

GNU standard compliant makefiles also have a target for generating tags. Tags are files, called `TAGS', that are used by GNU Emacs to allow you to navigate your source distribution more efficiently. More specifically, Emacs uses tags to take you from a place where a C function is being used in a file, to the file and line number where the function is defined. To generate the tags call:

 
% make tags

Tags are particularly useful when you are not the original author of the code you are working on, and you haven't yet memorized where everything is. See section Navigating source code, for all the details about navigating large source code trees with Emacs.

Finally, in the spirit of free redistributable code, there must be targets for cutting a source code distribution. If you type

 
% make dist

it will rebuild the `foo-1.0.tar.gz' file that you started with. If you modified the source, the modifications will be included in the distribution (and you should probably change the version number). Before putting a distribution up on FTP, you can test its integrity with:

 
% make distcheck

This makes the distribution, then unpacks it in a temporary subdirectory and tries to configure it, build it, run the test-suite, and check if the installation script works. If everything is okey then you're told that your distribution is ready.

Writing reliable makefiles that support all of these targets is a very difficult undertaking. This is why we prefer to generate our makefiles instead with GNU Automake.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1.3 Configuration options

The `configure' script accepts many command-line flags that modify its behaviour and the configuration of your source distribution. To obtain a list of all the options that are available type

 
% ./configure --help

on the shell prompt.

The most useful parameter that the installer controls during configuration is the directory where they want the package to be installed. During installation, the following files go to the following directories:

 
Executables   → /usr/local/bin
Libraries     → /usr/local/lib
Header files  → /usr/local/include
Man pages     → /usr/local/man/man?
Info files    → /usr/local/info

The `/usr/local' directory is called the prefix. The default prefix is always `/usr/local' but you can set it to anything you like when you call `configure' by adding a `--prefix' option. For example, suppose that you are not a privileged user, so you can not install anything in `/usr/local', but you would still like to install the package for your own use. Then you can tell the `configure' script to install the package in your home directory `/home/username':

 
% ./configure --prefix=/home/username
% make
% make check
% make install

The `--prefix' argument tells `configure' where you want to install your package, and `configure' will take that into account and build the proper makefile automatically.

If you are installing the package on a filesystem that is shared by computers that run variations of GNU or Unix, you need to install the files that are independent of the operating system in a shared directory, but separate the files that are dependent on the operating systems in different directories. Header files and documentation can be shared. However, libraries and executables must be installed separately. Usually the scheme used to handle such situations is:

 
Executables   → /usr/local/system/bin
Libraries     → /usr/local/system/lib
Header files  → /usr/local/include
Man pages     → /usr/local/man/mann
Info files    → /usr/local/info

The directory `/var/local/system' is called the executable prefix, and it is usually a subdirectory of the prefix. In general, it can be any directory. If you don't specify the executable prefix, it defaults to being equal to the prefix. To change that, use the `--exec-prefix' flag. For example, to configure for a GNU/Linux system, you would run:

 
% configure --exec-prefix=/usr/local/linux

To configure for GNU/Hurd, you would run:

 
% configure --exec-prefix=/usr/local/hurd

In general, there are many directories where a package may want to install files. Some of these directories are controlled by the prefix, where others are controlled by the executable prefix. See section Installation standard directories, for a complete discussion of what these directories are, and what they are for.

Some packages allow you to enable or disable certain features while you configure the source code. They do that with flags of the form:

 
   --with-package   --enable-feature
--without-package  --disable-feature

The --enable flags usually control whether to enable certain optional features of the package. Support for international languages, debugging features, and shared libraries are features that are usually controlled by these options. The --with flags instead control whether to compile and install certain optional components of the package. The specific flags that are available for a particular source distribution should be documented in the `README' file.

Finally, configure scripts can be passed parameters via environment variables. One of the things that configure does is decide what compiler to use and what flags to pass to that compiler. You can overrule the decisions that configure makes by setting the flags CC and CFLAGS. For example, to specify that you want the package to compile with full optimization and without any debugging symbols (which is a bad idea, yet people want to do it):

 
% export CFLAGS="-O3"
% ./configure

To tell configure to use the system's native compiler instead of gcc, and compile without optimization and with debugging symbols:

 
% export CC="cc"
% export CFLAGS="-g"
% ./configure

This assumes that you are using the bash shell as your default shell. If you use the csh or tcsh shells, you need to assign environment variables with the setenv command instead. For example:

 
% setenv CFLAGS "-O3"
% ./configure 

Similarly, the flags CXX, CXXFLAGS control the C++ compiler.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1.4 Doing a VPATH build

Autoconfiguring source distributions also support vpath builds. In a vpath build, the source distribution is stored in a, possibly read-only, directory, and the actual building takes place in a different directory where all the generated files are being stored. We call the first directory, the source tree, and the second directory the build tree. The build tree may be a subdirectory of the source tree, but it is better if it is a completely separate directory.

If you, the developer, use the standard features of the GNU build system, you don't need to do anything special to allow your packages to support vpath builds. The only exception to this is when you define your own make rules (see section General Automake principles). Then you have to follow certain conventions to allow vpath to work correctly.

You, the installer, however do need to do something special. You need to install and use GNU make. Most Unix make utilities do not support vpath builds, or their support doesn't work. GNU make is extremely portable, and if vpath is important to you, there is no excuse for not installing it.

Suppose that `/sources/foo-0.1' contains a source distribution, and you want to build it in the directory `/build/foo-0.1'. Assuming that both directories exist, all you have to do is:

 
% cd /build/foo-0.1
% /sources/foo-0.1/configure ...options...
% make
% make check
% su
# make install

The configure script and the generated makefiles will take care of the rest.

vpath builds are preferred by some people for the following reasons:

  1. They prevent the build process form cluttering your source directory with all sorts of build files.
  2. To remove a build, all you have to do is remove the build directory.
  3. You can build the same source multiple times using different options. This is very useful if you would like to write a script that will run the test suite for a package while the package is configured in many different ways (e.g. different features, different compiler optimization, and so on). It is also useful if you would like to do the same with releasing binary distributions of the source.

Some developers like to use vpath builds all the time. Others use them only when necessary. In general, if a source distribution builds with a vpath build, it also builds under the ordinary build. The opposite is not true however. This is why the distcheck target checks if your distribution is correct by attempting a vpath build.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1.5 Making a binary distribution

After compiling a source distribution, instead of installing it, you can make a snapshot of the files that it would install and package that snapshot in a tarball. It is often convenient to the installers to install from such snapshots rather than compile from source, especially when the source is extremely large, or when the amount of packages that they need to install is large.

To create a binary distribution run the following commands as root:

 
# make install DESTDIR=/tmp/dist
# tar -C /tmp/dist -cvf package-version.tar
# gzip -9 package-version.tar

The variable DESTDIR specifies a directory, alternative to root, for installing the compiled package. The directory tree under that directory is the exact same tree that would have normally been installed. Why not just specify a different prefix? Because very often, the prefix that you use to install the software affects the contents of the files that actually get installed.

Please note that under the terms of the GNU General Public License, if you distribute your software as a binary distribution, you also need to provide the corresponding source distribution. The simplest way to comply with this requirement is to distribute both distributions together.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2. Writing Good Programs


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.1 Why good code is important

When you work on a software project, one of your short-term goals is to solve a problem at hand. If you are doing this because someone asked you to solve the problem, then all you need to do to look good in per eyes is to deliver a program that works. Nevertheless, regardless of how little person may appreciate this, doing just that is not good enough. Once you have code that gives the right answer to a specific set of problems, you will want to make improvements to it. As you make these improvements, you would like to have proof that your code's known reliability hasn't regressed. Also, tomorrow you will want to move on to a different set of related problems by repeating as little work as possible. Finally, one day you may want to pass the project on to someone else or recruit another developer to help you out with certain parts. You need to make it possible for the other person to get up to speed without reinventing your efforts. To accomplish these equally important goals you need to write good code.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.2 Choosing a good programming language

To write a good software, you must use the appropriate programming language and use it well. To make your software free, it should be possible to compile it with free tools on a free operating system. Therefore, you should avoid using programming languages that do not have a free compiler.

The C programming language is the native language of GNU, and the GNU coding standards encourage you to program in C. The main advantages of C are that it can be compiled with the system's native compiler, many people know C, and it is easy to learn. Nevertheless, C has weaknesses: it forces you to manually manage memory allocation, and any mistakes you might make can lead to very difficult bugs. Also C forces you to program at a low level. Sometimes it is desirable to program at a low level, but there are also cases where you want to build on a higher level.

For projects where you would like a higher-level compiled language, the recommended choice is to use C++. The GNU project distributes a free C++ compiler and nowadays most GNU systems that have a C compiler also have the free C++ compiler. The main advantage of C++ is that it will automatically manage dynamic memory allocation for you. C++ also has a lot of powerful features that allow you to program at a higher level than C, bringing you closer to the algorithms and the concepts involved, and making it easier to write robust programs. At the same time, C++ does not hide low-level details from you and you have the freedom to do the same low-level hacks that you had in C, if you choose to. In fact C++ is 99% backwards compatible with C and it is very easy to mix C and C++ code. Finally, C++ is an industry standard. As a result, it has been used to solve a variety of real-world problems and its specification has evolved for many years to make it a powerful and mature language that can tackle such problems effectively. The C++ specification was frozen and became an ANSI standard in 1998.

One of the disadvantages of C++ is that C++ object files compiled by different C++ compilers can not be linked together. In order to compile C++ to machine language, a lot of compilation issues need to be deferred to the linking stage. Because object file formats are not traditionally sophisticated enough to handle these issues, C++ compilers do various ugly kludges. The problem is that different compilers do these kludges differently, making object files across compilers incompatible. This is not a terrible problem, since object files are incompatible across different platforms anyways. It is only a problem when you want to use more than one compiler on the same platform. Another disadvantage of C++ is that it is harder to interface a C++ library to another language, than it is to interface a C library. Finally not as many people know C++ as well as they know C, and C++ is a very extensive and difficult language to master. However these disadvantages must be weighted against the advantages. There is a price to using C++ but the price comes with a reward.

If you need a higher-level interpreted language, then the recommended choice is to use Guile. Guile is the GNU variant of Scheme, a LISP-like programming language. Guile is an interpreted language, and you can write full programs in Guile, or use the Guile interpreter interactively. Guile is compatible with the R4RS standard but provides a lot of GNU extensions. The GNU extensions are so extensive that it is possible to write entire applications in Guile. Most of the low-level facilities that are available in C, are also available in Guile.

What makes the Guile implementation of Scheme special is not the extensions themselves, but the fact that it it is very easy for any developer to add their own extensions to Guile, by implementing them in C. By combining C and Guile you leverage the advantages of both compiled and interpreted languages. Performance critical functionality can be implemented in C and higher-level software development can be done in Guile. Also, because Guile is interpreted, when you make your C code available through an extended Guile interpreter, then the user can also use the functionality of that code interactively through the interpreter.

The idea of extensible interpreted languages is not new. Other examples of extensible interpreted languages are Perl, Python and Tcl. What sets Guile apart from these languages is the elegance of Scheme. Scheme is the holy grail in the quest for a programming language that can be extended to support any programming paradigm by using the least amount of syntax. Scheme has natural support for both arbitrary precision integer arithmetic and floating point arithmetic. The simplicity of Scheme syntax, and the completeness of Guile, make it very easy to implement specialized scripting languages simply by translating them to Scheme. In Scheme algorithms and data are interchangeable. As a result, it is easy to write Scheme programs that manipulate Scheme source code. This makes Scheme an ideal language for writing programs that manipulate algorithms instead of data, such as programs that do symbolic algebra. Because Scheme can manipulate its own source code, a Scheme program can save its state by writing Scheme source code into a file, and by parsing it later to load it back up again. This feature alone is one reason why engineers should use Guile to configure and drive numerical simulations.

Some people like to use Fortran 77. This is in many ways a good language for developing the computational core of scientific applications. We do have free compilers for Fortran 77, so using it does not restrict our freedom. (see section Using Fortran effectively) Also, Fortran 77 is an aggressively optimized language, and this makes it very attractive to engineers that want to write code optimized for speed. Unfortunately, Fortran 77 can not do well anything except array-oriented numerical computations. Managing input/output is unnecessarily difficult with Fortran, and there's even computational areas, such as infinite precision integer arithmetic and symbolic computation that are not supported.

There are many variants of Fortran like Fortran 90, and HPF. Fortran 90 attempts, quite miserably, to make Fortran 77 more like C++. HPF allows engineers to write numerical code that runs on parallel computers. These variants should be avoided for two reasons:

  1. There are no free compilers for Fortran 90 or HPF. If you happen to use a proprietary operating system, you might as well make use of proprietary compilers if they generate highly optimized code and that is important to you. Nevertheless, in order for your software to be freed, it should be possible to compile it with free tools on a free operating system. Because it is possible to make parallel computers using GNU/Linux (see the Beowulf project), parallelized software can also be free. Therefore both Fortran 90 and HPF should be avoided.
  2. Another problem with these variants is that they are ad hoc languages that have been invented to enable Fortran to do things that it can not do by design. Eventually, when engineers will like to do things that Fortran 90 can't do either, it will be necessary to extend Fortran again, rewrite the compilers and produce yet another variant. What engineers need is a programming language that has the ability to self extend itself by writing software in the same programming language. The C++ programming language can do this without loss of performance. The departmentalization of disciplines in academia has made it very difficult for such a project to take off. Despite that, there is ongoing research in this area. (for example, see the Blitz++ project)

It is almost impossible to write good programs entirely in Fortran, so please use Fortran only for the numerical core of your application and do the bookkeeping tasks, including input/output using a more appropriate language.

If you have written a program entirely in Fortran, please do not ask anyone else to maintain your code, unless person is like you and also knows only Fortran. If Fortran is the only language that you know, then please learn at least C and C++ and use Fortran only when necessary. Please do not hold the opinion that contributions in science and engineering are "true" contributions and software development is just a "tool". This bigoted attitude is behind the thousands of lines of ugly unmaintainable code that goes around in many places. Good software development can be an important contribution in its own right, and regardless of what your goals are, please appreciate it and encourage it. To maximize the benefits of good software, please make your software free. (FIXME: Cross reference copyright section in this chapter)


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.3 Developing libraries

The key to better code is to focus away from developing monolithic throw-away hacks that do only one job, and focus on developing libraries (FIXME: cross reference). Break down the original problem to parts, and the parts to smaller parts, until you get down to simple subproblems that can be easily tested, and from which you can construct solutions for both the original problem and future variants. Every library that you write is a legacy that you can share with other developers, that want to solve similar problems. Each library will allow these other developers to focus on their problem and not have to reinvent the parts that are common with your work from scratch. You should definitely make libraries out of subproblems that are likely to be broadly useful. Please be very liberal in what you consider "broadly useful". Please program in a defensive way that renders reusable as much code as possible, regardless of whether or not you plan to reuse it in the near future. The final application should merely have to assemble all the libraries together and make their functionality accessible to the user through a good interface.

It is very important for each of your libraries to have a complete test suite. The purpose of the test suite is to detect bugs in the library and to prove to you or convince you, the developer, that the library works. A test suite is composed of a collection of test programs that link with your libraries and experiment with the features provided by the library. These test programs should return with

 
exit(0);

if they do not detect anything wrong with the library and with

 
exit(1);

if they detect problems. The test programs should not be installed with the rest of the package. They are meant to be run after your software is compiled and before it is installed. Therefore, they should be written so that they can run using the compiled but uninstalled files of the library. Test programs should not output messages by default. They should run completely quietly and communicate with the environment in a yes or no fashion using the exit code. However, it is useful for test programs to output debugging information when they fail during development. Statements that output such information should be surrounded by conditional directives like this:

 
#if INSPECT_ERRORS
 printf("Division by zero: %d / %d\n",a,b);
#endif

This way it becomes easy to switch them on or off upon demand. The preferred way to manipulate a macro like this INSPECT_ERRORS is by adding a switch to your `configure' script. You can do this by adding the following lines to `configure.in':

 
AC_ARG_WITH(inspect,
  [  --with-inspect           Inspect test suite errors],
  [ AC_DEFINE(INSPECT_ERRORS, 1, "Inspect test suite errors")],
  [ AC_DEFINE(INSPECT_ERRORS, 0, "Inspect test suite errors")])

After the library is debugged, the debug statements should not be removed. If a future version of the library regresses and an old test begins to fail again, it will be useful to be able to reactivate the same error messages that were useful in debugging the test when it was first put together, and it may be necessary to add a few new ones.

The best time to write each test program is as soon as it is possible!. You should not be lazy, and you should not just keep throwing in code after code after code. The minute there is enough code in there to put together some kind of test program, just do it! When you write new code, it is easy to think that you are producing work with every new line of code that is written. The reality is that you know you have produced new work every time you write working a test program for new features, and not a minute before. Another time when you should definitely write a test program is when you find a bug while ordinarily using the library. Then, write a test program that triggers the bug, fix the bug, and keep the test in your test suite. This way, if a future modification reintroduces the same bug it will be detected.

Please document your library as you go. The best time to update your documentation is immediately after you get new test programs checking out new futures. You might feel that you are too busy to write documentation, but the truth of the matter is that you will always be too busy. In fact, if you are a busy person, you are likely to have many other obligations bugging you around for your attention. There may be times that you have to stay away from a project for a large amount of time. If you have consistently been maintaining documentation, it will help you refocus on your project even after many months of absence.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.4 Developing applications

Applications are complete executable programs that can be run by the end-user. With library-oriented development the actual functionality is developed by writing libraries and debugged by developing test-suites for each library. With command-line oriented applications, the application source code parses the arguments that are passed to it by the user, and calls up the right functions in the library to carry out the user's requests. With GUI (1) applications, the application source code creates the widgets that compose the interface, binds them to actions, and then enters an event loop. Each action is implemented in terms of the functionality provided by the appropriate library.

It should be possible to implement applications by using relatively few application-specific source files, since most of the functionality is actually done in libraries. In some cases, the application is simple enough that it would be an overkill to package its functionality as a library. Nevertheless, in such cases please separate the source code that handles actual functionality from the source code that handles the user interface. Also, please always separate the code that handles input/output with the code that does actual computations. If these aspects of your source code are sufficiently separated then you make it easier for other people to reuse parts of your code in their applications. You also make it easier of yourself to switch to library-oriented development when your application grows and is no longer "simple enough".

Library-oriented development allows you to write good and robust applications. In return it requires discipline. Sometimes you may need to add experimental functionality that is not available through your libraries. The right thing to do is to extend the appropriate library. The easy thing to do is to implement it as part of your application-specific source code. If the feature is experimental and undergoing many changes, it may be best to go with the easy approach at first. Still, when the feature matures, please migrate it to the appropriate library, document it, and take it out of the application source code. What we mean by discipline is doing these migrations, when the time is right, despite pressures from "real life", such as deadlines, pointy-haired bosses, and nuclear terrorism. A rule of thumb for deciding when to migrate code to a library is when you find yourself cut-n-pasting chunks of code from application to application. If you do not do the right thing, your code will become increasingly harder to debug, harder to maintain, and less reliable.

Applications should also be documented, especially the ones that are command-line oriented. Application documentation should be thorough in explaining to the user all the things that he needs to know to use the application effectively and should be distributed separately from the application itself. Nevertheless, applications should recognize the --help switch and output a synopsis of how the application is used. Applications should also recognize the --version switch and state their version number. The easiest way to make applications understand these two switches is to use the GNU Argp library (FIXME: cross reference).


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.5 Free software is good software

One of the reasons why you should write good code is because it allows you to make your code robust, reliable and most useful to your needs. Another reason is to make it useful to other people too, and make it easier for them to work with your code and reuse it for their own work. In order for this to be possible, you need to give worry about a few obnoxious legal issues.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.6 Invoking the `gpl' utility

Maintaining these legalese notices can be quite painful after some time. To ease the burden, Autotools distributes a utility called `gpl'. This utility will conveniently generate for you all the legal wording you will ever want to use. It is important to know that this application is not approved in any way by the Free Software Foundation. By this I mean that I haven't asked their opinion of it yet.

To create the file `COPYING' type:

 
% gpl -l COPYING

If you want to include a copy of the GPL in your documentation, you can generate a copy in texinfo format like this:

 
% gpl -lt gpl.texi

Also, every time you want to create a new file, use the `gpl' to generate the copyright notice. If you want it covered by the GPL use the standard notice. If you want to invoke the Guile-like permissions, then also use the library notice. If you want to grant unlimited permissions, meaning no copyleft, use the special notice. The `gpl' utility takes many different flags to take into account the different commenting conventions.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.7 Inserting notices with Emacs

If you are using GNU Emacs, then you can insert these copyright notices on-demand while you're editing your source code. Autotools bundles two Emacs packages: gpl and gpl-copying which provide you with equivalents of the `gpl' command that can be run under Emacs. These packages will be byte-compiled and installed automatically for you while installing Autotools.

To use these packages, in your `.emacs' you must declare your identity by adding the following commands:

 
(setq user-mail-address "me@here.com")
(setq user-full-name "My Name")

Then you must require the packages to be loaded:

 
(require 'gpl)
(require 'gpl-copying)

These packages introduce a set of Emacs commands all of which are prefixed as gpl-. To invoke any of these commands press M-x, type the name of the command and press enter.

The following commands will generate notices for your source code:

`gpl-c'

Insert the standard GPL copyright notice using C commenting.

`gpl-cL'

lnsert the standard GPL copyright notice using C commenting, followed by a Guile-like library exception. This notice is used by the Guile library. You may want to use it for libraries that you write that implement some type of a standard that you wish to encourage. You will be prompted for the name of your package.

`gpl-cc'

Insert the standard GPL copyright notice using C++ commenting.

`gpl-ccL'

Insert the standard GPL copyright notice using C++ commenting, followed by a Guile-like library exception. You will be prompted for the name of your package

`gpl-sh'

Insert the standard GPL copyright notice using shell commenting (i.e. has marks).

`gpl-shL'

Insert the standard GPL copyright notice using shell commenting, followed by a Guile-like library exception. This can be useful for source files, like Tcl files, which are executable code that gets linked in to form an executable, and which use hash marks for commenting.

`gpl-shS'

Insert the standard GPL notice using shell commenting, followed by the special Autoconf exception. This is useful for small shell scripts that are distributed as part of a build system.

`gpl-m4'

Insert the standard GPL copyright notice using m4 commenting (i.e. dnl) and the special Autoconf exception. This is the preferred notice for new Autoconf macros.

`gpl-el'

Insert the standard GPL copyright notice using Elisp commenting. This is useful for writing Emacs extension files in Elisp.

The following commands will generate notices for your source code:

`gpl-insert-copying-texinfo'

Insert a set of paragraphs very similar to the ones appearing at the Copying section of this manual. It is a good idea to include this notice in an unnumbered chapter titled "Copying" in the Texinfo documentation of your source code. You will be prompted for the title of your package. That title will substitute the word Autotools as it appears in the corresponding section in this manual.

`gpl-insert-license-texinfo'

Insert the full text of the GNU General Public License in Texinfo format. If your documentation is very extensive, it may be a good idea to include this notice either at the very beginning of your manual, or at the end. You should include the full license, if you plan to distribute the manual separately from the package as a printed book.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3. Using GNU Emacs

Emacs is an environment for running Lisp programs that manipulate text interactively. To call Emacs merely an editor does not do it justice, unless you redefine the word "editor" to the broadest meaning possible. Emacs is so extensive, powerful and flexible, that you can almost think of it as a self-contained "operating system" in its own right.

Emacs is a very important part of the GNU development tools because it provides an integrated environment for software development. The simplest thing you can do with Emacs is edit your source code. However, you can do a lot more than that. You can run a debugger, and step through your program while Emacs shows you the corresponding sources that you are stepping through. You can browse on-line Info documentation and man pages, download and read your email off-line, and follow discussions on newsgroups. Emacs is particularly helpful with writing documentation with the Texinfo documentation system. You will find it harder to use Texinfo, if you don't use Emacs. It is also very helpful with editing files on remote machines over FTP, especially when your connection to the internet is over a slow modem. Finally, and most importantly, Emacs is programmable. You can write Emacs functions in Emacs Lisp to automate any chore that you find particularly useful in your own work. Because Emacs Lisp is a full programming language, there is no practical limit to what you can do with it.

If you already know a lot about Emacs, you can skip this chapter and move on. If you are a "vi" user, then we will assimilate you: See section Using vi emulation, for details. (2) This chapter will be most useful to the novice user who would like to set per Emacs up and running for software development, however it is not by any means comprehensive. See section Further reading on Emacs, for references to more comprehensive Emacs documentation.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.1 Introduction to Emacs

Emacs is an environment for running Lisp programs that manipulate text interactively. Because Emacs is completely programmable, it can be used to implement not only editors, but a full integrated development environment for software development. Emacs can also browse info documentation, run email clients, a newsgroup reader, a sophisticated xterm, and an understanding psychotherapist.

Under the X window system, Emacs controls multiple x-windows called frames. Each frame has a menu bar and the main editing area. The editing area is divided into windows with horizontal bars. You can grab these bars and move them around with the first mouse button. (3) Each window is bound to a buffer. A buffer is an Emacs data structure that contains text. Most editing commands operate on buffers, modifying their contents. When a buffer is bound to a window, then you can see its contents as they are being changed. It is possible for a buffer to be bound to two windows, on different frames or on the same frame. Then whenever a change is made to the buffer, it is reflected on both windows. It is not necessary for a buffer to be bound to a window, in order to operate on it. In a typical Emacs session you may be manipulating more buffers than the windows that you have on your screen.

A buffer can be visiting files. In that case, the contents of the buffer reflect the contents of a file that is being edited. But buffers can be associated with anything you like, so long as you program it up. For example, under the Dired directory editor, a buffer is bound to a directory, showing you the contents of the directory. When you press Enter while the cursor is over a file name, Emacs creates a new buffer, visits the file, and rebinds the window with that buffer. From the user's perspective, by pressing Enter he "opened" the file for editing. If the file has already been "opened" then Emacs simply rebinds the existing buffer for that file.

Emacs uses a variant of LISP, called Emacs LISP, as its programming language. Every time you press a key, click the mouse, or select an entry from the menu bar, an Emacs LISP function is evaluated. The mode of the buffer determines, among many other things, what function to evaluate. This way, every buffer can be associated with functionality that defines what you do in that buffer. For example you can program your buffer to edit text, to edit source code, to read news, and so on. You can also run LISP functions directly on the current buffer by typing M-x and the name of the function that you want to run. (4)

What is known as the "Emacs editor" is the default implementation of an editor under the Emacs system. If you prefer the vi editor, then you can instead run a vi clone, Viper (see section Using vi emulation). The main reason why you should use Emacs, is not the particular editor, but the way Emacs integrates editing with all the other functions that you like to do as a software developer. For example:

All of these features make Emacs a very powerful, albeit unusual, integrated development environment. Many users of proprietary operating systems, like Lose95 (5), complain that GNU (and Unix) does not have an integrated development environment. As a matter of fact it does. All of the above features make Emacs a very powerful IDE.

Emacs has its own very extensive documentation (see section Further reading on Emacs). In this manual we will only go over the fundamentals for using Emacs effectively as an integrated development environment.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.2 Installing GNU Emacs

If Emacs is not installed on your system, you will need to get a source code distribution and compile it yourself. Installing Emacs is not difficult. If Emacs is already installed on your GNU/Linux system, you might still need to reinstall it: you might not have the most recent version, you might have XEmacs instead, you might not have support for internationalization, or your Emacs might not have compiled support for reading mail over POP (a feature very useful to developers that hook up over modem). If any of these is the case, then uninstall that version of Emacs, and reinstall Emacs from a source code distribution.

The entire Emacs source code is distributed in three separate files:

`emacs-21.2.tar.gz'

This is the main Emacs distribution. If you do not care about international language support, you can install this by itself.

`leim-21.2.tar.gz'

This supplements the Emacs distribution with support for multiple languages. If you develop internationalized software, it is likely that you will need this.

`intlfonts-1.1.tar.gz'

This file contains the fonts that Emacs uses to support international languages. If you want international language support, you will definitely need this.

Get a copy of these three files, place them under the same directory and unpack them with the following commands:

 
% gunzip emacs-21.2.tar.gz
% tar xf emacs-21.2.tar
% gunzip leim-21.2.tar.gz
% tar xf leim-21.2.tar

Both tarballs will unpack under the `emacs-21.2' directory. When this is finished, configure the source code with the following commands:

 
% cd emacs-21.2
% ./configure --with-pop --with-gssapi
% make

The `--with-pop' flag is almost always a good idea, especially if you are running Emacs from a home computer that is connected to the internet over modem. It will let you use Emacs to download your email from your internet provider and read it off-line (see section Using Emacs as an email client). Most internet providers use GSSAPI-authenticated POP. If you need to support other authentication protocols however, you may also want to add one of the following flags:

--with-kerberos

support Kerberos-authenticated POP

--with-kerberos5

support Kerberos version 5 authenticated POP

--with-hesiod

support Hesiod to get the POP server host

Then compile and install Emacs with:

 
$ make
# make install

Emacs is a very large program, so this will take a while.

To install `intlfonts-1.1.tar.gz' unpack it, and follow the instructions in the `README' file. Alternatively, you may find it more straightforward to install it from a Debian package. Packages for `intlfonts' exist as of Debian 2.1.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.3 Basic Emacs concepts

In this section we describe what Emacs is and what it does. We will not yet discuss how to make Emacs work. That discussion is taken up in the subsequent sections, starting with Configuring GNU Emacs. This section instead covers the fundamental ideas that you need to understand in order to make sense out of Emacs.

You can run Emacs from a text terminal, such as a vt100 terminal, but it is usually nicer to run Emacs under the X-windows system. To start Emacs type

 
% emacs &

on your shell prompt. The seasoned GNU developer usually sets up per X configuration such that it starts Emacs when person logs in. Then, person uses that Emacs process for all of per work until person logs out. To quit Emacs press C-x C-c, or select

 
Files → Exit Emacs

from the menu. The notation C-c means CTRL-c. The separating dash `-' means that you press the key after the dash while holding down the key before the dash. Be sure to quit Emacs before logging out, to ensure that your work is properly saved. If there are any files that you haven't yet saved, Emacs will prompt you and ask you if you want to save them, before quiting. If at any time you want Emacs to stop doing what it's doing, press C-g.

Under the X window system, Emacs controls multiple x-windows which are called frames. Each frame has a menu bar and the main editing area. The editing area is divided into windows (6) by horizontal bars, called status bars. Every status bar contains concise information about the status of the window above the status bar. The minimal editing area has at least one big window, where editing takes place, and a small one-line window called the minibuffer. Emacs uses the minibuffer to display brief messages and to prompt the user to enter commands or other input. The minibuffer has no status bar of its own.

Each window is bound to a buffer. A buffer is an Emacs data structure that contains text. Most editing commands operate on buffers, modifying their contents. When a buffer is bound to a window, then you can see its contents as they are being changed. It is possible for a buffer to be bound to two windows, on different frames or on the same frame. Then whenever a change is made to the buffer, it is reflected on both windows. It is not necessary for a buffer to be bound to a window, in order to operate on it. In a typical Emacs session you may be manipulating more buffers than the windows that you actually have on your screen.

A buffer can be visiting files. In that case, the contents of the buffer reflect the contents of a file that is being edited. But buffers can be associated with anything you like, so long as you program it up. For example, under the Dired directory editor, a buffer is bound to a directory, showing you the contents of the directory. When you press RET while the cursor is over a file name, Emacs creates a new buffer, visits the file, and rebinds the window with that buffer. From the user's perspective, by pressing RET person "opened" the file for editing. If the file has already been "opened" then Emacs simply rebinds the existing buffer for that file.

Sometimes Emacs will divide a frame to two or more windows. You can switch from one window to another by clicking the 1st mouse button, while the mouse is inside the destination window. To resize these windows, grab the status bar with the 1st mouse button and move it up or down. Pressing the 2nd mouse button, while the mouse is on a status bar, will bury the window bellow the status bar. Pressing the 3rd mouse button will bury the window above the status bar, instead. Buried windows are not killed; they still exist and you can get back to them by selecting them from the menu bar, under:

 
Buffers → name-of-buffer

Buffers, with some exceptions, are usually named after the filenames of the files that they correspond to.

Once you visit a file for editing, then all you need to do is to edit it! The best way to learn how to edit files using the standard Emacs editor is by working through the on-line Emacs tutorial. To start the on-line tutorial type C-h t or select:

 
Help → Emacs Tutorial

If you are a vi user, or you simply prefer to use `vi' key bindings, then read Using vi emulation.

In Emacs, every event causes a Lisp function to be executed. An event can be any keystroke, mouse movement, mouse clicking or dragging, or a menu bar selection. The function implements the appropriate response to the event. Almost all of these functions are written in a variant of Lisp called Emacs Lisp. The actual Emacs program, the executable, is an Emacs Lisp interpreter with the implementation of frames, buffers, and so on. However, the actual functionality that makes Emacs usable is implemented in Emacs Lisp.

Sometimes, Emacs will bind a few words of text to an Emacs function. For example, when you use Emacs to browse Info documentation, certain words that corresponds to hyperlinks to other nodes are bound to a function that makes Emacs follow the hyperlink. When such a binding is actually installed, moving the mouse over the bound text highlights it momentarily. While the text is highlighted, you can invoke the binding by clicking the 2nd mouse button.

Sometimes, an Emacs function might go into an infinite loop, or it might start doing something that you want to stop. You can always make Emacs abort