--- The Detailed Node Listing ---
Installing GNU Software
Writing Good Programs
Using GNU Emacs
Compiling with Makefiles
The GNU build system
Using Automake and Autoconf
Using Libtool
Using Autotoolset
Using C and C++ effectively
Using Fortran effectively
Internationalization
Maintaining Documentation
Portable shell programming
Writing Autoconf macros
Legal issues with Free Software
Philosophical issues
Licensing Free Software
GNU GENERAL PUBLIC LICENSE
The purpose of this document is to introduce you to the GNU build system, and show you how to use it to write good code. It also discusses peripheral topics such as how to use GNU Emacs as a source code navigator, how to write good software, and the philosophical concerns behind the free software movement. The intended reader should be a software developer who knows his programming languages, and wants to learn how to package his programs in a way that follows the GNU coding standards.
This manual introduces you to the GNU build system and shows you how to develop high-quality
This manual shows you how to develop high-quality software on GNU using the GNU build system that conforms to the GNU coding standards. These techniques are also useful for software development on GNU/Linux and most variants of the Unix system. In fact, one of the reasons for the elaborate GNU build system was to make software portable between GNU and other similar operating systems. We also discuss peripheral topics such as how to use GNU Emacs as an IDE (integrated development environment), and the various practical, legal and philosophical concerns behind software development.
When we speak of the GNU build system we refer primarily to the following four packages:
The GNU build system has two goals. The first is to simplify the development of portable programs. The second is to simplify the building of programs that are distributed as source code. The first goal is achieved by the automatic generation of a configure shell script. The second goal is achieved by the automatic generation of Makefiles and other shell scripts that are typically used in the building process. This way the developer can concentrate on debugging his source code, instead of his overly complex Makefiles. And the installer can compile and install the program directly from the source code distribution by a simple and automatic procedure.
The GNU build system needs to be installed only when you are developing
programs that are meant to be distributed. To build a program from
distributed source code, you only need a working make
, a compiler,
a shell,
and sometimes standard Unix utilities like sed
, awk
,
yacc
, lex
. The objective is to make software installation
as simple and as automatic as possible for the installer. Also, by
setting up the GNU build system such that it creates programs that don't
require the build system to be present during their installation, it
becomes possible to use the build system to bootstrap itself.
Some tasks that are simplified by the GNU build system include:
make
recursively. Having simplified this step, the developer
is encouraged to organize his source code in a deep directory tree rather than
lump everything under the same directory. Developers that use raw make
often can't justify the inconvenience of recursive make and prefer to
disorganize their source code. With the GNU tools this is no longer necessary.
check
target available such that you can compile and run the entire test suite
by running make check
.
make distcheck
.
The Autotoolset package complements the GNU build system by providing the following additional features:
This effort began by my attempt to write a tutorial for Autoconf. It involved into “Learning Autoconf and Automake”. Along the way I developed Autotoolset to deal with things that annoyed me or to cover needs from my own work. Ultimately I want this document to be both a unified introduction of the GNU build system as well as documentation for the Autotoolset package.
I believe that knowing these tools and having this know-how is very important, and should not be missed from engineering or science students who will one day go out and do software development for academic or industrial research. Many students are incredibly undertrained in software engineering and write a lot of bad code. This is very very sad because of all people, it is them that have the greatest need to write portable, robust and reliable code. I found from my own experience that moving away from Fortran and C, and towards C++ is the first step in writing better code. The second step is to use the sophisticated GNU build system and use it properly, as described in this document. Ultimately, I am hoping that this document will help people get over the learning curve of the second step, so they can be productive and ready to study the reference manuals that are distributed with all these tools.
This manual of course is still under construction. When I am done constructing it some paragraph somewhere will be inserted with the traditional run-down of summaries about each chapter. I write this manual in a highly non-linear way, so while it is under construction you will find that some parts are better-developed than others. If you wish to contribute sections of the manual that I haven't written or haven't yet developed fully, please contact me.
Chapters 1,2,3,4 are okay. Chapter 5 is okay to, but needs a little more work. I removed the other chapters to minimize confusion, but the sources for them are still being distributed as part of the Autotoolset package for those that found them useful. The other chapters need a lot of rewriting and they would do more harm than good at this point to the unsuspecting reader. Please contact me if you have any suggestions for improving this manual.
Remarks by Marcelo: I am currently updating this manual to the last release of the autoconf/automake tools.
This document and the Autotools package have originally been written by Eleftherios Gkioulekas. Many people have further contributed to this effort, directly or indirectly, in various way. Here is a list of these people. Please help me keep it complete and exempt of errors.
FIXME: I need to start keeping track of acknowledgements here
This book that you are now reading is actually free. The information in it is freely available to anyone. The machine readable source code for the book is freely distributed on the internet and anyone may take this book and make as many copies as they like. (take a moment to check the copying permissions on the Copyright page). If you paid money for this book, what you actually paid for was the book's nice printing and binding, and the publisher's associated costs to produce it.
The following notice refers to the Autotoolset package, which includes this
documentation, as well as the source code for utilities like acmkdir
and for additional Autoconf macros. The complete GNU development tools
involves other packages also, such as Autoconf, Automake,
Libtool, Make, Emacs, Texinfo,
the GNU C and C++ compilers
and a few other accessories. These packages are free software, and you
can obtain them from the Free Software Foundation. For details on doing so,
please visit their web site http://www.fsf.org/
. Although Autotoolset
has been designed to work with the GNU build system, it is not yet an
official part of the GNU project.
The Autotoolset package is also “free”; this means that everyone is free to use it and free to redistribute it on a free basis. The Autotoolset package is not in the public domain; it is copyrighted and there are restrictions on its distribution, but these restrictions are designed to permit everything that a good cooperating citizen would want to do. What is not allowed is to try to prevent others from further sharing any version of this package that they might get from you.
Specifically, we want to make sure that you have the right to give away copies of the programs that relate to Autotoolset, that you receive source code or else can get it if you want it, that you can change these programs or use pieces of them in new free programs, and that you know you can do these things.
To make sure that everyone has such rights, we have to forbid you to deprive anyone else of these rights. For example, if you distribute copies of the Autotoolset-related code, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must tell them their rights.
Also, for our own protection, we must make certain that everyone finds out that there is no warranty for the programs that relate to Autotoolset. If these programs are modified by someone else and passed on, we want their recipients to know that what they have is not what we distributed, so that any problems introduced by others will not reflect on our reputation.
The precise conditions of the licenses for the programs currently being distributed that relate to Autotoolset are found in the General Public Licenses that accompany them.
Free software is distributed in source code distributions. Many of these programs are difficult to install because they use system dependent features, and they require the user to edit makefiles and configuration headers. By contrast, the software distributed by the GNU project is autoconfiguring; it is possible to compile it from source code and install it automatically, without any tedious user intervention.
In this chapter we discuss how to compile and install autoconfiguring software written by others. In the subsequent chapters we discuss how to use the development tools that allow you to make your software autoconfiguring as well.
Autoconfiguring software is distributed with packaged source code distributions. These are big files with filenames of the form:
package-version.tar.gz
For example, the file autoconf-2.57.tar.gz contains version 2.57 of GNU Autoconf. We often call these files source distributions; sometimes we simply call them packages.
The steps for installing an autoconfiguring source code distribution are simple, and if the distribution is not buggy, can be carried out without substantial user intervention.
% gunzip foo-1.0.tar.gz % tar xf foo-1.0.tar
This will create the directory foo-1.0 which contains the package's source code and documentation. Look for the files README to see if there's anything that you should do next. The README file might suggest that you need to install other packages before installing this one, or it might suggest that you have to do unusual things to install this package. If the source distribution conforms to the GNU coding standards, you will find many other documentation files like README. See Maintaining the documentation files, for an explanation of what these files mean.
% cd foo-1.0 % ./configure
% make
and if the program is big, you can make some coffee. After the program compiles, you can run its regression test-suite, if it has one, by typing
% make check
% su # make install
The make program launches the shell commands necessary for compiling,
testing and installing the package from source code. However, make
has no knowledge of what it is really doing. It takes its orders from
makefiles, files called Makefile that have to be present
in every subdirectory of your source code directory tree. From the installer
perspective,
the makefiles define a set of targets that correspond to things
that the installer wants to do. The default target is always compiling the
source code, which is what gets invoked when you simply run make
.
Other targets, such as ‘install’, ‘check’ need to be mentioned
explicitly. Because make takes its orders from the makefile in
the current directory, it is important to run it from the correct
directory. See Compiling with Makefiles, for the full story behind
make.
The configure program is a shell script that probes your system through a set of tests to determine things that it needs to know, and then uses the results to generate Makefile files from templates stored in files called Makefile.in. In the early days of the GNU project, developers used to write configure scripts by hand. Now, no-one ever does that any more. Now, configure scripts are automatically generated by GNU Autoconf from an input file configure.in. GNU Autoconf is part of the GNU build system and we first introduce in in The GNU build system.
As it turns out, you don't have to write the Makefile.in templates
by hand either. Instead you can use another program, GNU Automake, to
generate Makefile.in templates from higher-level descriptions
stored in files called Makefile.am. In these files you describe
what is being created by your source code, and Automake computes the
makefile targets for compiling, installing and uninstalling it. Automake
also computes targets for compiling and running test suites, and targets
for recursively calling make
in subdirectories. The details about
Automake are first introduced in Using Automake and Autoconf.
The GNU coding standards are a document that describes the requirements that must be satisfied by all GNU programs. These requirements are driven mainly by technical considerations, and they are excellent advice for writing good software. The makefile standards, a part of the GNU coding standards, require that your makefiles do a lot more than simply compile and install the software.
One requirement is cleaning targets; these targets remove the files
that were generated while installing the package and restore the source
distribution to a previous state. There are three cleaning targets that
corresponds to three levels of cleaning: clean
, distclean
,
maintainer-clean
.
clean
make
and
make check
, but not the files that were generated by running
configure
. This targets cleans the build, but does not undo the
source configuration by the configure script.
distclean
make
and make check
,
but also cleans the files that were generated by running configure
.
As a result, you can not invoke any other make targets until you run
the configure script again. This target reverts your source directory tree
back to the state in which it was when you first unpacked it.
maintainer-clean
distclean
cleans. However it also removes
files that the developers have automatically generated with the GNU build
system. Because users shouldn't need the entire GNU build system to install
a package, these files should not be removed in the final source distribution.
However, it is occasionally useful for the maintainer to remove and
regenerate these files.
Another type of cleaning that is required is erasing the package itself from the installation directory; uninstalling the package. To uninstall the package, you must call
% make uninstall
from the top level directory of the source distribution. This will work only if the source distribution is configured first. It will work best only if you do it from the same source distribution, with the same configuration, that you've used to install the package in the first place.
When you install GNU software, archive the source code to all the packages
that you install in a directory like /usr/src or /usr/local/src.
To do that, first run make clean
on the source distribution, and then
use a recursive copy to copy it to /usr/src. The presence of a
source distribution in one of these directories should be a signal to you
that the corresponding package is currently installed.
Francois Pinard came up with a cute rule for remembering what the cleaning targets do:
configure
or make
did it, make distclean
undoes it.
make
did it, make clean
undoes it.
make install
did it, make uninstall
undoes it.
make maintainer-clean
undoes it.
GNU standard compliant makefiles also have a target for generating tags. Tags are files, called TAGS, that are used by GNU Emacs to allow you to navigate your source distribution more efficiently. More specifically, Emacs uses tags to take you from a place where a C function is being used in a file, to the file and line number where the function is defined. To generate the tags call:
% make tags
Tags are particularly useful when you are not the original author of the code you are working on, and you haven't yet memorized where everything is. See Navigating source code, for all the details about navigating large source code trees with Emacs.
Finally, in the spirit of free redistributable code, there must be targets for cutting a source code distribution. If you type
% make dist
it will rebuild the foo-1.0.tar.gz file that you started with. If you modified the source, the modifications will be included in the distribution (and you should probably change the version number). Before putting a distribution up on FTP, you can test its integrity with:
% make distcheck
This makes the distribution, then unpacks it in a temporary subdirectory and tries to configure it, build it, run the test-suite, and check if the installation script works. If everything is okey then you're told that your distribution is ready.
Writing reliable makefiles that support all of these targets is a very difficult undertaking. This is why we prefer to generate our makefiles instead with GNU Automake.
The ‘configure’ script accepts many command-line flags that modify its behaviour and the configuration of your source distribution. To obtain a list of all the options that are available type
% ./configure --help
on the shell prompt.
The most useful parameter that the installer controls during configuration is the directory where they want the package to be installed. During installation, the following files go to the following directories:
Executables ==> /usr/local/bin Libraries ==> /usr/local/lib Header files ==> /usr/local/include Man pages ==> /usr/local/man/man? Info files ==> /usr/local/info
The /usr/local directory is called the prefix. The default prefix is always /usr/local but you can set it to anything you like when you call ‘configure’ by adding a ‘--prefix’ option. For example, suppose that you are not a privileged user, so you can not install anything in /usr/local, but you would still like to install the package for your own use. Then you can tell the ‘configure’ script to install the package in your home directory ‘/home/username’:
% ./configure --prefix=/home/username % make % make check % make install
The ‘--prefix’ argument tells ‘configure’ where you want to install your package, and ‘configure’ will take that into account and build the proper makefile automatically.
If you are installing the package on a filesystem that is shared by computers that run variations of GNU or Unix, you need to install the files that are independent of the operating system in a shared directory, but separate the files that are dependent on the operating systems in different directories. Header files and documentation can be shared. However, libraries and executables must be installed separately. Usually the scheme used to handle such situations is:
Executables ==> /usr/local/system/bin Libraries ==> /usr/local/system/lib Header files ==> /usr/local/include Man pages ==> /usr/local/man/mann Info files ==> /usr/local/info
The directory /var/local/system is called the executable prefix, and it is usually a subdirectory of the prefix. In general, it can be any directory. If you don't specify the executable prefix, it defaults to being equal to the prefix. To change that, use the ‘--exec-prefix’ flag. For example, to configure for a GNU/Linux system, you would run:
% configure --exec-prefix=/usr/local/linux
To configure for GNU/Hurd, you would run:
% configure --exec-prefix=/usr/local/hurd
In general, there are many directories where a package may want to install files. Some of these directories are controlled by the prefix, where others are controlled by the executable prefix. See Installation standard directories, for a complete discussion of what these directories are, and what they are for.
Some packages allow you to enable or disable certain features while you configure the source code. They do that with flags of the form:
--with-package --enable-feature --without-package --disable-feature
The --enable
flags usually control whether to enable certain
optional features of the package. Support for international languages,
debugging features, and shared libraries are features that are usually
controlled by these options.
The --with
flags instead control whether to compile and install
certain optional components of the package.
The specific flags that are available for a particular source distribution
should be documented in the README file.
Finally, configure
scripts can be passed parameters via environment
variables. One of the things that configure
does is decide what
compiler to use and what flags to pass to that compiler. You can
overrule the decisions that configure
makes by setting the flags
CC
and CFLAGS
. For example, to specify that you want the
package to compile with full optimization and without any debugging
symbols (which is a bad idea, yet people want to do it):
% export CFLAGS="-O3" % ./configure
To tell configure
to use the system's native compiler instead of
gcc
, and compile without optimization and with debugging symbols:
% export CC="cc" % export CFLAGS="-g" % ./configure
This assumes that you are using the bash
shell as your default shell.
If you use the csh
or tcsh
shells, you need to assign
environment variables with the setenv
command instead. For example:
% setenv CFLAGS "-O3" % ./configure
Similarly, the flags CXX
, CXXFLAGS
control the C++ compiler.
Autoconfiguring source distributions also support vpath builds. In a vpath build, the source distribution is stored in a, possibly read-only, directory, and the actual building takes place in a different directory where all the generated files are being stored. We call the first directory, the source tree, and the second directory the build tree. The build tree may be a subdirectory of the source tree, but it is better if it is a completely separate directory.
If you, the developer, use the standard features of the GNU build system, you don't need to do anything special to allow your packages to support vpath builds. The only exception to this is when you define your own make rules (see General Automake principles). Then you have to follow certain conventions to allow vpath to work correctly.
You, the installer, however do need to do something special. You need to install and use GNU make. Most Unix make utilities do not support vpath builds, or their support doesn't work. GNU make is extremely portable, and if vpath is important to you, there is no excuse for not installing it.
Suppose that /sources/foo-0.1 contains a source distribution, and you want to build it in the directory /build/foo-0.1. Assuming that both directories exist, all you have to do is:
% cd /build/foo-0.1 % /sources/foo-0.1/configure ...options... % make % make check % su # make install
The configure script and the generated makefiles will take care of the rest.
vpath builds are preferred by some people for the following reasons:
distcheck
target checks if your distribution is
correct by attempting a vpath build.
After compiling a source distribution, instead of installing it, you can make a snapshot of the files that it would install and package that snapshot in a tarball. It is often convenient to the installers to install from such snapshots rather than compile from source, especially when the source is extremely large, or when the amount of packages that they need to install is large.
To create a binary distribution run the following commands as root:
# make install DESTDIR=/tmp/dist # tar -C /tmp/dist -cvf package-version.tar # gzip -9 package-version.tar
The variable DESTDIR
specifies a directory, alternative to root,
for installing the compiled package. The directory tree under that directory
is the exact same tree that would have normally been installed.
Why not just specify a different prefix? Because very often, the prefix
that you use to install the software affects the contents of the files
that actually get installed.
Please note that under the terms of the GNU General Public License, if you distribute your software as a binary distribution, you also need to provide the corresponding source distribution. The simplest way to comply with this requirement is to distribute both distributions together.
When you work on a software project, one of your short-term goals is to solve a problem at hand. If you are doing this because someone asked you to solve the problem, then all you need to do to look good in per eyes is to deliver a program that works. Nevertheless, regardless of how little person may appreciate this, doing just that is not good enough. Once you have code that gives the right answer to a specific set of problems, you will want to make improvements to it. As you make these improvements, you would like to have proof that your code's known reliability hasn't regressed. Also, tomorrow you will want to move on to a different set of related problems by repeating as little work as possible. Finally, one day you may want to pass the project on to someone else or recruit another developer to help you out with certain parts. You need to make it possible for the other person to get up to speed without reinventing your efforts. To accomplish these equally important goals you need to write good code.
To write a good software, you must use the appropriate programming language and use it well. To make your software free, it should be possible to compile it with free tools on a free operating system. Therefore, you should avoid using programming languages that do not have a free compiler.
The C programming language is the native language of GNU, and the GNU coding standards encourage you to program in C. The main advantages of C are that it can be compiled with the system's native compiler, many people know C, and it is easy to learn. Nevertheless, C has weaknesses: it forces you to manually manage memory allocation, and any mistakes you might make can lead to very difficult bugs. Also C forces you to program at a low level. Sometimes it is desirable to program at a low level, but there are also cases where you want to build on a higher level.
For projects where you would like a higher-level compiled language, the recommended choice is to use C++. The GNU project distributes a free C++ compiler and nowadays most GNU systems that have a C compiler also have the free C++ compiler. The main advantage of C++ is that it will automatically manage dynamic memory allocation for you. C++ also has a lot of powerful features that allow you to program at a higher level than C, bringing you closer to the algorithms and the concepts involved, and making it easier to write robust programs. At the same time, C++ does not hide low-level details from you and you have the freedom to do the same low-level hacks that you had in C, if you choose to. In fact C++ is 99% backwards compatible with C and it is very easy to mix C and C++ code. Finally, C++ is an industry standard. As a result, it has been used to solve a variety of real-world problems and its specification has evolved for many years to make it a powerful and mature language that can tackle such problems effectively. The C++ specification was frozen and became an ANSI standard in 1998.
One of the disadvantages of C++ is that C++ object files compiled by different C++ compilers can not be linked together. In order to compile C++ to machine language, a lot of compilation issues need to be deferred to the linking stage. Because object file formats are not traditionally sophisticated enough to handle these issues, C++ compilers do various ugly kludges. The problem is that different compilers do these kludges differently, making object files across compilers incompatible. This is not a terrible problem, since object files are incompatible across different platforms anyways. It is only a problem when you want to use more than one compiler on the same platform. Another disadvantage of C++ is that it is harder to interface a C++ library to another language, than it is to interface a C library. Finally not as many people know C++ as well as they know C, and C++ is a very extensive and difficult language to master. However these disadvantages must be weighted against the advantages. There is a price to using C++ but the price comes with a reward.
If you need a higher-level interpreted language, then the recommended choice is to use Guile. Guile is the GNU variant of Scheme, a LISP-like programming language. Guile is an interpreted language, and you can write full programs in Guile, or use the Guile interpreter interactively. Guile is compatible with the R4RS standard but provides a lot of GNU extensions. The GNU extensions are so extensive that it is possible to write entire applications in Guile. Most of the low-level facilities that are available in C, are also available in Guile.
What makes the Guile implementation of Scheme special is not the extensions themselves, but the fact that it it is very easy for any developer to add their own extensions to Guile, by implementing them in C. By combining C and Guile you leverage the advantages of both compiled and interpreted languages. Performance critical functionality can be implemented in C and higher-level software development can be done in Guile. Also, because Guile is interpreted, when you make your C code available through an extended Guile interpreter, then the user can also use the functionality of that code interactively through the interpreter.
The idea of extensible interpreted languages is not new. Other examples of extensible interpreted languages are Perl, Python and Tcl. What sets Guile apart from these languages is the elegance of Scheme. Scheme is the holy grail in the quest for a programming language that can be extended to support any programming paradigm by using the least amount of syntax. Scheme has natural support for both arbitrary precision integer arithmetic and floating point arithmetic. The simplicity of Scheme syntax, and the completeness of Guile, make it very easy to implement specialized scripting languages simply by translating them to Scheme. In Scheme algorithms and data are interchangeable. As a result, it is easy to write Scheme programs that manipulate Scheme source code. This makes Scheme an ideal language for writing programs that manipulate algorithms instead of data, such as programs that do symbolic algebra. Because Scheme can manipulate its own source code, a Scheme program can save its state by writing Scheme source code into a file, and by parsing it later to load it back up again. This feature alone is one reason why engineers should use Guile to configure and drive numerical simulations.
Some people like to use Fortran 77. This is in many ways a good language for developing the computational core of scientific applications. We do have free compilers for Fortran 77, so using it does not restrict our freedom. (see Using Fortran effectively) Also, Fortran 77 is an aggressively optimized language, and this makes it very attractive to engineers that want to write code optimized for speed. Unfortunately, Fortran 77 can not do well anything except array-oriented numerical computations. Managing input/output is unnecessarily difficult with Fortran, and there's even computational areas, such as infinite precision integer arithmetic and symbolic computation that are not supported.
There are many variants of Fortran like Fortran 90, and HPF. Fortran 90 attempts, quite miserably, to make Fortran 77 more like C++. HPF allows engineers to write numerical code that runs on parallel computers. These variants should be avoided for two reasons:
If you have written a program entirely in Fortran, please do not ask anyone else to maintain your code, unless person is like you and also knows only Fortran. If Fortran is the only language that you know, then please learn at least C and C++ and use Fortran only when necessary. Please do not hold the opinion that contributions in science and engineering are “true” contributions and software development is just a “tool”. This bigoted attitude is behind the thousands of lines of ugly unmaintainable code that goes around in many places. Good software development can be an important contribution in its own right, and regardless of what your goals are, please appreciate it and encourage it. To maximize the benefits of good software, please make your software free. (FIXME: Cross reference copyright section in this chapter)
The key to better code is to focus away from developing monolithic throw-away hacks that do only one job, and focus on developing libraries (FIXME: cross reference). Break down the original problem to parts, and the parts to smaller parts, until you get down to simple subproblems that can be easily tested, and from which you can construct solutions for both the original problem and future variants. Every library that you write is a legacy that you can share with other developers, that want to solve similar problems. Each library will allow these other developers to focus on their problem and not have to reinvent the parts that are common with your work from scratch. You should definitely make libraries out of subproblems that are likely to be broadly useful. Please be very liberal in what you consider “broadly useful”. Please program in a defensive way that renders reusable as much code as possible, regardless of whether or not you plan to reuse it in the near future. The final application should merely have to assemble all the libraries together and make their functionality accessible to the user through a good interface.
It is very important for each of your libraries to have a complete test suite. The purpose of the test suite is to detect bugs in the library and to prove to you or convince you, the developer, that the library works. A test suite is composed of a collection of test programs that link with your libraries and experiment with the features provided by the library. These test programs should return with
exit(0);
if they do not detect anything wrong with the library and with
exit(1);
if they detect problems. The test programs should not be installed with
the rest of the package. They are meant to be run after your software
is compiled and before it is installed. Therefore, they should be written
so that they can run using the compiled but uninstalled files of the library.
Test programs should not output messages by default. They should run
completely quietly and communicate with the environment in a yes or no
fashion using the exit
code. However, it is useful for test programs
to output debugging information when they fail during development. Statements
that output such information should be surrounded by conditional
directives like this:
#if INSPECT_ERRORS printf("Division by zero: %d / %d\n",a,b); #endif
This way it becomes easy to switch them on or off upon demand. The preferred
way to manipulate a macro like this INSPECT_ERRORS
is by adding
a switch to your configure script. You can do this by adding the
following lines to configure.in:
AC_ARG_WITH(inspect, [ --with-inspect Inspect test suite errors], [ AC_DEFINE(INSPECT_ERRORS, 1, "Inspect test suite errors")], [ AC_DEFINE(INSPECT_ERRORS, 0, "Inspect test suite errors")])
After the library is debugged, the debug statements should not be removed. If a future version of the library regresses and an old test begins to fail again, it will be useful to be able to reactivate the same error messages that were useful in debugging the test when it was first put together, and it may be necessary to add a few new ones.
The best time to write each test program is as soon as it is possible!. You should not be lazy, and you should not just keep throwing in code after code after code. The minute there is enough code in there to put together some kind of test program, just do it! When you write new code, it is easy to think that you are producing work with every new line of code that is written. The reality is that you know you have produced new work every time you write working a test program for new features, and not a minute before. Another time when you should definitely write a test program is when you find a bug while ordinarily using the library. Then, write a test program that triggers the bug, fix the bug, and keep the test in your test suite. This way, if a future modification reintroduces the same bug it will be detected.
Please document your library as you go. The best time to update your documentation is immediately after you get new test programs checking out new futures. You might feel that you are too busy to write documentation, but the truth of the matter is that you will always be too busy. In fact, if you are a busy person, you are likely to have many other obligations bugging you around for your attention. There may be times that you have to stay away from a project for a large amount of time. If you have consistently been maintaining documentation, it will help you refocus on your project even after many months of absence.
Applications are complete executable programs that can be run by the end-user. With library-oriented development the actual functionality is developed by writing libraries and debugged by developing test-suites for each library. With command-line oriented applications, the application source code parses the arguments that are passed to it by the user, and calls up the right functions in the library to carry out the user's requests. With GUI 1 applications, the application source code creates the widgets that compose the interface, binds them to actions, and then enters an event loop. Each action is implemented in terms of the functionality provided by the appropriate library.
It should be possible to implement applications by using relatively few application-specific source files, since most of the functionality is actually done in libraries. In some cases, the application is simple enough that it would be an overkill to package its functionality as a library. Nevertheless, in such cases please separate the source code that handles actual functionality from the source code that handles the user interface. Also, please always separate the code that handles input/output with the code that does actual computations. If these aspects of your source code are sufficiently separated then you make it easier for other people to reuse parts of your code in their applications. You also make it easier of yourself to switch to library-oriented development when your application grows and is no longer “simple enough”.
Library-oriented development allows you to write good and robust applications. In return it requires discipline. Sometimes you may need to add experimental functionality that is not available through your libraries. The right thing to do is to extend the appropriate library. The easy thing to do is to implement it as part of your application-specific source code. If the feature is experimental and undergoing many changes, it may be best to go with the easy approach at first. Still, when the feature matures, please migrate it to the appropriate library, document it, and take it out of the application source code. What we mean by discipline is doing these migrations, when the time is right, despite pressures from “real life”, such as deadlines, pointy-haired bosses, and nuclear terrorism. A rule of thumb for deciding when to migrate code to a library is when you find yourself cut-n-pasting chunks of code from application to application. If you do not do the right thing, your code will become increasingly harder to debug, harder to maintain, and less reliable.
Applications should also be documented, especially the ones that are
command-line oriented. Application documentation should be thorough in
explaining to the user all the things that he needs to know to use
the application effectively and should be distributed separately
from the application itself. Nevertheless, applications should recognize
the --help
switch and output a synopsis of how
the application is used. Applications should also recognize the
--version
switch and state their version number. The easiest
way to make applications understand these two switches is to use the
GNU Argp library (FIXME: cross reference).
One of the reasons why you should write good code is because it allows you to make your code robust, reliable and most useful to your needs. Another reason is to make it useful to other people too, and make it easier for them to work with your code and reuse it for their own work. In order for this to be possible, you need to give worry about a few obnoxious legal issues.
Maintaining these legalese notices can be quite painful after some time. To ease the burden, Autotools distributes a utility called ‘gpl’. This utility will conveniently generate for you all the legal wording you will ever want to use. It is important to know that this application is not approved in any way by the Free Software Foundation. By this I mean that I haven't asked their opinion of it yet.
To create the file COPYING type:
% gpl -l COPYING
If you want to include a copy of the GPL in your documentation, you can generate a copy in texinfo format like this:
% gpl -lt gpl.texi
Also, every time you want to create a new file, use the ‘gpl’ to generate the copyright notice. If you want it covered by the GPL use the standard notice. If you want to invoke the Guile-like permissions, then also use the library notice. If you want to grant unlimited permissions, meaning no copyleft, use the special notice. The ‘gpl’ utility takes many different flags to take into account the different commenting conventions.
% gpl -c file.c
the library notice with
% gpl -cL file.c
and the special notice with
% gpl -cS file.c
% gpl -cc file.cc
the library notice with
% gpl -ccL file.cc
and the special notice with
% gpl -ccS file.cc
% gpl -sh foo.pl
the library notice with
% gpl -shL foo.tcl
and the special notice with
% gpl -shS foo.pl
It does not make sense to use the library notice, if no executable is being formed from this file. If however, you parse that file into C code that is then compiled into object code, then you may consider using the library notice on it instead of the special notice. One of the features provided by Autotools allows you to embed text, such as Tcl scripts, into the executable. In that case, you can use the library notice to license the original text.
% gpl -m4 file.m4
In general, we exempt autoconf macro files from the GNU GPL because the terms of autoconf also exclude its output, the ‘configure’ script, from the GPL.
% gpl -am Makefile.am
For these we also exempt them from the GPL because they are so trivial that it makes no sense to add copyleft protection.
If you are using GNU Emacs, then you can insert these copyright notices
on-demand while you're editing your source code. Autotools bundles two
Emacs packages: gpl
and gpl-copying
which provide you with
equivalents of the ‘gpl’ command that can be run under Emacs. These
packages will be byte-compiled and installed automatically for you while
installing Autotools.
To use these packages, in your .emacs you must declare your identity by adding the following commands:
(setq user-mail-address "me@here.com") (setq user-full-name "My Name")
Then you must require the packages to be loaded:
(require 'gpl) (require 'gpl-copying)
These packages introduce a set of Emacs commands all of which are prefixed
as gpl-
. To invoke any of these commands press M-x
, type
the name of the command and press enter.
The following commands will generate notices for your source code:
unnumbered
chapter titled “Copying” in the
Texinfo documentation of your source code. You will be prompted for the
title of your package. That title will substitute the word Autotools
as it appears in the corresponding section in this manual.
Emacs is an environment for running Lisp programs that manipulate text interactively. To call Emacs merely an editor does not do it justice, unless you redefine the word “editor” to the broadest meaning possible. Emacs is so extensive, powerful and flexible, that you can almost think of it as a self-contained “operating system” in its own right.
Emacs is a very important part of the GNU development tools because it provides an integrated environment for software development. The simplest thing you can do with Emacs is edit your source code. However, you can do a lot more than that. You can run a debugger, and step through your program while Emacs shows you the corresponding sources that you are stepping through. You can browse on-line Info documentation and man pages, download and read your email off-line, and follow discussions on newsgroups. Emacs is particularly helpful with writing documentation with the Texinfo documentation system. You will find it harder to use Texinfo, if you don't use Emacs. It is also very helpful with editing files on remote machines over FTP, especially when your connection to the internet is over a slow modem. Finally, and most importantly, Emacs is programmable. You can write Emacs functions in Emacs Lisp to automate any chore that you find particularly useful in your own work. Because Emacs Lisp is a full programming language, there is no practical limit to what you can do with it.
If you already know a lot about Emacs, you can skip this chapter and move on. If you are a “vi” user, then we will assimilate you: See Using vi emulation, for details. 2 This chapter will be most useful to the novice user who would like to set per Emacs up and running for software development, however it is not by any means comprehensive. See Further reading on Emacs, for references to more comprehensive Emacs documentation.
Emacs is an environment for running Lisp programs that manipulate text interactively. Because Emacs is completely programmable, it can be used to implement not only editors, but a full integrated development environment for software development. Emacs can also browse info documentation, run email clients, a newsgroup reader, a sophisticated xterm, and an understanding psychotherapist.
Under the X window system, Emacs controls multiple x-windows called frames. Each frame has a menu bar and the main editing area. The editing area is divided into windows with horizontal bars. You can grab these bars and move them around with the first mouse button. 3 Each window is bound to a buffer. A buffer is an Emacs data structure that contains text. Most editing commands operate on buffers, modifying their contents. When a buffer is bound to a window, then you can see its contents as they are being changed. It is possible for a buffer to be bound to two windows, on different frames or on the same frame. Then whenever a change is made to the buffer, it is reflected on both windows. It is not necessary for a buffer to be bound to a window, in order to operate on it. In a typical Emacs session you may be manipulating more buffers than the windows that you have on your screen.
A buffer can be visiting files. In that case, the contents of the buffer reflect the contents of a file that is being edited. But buffers can be associated with anything you like, so long as you program it up. For example, under the Dired directory editor, a buffer is bound to a directory, showing you the contents of the directory. When you press <Enter> while the cursor is over a file name, Emacs creates a new buffer, visits the file, and rebinds the window with that buffer. From the user's perspective, by pressing <Enter> he “opened” the file for editing. If the file has already been “opened” then Emacs simply rebinds the existing buffer for that file.
Emacs uses a variant of LISP, called Emacs LISP, as its programming language.
Every time you press a key, click the mouse, or select an entry from the
menu bar, an Emacs LISP function is evaluated. The mode of the
buffer determines, among many other things, what function to evaluate.
This way, every buffer can be associated with functionality that defines
what you do in that buffer. For example you can program your buffer to edit
text, to edit source code, to read news, and so on. You can also run
LISP functions directly on the current buffer by typing M-x
and
the name of the function that you want to run.
4
What is known as the “Emacs editor” is the default implementation of an editor under the Emacs system. If you prefer the vi editor, then you can instead run a vi clone, Viper (see Using vi emulation). The main reason why you should use Emacs, is not the particular editor, but the way Emacs integrates editing with all the other functions that you like to do as a software developer. For example:
All of these features make Emacs a very powerful, albeit unusual, integrated development environment. Many users of proprietary operating systems, like Lose95 5, complain that GNU (and Unix) does not have an integrated development environment. As a matter of fact it does. All of the above features make Emacs a very powerful IDE.
Emacs has its own very extensive documentation (see Further reading on Emacs). In this manual we will only go over the fundamentals for using Emacs effectively as an integrated development environment.
If Emacs is not installed on your system, you will need to get a source code distribution and compile it yourself. Installing Emacs is not difficult. If Emacs is already installed on your GNU/Linux system, you might still need to reinstall it: you might not have the most recent version, you might have XEmacs instead, you might not have support for internationalization, or your Emacs might not have compiled support for reading mail over POP (a feature very useful to developers that hook up over modem). If any of these is the case, then uninstall that version of Emacs, and reinstall Emacs from a source code distribution.
The entire Emacs source code is distributed in three separate files:
% gunzip emacs-21.2.tar.gz % tar xf emacs-21.2.tar % gunzip leim-21.2.tar.gz % tar xf leim-21.2.tar
Both tarballs will unpack under the emacs-21.2 directory. When this is finished, configure the source code with the following commands:
% cd emacs-21.2 % ./configure --with-pop --with-gssapi % make
The ‘--with-pop’ flag is almost always a good idea, especially if you are running Emacs from a home computer that is connected to the internet over modem. It will let you use Emacs to download your email from your internet provider and read it off-line (see Using Emacs as an email client). Most internet providers use GSSAPI-authenticated POP. If you need to support other authentication protocols however, you may also want to add one of the following flags:
--with-kerberos
--with-kerberos5
--with-hesiod
$ make # make install
Emacs is a very large program, so this will take a while.
To install intlfonts-1.1.tar.gz unpack it, and follow the instructions in the README file. Alternatively, you may find it more straightforward to install it from a Debian package. Packages for intlfonts exist as of Debian 2.1.
In this section we describe what Emacs is and what it does. We will not yet discuss how to make Emacs work. That discussion is taken up in the subsequent sections, starting with Configuring GNU Emacs. This section instead covers the fundamental ideas that you need to understand in order to make sense out of Emacs.
You can run Emacs from a text terminal, such as a vt100 terminal, but it is usually nicer to run Emacs under the X-windows system. To start Emacs type
% emacs &
on your shell prompt. The seasoned GNU developer usually sets up per X configuration such that it starts Emacs when person logs in. Then, person uses that Emacs process for all of per work until person logs out. To quit Emacs press C-x C-c, or select
Files ==> Exit Emacs
from the menu. The notation C-c means <CTRL>-c. The separating dash ‘-’ means that you press the key after the dash while holding down the key before the dash. Be sure to quit Emacs before logging out, to ensure that your work is properly saved. If there are any files that you haven't yet saved, Emacs will prompt you and ask you if you want to save them, before quiting. If at any time you want Emacs to stop doing what it's doing, press C-g.
Under the X window system, Emacs controls multiple x-windows which are called frames. Each frame has a menu bar and the main editing area. The editing area is divided into windows 6 by horizontal bars, called status bars. Every status bar contains concise information about the status of the window above the status bar. The minimal editing area has at least one big window, where editing takes place, and a small one-line window called the minibuffer. Emacs uses the minibuffer to display brief messages and to prompt the user to enter commands or other input. The minibuffer has no status bar of its own.
Each window is bound to a buffer. A buffer is an Emacs data structure that contains text. Most editing commands operate on buffers, modifying their contents. When a buffer is bound to a window, then you can see its contents as they are being changed. It is possible for a buffer to be bound to two windows, on different frames or on the same frame. Then whenever a change is made to the buffer, it is reflected on both windows. It is not necessary for a buffer to be bound to a window, in order to operate on it. In a typical Emacs session you may be manipulating more buffers than the windows that you actually have on your screen.
A buffer can be visiting files. In that case, the contents of the buffer reflect the contents of a file that is being edited. But buffers can be associated with anything you like, so long as you program it up. For example, under the Dired directory editor, a buffer is bound to a directory, showing you the contents of the directory. When you press <RET> while the cursor is over a file name, Emacs creates a new buffer, visits the file, and rebinds the window with that buffer. From the user's perspective, by pressing <RET> person “opened” the file for editing. If the file has already been “opened” then Emacs simply rebinds the existing buffer for that file.
Sometimes Emacs will divide a frame to two or more windows. You can switch from one window to another by clicking the 1st mouse button, while the mouse is inside the destination window. To resize these windows, grab the status bar with the 1st mouse button and move it up or down. Pressing the 2nd mouse button, while the mouse is on a status bar, will bury the window bellow the status bar. Pressing the 3rd mouse button will bury the window above the status bar, instead. Buried windows are not killed; they still exist and you can get back to them by selecting them from the menu bar, under:
Buffers ==> name-of-buffer
Buffers, with some exceptions, are usually named after the filenames of the files that they correspond to.
Once you visit a file for editing, then all you need to do is to edit it! The best way to learn how to edit files using the standard Emacs editor is by working through the on-line Emacs tutorial. To start the on-line tutorial type C-h t or select:
Help ==> Emacs Tutorial
If you are a vi user, or you simply prefer to use `vi' key bindings, then read Using vi emulation.
In Emacs, every event causes a Lisp function to be executed. An event can be any keystroke, mouse movement, mouse clicking or dragging, or a menu bar selection. The function implements the appropriate response to the event. Almost all of these functions are written in a variant of Lisp called Emacs Lisp. The actual Emacs program, the executable, is an Emacs Lisp interpreter with the implementation of frames, buffers, and so on. However, the actual functionality that makes Emacs usable is implemented in Emacs Lisp.
Sometimes, Emacs will bind a few words of text to an Emacs function. For example, when you use Emacs to browse Info documentation, certain words that corresponds to hyperlinks to other nodes are bound to a function that makes Emacs follow the hyperlink. When such a binding is actually installed, moving the mouse over the bound text highlights it momentarily. While the text is highlighted, you can invoke the binding by clicking the 2nd mouse button.
Sometimes, an Emacs function might go into an infinite loop, or it might start doing something that you want to stop. You can always make Emacs abort 7 the function it is currently running by pressing C-g.
Emacs functions are usually spawned by Emacs itself in response to an event. However, the user can also spawn an Emacs function by typing:
<ALT>-x function-name <RET>
These functions can also be aborted with C-g.
It is standard in Emacs documentation to refer to the <ALT> key with the letter ‘M’. So, in the future, we will be referring to function invocations as:
M-x function-name
Because Emacs functionality is implemented in an event-driven fashion, the Emacs developer has to write Lisp functions that implement functionality, and then bind these functions to events. Tables of such bindings are called keymaps.
Emacs has a global keymap, which is in effect at all times, and then it has specialized keymaps depending on what editing mode you use. Editing modes are selected when you visit a file depending on the name of the file. So, for example, if you visit a C file, Emacs goes into the C mode. If you visit Makefile, Emacs goes into makefile mode. The reason for associating different modes with different types of files is that the user's editing needs depend on the type of file that person is editing.
You can also enter a mode by running the Emacs function that initializes the mode. Here are the most commonly used modes:
M-x c-mode
M-x c++-mode
M-x sh-mode
M-x m4-mode
M-x texinfo-mode
M-x makefile-mode
To use Emacs effectively for software development you need to configure it. Part of the configuration needs to be done in your X-resources file. On a Debian GNU/Linux system, the X-resources can be configured by editing
/etc/X11/Xresources
In many systems, you can configure X-resources by editing a file called .Xresources or .Xdefaults on your home directory, but that is system-dependent. The configuration that I use on my system is:
! Emacs defaults emacs*Background: Black emacs*Foreground: White emacs*pointerColor: White emacs*cursorColor: White emacs*bitmapIcon: on emacs*font: fixed emacs*geometry: 80x40
In general I favor dark backgrounds and ‘fixed’ fonts. Dark backgrounds make it easier to sit in front of the monitor for a prolonged period of time. ‘fixed’ fonts looks nice and it's small enough to make efficient use of your screen space. Some people might prefer larger fonts however.
When Emacs starts up, it looks for a file called .emacs at the user's home directory, and evaluates its contents through the Emacs Lisp interpreter. You can customize and modify Emacs' behaviour by adding commands, written in Emacs Lisp, to this file. Here's a brief outline of the ways in which you can customize Emacs:
(setq variable value)
For example:
(setq viper-mode t)
You can access on-line documentation for global variables by running:
M-x describe-variable
(setenv "variable" "value")
For example:
(setenv "INFOPATH" "/usr/info:/usr/local/info")
‘setenv’ does not affect the shell that invoked Emacs, but it does affect Emacs itself, and shells that are run under Emacs.
(global-set-key [key sequence] 'function)
For example, adding:
(global-set-key [F12 d] 'doctor)
to .emacs makes the key sequence F12 d equivalent to running ‘M-x doctor’. Emacs has many functions that provide all sorts of features. To find out about specific functions, consult the Emacs user manual. Once you know that a function exists, you can also get on-line documentation for it by running:
M-x describe-function
You can also write your own functions in Emacs Lisp.
(defun texi-insert-@example () "Insert an @example @end example block" (interactive) (beginning-of-line) (insert "\n@example\n") (save-excursion (insert "\n") (insert "@end example\n") (insert "\n@noindent\n")))
We would like to bind this function to the key ‘F9’, however we would like this binding to be in effect only when we are within ‘texinfo-mode’. To do that, first we must define a hook function that establishes the local bindings using ‘define-key’:
(defun texinfo-elef-hook () (define-key texinfo-mode-map [F9] 'texi-insert-@example))
The syntax of ‘define-key’ is similar to ‘global-set-key’ except it takes the name of the local keymap as an additional argument. The local keymap of any ‘name-mode’ is ‘name-mode-map’. Finally, we must ask ‘texinfo-mode’ to call the function ‘texinfo-elef-hook’. To do that use the ‘add-hook’ command:
(add-hook 'texinfo-mode-hook 'texinfo-elef-hook)
In some cases, Emacs itself will provide you with a few optional hooks that you can attach to your modes.
With the exception of simple customizations, most of the more complicated ones require that you write new Emacs Lisp functions, distribute them with your software and somehow make them visible to the installer's Emacs when person installs your software. See Emacs Lisp with Automake, for more details on how to include Emacs Lisp packages to your software.
Here are some simple customizations that you might want to add to your .emacs file:
(set-background-color "black") (set-foreground-color "white")
You can change the colors to your liking.
(setq user-mail-address "karl@whitehouse.com") (setq user-full-name "President Karl Marx")
Make sure the name is your real name, and the email address that you include can receive email 24 hours per day.
(display-time) (line-number-mode 1) (column-number-mode 1)
(global-set-key [mouse-2] 'yank)
By default, selected text in Emacs buffers is highlighted with blue color. However, you can also select and paste into an Emacs buffer text that you select from other applications, like your web browser, or your xterm.
(global-font-lock-mode t) (setq font-lock-maximum-size nil)
(setq scroll-bar-mode nil)
The only reason that the scrollbar is default is to make Emacs more similar to what lusers are used to. It is assumed that seasoned hacker, who will be glad to see the scrollbar bite it, will figure out how to make it go away.
m4-mode
and editing Makefile.am takes you to makefile-mode
.
(setq auto-mode-alist (append '( ("configure.in" . m4-mode) ("\\.m4\\'" . m4-mode) ("\\.am\\'" . makefile-mode)) auto-mode-alist))
You will have to edit such files if you use the GNU build system. See The GNU build system, for more details.
(setq load-path (append "/usr/share/emacs/site-lisp" "/usr/local/share/emacs/site-site" (expand-file-name "~lf/lisp") load-path))
Note the use of ‘expand-file-name’ for dealing with non-absolute directories. If you are a user in an account where you don't have root privilege, you are very likely to need to install your Emacs packages in a non-standard directory.
Help ==> Customize ==> Browse Customization Groups
from the menu bar. You can also manipulate some common settings from:
Help ==> Options
Many hackers prefer to use the ‘vi’ editor. The ‘vi’ editor is the standard editor on Unix. It is also always available on GNU/Linux. Many system administrators find it necessary to use vi, especially when they are in the middle of setting up a system in which Emacs has not been installed yet. Besides that, there are many compelling reasons why people like vi.
The vi emulation package for the Emacs system is called Viper. To use Viper, add the following lines in your .emacs:
(setq viper-mode t) (setq viper-inhibit-startup-message 't) (setq viper-expert-level '3) (require 'viper)
We recommend expert level 3, as the most balanced blend of the vi editor with the Emacs system. Most editing modes are aware of Viper, and when you begin editing the text you are immediately thrown into Viper. Some modes however do not do that. In some modes, like the Dired mode, this is very appropriate. In other modes however, especially custom modes that you have added to your system, Viper does not know about them, so it does not configure them to enter Viper mode by default. To tell a mode to enter Viper by default, add a line like the following to your .emacs file:
(add-hook 'm4-mode-hook 'viper-mode)
The modes that you are most likely to use during software development are
c-mode , c++-mode , texinfo-mode sh-mode , m4-mode , makefile-mode
Sometimes, Emacs will enter Viper mode by default in modes where you prefer
to get Emacs modes. In some versions of Emacs, the
compilation-mode
is such a mode. To tell a mode not to
enter Viper by default, add a line like the following to your
.emacs file:
(add-hook 'compilation-mode-hook 'viper-change-state-to-emacs)
The Emacs distribution has a Viper manual. For more details on setting Viper up, you should read that manual.
The vi editor has these things called editing modes. An editing mode defines how the editor responds to your keystrokes. Vi has three editing modes: insert mode, replace mode and command mode. If you run Viper, there is also the Emacs mode. Emacs indicates which mode you are in by showing one of ‘<I>’, ‘<R>’, ‘<V>’, ‘<E>’ on the statusbar correspondingly for the Insert, Replace, Command and Emacs modes. Emacs also shows you the mode by the color of the cursor. This makes it easy for you to keep track of which mode you are in.
When you develop software, you need to edit many files at the same time, and you need an efficient way to switch from one file to another. The most general solution in Emacs is by going through Dired, the Emacs Directory Editor.
To use Dired effectively, we recommend that you add the following customizations to your .emacs file: First, add
(add-hook 'dired-load-hook (function (lambda () (load "dired-x")))) (setq dired-omit-files-p t)
to activate the extended features of Dired. Then add the following key-bindings to the global keymap:
(global-set-key [f1] 'dired) (global-set-key [f2] 'dired-omit-toggle) (global-set-key [f3] 'shell) (global-set-key [f4] 'find-file) (global-set-key [f5] 'compile) (global-set-key [f6] 'visit-tags-table) (global-set-key [f8] 'add-change-log-entry-other-window) (global-set-key [f12] 'make-frame)
If you use viper (see Using vi emulation), you should also add the following customization to your .emacs:
(add-hook 'compilation-mode-hook 'viper-change-state-to-emacs)
With these bindings, you can navigate from file to file or switch between editing and the shell simply by pressing the right function keys. Here's what these key bindings do:
To go down a directory, move the cursor over the directory filename and press RET. To go up a few directories, press f1 and when you are prompted for the new directory, with the current directory as the default choice, erase your way up the hierarchy and press <RET>. To jump to a substantially different directory that you have visited recently, press f1 and then when prompted for the destination directory name, use the cursor keys to select the directory that you want among the list of directories that you have recently visited.
While in the directory navigator, you can use the cursor keys to move to another file. Pressing <<RET>> will bring that file up for editing. However there are many other things that Dired will let you do instead:
Emacs provides another method for jumping from file to file: tags.
Suppose that you are editing a C program whose source code is distributed
in many files, and while editing the source for the function foo
,
you note that it is calling another function gleep
. If you move
your cursor on gleep
, then Emacs will let you jump to the file
where gleep
is defined by pressing M-.. You can also jump to
other occurences in your code where gleep
is invoked by pressing
M-,. In order for this
to work, you need to do two things: you need to generate a tags
file, and you need to tell emacs to load the file. If your source code
is maintained with the GNU build system, you can create that tags files
by typing:
% make tags
from the top-level directory of your source tree. Then load the tags file in Emacs by navigating Dired to the top-level directory of your source code, and pressing f6.
While editing a file, you may want to hop to the shell prompt to run a program. You can do that at any time, on any frame, by pressing f3. To get out of the shell, and back into the file that you were editing, enter the directory editor by pressing f1, and then press <RET> repeatedly. The default selections will take you back to the file that you were most recently editing on that frame.
One very nice feature of Emacs is that it understands tar files. If you have a tar file foo.tar and you select it under Dired, then Emacs will load the entire file, parse it, and let you edit the individual files that it includes directly. This only works, however, when the tar file is not compressed. Usually tar files are distributed compressed, so you should uncompress them first with Z before entering them. Also, be careful not to load an extremely huge tar file. Emacs may mean “eating memory and constantly swapping” to some people, but don't push it!
Another very powerful feature of Emacs is the Ange-FTP package: it allows you to edit files on other computers, remotely, over an FTP connection. From a user perspective, remote files behave just like local files. All you have to do is press f1 or f4 and request a directory or file with filename following this form:
/username@host:/pathname
Then Emacs will access for you the file /pathname on the
remote machine host by logging in over FTP as username.
You will be prompted for a password, but that will happen only once per
host. Emacs will then
download the file that you want to edit and let you make your changes locally.
When you save your changes, Emacs will use an FTP connection again to upload
the new version back to the remote machine, replacing the older version of
the file there. When you develop software on a remote computer, this feature
can be very useful, especially if your connection to the Net is over
a slow modem line. This way you can edit remote files just like you do
with local files. You will still have to telnet to the remote computer
to get a shell prompt. In Emacs, you can do this with M-x telnet
.
An advantage to telneting under Emacs is that it records your session,
and you can save it to a file to browse it later.
While you are making changes to your files, you should also be keeping a diary of these changes in a ChangeLog file (see Maintaining the documentation files). Whenever you are done with a modification that you would like to log, press f8, while the cursor is still at the same file, and preferably near the modification (for example, if you are editing a C program, be inside the same C function). Emacs will split the frame to two windows. The new window brings up your ChangeLog file. Record your changes and click on the status bar that separates the two windows with the 2nd mouse button to get rid of the ChangeLog file. Because updating the log is a frequent chore, this Emacs help is invaluable.
If you would like to compile your program, you can use the shell prompt
to run ‘make’. However, the Emacs way is to use the M-x compile
command. Press f5. Emacs will prompt you for the command that you
would like to run. You can enter something like: ‘configure’,
‘make’, ‘make dvi’, and so on
(see Installing a GNU package). The directory on which this command
will run is the current directory of the current buffer. If your current
buffer is visiting a file, then your command will run on the same directory
as the file. If your current buffer is the directory editor, then your
command will run on that directory. When you press <RET>, Emacs will
split the frame into another window, and it will show you the command's
output on that window. If there are error messages, then Emacs converts
these messages to hyperlinks and you can follow them by pressing <RET>
while the cursor is on them, or by clicking on them with the 2nd mouse button.
When you are done, click on the status bar with the 2nd mouse button to
get the compilation window off your screen.
You can use Emacs to read your email. If you maintain free software, or in general maintain a very active internet life, you will get a lot of email. The Emacs mail readers have been designed to address the needs of software developers who get endless tons of email every day.
Emacs has two email programs: Rmail and Gnus. Rmail is simpler to learn, and it is similar to many other mail readers. The philosophy behind Rmail is that instead of separating messages to different folders, you attach labels to each message but leave the messages on the same folder. Then you can tell Rmail to browse only messages that have specific labels. Gnus, on the other hand, has a rather eccentric approach to email. It is a news-reader, so it makes your email look like another newsgroup! This is actually very nice if you are subscribed to many mailing lists and want to sort your email messages automatically. To learn more about Gnus, read the excellent Gnus manual. In this manual, we will only describe Rmail.
When you start Rmail, it moves any new mail from your mailboxes to the file ~/RMAIL in your home directory. So, the first thing you need to tell Rmail is where your mailboxes are. To do that, add the following to your .emacs:
(require 'rmail) (setq rmail-primary-inbox-list (list "mailbox1" "mailbox2" ...))
If your mailboxes are on a filesystem that is mounted to your computer, then you just have to list the corresponding filenames. If your mailbox is on a remote computer, then you have to use the POP protocol to download it to your own computer. In order for this to work, the remote computer must support POP. Many hobbyist developers receive their email on an internet provider computer that is connected to the network 24/7 and download it on their personal computer whenever they dial up.
For example, if karl@whitehouse.gov
is your email address at your
internet provider, and they support POP, you would have to add the
following to your .emacs:
(require 'rmail) (setq rmail-primary-inbox-list (list "po:karl")) (setenv "MAILHOST" "whitehouse.gov") (setq rmail-pop-password-required t) (setq user-mail-address "karl@whitehouse.gov") (setq user-full-name "President Karl Marx")
The string "po:username" is used to tell the POP daemon which
mailbox you want to download. The environment variable MAILHOST
tells Emacs which machine to connect to, to talk with a POP daemon.
We also tell Emacs to prompt in the minibuffer to request
the password for logging in with the POP daemon. The alternative is to
hardcode the password into the .emacs file, but doing so is not
a very good idea: if the security of your home computer is compromised, the
cracker also gets your password for another system. Emacs will remember the
password however, after the first time you enter it, so you won't have to
enter it again later, during the same Emacs session. Finally, we tell Emacs
our internet provider's email address and our “real name” in the internet
provider's account. This way, when you send email from your home computer,
Emacs will spoof it to make it look like it was sent from the internet
provider's computer.
In addition to telling Rmail where to find your email, you may also want to add the following configuration options:
>
prefix:
(setq mail-yank-prefix ">")
(setq mail-self-blind t)
(setq mail-archive-file-name "/home/username/mail/sent-mail")
(setq mail-signature t)
and add the actual contents of your signature to .signature at your home directory.
Once Rmail is configured, to start downloading your email, run
M-x rmail
in Emacs. Emacs will load your mail, prompt you for
your POP password if necessary, and download your email from the internet
provider. Then, Emacs will display the first new message. You may quickly
navigate by pressing n to go to the next message or p to go
to the previous message.
It is much better however to tell Emacs to compile a summary of your messages
and let you to navigate your mailbox using the summary. To do that, press
h. Emacs will split your frame to two windows: one window will
display the current message, and the other window the summary. A highlighted
bar in the summary indicates what the current message is. Emacs will also
display any labels that you have associated with your messages.
While the current buffer is the summary, you can navigate from message
to message with the cursor keys (up and down in particular).
You can also run any of the following commands:
--text follows this line--
Before this line you may edit the message's headers. After this line, you edit the actual body of the of the message. When you are done composing the message, you can do one of the following:
Believe it or not, I really don't know how to do that. I need a volunteer to explain this to me so I can explain it then in this section
When you develop free software, you must place copyright notices at every file that invokes the General Public License. If you don't place any notice whatsoever, then the legal meaning is that you refuse to give any permissions whatsoever, and the software consequently is not free. For more details see Applying the GPL. Many hackers, who don't take the law seriously, complain that adding the copyright notices takes too much typing. Some of these people live in countries where copyright is not really enforced. Others simply ignore it.
There is an Emacs package, called ‘gpl’, which is currently distributed with Autotoolset, that makes it possible to insert and maintain copyright notices with minimal work. To use this package, in your .emacs you must declare your identity by adding the following commands:
(setq user-mail-address "me@here.com") (setq user-full-name "My Name")
Then you must require the packages to be loaded:
(require 'gpl) (require 'gpl-copying)
This package introduces the following commands:
gpl
gpl-fsf
gpl
command to insert a GPL
notice for software that is assigned to the Free Software Foundation.
The gpl
command autodetects what type of file you are editing,
from the filename, and uses the appropriate commenting.
gpl-personal
gpl
command to insert a
GPL notice for software in which you keep the copyright.
(setq gpl-organization "name")
after the ‘require’ statements in your .emacs.
Every once in a while, after long heroic efforts in front of the computer
monitor, a software developer will need to some counseling to feel
better about herself. In RL (real life) counseling is very expensive and
it also involves getting up from your computer and transporting yourself
to another location, which decreases your productivity. Emacs can help you.
Run M-x doctor
, and you will talk to a psychiatrist for free.
Many people say that hackers work too hard and they should go out for
a walk once in a while. In Emacs, it is possible to do that without
getting up from your chair. To enter an alternate universe, run
M-x dunnet
. Aside from being a refreshing experience, it is also
a very effective way to procrastinate away work that you don't want to do.
Why do today, what you can postpone for tomorrow?
This chapter should be enough to get you going with GNU Emacs. This is really all you need to know to use Emacs to develop software. However, the more you learn about Emacs, the more effectively you will be able to use it, and there is still a lot to learn; a lot more than we can fit in this one chapter. In this section we refer to other manuals that you can read to learn more about Emacs. Unlike many proprietary manuals that you are likely to find in bookstores, these manuals are free (see Why free software needs free documentation). Whenever possible, please contribute to the GNU project by ordering a bound copy of the free documentation from the Free Software Foundation, or by contributing a donation.
The Free Software Foundation publishes the following manuals on Emacs:
In this chapter we describe how to use the compiler to compile simple software and libraries, and how to use makefiles.
It is very easy to compile simple C programs on the GNU system. For example, consider the famous “Hello world” program:
#include <stdio.h> int main () { printf ("Hello world\n"); }
% gcc hello.c
on your shell. The resulting executable file is called a.out and you can run it from the shell like this:
% ./a.out Hello world
To cause the executable to be stored under a different filename pass the ‘-o’ flag to the compiler:
% gcc hello.c -o hello % ./hello Hello world
Even with simple one-file hacks like this, the GNU compiler can accept many options that modify its behaviour:
% gcc -g -O3 hello.c hello % gcc -g -Wall hello.c -o hello % gcc -g -Wall -O3 hello.c -o hello
To run a compiled executable in the current directory just type its
name, prepended by ‘./’. In general, once you compile a useful
program, you should install it so that it can be run from any current
directory, simply by typing its name without prepending ‘./’.
To install an executable, you need to move it to a standard directory
such as /usr/bin or /usr/local/bin. If you don't have
permissions to install files there, you can instead install them on
your home directory at /home/username/bin where username
is your username. When you write the name of an executable, the shell
looks for the executable in a set of directories listed in the environment
variable ‘PATH’. To add a nonstandard directory to your path do
% export PATH="$PATH:/home/username/bin"
if you are using the Bash shell, or
% setenv PATH "$PATH:/home/username/bin"
if you are using a different shell.
Now let's consider the case where you have a much larger program made of source files foo1.c, foo2.c, foo3.c and header files header1.h and header2.h. One way to compile the program is like this:
% gcc foo1.c foo2.c foo3.c -o foo
This is fine when you have only a few files to deal with. Eventually, when you have more than a few dozen files, it becomes wasteful to compile all of the files, all the time, every time you make a change in only one of the files. For this reason, the compiler allows you to compile every file separately into an intermediate file called object file, and link all the object files together at the end. This can be done with the following commands:
% gcc -c foo1.c % gcc -c foo2.c % gcc -c foo3.c % gcc foo1.o foo2.o foo3.o -o foo
The first three commands generate the object files foo1.o, foo2.o, foo3.o and the last command links them together to the final executable file foo. The *.o suffix is reserved for use only by object files.
If you make a change only in foo1.c, then you can rebuild foo like this:
% gcc -c foo1.c % gcc foo1.o foo2.o foo3.o -o foo
The object files foo2.o and foo3.o do not need to be rebuilt since only foo1.c changed, so it is not necessary to recompile them.
Object files *.o contain definitions of variables and subroutines written out in assembly (machine language “pseudo code”). Most of these definitions will eventually be embedded in the final executable program at a specific address. At this stage however these memory addresses are not known so they are being referred to symbolically. These symbolic references are called symbols. It is possible to list the symbols defined in an object file with the ‘nm’ command. For example:
% nm xmalloc.o U error U malloc U realloc 00000000 D xalloc_exit_failure 00000000 t xalloc_fail 00000004 D xalloc_fail_func 00000014 R xalloc_msg_memory_exhausted 00000030 T xmalloc 00000060 T xrealloc
The first column lists the symbol's address within the object file, when the symbol is actually defined in that object file. The second column lists the symbol type. The third column is the symbolic name of the symbol. In the final executable, these names become irrelevant. The following types commonly occur:
static
.
const
) global variable
The job of the compiler is to translate all the C source files to object files containing a corresponding set of symbolic definitions. It is the job of another program, the linker, to put the object files together, resolve and evaluate all the symbolic addresses, and build a complete machine language program that can actually be executed. When you ask ‘gcc’ to link the object files into an executable, the compiler is actually running the linker to do the job.
During the process of linking, all the machine language instructions that refer to a specific memory address need to be modified to use the correct addresses within the executable, as opposed to the addresses within their object file. This becomes an issue when you want to your program to load and link compiled object files during run-time instead of compile-time. To make such dynamic linking possible, your symbols need to be relocatable. This means that your symbols definitions must be correct no matter where you place them in memory. There should be no memory addresses that need to be modified. One way to do this is by referring to memory addresses within the object file by giving an offset from the referring address. Memory addresses outside the object file must be treated as interlibrary dependencies and you must tell the compiler what you expect them to be when you attempt to build relocatable machine code. Unfortunately some flavours of Unix do not handle interlibrary dependencies correctly. Fortunately, all of this mess can be dealt with in a uniform way, to the extent that this is possible, by using GNU Libtool. See Using Libtool, for more details.
On GNU and Unix, all compiled languages compile to object files, and it is possible, in principle, to link object files that have originated from source files written in different programming languages. For example it is possible to link source code written in Fortran together with source code written in C or C++. In such cases, you need to know how the compiler converts the names with which the program language calls its constructs (such as variable, subroutines, etc.) to symbol names. Such conversions, when they actually happen, are called name-mangling. Both C++ and Fortran do name-mangling. C however is a very nice language, because it does absolutely no name-mangling. This is why when you want to write code that you want to export to many programming languages, it is best to write it in C. See Using Fortran effectively, for more details on how to deal with the name-mangling done by Fortran compilers.
In many cases a collection of object files form a logical unit that is used by more than one executable. On both GNU and Unix systems, it is possible to collect such object files and form a library. On the GNU system, to create a library, you use the ar command:
ar cru libfoo.a foo1.o foo2.o foo3.o
This will create a file libfoo.a from the object files foo1.o, foo2.o and foo3.o. The suffix *.a is reserved for object file libraries. Before using the library, it needs to be “blessed” by a program called ‘ranlib’:
% ranlib libfoo.a
The GNU system, and most Unix systems require that you run ranlib, but there have been some Unix systems where doing so is not necessary. In fact there are Unix systems, like some versions of SGI's Irix, that don't even have the ‘ranlib’ command!
The reason for this is historical. Originally ar
was meant to be used merely for packaging files together. The more
well known program tar
is a descendant of ar
that was designed
to handle making such archives on a tape device. Now that tape devices are
more or less obsolete, tar
is playing the role that was originally
meant for ar
. As for ar
, way back, some people thought to
use it to package *.o
files. However the linker wanted a symbol table
to be passed along with the archive. So the ranlib
program was written to generate that table and add it to the *.a
file.
Then some Unix vendors thought that if they incorporated ranlib
to ar
then users wouldn't have to worry about forgetting to call
ranlib
. So they provided ranlib
but it did nothing. Some
of the more evil ones dropped it all-together breaking many people's scripts.
Once you have a library, you can link it with other object files just as if it were an object file itself. For example
% gcc bar.o libfoo.a -o foo
using libfoo.a as defined above, is equivalent to writing
% gcc bar.o foo1.o foo2.o foo3.o -o foo
Libraries are particularly useful when they are installed. To install a library you need to move the file libfoo.a to a standard directory. The actual location of that directory depends on your compiler. The GNU compiler looks for installed libraries in /usr/lib and /usr/local/lib. Because many Unix systems also use the GNU compiler, it is safe to say that both of these directories are standard in these systems too. However there are some Unix compilers that don't look at /usr/local/lib by default. Once a library is installed, it can be used in any project from any current directory to compile an executable that uses the subroutines that that library provides. You can direct the compiler to link an installed library with a set of executable files to form an executable by using the ‘-l’ flag like this:
% gcc -o foo bar.o -lfoo
Note that if the filename of the library is libfoo.a, the corresponding argument to the ‘-l’ flag must be only the substring foo, hence ‘-lfoo’. Libraries must be named with names that have the form lib*.a. If you have installed the libfoo.a library in a non-standard directory, you can tell the linker to look for the library in that directory as well by using the ‘-L’ flag. For example, if the library was installed in /home/lf/lib then we would have to invoke the linking like this:
gcc -o bar bar.o -L/home/lf/lib -lfoo
The ‘-L’ flag must appear before the ‘-l’ flag.
Some people like to pass ‘-L.’ to the compiler so they can link uninstalled libraries in the current working directory using the ‘-l’ flag instead of typing in their full filenames. The idea is that they think “it looks better” that way. Actually this is considered bad style. You should use the ‘-l’ flag to link only libraries that have already been installed and use the full pathnames to link in uninstalled libraries. The reason why this is important is because, even though it makes no difference when dealing with ordinary libraries, it makes a lot of difference when you are working with shared libraries. (FIXME: Crossreference). It makes a difference whether or not you are linking to an uninstalled or installed shared library, and in that case the ‘-l’ semantics mean that you are linking an installed shared library. Please stick to this rule, even if you are not using shared libraries, to make it possible to switch to using shared libraries without too much hassle.
Also, if you are linking in more than one library, please pay attention to the order with which you link your libraries. When the linker links a library, it does not embed into the executable code the entire library, but only the symbols that are needed from the library. In order for the linker to know what symbols are really needed from any given library, it must have already parsed all the other libraries and object files that depend on that library! This implies that you first link your object files, then you link the higher-level libraries, then the lower-level libraries. If you are the author of the libraries, you must write your libraries in such a manner, that the dependency graph of your libraries is a tree. If two libraries depend on each other bidirectionally, then you may have trouble linking them in. This suggests that they should be one library instead!
In general libraries are composed of many ‘*.c’ files that compile to object files, and a few header files (‘*.h’). The header files declare the resources that are defined by the library and need to be included by any source files that use the library's resources. In general a library comes with two types of header files: public and private. The public header files declare resources that you want to make accessible to other software. The private header files declare resources that are meant to be used only for developing the library itself. To make an installed library useful, it is also necessary to install the corresponding public header files. The standard directory for installing header files is /usr/include. The GNU compiler also understands /usr/local/include as an alternative directory. When the compiler encounters the directive
#include <foo.h>
it searches these standard directories for foo.h.
If you have installed the header files in a non-standard directory,
you can tell the compiler to search for them in that directory by
using the ‘-I’ flag. For example, to build a program bar
from a source file bar.c that uses the libfoo
library
installed at /home/username you would need to do the following:
% gcc -c -I/home/lf/include bar.c % gcc -o bar bar.o -L/home/lf/lib -lfoo
You can also do it in one step:
% gcc -I/home/lf/include -o bar bar.o -L/home/lf/lib -lfoo
For portability, it is better that the ‘-I’ appear before the filenames of the source files that we want to compile.
A good coding standard is to distinguish private from public header files in your source code by including private header files like
#include "private.h"
and public header files like
#include <public.h>
in your implementation of the library, even when the public header files are not yet installed while building the library. This way source code can be moved in or out of the library without needing to change the header file inclusion semantics from ‘<..>’ to ‘".."’ back and forth. In order for this to work however, you must tell the compiler to search for “installed” header files in the current directory too. To do that you must pass the ‘-I’ flag with the current directory . as argument (‘-I.’).
In many cases a header file needs to include other header files, and it is very easy for some header files to be included more than once. When this happens, the compiler will complain about multiple declarations of the same symbols and throw an error. To prevent this from happening, please surround the contents of your header files with C preprocessor conditional like this:
#ifndef __defined_foo_h #define __defined_hoo_h [...contents...] #endif
The defined macro __defined_foo_h
is used as a flag to indicate that
the contents of this header file have been included. To make sure that
each one of these macros is unique to only one header file, please
combine the prefix __defined
with the pathname of the header file
when it gets installed. If your header file is meant to be installed as
in /usr/local/include/foo.h or /usr/include/foo.h then
use __defined_foo_h
. If your header files is meant to be installed
in a subdirectory like /usr/include/dir/foo.h then please use
__defined_dir_foo_h
instead.
In principle, every library can be implemented using only one public header file and perhaps only one private header file. There are problems with this approach however:
Once this decision is made, a few issues still remain:
For example, if the Foo library wants to install headers foo1.h, foo2.h and foo3.h, it can install them under /usr/include/foo and install in /usr/include/ only a one header file foo.h containing only:
#include <foo/foo1.h> #include <foo/foo2.h> #include <foo/foo3.h>
Please name this “central” header and the directory for the subsidiary headers consistently after the corresponding library. So the libfoo.a library should install a central header named foo.h and all subsidiary headers under the subdirectory foo.
The subsidiary header files should be guarded with preprocessor conditionals, but it is not necessary to also guard the central header file that includes them. To make the flag macros used in these preprocessor conditionals unique, you should include the directory name in the flag macro's name. For example, foo/foo1.h should be guarded with
#ifndef __defined_foo_foo1_h #define __defined_foo_foo1_h [...contents...] #endif
and similarly with foo/foo2.h and foo/foo3.h.
This approach creates yet another problem that needs to
be addressed. If you recall, we suggested that you use the
include "..."
semantics for private header files and the
include <...>
semantics for public header files.
This means that when you include the public header file foo1.h
from one of the source files of the library itself, you should write:
#include <foo/foo1.h>
Unfortunately, if you place the foo1.h in the same directory as the file that attempts to include it, using these semantics, it will not work, because there is no subdirectory foo during compile time.
The simplest way to resolve this is by placing all of the source code for a given library under a directory and all such header files in a subdirectory named foo. The GNU build system in general requires that all the object files that build a specific library be under the same directory. This means that the C files must be in the same directory. It is okey however to place header files in a subdirectory.
This will also work if you have many directories, each containing the
sources for a separate library, and a source file in directory bar,
for example, tries to include the header file <foo/foo1.h> from
a directory foo bellow the directory containing the source code
for the library libfoo
. To make it work, just pass ‘-I’
flags to the compiler for every directory of containing the source code
of every library in the package.
See Libraries with Automake, for more details.
It will also work even if there are already old versions of
foo/foo1.h installed
in a standard directory like /usr/include, because the compiler
will first search under the directories mentioned in the ‘-I’ flags
before trying the standard directories.
A very common point of contention is whether or not using a software library
in your program, makes your program derived work from that library.
For example, suppose that your program uses the readline ()
function
which is defined in the library libreadline.a. To do this, your
program needs to link with this library. Whether or not this makes the
program derived work makes a big difference. The readline library is
free software published under the GNU General Public License, which requires
that any derived work must also be free software and published under the
same terms. So, if your program is derived work, you have to free it;
if not, then you are not required to by the law.
When you link the library with your object files to create an executable, you are copying code from the library and combining it with code from your object files to create a new work. As a result, the executable is derived work. It doesn't matter if you create the executable by hand by running an assembler and putting it together manually, or if you automate the process by letting the compiler do it for you. Legally, you are doing the same thing.
Some people feel that linking to the library dynamically avoids making the executable derived work of the library. A dynamically linked executable does not embed a copy of the library. Instead, it contains code for loading the library from the disk during run-time. However, the executable is still derived work. The law makes no distinction between static linking and dynamic linking. So, when you compile an executable and you link it dynamically to a GPLed library, the executable must be distributed as free software with the library. This also means that you can not link dynamically both to a GPLed library and a proprietary library because the licenses of the two libraries conflict. The best way to resolve such conflicts is by replacing the proprietary library with a free one, or by convincing the owners of the proprietary library to license it as free software.
The law is actually pretty slimy about what is derived work. In the entertainment industry, if you write an original story that takes placed in the established universe of a Hollywood serial, like Star Trek, in which you use characters from that serial, like Captain Kirk, your story is actually derived work, according to the law, and Paramount can claim rights to it. Similarly, a dynamically linked executable does not contain a copy of the library itself, but it does contain code that refers to the library, and it is not self-contained without the library.
Note that there is no conflict when a GPLed utility is invoked by a
proprietary program or vice versa via a system ()
call.
There is a very specific reason why this is allowed: When you were
given a copy of the invoked program, you were given permission to run it.
As a technical matter, on Unix systems and the GNU system,
using a program means forking some process that is already running to
create a new process and loading up the program to take over the new process,
until it exits. This is exactly what the system ()
call does, so
permission to use a program implies that you have permission to
call it from any other program via system ()
. This way, you can
run GNU programs under a proprietary sh
shell on Unix, and you
can invoke proprietary programs from a GNU program. However, a free program
that depends on a proprietary program for its operation can not
be included in a free operating system, because the proprietary program
would also have to be distributed with the system.
Because any program that uses a library becomes derived work of that library, the GNU project occasionally uses another license, the Lesser GPL, (often called LGPL) to copyleft libraries. The LGPL protects the freedom of the library, just like the GPL does, but allows proprietary executables to link and use LGPLed libraries. However, this permission should only be given when it benefits the free software community, and not to be nice to proprietary software developers. There's no moral reason why you should let them use your code if they don't let you use theirs. See The LGPL vs the GPL, for a detailed discussion of this issue.
When you compile ordinary programs, like the hello world program the compiler will automatically link to your program a library called libc.a. So when you type
% gcc -c hello.c % gcc -o hello hello.o
what is actually going on behind the scenes is:
% gcc -c hello.c % gcc -o hello hello.c -lc
To see why this is necessary, try ‘nm’ on hello.o:
% nm hello.o 00000000 t gcc2_compiled. 00000000 T main U printf
The file hello.o defines the symbol ‘main’, but it marks the symbol ‘printf’ as undefined. The reason for this is that ‘printf’ is not a built-in keyword of the C programming language, but a function call that is defined by the libc.a library. Most of the facilities of the C programming language are defined by this library. The include files stdio.h, stdlib.h, and so on are only header files that declare parts of the C library. You can read all about the C library in the Libc manual.
The catch is that there are many functions that you may consider standard features of C that are not included in the libc.a library itself. For example, all the math functions that are declared in math.h are defined in a library called libm.a which is not linked by default. So if your program is using math functions and including math.h, then you need to explicitly link the math library by passing the ‘-lm’ flag. The reason for this particular separation is that mathematicians are very picky about the way their math is being computed and they may want to use their own implementation of the math functions instead of the standard implementation. If the math functions were lumped into libc.a it wouldn't be possible to do that.
For example, consider the following program that prompts for a number and prints its square root:
#include <stdio.h> #include <math.h> int main () { double a; printf ("a = "); scanf ("%f", &a); printf ("sqrt(a) = %f", sqrt(a)); }
% gcc -o dude dude.c -lm
otherwise you will get an error message from the linker about sqrt
being an unresolved symbol.
On GNU, the libc.a library is very comprehensive. On many Unix systems
however, when you use system-level features you may need to link additional
system libraries such as
libbsd.a, libsocket.a, libnsl.a, etc.
If you are linking C++ code, the C++ compiler will link
both libc.a and the C++ standard library libstdc++.a.
If you are also using GNU C++ features however, you will explicitly need to
link libg++.a yourself.
Also if you are linking Fortran and C code together
you must also link the Fortran run-time libraries. These libraries
have non-standard names and depend on the Fortran compiler that you use.
(see Using Fortran effectively)
Finally, a very common problem is encountered when you are writing
X applications. The X libraries and header files like to be placed in
non-standard locations so you must provide system-dependent -I
and -L
flags so that the compiler can find them. Also the most
recent version of X requires you to link in some additional libraries
on top of libX11.a
and some rare systems require you to link
some additional system libraries to access networking features
(recall that X is built on top of the sockets interface and it is essentially a
communications protocol between the computer running the program and
computer that controls the screen in which the X program is displayed.)
FIXME: Cross references, if we explain all this in more details.
Because it is necessary to link system libraries to form an executable, under copyright law, the executable is derived work from the system libraries. This means that you must pay attention to the license terms of these libraries. The GNU libc library is under the LGPL license which allows you to link and distribute both free and proprietary executables. The stdc++ library is also under terms that permit the distribution of proprietary executables. The libg++ library however only permits you to build free executables. If you are on a GNU system, including Linux-based GNU systems, the legalese is pretty straightforward. If you are on a proprietary Unix system, you need to be more careful. The GNU GPL does not allow GPLed code to be linked against proprietary library. Because on Unix systems, the system libraries are proprietary, their terms also may not allow you to distribute executables derived from them. In practice, they do however, since proprietary Unix systems do want to attract proprietary applications. In the same spirit, the GNU GPL also makes an exception and explicitly permits the linking of GPL code with proprietary system libraries, provided that these libraries are a major component of the operating system (i.e. they are part of the compiler, or the kernel, and so on), unless the copy of the library itself accompanies the executable!
This includes proprietary libc.a libraries, the libdxml.a library in Digital Unix, proprietary Fortran system libraries like libUfor.a, and the X11 libraries.
To build a very large program, you need an extended set of invocations to the ‘gcc’ compiler and utilities like ‘ar’, ‘ranlib’. As we explained (see Programs with many source files) if you make changes only to a few files in your source code, it is not necessary to rebuild everything; you only need to rebuild the object files that get to change because of your modifications and link those together with all the other object files to form an updated executable. The ‘make’ utility was written mainly to automate rebuilding software by determining the minimum set of commands that need to be called to do this, and invoking them for you in the right order. It can also handle, many other tasks. For example, you can use ‘make’ to install your program's files in the standard directories, and clean up the object files when you no longer need them.
To learn all about ‘make’ and especially ‘GNU Make’, please read the excellent GNU Make manual. In general, to use the GNU build system you don't need to know the most esoteric aspects of the GNU make, because makefiles will be automatically compiled for you from higher level descriptions. However it is important to understand the basic aspects of ‘make’ to use the GNU build system effectively. In the following sections we will explain only these basic aspects.
The ‘make’ utility reads its instructions from a file named Makefile in the current directory. ‘make’ itself has no knowledge about the syntax of the files that it works with, and it relies on the instructions in Makefile to figure out what it needs to do. A makefile is essentially a list of rules. Each rule has the form:
Target: Dependencies <TAB> Command <TAB> ..... <TAB> ..... [Blank Line]
The <TAB>s are mandatory. The blank line at the end of the rule definition is not necessary when using GNU make but it is a good idea if you would like backwards compatibility with Unix.
When we talk about ‘make’ building a target, we mean that we want ‘make’ to do the following things:
If the requested target exists as a file, and there are no dependencies newer than the target, then ‘make’ will do nothing except printing a message saying that it has nothing to do. If the requested target is an action, no file will ever exist having the same name as the name describing the action, so every time you ask ‘make’ to build that target, it will always carry out your request. If one of the dependencies is a target corresponding to an action, ‘make’ will always attempt to build it and consequently always carry out that action. These three observations are only corollaries of the general algorithm.
To see how all this comes together in practice let's write a Makefile for compiling the hello world program. The simplest way to do this is with the following makefile:
hello: hello.c <TAB> gcc -o hello hello.c
This simply says that the target hello is being built from the file hello.c by invoking the gcc command
% gcc -o hello hello.c
A more complicated way of doing the same thing is by explicitly building the intermediate object file:
hello: hello.o <TAB> gcc -o hello hello.o hello.o: hello.c <TAB> gcc -c hello.c
Note that the target that we really want to build, hello is listed
first, to make sure that it is the default target.
Finally, we can add two more phony targets install
and clean
to install the hello world program and clean up the build after installation.
We get then the following:
hello: hello.o <TAB> gcc -o hello hello.o hello.o: hello.c <TAB> gcc -c hello.c clean: <TAB> rm -f hello hello.o install: hello <TAB> mkdir -p /usr/local/bin <TAB> rm -f /usr/local/bin/hello <TAB> cp hello /usr/local/bin/hello
The clean
needs no dependencies since it just does what it does.
However, the install
target needs to first make sure that the
file hello exists before attempting to install it, so it is necessary
to list hello as a dependency to install
.
Please note that this simple Makefile is for illustration only, and it is far from ideal. For example, we use the ‘mkdir’ command to make sure that the installation directory exists before attempting an install, but the ‘-p’ flag is not portable in Unix. Also, we usually want to use a BSD compatible version of the install utility to install executables instead of cp. Fortunately, you will almost never have to worry about writing ‘clean’ and ‘install’ targets, because those will be generated for you automatically by Automake.
Now let's consider a more complicated example. Suppose that we want to build a program foo whose source code is four source files
foo1.c, foo2.c, foo3.c, foo4.c
and three header files:
gleep1.h, gleep2.h, gleep3.h
Suppose also, for the sake of argument, that
foo: foo1.o foo2.o foo3.o foo4.o <TAB> gcc -o foo foo1.o foo2.o foo3.o foo4.o foo1.o: foo1.c gleep2.h gleep3.h <TAB> gcc -c foo1.c foo2.o: foo2.c gleep1.h <TAB> gcc -c foo2.c foo3.o: foo3.c gleep1.h gleep2.h <TAB> gcc -c foo3.c foo4.o: foo4.c gleep3.h <TAB> gcc -c foo4.c clean: <TAB> rm -f foo foo1.o foo2.o foo3.o foo4.o install: foo <TAB> mkdir -p /usr/local/bin <TAB> rm -f /usr/local/bin/foo <TAB> cp foo /usr/local/bin/foo
This idea can be easily generalized for any program. If you would like to build more than one programs, then you should add a phony target in the beginning that depends on the programs that you want to build. The usual way we do this is by adding a line like
all: prog1 prog2 prog3
to the beginning of the Makefile.
Unfortunately, this Makefile has a lot of unnecessary redundancy:
foo1.o, ..., foo4.o
appears in at least
two places.
variable = value
Then, in every other rule or variable definition, the symbol $(variable) is substituted with value.
.s1.s2: <TAB> Command <TAB> Command <TAB> .....
where s1 is the suffix of the source file, and s2 is the suffix of the corresponded generated file and Command is the set of commands that generate *.s2 from *.s1. Note that no dependencies are mentioned, because dependencies don't make sense in the general case. They must be explicitly provided for each individual case separately.
.c.o: <TAB> gcc -c $<
Similarly, the rule for building the executable file from a set of object files is:
.o: <TAB> gcc $^ -o $@
Note that because executables don't have a suffix, we only mention the suffix of the object files. When only one suffix appears, it is assumed that it is suffix s1 and that suffix s2 is the empty string.
The suffixes involved in your abstract rules, need to be listed in a directory that takes the form:
.SUFFIXES: s1 s2 ... sn
where s1, s2, etc. are suffixes. Also, if you've written an abstract rule, you still need to write rules where you mention the specific targets and their dependencies, except that you can omit the command-part since they are covered by the abstract rule.
Putting all of this together, we can enhance our Makefile like this:
CC = gcc CFLAGS = -Wall -g OBJECTS = foo1.o foo2.o foo3.o foo4.o PREFIX = /usr/local .SUFFIXES: .c .o .c.o: <TAB> $(CC) $(CFLAGS) -c $< .o: <TAB> $(CC) $(CFLAGS) $^ -o $@ foo: $(OBJECTS) foo1.o: foo1.c gleep2.h gleep3.h foo2.o: foo2.c gleep1.h foo3.o: foo3.c gleep1.h gleep2.h foo4.o: foo4.c gleep3.h clean: <TAB> rm -f $(OBJECTS) install: foo <TAB> mkdir -p $(PREFIX)/bin <TAB> rm -f $(PREFIX)/bin/foo <TAB> cp foo $(PREFIX)/bin/foo
The only part of this Makefile that still requires some thinking on your part, is the part where you list the object files and their dependencies:
foo1.o: foo1.c gleep2.h gleep3.h foo2.o: foo2.c gleep1.h foo3.o: foo3.c gleep1.h gleep2.h foo4.o: foo4.c gleep3.h
Note however, that in principle even that can be automatically generated. Even though the make utility does not understand C source code and can not determine the dependencies, the GNU C compiler can. If you use the ‘-MM’ flag, then the compiler will compute and output the dependency lines that you need to include in your Makefile. For example:
% gcc -MM foo1.c foo1.o: foo1.c gleep2.h gleep3.h % gcc -MM foo2.c foo2.o: foo2.c gleep1.h % gcc -MM foo3.c foo3.o: foo3.c gleep1.h gleep2.h % gcc -MM foo4.c foo4.o: foo4.c gleep3.h
Unfortunately, unlike all the other compiler features we have described up until now, this feature is not portable in Unix. If you have installed the GNU compiler on your Unix system however, then you can also do this.
Dealing with dependencies is one of the major drawbacks of writing
makefiles by hand. Another drawback is that even though
we have moved many of the parameters to makefile variables, these
variables still need to be adjusted by somebody. There is something
rude about asking the installer to edit Makefile.
Developers that ask their users to edit Makefile make their
user's life more difficult in an unacceptable way. Yet another annoyance
is writing clean
, install
and such targets. Doing so every
time you write a makefile gets to be tedious on the long run. Also,
because these targets are, in a way, mission critical, it is really important
not to make mistakes when you are writing them. Finally, if you want
to use multiple directories for every one of your libraries and
programs, you need to setup your makefiles to recursively
call ‘make’ on the subdirectories, write a whole lot of makefiles,
and have a way of propagating configuration information to every one of
these makefiles from a centralized source.
These problems are not impossible to deal with, but you need a lot of experience in makefile writing to overcome them. Most developers don't want to bother as much with all this, and would rather be debugging their source code. The GNU build system helps you set up your source code to make this possible. For the same example, the GNU developer only needs to write the following Makefile.am file:
bin_PROGRAMS = foo foo_SOURCES = foo1.c foo2.c foo3.c foo4.c noinst_HEADERS = gleep1.h gleep2.h gleep3.h
and set a few more things up. This file is then compiled into an intermediate file, called Makefile.in, by Automake, and during installation the final Makefile is generated from Makefile.in by a shell script called configure. This shell script is provided by the developer and it is also automatically generated with Autoconf. For more details see Hello world example with Autoconf and Automake.
In general you will not need to be writing makefiles by hand. It is useful however to understand how makefiles work and how to write abstract rules.
The GNU build system has two goals. The first is to simplify the development of portable programs. The second is to simplify the building of programs that are distributed as source code. The first goal is achieved by the automatic generation of a configure shell script, which configures the source code to the installer platform. The second goal is achieved by the automatic generation of Makefiles and other shell scripts that are typically used in the building process. This way the developer can concentrate on debugging per source code, instead of per overly complex Makefiles. And the installer can compile and install the program directly from the source code distribution by a simple and automatic procedure.
When we speak of the GNU build system we refer primarily to the following four packages:
make
recursively. Having simplified this step, the developer
is encouraged to organize per source code in a deep directory tree rather than
lump everything under the same directory. Developers that use raw make
often can't justify the inconvenience of recursive make and prefer to
disorganize their source code. With the GNU tools this is no longer necessary.
check
target available such that you can compile and run the entire test suite
by running make check
.
make distcheck
.
The GNU build system needs to be installed only when you are developing
programs that are meant to be distributed. To build a program from
distributed source code, the installer only needs a working make
utility, a compiler, a shell,
and sometimes standard Unix utilities like sed
, awk
,
yacc
, lex
. The objective is to make software installation
as simple and as automatic as possible for the installer. Also, by
setting up the GNU build system such that it creates programs that don't
require the build system to be present during their installation, it
becomes possible to use the build system to bootstrap itself.
If you are on a Unix system, don't be surprised if the GNU development tools are not installed. Some Unix sysadmins have never heard about them. If you do have them installed check to see whether you have the most recent versions. To do that, type:
% autoconf --version % automake --version % libtool --version
This manual is current with Autoconf 2.57, Automake 1.6.3 and Libtool 1.4.3.
If you don't have any of the above packages, you need to get a copy and install them on your computer. The distribution filenames for the GNU build tools, are:
autoconf-2.57.tar.gz automake-1.6.3.tar.gz libtool-1.4.3.tar.gz autotoolset-0.11.6.tar.gz
Before installing these packages however, you will need to install the following needed packages from the FSF:
make-*.tar.gz m4-*.tar.gz texinfo-4.3.tar.gz tar-*.shar.gz
The asterisks in the version numbers mean that the version for these programs is not critically important.
You will need the GNU versions of make
, m4
and
tar
even if your system already has native versions of these utilities.
To check whether you do have the GNU versions see whether they accept the
--version
flag. If you have proprietary versions of make
or
m4
, rename them and then install the GNU ones.
You will also need to install Perl, the GNU C compiler,
and the TeX typesetting system. These programs are always installed
on a typical Debian GNU/Linux system.
It is important to note that the end user will only need a decent shell
and a working make
to build a source code distribution. The developer
however needs to gather all of these tools in order to create the distribution.
The installation process, for all of these tools that you obtain as *.tar.gz distributions is rather straightforward:
% ./configure % make % make check % make install
Most of these tools include documentation which you can build with
% make dvi
To get started we will show you how to do the Hello world program using ‘autoconf’ and ‘automake’. In the fine tradition of k&r, the C version of the hello world program is:
#include <stdio.h> main() { printf("Howdy world!\n"); }
Call this hello.c and place it under an empty directory. Simple programs like this can be compiled and ran directly with the following commands:
% gcc hello.c -o hello % hello
If you are on a Unix system instead of a GNU system, your compiler might be called ‘cc’ but the usage will be pretty much the same.
Now to do the same thing the ‘autoconf’ and ‘automake’ way create first the following files:
bin_PROGRAMS = hello hello_SOURCES = hello.c
AC_INIT(hello.c) AM_INIT_AUTOMAKE(hello,0.1) AC_PROG_CC AC_PROG_INSTALL AC_OUTPUT(Makefile)
Now run ‘autoconf’:
% aclocal % autoconf
This will create the shell script configure. Next, run ‘automake’:
% automake -a required file "./install-sh" not found; installing required file "./mkinstalldirs" not found; installing required file "./missing" not found; installing required file "./INSTALL" not found; installing required file "./NEWS" not found required file "./README" not found required file "./COPYING" not found; installing required file "./AUTHORS" not found required file "./ChangeLog" not found
The first time you do this, ‘automake’ will complain a couple of things. First it notices that the files install-sh, mkinstalldirs and missing are not present, and it installs copies. These files contain boiler-plate shell scripts that are needed by the makefiles that ‘automake’ generates. It also complains that the following files are not around:
INSTALL, COPYING, NEWS, README, AUTHORS, ChangeLog
These files are required to be present by the GNU coding standards, and we discuss them in detail in Maintaining the documentation files. At this point, it is important to at least touch these files, otherwise if you attempt to do a ‘make distcheck’ it will deliberately fail. To make these files exist, type:
% touch NEWS README AUTHORS ChangeLog
and to make Automake aware of the existence of these files, rerun it:
% automake -a
You can assume that the generated Makefile.in is correct, only when Automake completes without any error messages.
Now the package is exactly in the state that the end-user will find it when person unpacks it from a source code distribution. For future reference, we will call this state autoconfiscated. Being in an autoconfiscated state means that, you are ready to type:
% ./configure % make % ./hello
to compile and run the hello world program. If you really want to install it, go ahead and call the ‘install’ target:
# make install
To undo installation, that is to uninstall the package, do:
# make uninstall
If you didn't use the ‘--prefix’ argument to point to your home directory, or a directory in which you have permissions to write and execute, you may need to be superuser to invoke the install and uninstall commands. If you feel like cutting a source code distribution, type:
make distcheck
This will create a file called hello-0.1.tar.gz in the current working directory that contains the project's source code, and test it out to see whether all the files are actually included and whether the source code passes the regression test suite.
In order to do all of the above, you need to use the GNU ‘gcc’ compiler. Automake depends on ‘gcc’'s ability to compute dependencies. Also, the ‘distcheck’ target requires GNU make and GNU tar.
The GNU build tools assume that there are two types of hats that people like to wear: the developer hat and the installer hat. Developers develop the source code and create the source code distribution. Installers just want to compile and install a source code distribution on their system. In the free software community, the same people get to wear either hat depending on what they want to do. If you are a developer, then you need to install the entire GNU build system, period (see Installing the GNU build system). If you are an installer, then all you need to compile and install a GNU package is a minimal ‘make’ utility and a minimal shell. Any native Unix shell and ‘make’ will work.
Both Autoconf and Automake take special steps to ensure that packages generated through the ‘distcheck’ target can be easily installed with minimal tools. Autoconf generates configure shell scripts that use only portable Bourne shell features. (see Portable shell programming) Automake ensures that the source code is in an autoconfiscated state when it is unpacked. It also regenerates the makefiles before adding them to the distribution, such that the installer targets (‘all’, ‘install’, ‘uninstall’, ‘check’, ‘clean’, ‘distclean’) do not depend on GNU make features. The regenerated makefiles also do not use the ‘gcc’ cruft to compute dependencies. Instead, precomputed dependencies are included in the regenerated makefiles, and the dependencies generation mechanism is disabled. This will allow the end-user to compile the package using a native compiler, if the GNU compiler is not available. For future reference we will call this the installer state.
Now wear your installer hat, and install hello-0.1.tar.gz:
% gunzip hello-0.1.tar.gz % tar xf hello-0.1.tar % cd hello-0.1 % configure % make % ./hello
This is the full circle. The distribution compiles, and by typing ‘make install’ it installs. If you need to switch back to the developer hat, then you should rerun ‘automake’ to get regenerate the makefiles.
When you run the ‘distcheck’ target, ‘make’ will create the source code distribution ‘hello-0.1.tar.gz’ and it will pretend that it is an installer and see if it the distribution can be unpacked, configured, compiled and installed. It will also run the test suite, if one is bundled. If you would like to skip these tests, then run the ‘dist’ target instead:
% make dist
Nevertheless, running ‘distcheck’ is extremely helpful in debugging your build cruft. Please never release a distribution without getting it through ‘distcheck’. If you make daily distributions for off-site backup, please do pass them through ‘distcheck’. If there are files missing from your distribution, the ‘distcheck’ target will detect them. If you fail to notice such problems, then your backups will be incomplete leading you to a false sense of security.
When you made the hello-0.1.tar.gz distribution, most of the files were automatically generated. The only files that were actually written by your fingers were:
#include <stdio.h> main() { printf("Howdy, world!\n"); }
bin_PROGRAMS = hello hello_SOURCES = hello.c
AC_INIT(hello.cc) AM_INIT_AUTOMAKE(hello,1.0) AC_PROG_CC AC_PROG_INSTALL AC_OUTPUT(Makefile)
The language of Makefile.am is a logic language. There is no explicit statement of execution. Only a statement of relations from which execution is inferred. On the other hand, the language of configure.in is procedural. Each line of configure.in is a command that is executed.
Seen in this light, here's what the configure.in commands shown do:
AC_INIT
command initializes the configure script. It must be
passed as argument the name of one of the source files. Any source file
will do.
AM_INIT_AUTOMAKE
performs some further initializations that are
related to the fact that we are using ‘automake’. If you are writing
your Makefile.in by hand, then you don't need to call this command.
The two comma-separated arguments are the name of the package and the
version number.
AC_PROG_CC
checks to see which C compiler you have.
AC_PROG_INSTALL
checks to see whether your system has a BSD
compatible install utility. If not then it uses install-sh which
automake will install at the root of your package directory if it's
not there yet.
AC_OUTPUT
tells the configure script to generate Makefile
from Makefile.in
The Makefile.am is more obvious. The first line specifies the name of the program we are building. The second line specifies the source files that compose the program.
For now, as far as configure.in is concerned you need to know the following additional facts:
AC_PROG_RANLIB
command.
AC_PROG_CXX
to your configure.in.
AC_PROG_YACC AC_PROG_LEX
to your configure.in.
AC_OUTPUT
statement like this:
AC_OUTPUT(Makefile \ dir1/Makefile \ dir2/Makefile \ )
Note that the backslashes are not needed if you are using the bash shell. For portability reasons, however, it is a good idea to include them. Make sure that every subdirectory where building takes place, is mentioned!
Now consider the commands that are used to build the hello world distribution:
% aclocal % autoconf % touch README AUTHORS NEWS ChangeLog % automake -a % ./configure % make
The first three commands bring the package in autoconfiscated state. The remaining two commands do the actual configuration and building. More specifically:
AM_INIT_AUTOMAKE
macro which is
not part of the standard ‘autoconf’ macros. For this reason, its
definition needs to be placed in aclocal.m4. If you call ‘aclocal’
with no arguments then it will generate the appropriate aclocal.m4 file.
Later we will show you how to use ‘aclocal’ to also install your
own ‘autoconf’ macros.
The configure script probes your platform and generates makefiles that are customized for building the source code on your platform. The specifics of how the probing should be done are programmed in configure.in. The generated makefiles are based on templates that appear in Makefile.in files. In order for these templates to cooperate with configure and produce makefiles that conform to the GNU coding standards they need to contain a tedious amount of boring stuff. This is where Automake comes in. Automakes generates the Makefile.in files from the more terse description in Makefile.am. As you have seen in the example, Makefile.am files can be very simple in simple cases. Once you have customized makefiles, your make utility takes over.
How does configure actually convert the template Makefile.in to the final makefile? The configure script really does two things:
AC_OUTPUT
it parses
all of the files listed in AC_OUTPUT
and every occurence of
@FOO@
in these files is substituted with the text that corresponds
to FOO
. For example, if you add the following lines to
configure.in you will cause @FOO@
to be substituted with
‘hello’:
FOO="hello" AC_SUBST(FOO)
This is how configure exports compile-time decisions to the makefile, such as what compiler to use, what flags to pass the compilers and so on. Occasionally, you want to use configure's substitution capability directly on files that are not makefiles. This is why it is important to be aware of it. See Scripts with Automake, for an example.
If you inspect the output of ‘make’ while compiling the hello world
example, you will see that the generated Makefile is passing ‘-D’
flags to the compiler that define the macros PACKAGE
and VERSION
.
These macros are assigned the arguments that are passed to the
AM_INIT_AUTOMAKE
command in configure.in.
One of the ways in which configure customizes your source code to
a specific platform is by getting such C preprocessors defined. The
definition is requested by appropriate commands in the configure.in.
The AM_INIT_AUTOMAKE
command is one such command.
The GNU build system by default implements C preprocessor macro definitions by passing ‘-D’ flags to the compiler. When there is too many of these flags, we have two problems: the ‘make’ output becomes hard to read, and more importantly we are running the risk of hitting the buffer limits of braindead Unix implementations of ‘make’. To work around this problem, you can ask Autoconf to use another approach in which all macros are defined in a special header file that is included in all the sources. This header file is called a configuration header.
A hello world program using this technique looks like this
AC_INIT(hello.c) AM_CONFIG_HEADER(config.h) AM_INIT_AUTOMAKE(hello,0.1) AC_PROG_CC AC_PROG_INSTALL AC_OUTPUT(Makefile)
bin_PROGRAMS = hello hello_SOURCES = hello.c
#ifdef HAVE_CONFIG_H #include <config.h> #endif #include <stdio.h> main() { printf("Howdy, pardner!\n"); }
AM_CONFIG_HEADER
command. The configuration header must
be installed conditionally with the following three lines:
#if HAVE_CONFIG_H #include <config.h> #endif
It is important that config.h is the first thing that gets included. Now autoconfiscate the source code by typing:
% aclocal % autoconf % touch NEWS README AUTHORS ChangeLog % autoheader % automake -a
It is important to type these commands in the order shown. The difference between this, and what we did in Hello world example with Autoconf and Automake, is that we had to run a new program: ‘autoheader’. This program scans configure.in and generates a template file config.h.in listing all the C preprocessor macros that might be defined and comments that should accompany the macros describing what they do. When you run configure, it will load in config.h.in and use it to generate the final config.h that will be used by the source code during compilation.
Now you can go ahead and build the program:
% configure % make gcc -DHAVE_CONFIG_H -I. -I. -I. -g -O2 -c hello.c gcc -g -O2 -o hello hello.o
Note that now instead of multiple -D
flags, there is only one
such flag passed: -DHAVE_CONFIG_H
. Also, appropriate -I
flags are passed to make sure that hello.c can find and include
config.h.
To test the distribution, type:
% make distcheck ...... ======================== hello-0.1.tar.gz is ready for distribution ========================
and it should all work out.
The config.h files go a long way back in history. In the past, there used to be packages where you would have to manually edit config.h files and adjust the macros you wanted defined by hand. This made these packages very difficult to install because they required intimate knowledge of your operating system. For example, it was not unusual to see a comment saying “if your system has a broken vfork, then define this macro”. Many installers found this frustrating because they didn't really know how to configure the esoteric details of the config.h files. With autoconfiguring source code all of these details can be taken care of automatically, shifting this burden from the installer to the developer where it belongs.
Every software project must have its own directory. A minimal “project” is the example that we described in Hello world example with Autoconf and Automake. In general, even a minimal project must have the files:
README, INSTALL, AUTHORS, THANKS, NEWS, ChangeLog
Before distributing your source code, it is important to write the real contents of these files. In this section we give a summary overview on how these files should be maintained. For more details, please see the GNU coding standards as published by the FSF.
For pretest releases, only, you might also decide to distribute a file README-alpha containing special comments for your friendly pretesters. If you use the recommended version numbering scheme (see Handling version numbers), you can automate its distribution by adding the following code in your configure.in:
changequote(,)dnl case $VERSION in [0-9]*.[0-9]*[a-z]) DIST_ALPHA="README-alpha";; [0-9]*.[0-9]*.[0-9]*) DIST_ALPHA="README-alpha";; *) DIST_ALPHA=;; esac changequote([, ])dnl AC_SUBST(DIST_ALPHA)
In your top-level Makefile.am, add something like:
EXTRA_DIST = $(DIST_ALPHA)
Automake
.
If you have something very important to say, it may be best to say it in
the README file instead. the INSTALL file is mostly for
the benefit of people who've never installed a GNU package before.
However, if your package is very unusual, you may decide that it is
best to modify the standard INSTALL file or write your own.
Authors of PACKAGE The following contributions warranted legal paper exchanges with [the Free Software Foundation | Your Name]. Also see files ChangeLog and THANKS
Then, list who the contributors are and what files they have worked on. Indicate whether they created the file, or whether they modified it. For example:
Random J. Hacker: entire files -> foo1.c , foo2.c , foo3.c modifications -> foo4.c , foo5.c
PACKAGE THANKS file PACKAGE has originally been written by ORIGINAL AUTHOR. Many people further contributed to PACKAGE by reporting problems, suggesting various improvements or submitting actual code. Here is a list of these people. Help me keep it complete and exempt of errors.
The easiest policy with this file is to thank everyone who contributes to the project, without judging the value of the contribution.
Unlike AUTHORS, the THANKS file is not maintained for legal reasons. It is maintained to thank all the contributors that helped you out in your project. The AUTHORS file can not be used for this purpose because certain contributions, like bug reports or ideas and suggestions do not require legal paper exchanges.
You can also decide to send some kind of special greeting when you initially add a name to your THANKS file. The mere presence of a name in THANKS is then a flag to you that the initial greeting has been sent.
The GNU coding standards explain in a lot of detail how you should structure a ChangeLog, so you should read about it there. The basic idea is to record semi-permanant modifications you make to your source code. It is not necessary to continuously record changes that you make while you are experimenting with something. But once you decide that you got a modification worked out, then you should record Please do record version releases on the central ChangeLog (see Handling version numbers). This way, it will be possible to tell what changes happened between versions.
You can automate ChangeLog maintenance with emacs.
See Navigating source code, for more details.
Recently versions of Emacs use
the ISO 8601 standard for dates which is: YYYY-MM-DD
(year-month-date).
A typical ChangeLog entry looks like this:
1998-05-17 Eleftherios Gkioulekas <lf@amath.washington.edu> * src/acmkdir.sh: Now acmkdir will put better default content to the files README, NEWS, AUTHORS, THANKS
Every entry contains all the changes you made within the period of a day. The most recent changes are listed at the top, the older changes slowly scroll to the bottom. Changes are sorted together in groups, per day of work.
Copyright is one of the many legal concerns that you need to be aware of if you develop free software. See Legal issues with Free Software, for more details. The philosophy of the GNU project, that software should be free, is very important to the future of our community. See Philosophical issues, to read Richard Stallman's essays on this topic.
If your program is very small, you can place all your files in the top-level directory, like we did in the Hello World example (see Hello world example with Autoconf and Automake). Such packages are called shallow.
In general, it is preferred to organize your package as a deep package. In a deep package, the documentation files
README, INSTALL, AUTHORS, THANKS, ChangeLog, COPYING
as well as the build cruft are placed at the top-level directory, and the rest of the files are placed in subdirectories. It is standard practice to use the following subdirectories:
The General Public License (GPL) is the legal implementation of the idea that the program, to which it is applied, belongs to the public. It means that the public is free to use it, free to modify it and redistribute it. And it also means that no-one can steal it from the public and use it to create a derived work that is not free. This is different from public domain, where anyone can take a work, make a few changes, slap a copyright notice on it, and forbid the public to use the resulting work without a proprietary license. The idea, that a work is owned by the public in this sense, is often called copyleft.
Unfortunately our legal system does not recognize this idea properly. Every work must have an owner, and that person or entity is the only one that can enforce per copyright. As a result, when a work is distributed under the GPL, with the spirit that it belongs to the public, only the nominal “owner” has the right to sue hoarders that use the work to create proprietary products. Unfortunately, the law does not extend that right to the public. Despite this shortcoming, the GPL has proven to be a very effective way to distribute free software. Almost all of the components of the GNU system are distributed under the GPL.
To apply the GPL to your programs you need to do the following things:
Copyright (C) (years) (Your Name) <your@email.address> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
If you have assigned your copyright to an organization, like the Free Software Foundation, then you should probably fashion your copyright notice like this:
Copyright (C) (years) Free Software Foundation (your name) <your@email.address> (initial year) etc...
This legal notice works like a subroutine. By invoking it, you invoke the full text of the GNU General Public License which is too lengthy to include in every source file. Where you see ‘(years)’ you need to list all the years in which you finished preparing a version that was actually released, and which was an ancestor to the current version. This list is not the list of years in which versions were released. It is a list of years in which versions, later released, were completed. If you finish a version on Dec 31, 1997 and release it on Jan 1, 1998, you need to include 1997, but you do not need to include 1998. This rule is complicated, but it is dictated by international copyright law.
Some developers don't like inserting a proper legal notice to every file in their source code, because they don't want to do the typing. However, it is not sufficient to just say something like “this file is GPLed”. You have to make an unambiguous and exact statement, and you have to include the entire boilerplate text to do that. Fortunately, you can save typing by having Emacs insert copyright notices for you. See Inserting copyright notices with Emacs, for more details.
--version
command-line flag.
The number ‘0.1’ in the filename hello-0.1.tar.gz is called the version number of the source code distribution. The purpose of version numbers is to label the various releases of a source code distribution so that its development can be tracked. If you use the GNU build system, then the name of the package and the version number are specified in the line that invokes the ‘AM_INIT_AUTOMAKE’ macro. In the hello world example (see Hello world example with Autoconf and Automake) we used the following line to set the version number equal to 0.1:
AM_INIT_AUTOMAKE(hello,0.1)
Whenever you publish a new version of your program, you must increase the version number. We also recommend that you note on ChangeLog the release of the new version. This way, when someone inspects your ChangeLog, person will be able to determine what changes happened between any two specific versions.
To release the current version of your source code, run
% make distcheck
to build the distribution and apply the test suite to validate it. Once you get this to work, change your version number in configure.in, record an entry in ChangeLog saying that you are cutting the new version, and update the NEWS file. Without making any other changes, do
% make dist
to rebuild the distribution without having to wait for the test suite to run all over again.
Most packages declare their version with two integers: a major number and a minor number that are separated by a dot in the middle. In our example above, the major number is 0 and the minor number is 1. The minor number should be increased when you release a version that contains new features and improvements over the old version that you want your users to upgrade to. The major number should be increased when the incremental improvements bring your program into a new level of maturity and stability. A major number of 0 indicates that your software is still experimental and not ready for prime time. When you release version 1.0, you are telling people that your software has developed to the point that you recommend it for general use. Releasing version 2.0 means that your software has significantly matured from user feedback.
Before releasing a new major version, it is a good idea to publish a prerelease to your beta-testers. In general, the prerelease for version 1.0 is labeled 0.90 regardless of what the previous minor number was. 8 When releasing a 0.90 version, development should freeze, and you should only be fixing bugs. If the prerelease turns out to be stable, it becomes the stable version. If not, you may need to release further bug-fixing prereleases: 0.91, 0.92, etc.
Many maintainers like to publish working versions of their code, so that contributors can donate code against the most recent version of the source code. These unofficial versions should only be used by people who are interested in contributing to the project, and not by end users. It is useful to use a third integer for writing the version numbers for these “unofficial” releases. Please use only two integers for official releases so that it is easy to distinguish them from unofficial releases. A possible succession of version numbers might look like this:
0.1, 0.1.1, 0.1.2, ... , 0.2, ..., 0.3, ..., 0.90, ..., 1.0
It is always a good idea to maintain an archive of at least all the official releases that you ever publish.
Whenever you start out a new programming project, there is quite a bit of overhead setup that you need to do in order to use the GNU build system. You need to install the documentation files described in Maintaining the documentation files, and set up the directory structure described in Organizing your project in subdirectories. In the quest for never-ending automation, you can do these tasks automatically with the ‘acmkdir’ utility.
Start by typing the following command on the shell:
% acmkdir hello
‘acmkdir’ will ask you to enter the name of your program, your name and your email address. When you are done, ‘acmkdir’ will ask you if you really want to go for it. Say ‘y’. Then, ‘acmkdir’ will do the following routine work for you:
AC_INIT AM_CONFIG_HEADER(config.h) AM_INIT_AUTOMAKE(test,0.1) AC_PROG_CC AC_PROG_CXX AC_PROG_RANLIB AC_OUTPUT(Makefile doc/Makefile m4/Makefile src/Makefile)
By default, both the C and C++ compilers are initialized, but you can take out ‘AC_PROG_CXX’ if you don't plan to use C++. You can edit and customize this file to your needs. More specifically, you will need to update the version number in ‘AM_INIT_AUTOMAKE’ every time you cut a new distribution (see Handling version numbers). You should also make sure to list all the subdirectories that have a Makefile.am in ‘AC_OUTPUT’.
EXTRA_DIST = reconf configure SUBDIRS = m4 doc src
The ones in the src
and doc
subdirectories are empty. The
one in m4 contains a template Makefile.am which you should
edit if you want to add new Autoconf macros.
(FIXME: Crossreference)
% rm -f config.cache % rm -f acconfig.h % aclocal -I m4 % autoconf % acconfig % autoheader % automake -a
Before ‘acmkdir’ exits, it will call the ‘reconf’ script for you once to set things up.
% ./configure % make
but nothing interesting will happen because the package is still empty.
To add a simple hello world program, all you need to do is create the following two files:
bin_PROGRAMS = hello hello_SOURCES = hello.c
#if HAVE_CONFIG_H # include <config.h> #endif #include <stdio.h> int main () { printf ("Hello world\n"); }
% ./reconf % ./configure % make % make distcheck
to compile the hello world program and build a distribution. It's that simple!
In general, to develop simple programs with the GNU build system you setup the project directory tree with ‘acmkdir’, you write your source code, you put together the necessary Makefile.am and update configure.in and you are set to go. In fact, at this point you practically know all you need to know to develop source code distributions that compile and install simple C programs. All you need to do is write the source code and list the source files in ‘*_SOURCES’.
In the following chapters we will explain in more detail how to use the GNU build system to develop software that conforms to the GNU coding standards.
To begin, let's review the simplest example, the hello world program:
#include <stdio.h> main() { printf("Howdy, world!\n"); }
bin_PROGRAMS = hello hello_SOURCES = hello.c
AC_INIT([Hello Program], [1.0], [Author Of The Program <aotp@zxcv.com>], [hello]) AM_INIT_AUTOMAKE AC_PROG_CC AC_PROG_INSTALL AC_CONFIG_FILES([Makefile]) AC_OUTPUT
The language of Makefile.am is a logic language. There is no explicit statement of execution. Only a statement of relations from which execution is inferred. On the other hand, the language of configure.ac is procedural. Each line of configure.ac is a command that is executed.
Seen in this light, here's what the configure.ac commands shown do:
AC_INIT
command initializes the configure script. Its arguments
are the name of the package, the version number of the package, the name of
the author of the program and his e-mail, and the name of the tar-file,
if it is different from the first argument.
AM_INIT_AUTOMAKE
performs some further initializations that are
related to the fact that we are using ‘automake’. If you are writing
your Makefile.in by hand, then you don't need to call this command.
AC_PROG_CC
checks to see which C compiler you have.
AC_PROG_INSTALL
checks to see whether your system has a BSD
compatible install utility. If not then it uses install-sh which
automake will install at the root of your package directory if it's
not there yet.
AC_CONFIG_FILES
tells configure which files must be generated
from templates. In this case, Makefile will be generated from the
template Makefile.in. Remember that if we use ‘automake’,
the file Makefile.in is generated from Makefile.am. But
other files could have been specified in the parameter of this macro.
AC_OUTPUT
tells the configure script to generate the files
specified in the list of AC_CONFIG_FILES
from their templates
(the *.in files).
The Makefile.am is more obvious. The first line specifies the name of the program we are building. The second line specifies the source files that compose the program.
For now, as far as configure.ac is concerned you need to know the following additional facts:
AC_PROG_RANLIB
command.
AC_PROG_MAKE_SET
command.
AC_CONFIG_FILES
statement like this:
AC_CONFIG_FILES([ Makefile dir1/Makefile dir2/Makefile])
Do not put the ending parenthesis in another line, separated from the ending bracket, this will cause a misbehaviour of the macro.
As we explained before to build this package you need to execute the following commands:
% aclocal % autoconf % touch README AUTHORS NEWS ChangeLog % automake -a configure.ac: installing `./install-sh' configure.ac: installing `./mkinstalldirs' configure.ac: installing `./missing' Makefile.am: installing `./INSTALL' Makefile.am: installing `./COPYING' Makefile.am: installing `./depcomp' % ./configure checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for gawk... gawk checking whether make sets $(MAKE)... yes checking for gcc... gcc checking for C compiler default output... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ANSI C... none needed checking for style of include used by make... GNU checking dependency style of gcc... gcc3 checking for a BSD-compatible install... /usr/bin/install -c configure: creating ./config.status config.status: creating Makefile config.status: executing depfiles commands % make source='hello.c' object='hello.o' libtool=no \ depfile='.deps/hello.Po' tmpdepfile='.deps/hello.TPo' \ depmode=gcc3 /bin/sh ./depcomp \ gcc -DPACKAGE_NAME=\"Hello\ Program\" -DPACKAGE_TARNAME=\"hello\" -DPACKAGE_VERSION=\"1.0\" -DPACKAGE_STRING=\"Hello\ Program\ 1.0\" -DPACKAGE_BUGREPORT=\"Author\ Of\ The\ Program\ \<aotp@zxcv.com\>\" -DPACKAGE=\"hello\" -DVERSION=\"1.0\" -I. -I. -g -O2 -c `test -f 'hello.c' || echo './'`hello.c gcc -g -O2 -o hello hello.o
The first four commands, are for the maintainer only. When the user unpacks a distribution, he should be able to start from ‘configure’ and move on.
AM_INIT_AUTOMAKE
macro which is
not part of the standard ‘autoconf’ macros. For this reason, its
definition needs to be placed in aclocal.m4. If you call ‘aclocal’
with no arguments then it will generate the appropriate aclocal.m4 file.
Later we will show you how to use ‘aclocal’ to also install your
own ‘autoconf’ macros.
If you are curious you can take a look at the generated Makefile. It looks like gorilla spit but it will give you an idea of how one gets there from the Makefile.am.
The ‘configure’ script is an information gatherer. It finds out things
about your system. That information is given to you in two ways. One way
is through defining C preprocessor macros that you can test for directly
in your source code with preprocessor directives. This is done by passing
-D
flags to the compiler. The other way is by making certain
variables defined at the Makefile.am level. This way you can, for
example, have the configure script find out how a certain library is linked,
export it as a Makefile.am variable and use that variable in your
Makefile.am. Also, through certain special variables, configure
can control how the compiler is invoked by the Makefile.
Notice: This section is somewhat obsolete. The ‘acconfig’ program distributed with Autotoolset is no longer needed, because the file acconfig.h is no longer used by Autoconf/Automake. This section is kept here only for historical reasons and may soon be removed or rewritten. Nevertheless, the description of the behaviour of ‘autoheader’ has been updated and is correct. That said...
As you may have noticed, the ‘configure’ script in the previous example
defines two preprocessor macros that you can use in your code:
PACKAGE
and VERSION
. As you become a power-user of
‘autoconf’ you will get to define even more such macros. If you inspect
the output of ‘make’ during compilation, you will see that these macros
get defined by passing ‘-D’ flags to the compiler, one for each macro.
When there is too many of these flags getting passed around, this can cause
two problems: it can make the ‘make’ output hard to
read, and more importantly it can hit the buffer limits of various braindead
implementations of ‘make’. To work around this problem, an alternative
approach is to define all these macros in a special header file and include
it in all the sources.
A hello world program using this technique looks like this
AC_INIT([Hello Program], [1.0], [Author Of The Program <aotp@zxcv.com>], [hello]) AM_CONFIG_HEADER(config.h) AM_INIT_AUTOMAKE AC_PROG_CC AC_PROG_INSTALL AC_CONFIG_FILES([Makefile]) AC_OUTPUT
bin_PROGRAMS = hello hello_SOURCES = hello.c
#ifdef HAVE_CONFIG_H #include <config.h> #endif #include <stdio.h> main() { printf("Howdy, partner!\n"); }
AM_CONFIG_HEADER
. Also we include the configuration file conditionally
with the following three lines:
#ifdef HAVE_CONFIG_H #include <config.h> #endif
It is important to make sure that the config.h file is the first thing that gets included. Now do the usual routine:
% aclocal % autoconf % touch NEWS README AUTHORS ChangeLog % automake -a configure.ac: installing `./install-sh' configure.ac: installing `./mkinstalldirs' configure.ac: installing `./missing' Makefile.am: installing `./INSTALL' Makefile.am: installing `./COPYING' configure.ac:5: required file `./config.h.in' not found Makefile.am: installing `./depcomp'
Automake will give you an error message saying that it needs a file called ‘config.h.in’. You can generate such a file with the ‘autoheader’ program. So run:
% autoheader % configure checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for gawk... gawk checking whether make sets $(MAKE)... yes checking for gcc... gcc checking for C compiler default output... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ANSI C... none needed checking for style of include used by make... GNU checking dependency style of gcc... gcc3 checking for a BSD-compatible install... /usr/bin/install -c configure: creating ./config.status config.status: creating Makefile config.status: creating config.h config.status: executing depfiles commands % make make all-am make[1]: Entering directory `/home/mroberto/programs/autotoolset/hello5' source='hello.c' object='hello.o' libtool=no \ depfile='.deps/hello.Po' tmpdepfile='.deps/hello.TPo' \ depmode=gcc3 /bin/sh ./depcomp \ gcc -DHAVE_CONFIG_H -I. -I. -I. -g -O2 -c `test -f 'hello.c' || echo './'`hello.c gcc -g -O2 -o hello hello.o make[1]: Leaving directory `/home/mroberto/programs/autotoolset/hello5'
In older versions of the automake/autoconf
package, you would get
error messages. The problem was that autoheader
was
bundled with the autoconf
distribution, not the automake
distribution, and consequently didn't know how to deal with the
PACKAGE
and VERSION
macros. This problem is now solved with the
use of the new syntax of the macro AC_INIT. But we choose to keep
this discussion here because (a) it is still useful and (b) someone may
be using the old syntax that was kept for compatibility or (c) you have old
versions of the automake/autoconf
packages.
Of course, if ‘configure’ defines a macro, there's nothing to know. On the other hand, when a macro is not defined then there are at least two possible defaults:
#undef PACKAGE #define PACKAGE 0
The autoheader
program here used to complain that it didn't know the
defaults for the PACKAGE
and VERSION
macros.
To provide the defaults, we would create a new file acconfig.h:
#undef PACKAGE #undef VERSION
% autoheader
At this point you would run autoconf
again, so that it took into account
the presence of acconfig.h
:
% aclocal % autoconf
Now you would go ahead and build the program:
% configure % make Computing dependencies for hello.c... echo > .deps/.P gcc -DHAVE_CONFIG_H -I. -I. -I. -g -O2 -c hello.c gcc -g -O2 -o hello hello.o
Note that now instead of multiple -D
flags, there is only one
such flag passed: -DHAVE_CONFIG_H
. Also, appropriate -I
flags are passed to make sure that hello.c can find and include
config.h.
To test the distribution, type:
% make distcheck ...... ======================== hello-1.0.tar.gz is ready for distribution ========================
and it should all work out.
The config.h files go a long way back in history. In the past, there
used to be packages where you would have to manually edit config.h
files and adjust the macros you wanted defined by hand. This made these
packages very difficult to install because they required intimate knowledge
of your operating system. For example, it was not unusual to see a comment
saying “if your system has a broken vfork, then define this macro”.
How the hell are you supposed to know if your systems vfork
is
broken?? With auto-configuring packages all of these details are taken
care of automatically, shifting the burden from the user to the developer
where it belongs.
Note: the use of the file acconfig.h is deprecated, but the discussion is kept here for the reasons explained above.
Normally in the acconfig.h file you would put statements like
#undef MACRO #define MACRO default
These values were copied over to config.h.in and are supplemented with
additional defaults for C preprocessor macros that got defined by
native autoconf
macros like
AC_CHECK_HEADERS
, AC_CHECK_FUNCS
, AC_CHECK_SIZEOF
,
AC_CHECK_LIB
.
If the file acconfig.h contained the string @TOP@
then all
the lines before the string would be included verbatim to config.h
before the custom definitions. Also, if the file acconfig.h
contained the string @BOTTOM@
then all the lines after the string would
be included verbatim to config.h after the custom definitions.
This allowed you to include further preprocessor directives that are related
to configuration. Some of these directives may be using the custom definitions
to conditionally issue further preprocessor directives. Due to a bug in
some versions of autoheader
if the strings @TOP@
and
@BOTTOM@
do appear in your acconfig.h
file, then you must
make sure that there is at least one line appearing before
@TOP@
and one line after @BOTTOM@
, even if it has to be
a comment. Otherwise, autoheader
may not work correctly.
With ‘autotoolset’ we distribute a utility called acconfig which will build acconfig.h automatically. By default it will always make sure that
#undef PACKAGE #undef VERSION
are there. Additionally, if you install macros that are acconfig friendly
then ‘acconfig’ will also install entries for these macros.
The acconfig
program may be revised in the future and perhaps
it might be eliminated (note: indeed...). There is an unofficial patch to
Autoconf that will automate the maintenance of acconfig.h, eliminating
the need for a separate program. I am not yet certain if that patch will be
part of the official next version of Autoconf, but I very much expect it
to (note: I think it has been included). Until then, if you are interested, see:
http://www.clark.net/pub/dickey/autoconf/autoconf.html
This situation creates a bit of a dilemma about whether I should
document and encourage acconfig
in this tutorial or not.
I believe that the Autoconf patch is a superior solution. However since
I am not the one maintaining Autoconf, my hands are tied. For now
let's say that if you confine yourself to using only the macros provided
by autoconf
, automake
, and autotoolset
then
acconfig.h will be completely taken care for you by acconfig.
In the future, I hope that acconfig.h
will be generated
by configure
and be the sole responsibility of Autoconf.
You may be wondering whether it is worth using config.h files in the
programs you develop if there aren't all that many macros being defined.
My personal recommendation is yes. Use config.h files because
perhaps in the future your configure might need to define even more
macros. So get started on the right foot from the beginning. Also, it is
nice to just have a config.h
file lying around because you can have
all your configuration specific C preprocessor directives in one place.
In fact, if you are one of these people writing peculiar system software
where you get to #include
20 header files on every single source file
you write, you can just have them on all thrown into config.h
once
and for all.
In the next chapter we will tell you about the LF
macros that get
distributed with autotoolset
and this tutorial. These macros do require
you to use the config.h file. The bottom line is: config.h
is your friend; trust the config.h
.
FIXME: write about VPATH builds and how to modify optimization
In software engineering, people start from a precise, well-designed specification and proceed to implementation. In research, the specification is fluid and immaterial and the goal is to be able to solve a slightly different problem every day. To have the flexibility to go from variation to variation with the least amount of fuss is the name of the game. By fuss, we refer to debugging, testing and validation. Once you have a code that you know gives the right answer to a specific set of problems, you want to be able to move on to a different set of similar problems with reinventing, debugging and testing as little as possible. These are the two distinct situations that computer programmers get to confront in their lives.
Software engineers can take good care of themselves in both situations. It's part of their training. However, people whose specialty is the scientific problem and not software engineering, must confront the hardest of the two cases, the second one, with very little training in software engineering. As a result they develop code that's clumsy in implementation, clumsy in usage, and with only redeeming quality the fact that it gives the right answer. This way, they do get the work of the day done, but they leave behind them no legacy to do the work of tomorrow. No general-purpose tools, no documentation, no reusable code.
The key to better software engineering is to focus away from developing monolithic applications that do only one job, and focus on developing libraries. One way to think of libraries is as a program with multiple entry points. Every library you write becomes a legacy that you can pass on to other developers. Just like in mathematics you develop little theorems and use the little theorems to hide the complexity in proving bigger theorems, in software engineering you develop libraries to take care of low-level details once and for all so that they are out of the way every time you make a different implementation for a variation of the problem.
On a higher level you still don't create just one application. You create many little applications that work together. The centralized all-in-one approach in my experience is far less flexible than the decentralized approach in which a set of applications work together as a team to accomplish the goal. In fact this is the fundamental principle behind the design of the Unix operating system. Of course, it is still important to glue together the various components to do the job. This you can do either with scripting or with actually building a suite of specialized monolithic applications derived from the underlying tools.
The name of the game is like this:
Break down the program to parts. And the parts to smaller parts, until you
get down to simple subproblems that can be easily tested, and from which
you can construct variations of the original problem. Implement each one
of these as a library, write test code for each library and make sure that
the library works. It is very important for your library to have a complete
test suite, a collection of programs that are supposed to run silently
and return normally (exit(0);
) if they execute successfully,
and return abnormally (assert(false); exit(1);
) if they fail.
The purpose of the test suite is to detect bugs in the library, and to
convince you, the developer, that the library works. The best time to
write a test program is as soon as it is possible! Don't be lazy.
Don't just keep throwing in code after code after code. The minute there
is enough new code in there to put together some kind of test program,
just do it! I can not emphasize that enough. When you write new code
you have the illusion that you are producing work, only to find out tomorrow
that you need an entire week to debug it. As a rule, internalize the reality
that you know you have produced new work every time you write a working
test program for the new features, and not a minute before.
Another time when you should definitely write a test suite is when you
find a bug while ordinarily using the library. Then, before you even
fix the bug, write a test program that detects the bug. Then go fix it.
This way, as you add new features to your libraries you have insurance that
they won't reawaken old bugs.
Please keep documentation up to date as you go. The best time to write documentation is right after you get a few new test programs working. You might feel that you are too busy to write documentation, but the truth of the matter is that you will always be too busy. After long hours debugging these segfaults, think of it as a celebration of triumph to fire up the editor and document your brand-spanking new cool features.
Please make sure that computational code is completely separated from I/O code so that someone else can reuse your computational code without being forced to also follow your I/O model. Then write programs that invoke your collection of libraries to solve various problems. By dividing and conquering the problem library by library with a test suite for each step along the way, you can write good and robust code. Also, if you are developing numerical software, please don't expect that other users of your code will be getting a high while entering data for your input files. Instead write an interactive utility that will allow users to configure input files in a user friendly way. Granted, this is too much work in Fortran. Then again, you do know more powerful languages, don't you?
Examples of useful libraries are things like linear algebra libraries, general ODE solvers, interpolation algorithms, and so on. As a result you end up with two packages. A package of libraries complete with a test suite, and a package of applications that invoke the libraries. The package of libraries is well-tested code that can be passed down to future developers. It is code that won't have to be rewritten if it's treated with respect. The package of applications is something that each developer will probably rewrite since different people will probably want to solve different problems. The effect of having a package of libraries is that C++ is elevated to a Very High Level Language that's closer to the problems you are solving. In fact a good rule of thumb is to make the libraries sufficiently sophisticated so that each executable that you produce can be expressed in one source file. All this may sound like common sense, but you will be surprised at how many scientific developers maintain just one does-everything-program that they perpetually hack until it becomes impossible to maintain. And then you will be even more surprised when you find that some professors don't understand why a “simple mathematical modification” of someone else's code is taking you so long.
Every library must have its own directory and Makefile
. So a library
package will have many subdirectories, each directory being one library.
And perhaps if you have too many of them, you might want to group them
even further down. Then, there's the applications. If you've done
everything right, there should be enough stuff in your libraries to enable
you to have one source file per application. Which means that all the source
files can probably go down under the same directory.
Very often you will come to a situation where there's something that your libraries to-date can't do, so you implement it and stick it along in your source file for the application. If you find yourself cut and pasting that implementation to other source files, then this means that you have to put this in a library somewhere. And if it doesn't belong to any library you've written so far, maybe to a new library. When you are in a deadline crunch, there's a tendency not to do this since it's easier to cut and paste. The problem is that if you don't take action right then, eventually your code will degenerate to a hard-to-use mess. Keeping the entropy down is something that must be done on a daily basis.
Finally, a word about the age-old issue of language-choice. The GNU coding standards encourage you to program in C and avoid using languages other than C, such as C++ or Fortran. The main advantage of C over C++ and Fortran is that it produces object files that can be linked by any C or C++ compiler. In contrast, C++ object files can only be linked by the compiler that produced them. As for Fortran, aside from the fact that Fortran 90 and 95 have no free compilers, it is not very trivial to mix Fortran 77 with C/C++, so it makes no sense to invite all that trouble without a compelling reason. Nevertheless, my suggestion is to code in C++. The main benefit you get with C++ is robustness. Having constructors and destructors and references can go a long way towards helping you to void memory errors, if you know how to make them work for you.
Now we get into the gory details of software organization. I'll tell you one
way to do it. This is advice, not divine will. It's simply a way that works
well in general, and a way that works well with autoconf
and
automake
in particular.
The first principle is to maintain the package of libraries separate from the package of applications. This is not an iron-clad rule. In software engineering, where you have a crystal clear specification, it makes no sense to keep these two separate. I found from experience that it makes a lot more sense in research. Either of these two packages must have a toplevel directory under which live all of its guts. Now what do the guts look like?
First of all you have the traditional set of information files that we described in Chapter 1:
README, AUTHORS, NEWS, ChangeLog, INSTALL, COPYING
You also have the following subdirectories:
configure
script link all public header files
in all the subdirectories under src
to this directory. This way
it will only be necessary to pass one -I
flag to test suites that
want to access the include files of other libraries in the distribution.
We will discuss this later.
#!/bin/sh rm -f config.cache aclocal -I m4 autoconf autoheader automake -a exit
This will generate configure and Makefile.in and needs to
be called whenever you change a Makefile.am or a configure.ac
as well as when you change something under the m4 directory.
It will also call ‘autoheader’ to make config.h.in
.
At the toplevel directory, you need to put a Makefile.am that will tell the computer that all the source code is under the src directory. The way to do it is to put the following lines in Makefile.am:
EXTRA_DIST = reconf SUBDIRS = m4 doc src
automake
that the reconf script
is part of the distribution and must be included when you do make dist
.
automake
that the rest of the distribution is
in the subdirectories m4, doc and src. It instructs
make to recursively call itself in these subdirectories. It is important
to include the doc and m4 subdirectories here and enhance them
with Makefile.am so that make dist
includes them into the
distribution.
If you are also using a lib subdirectory, then it should be built before src:
EXTRA_DIST = reconf SUBDIRS = m4 doc lib src
The lib subdirectory should build a static library that is linked by your executables in src. There should be no need to install that library.
At the toplevel directory you also need to put the configure.ac file. That should look like this:
AC_INIT(packagename,versionnumber) AM_INIT_AUTOMAKE [...put your tests here...] AC_CONFIG_FILES([Makefile doc/Makefile m4/Makefile src/Makefile src/dir1/Makefile src/dir2/Makefile src/dir3/Makefile ............ src/dir1/foo1/Makefile]) AC_OUTPUT
You will not need another configure.ac file. However,
every directory level on your tree must have a Makefile.am.
When you call automake
on the top-level directory, it looks at
‘AC_CONFIG_FILES’ at your configure.ac to decide what other
directories have a Makefile.am that needs parsing. As you can see from
above, a Makefile.am file is needed even under the doc and
m4 directories. How to set that up is up to you. If you aren't building
anything, but just have files and directories hanging around, you must declare
these files and directories in the Makefile.am like this:
SUBDIRS = dir1 dir2 dir3 EXTRA_DIST = file1 file2 file3
Doing that will cause make dist
to include these files and directories
to the package distribution.
This tedious setup work needs to be done every time that you create a new package. If you create enough packages to get sick of it, then you want to look into the ‘acmkdir’ utility that is distributed by Autotoolset. We will describe it at the next chapter.
Next we explain how to develop Makefile.am files for the source code directory levels. A Makefile.am is a set of assignments. These assignments imply the Makefile, a set of targets, dependencies and rules, and the Makefile implies the execution of building.
The first set of assignments going at the beginning look like this:
INCLUDES = -I/dir1 -I/dir2 -I/dir3 .... LDFLAGS = -L/dir1 -L/dir2 -L/dir3 .... LDADD = -llib1 -llib2 -llib3 ...
-I
flags that
you need to pass to your compiler. If the stuff in this directory is
dependent on a library in another directory of the same package, then
the -I
flag must point to that directory.
-L
flags
that are needed by the compiler when it links all the object files to
an executable.
-l
flag only for installed libraries. You can list
libraries that have been built but not installed yet as well, but
do this only be providing the full path to these libraries.
INCLUDES = ... -I$(top_srcdir)/src/libfoo ... LDFLAGS = ... -L$(top_srcdir)/src/libfoo ...
on the Makefile.am of every directory level that wants access to these libraries. Also, you must make sure that the libraries are built before the directory level is built. To guarantee that, list the library directories in ‘SUBDIRS’ before the directory levels that depend on it. One way to do this is to put all the library directories under a lib directory and all the executable directories under a bin directory and on the Makefile.am for the directory level that contains lib and bin list them as:
SUBDIRS = lib bin
This will guarantee that all the libraries are available before building any executables. Alternatively, you can simply order your directories in such a way so that the library directories are built first.
Next we list the things that are to be built in this directory level:
bin_PROGRAMS = prog1 prog2 prog3 .... lib_LIBRARIES = libfoo1.a libfoo2.a libfoo3.a .... check_PROGRAMS = test1 test2 test3 .... TESTS = $(check_PROGRAMS) include_HEADERS = header1.h header2.h ....
make
and installed with make install
under
/prefix/bin, where prefix is usually /usr/local.
make
and installed with make install
under
/prefix/lib.
make
but only with a
make check
. These programs serve as tests that you, the user
can use to test the library.
make check
. These programs
constitute the test suite and they are indispensable when you
develop a library. It is common to just set
TESTS = $(check_PROGRAMS)
This way by commenting the line in and out, you can modify the behaviour
of make check
. While debugging your test suite, you will want to
comment out this line so that make check
doesn't run it. However,
in the end product, you will want to comment it back in.
/prefix/include
. You must
list a header file here if you want to cause it to be installed. You
can also list it under libfoo_a_SOURCES
for the library that it
belongs to, but it is imperative to list public headers here so that they
can be installed.
For each of these types of targets, we must state information that
will allow automake
and make
to infer the building process.
prog1_SOURCES = foo1.cc foo2.cc ... header1.h header2.h .... prog1_LDADD = -lbar1 -lbar2 -lbar3 prog1_LDFLAGS = -L/dir1 -L/dir2 -L/dir3 ... prog1_DEPENDENCIES = dep1 dep2 dep3 ...
In each assignment substitute ‘prog1’ with the name of the program that you are building as it appeared in ‘bin_PROGRAMS’ or ‘check_PROGRAMS’.
make dist
.
To cause header files to be installed you must also put them in
‘include_HEADERS’.
-l
flags for linking
whatever libraries are needed by your code. You may also list object files,
which have been compiled in an exotic way, as well as paths to uninstalled
yet libraries.
-L
flags that are needed to
resolve the libraries you passed in ‘prog_LDADD’. Certain flags that
need to be passed on every program can be expressed on a global
basis by assigning them at ‘LDFLAGS’.
lib_LIBRARIES = ... libfoo1.a ... libfoo1_a_SOURCES = foo1.cc foo2.cc private1.h private2.h ... libfoo1_a_LIBADD = obj1.o obj2.o obj3.o libfoo1_a_DEPENDENCIES = dep1 dep2 dep3 ...
Note that if the name of the library is libfoo1.a the prefix that appears in the variables that are related with that library is ‘libfoo1_a_’.
include_HEADERS
it is not required to repeat
them a second time here.
In the previous section we described how to use Automake to compile programs, libraries and test suites. To exploit the full power of Automake however, it is important to understand the fundamental ideas behind it.
The simplest way to look at a Makefile.am is as a collection of assignments which infer a set of Makefile rules, which in turn infer the building process. There are three types of such assignments:
bindir = $(prefix)/bin libdir = $(prefix)/lib includedir = $(prefix)/include
These are the directories where you install executables, libraries and public header files. You can override the defaults by inserting different assignments in your Makefile.am, but please don't do that. Instead define new assignments. For example, if you do
foodir = $(prefix)/foo
then you can use ‘foo_PROGRAMS’, ‘foo_LIBRARIES’, etc. to list programs and libraries that you want installed in /prefix/foo. The symbols ‘check’ and ‘noinst’ have special meanings and you should not ever try to assign to ‘checkdir’ and ‘noinstdir’.
bin_PROGRAMS = hello
this means that you can then say:
hello_SOURCES = ... hello_LDADD = ...
and so on. The ‘SOURCES’ and ‘LDADD’ are properties of ‘hello’ which is a ‘PROGRAMS’ primitive.
In addition to assignments, it is also possible to include ordinary targets and abstract targets in a Makefile.am just as you would in an ordinary Makefile.am. This can be particularly useful in situations like the following:
Ordinary rules simply build things. Abstract rules however have a special
meaning to Automake. If you define an abstract rule that compiles
files with an arbitrary suffix into *.o an object file,
then files with such a suffix can appear in the ‘*_SOURCES’ of
programs and libraries. You must however write the abstract rule early
enough in your Makefile.am for Automake to parse it before
encountering a sources assignment in which such files appear.
You must also mention all the additional
suffixes by assigning the variable ‘SUFFIXES’. Automake will use
the value of that variable to put together the .SUFFIXES
construct
(see More about Makefiles).
If you need to write your own rules or abstract rules, then check at some point that your distribution builds properly with ‘make distcheck’. It is very important, when you define your own rules, to follow the following guidelines:
make distcheck
fails, which attempts to do a VPATH build.
$(top_srcdir)
for files which you write
(and your compiler tools read) and $(top_builddir)
for
files which the compiler tools write.
ar cat chmod cmp cp diff echo egrep expr false grep ls mkdir mv pwd rm rmdir sed sleep sort tar test touch true
Any other programs that you want to use, you must use them through make variables. That includes programs such as these:
awk bash bison cc flex install latex ld ldconfig lex ln make makeinfo perl ranlib shar texi2dvi yacc
The make variables can be defined through Autoconf in your configure.ac. For special-purpose tools, use the AC_PATH_PROGS macro. For example:
AC_PATH_PROGS(BASH, bash) AC_PATH_PROGS(PERL, perl perl5)
Some special tools have their own autoconf macros:
AC_PROG_MAKE_SET ==> $(MAKE) ==> make AC_PROG_RANLIB ==> $(RANLIB) ==> ranlib | (do-nothing) AC_PROG_AWK ==> $(AWK) ==> mawk | gawk | nawk | awk AC_PROG_LEX ==> $(LEX) ==> flex | lex AC_PROG_YACC ==> $(YACC) ==> 'bison -y' | byacc | yacc AC_PROG_LN_S ==> $(LN_S) ==> ln -s
See the Autoconf manual for more information.
A real life example of a Makefile.am for libraries is the one I use to build the Blas-1 library. It looks like this:
• blas1/Makefile.am
SUFFIXES = .f .f.o: $(F77) $(FFLAGS) -c $< lib_LIBRARIES = libblas1.a libblas1_a_SOURCES = f2c.h caxpy.f ccopy.f cdotc.f cdotu.f crotg.f cscal.f \ csrot.f csscal.f cswap.f dasum.f daxpy.f dcabs1.f dcopy.f ddot.f dnrm2.f \ drot.f drotg.f drotm.f drotmg.f dscal.f dswap.f dzasum.f dznrm2.f icamax.f \ idamax.f isamax.f izamax.f sasum.f saxpy.f scasum.f scnrm2.f scopy.f \ sdot.f snrm2.f srot.f srotg.f srotm.f srotmg.f sscal.f sswap.f zaxpy.f \ zcopy.f zdotc.f zdotu.f zdrot.f zdscal.f zrotg.f zscal.f zswap.f
Because the Blas library is written in Fortran, I need to declare the Fortran suffix at the beginning of the Makefile.am with the ‘SUFFIXES’ assignment and then insert an implicit rule for building object files from Fortran files. The variables ‘F77’ and ‘FFLAGS’ are defined by Autoconf, by using the Fortran support provided by Autotoolset. For C or C++ files there is no need to include implicit rules. We discuss Fortran support at a later chapter.
Another important thing to note is the use of the symbol ‘$<’. We introduced these symbols in Chapter 2, where we mentioned that ‘$<’ is the dependencies that changed causing the target to need to be rebuilt. If you've been paying attention you may be wondering why we didn't say ‘$(srcdir)/$<’ instead. The reason is because for VPATH builds, ‘make’ is sufficiently intelligent to substitute ‘$<’ with the Right Thing.
Now consider the Makefile.am for building a library for solving linear systems of equations in a nearby directory:
• lin/Makefile.am
SUFFIXES = .f .f.o: $(F77) $(FFLAGS) -c $< INCLUDES = -I../blas1 -I../mathutil lib_LIBRARIES = liblin.a include_HEADERS = lin.h liblin_a_SOURCES = dgeco.f dgefa.f dgesl.f f2c.h f77-fcn.h lin.h lin.cc check_PROGRAMS = test1 test2 test3 TESTS = $(check_PROGRAMS) LDADD = liblin.a ../blas1/libblas1.a ../mathutil/libmathutil.a $(FLIBS) -lm test1_SOURCES = test1.cc f2c-main.cc test2_SOURCES = test2.cc f2c-main.cc test3_SOURCES = test3.cc f2c-main.cc
In this case, we have a library that contains mixed Fortran and C++ code. We also have an example of a test suite, which in this case contains three test programs. What's new here is that in order to link the test suite properly we need to link in libraries that have been built already in other directories but haven't been installed yet. Because every test program requires to be linked against the same libraries, we set these libraries globally with an ‘LDADD’ assignment for all executables. Because the libraries have not been installed yet we specify them with their full path. This will allow Automake to track dependencies correctly; if libblas1.a is modified, it will cause the test suite to be rebuilt. Also the variable ‘INCLUDES’ is globally assigned to make the header files of the other two libraries accessible to the source code in this directory. The variable ‘$(FLIBS)’ is assigned by Autoconf to link the run-time Fortran libraries, and then we link the installed libm.a library. Because that library is installed, it must be linked with the ‘-l’ flag. Another peculiarity in this example is the file f2c-main.cc which is shared by all three executables. As we will explain later, when you link executables that are derived from mixed Fortran and C or C++ code, then you need to link with the executable this kludge file.
The test-suite files for numerical code will usually invoke the library to perform a computation for which an exact result is known and then verify that the result is true. For non-numerical code, the library will need to be tested in different ways depending on what it does.
A good example, and all about how libraries should be tested and documented. Needs thinking.
In some complicated packages, you want to generate part of their source code by executing a program at compile time. For example, in one of the packages that I wrote for an assignment, I had to generate a file incidence.out that contained a lot of hairy matrix definitions that were too ugly to just compute and write by hand. That file was then included by fem.cc which was part of a library that I wrote to solve simple finite element problems, with a preprocessor statement:
#include "incidence.out"
All source code files that are to be generated during compile time should be listed in the global definition of ‘BUILT_SOURCES’. This will make sure that these files get compiled before anything else. In our example, the file incidence.out is computed by running a program called incidence which of course also needs to be compiled before it is run. So the Makefile.am that we used looked like this:
noinst_PROGRAMS = incidence lib_LIBRARIES = libpmf.a incidence_SOURCES = incidence.cc mathutil.h incidence_LDADD = -lm incidence.out: incidence ./incidence > incidence.out BUILT_SOURCES = incidence.out libpmf_a_SOURCES = laplace.cc laplace.h fem.cc fem.h mathutil.h check_PROGRAMS = test1 test2 TESTS = $(check_PROGRAMS) test1_SOURCES = test1.cc test1_LDADD = libpmf.a -lm test2_SOURCES = test2.cc test2_LDADD = libpmf.a -lm
Note that because the executable incidence has been created at compile time, the correct path is ./incidence. Always keep in mind, that the correct path to source files, such as incidence.cc is $(srcdir)/incidence.cc. Because the incidence program is used temporarily only for the purposes of building the libpmf.a library, there is no reason to install it. So, we use the ‘noinst’ prefix to instruct Automake not to install it.
In some cases, we want to embed text to the executable file of an application. This may be on-line help pages, or it may be a script of some sort that we intend to execute by an interpreter library that we are linking with, like Guile or Tcl. Whatever the reason, if we want to compile the application as a stand-alone executable, it is necessary to embed the text in the source code. Autotoolset provides with the build tools necessary to do this painlessly.
As a tutorial example, we will write a simple program that prints the contents of the GNU General Public License. First create the directory tree for the program:
% acmkdir foo
Enter the directory and create a copy of the txtc
compiler:
% cd foo-0.1 % mktxtc
Then edit the file configure.ac and add a call to the
LF_PROG_TXTC
macro. This macro depends on
AC_PROG_CC AC_PROG_AWK
so make sure that these are invoked also. Finally add txtc.sh to
your AC_OUTPUT
.
The end-result should look like this:
AC_INIT(reconf) AM_CONFIG_HEADER(config.h) AM_INIT_AUTOMAKE(foo,0.1) AC_PROG_CC AC_PROG_RANLIB AC_PROG_AWK LF_PROG_TXTC AC_OUTPUT(Makefile txtc.sh doc/Makefile m4/Makefile src/Makefile)
In the src directory use Emacs to create a file src/text.txt
containing some random text.
The text.txt file is the text that we want to print. You can substitute
it with any text you want. This file will be compiled into text.o
during the build process. The text.h file is a header file that gives
access to the symbols defined by text.o. The file copyleft.cc
is where the main
will be written.
Next, create the following files with Emacs:
extern int text_txt_length; extern char *text_txt[];
#if HAVE_CONFIG_H # include <config.h> #endif #include <stdio.h> #include "text.h" main() { for (i = 1; i<= text_txt_length; i++) printf ("%s\n", text_txt[i]); }
SUFFIXES = .txt .txt.o: $(TXTC) $< bin_PROGRAMS = foo foo_SOURCES = foo.c text.h text.txt
$ cd .. $ reconf $ configure $ make $ src/foo | less
To verify that this works properly, do the following:
$ cd src $ foo > foo.out $ diff text.txt foo.out
The two files should be identical. Finally, convince yourself that you can make a distribution:
$ make distcheck
and there you are.
Note that in general the text file, as encoded by the text compiler, will not be always identical to the original. There is one and only one modification being made: If any line has any blank spaces at the end, they are trimmed off. This feature was introduced to deal with a bug in the Tcl interpreter, and it is in general a good idea since it conserves a few bytes, it never hurts, and additional whitespace at the end of a line shouldn't really be there.
This magic is put together from many different directions. It begins with
the LF_PROG_TXTC
macro:
This macro will define the variable
TXTC
to point to a Text-to-C compiler. To create a copy of the compiler at the toplevel directory of your source code, use themktxtc
command:% mktxtcThe compiler is implemented as a shell script, and it depends on
sed
,awk
and the C compiler, so you should call the following two macros before invokingAC_PROG_TXTC
:AC_PROG_CC AC_PROG_AWKThe compiler is intended to be used as follows:
$(TXTC) text1.txt text2.txt text3.txt ...such that given the files text1.txt, text2.txt, etc. object files text1.o, text2.o, etc, are generated that contains the text from these files.
From the Automake point of view, you need to add the following two lines to Automake:
SUFFIXES = .txt .txt.o: $(TXTC) $<
assuming that your text files will end in the .txt
suffix. The first
line informs Automake that there exist source files using non-standard
suffixes. Then we describe, in terms of an abstract Makefile rule, how to
build an object file from these non-standard suffixes. Recall the use of
the symbol $<
. Also note that it is not necessary
to use $(srcdir)
on $<
for VPATH builds.
If you embed more than one type of files, then you may want to use more
than one suffixes. For example, you may have .hlp files containing
online help and .scm files containing Guile code. Then you
want to write a rule for each suffix as follows:
SUFFIXES = .hlp .scm .hlp.o: $(TXTC) $< .scm.o: $(TXTC) $<
It is important to put these lines before mentioning any SOURCES
assignments. Automake is smart enough to parse these abstract makefile
rules and recognize that files ending in these suffixes are valid source
code that can be built to object code. This allows you to simply list
gpl.txt with the other source files in the SOURCES
assignment:
foo_SOURCES = foo.c text.h text.txt
In order for this to work however, Automake must be able to see your abstract rules first.
When you “compile” a text file foo.txt this makes an object file that defines the following two symbols:
int foo_txt_length; char *foo_txt[];
Note that the dot characters are converted into underscores. To make these symbols accessible, you need to define an appropriate header file with the following general form:
extern int foo_txt_length; extern char *foo_txt[];
When you include this header file into your other C or C++ files then:
foo_txt[0];
and use it to print diagnostic messages.
char *foo_txt[1]; ==> first line char *foo_txt[2]; ==> second line ...
foo_txt_length
is defined such that
char *foo_txt[foo_txt_length+1] == NULL
The last line of the text is:
char *foo_txt[foo_txt_length];
You can use a for
loop (or the loop
macro defined by
LF_CPP_PORTABILITY)
together with foo_txt_length
to loop over the entire text, or you can
exploit the fact that the last line points to NULL
and do a
while
loop.
Previously, we mentioned that the symbols ‘bin’, ‘lib’ and ‘include’ refer to installation locations that are defined respectively by the variables ‘bindir’, ‘libdir’ and ‘includedir’. For completeness, we will now list the installation locations available by default by Automake and describe their purpose.
All installation locations are placed under one of the following directories:
configure --prefix=/home/lf
configure --prefix=/home/lf --exec-prefix=/home/lf/gnulinux
The purpose of using a separate location for machine-dependent files is because then it makes it possible to install the software on a networked file server and make that available to machines with different architectures. To do that there must be separate copies of all the machine-dependent files for each architecture in use.
The GNU coding standards describe in detail the standard directories in which you should install your files. All of these standard locations are supported by Automake. So, for example, you can write things like
sbin_PROGRAMS = prog ... sharedstate_DATA = foo ... ....
without having to define the variables ‘sbindir’, ‘sharedstatedir’ and so on.
bindir = $(exec_prefix)/bin
sbindir = $(exec_prefix)/sbin
libexecdir = $(exec_prefix)/libexec
libdir = $(exec_prefix)/lib
includedir = $(prefix)/include
datadir = $(prefix)/share
sysconfdir = $(prefix)/etc
sharedstatedir = $(prefix)/com
localstatedir = $(prefix)/var
lispdir = $(datadir)/emacs/site-lisp
/usr/local/share/emacs/site-lisp.
This directory is not automatically defined by Automake. To define it, you must invoke
AM_PATH_LISPDIR
from Autoconf. See Emacs Lisp with Automake.
m4dir = $(datadir)/aclocal
m4dir = $(datadir)/aclocal
to define it yourself. See Writing Autoconf macros.
infodir = $(prefix)/info
mandir = $(prefix)/man
man1dir = $(prefix)/man1
man2dir = $(prefix)/man2
pkglibdir = $(libdir)/@PACKAGE@ pkgincludedir = $(includedir)/@PACKAGE@ pkgdatadir = $(datadir)/@PACKAGE@
These subdirectories are useful for separating the files of your
package from other packages. Of these three, you are most likely to
want to use pkgincludedir
to segragate public header files,
as we discussed in Dealing with header files. For similar
reasons you might like to segregate your data files.
The only reason for using pkglibdir
is to
install dynamic libraries that are meant to be loaded only at run-time
while an application is running.
You should not use a subdirectory for libraries that are linked to
programs, even dynamically, while the programs are being compiled, because
that will make it more difficult to compile your programs. However,
things like plug-ins, widget themes and so on should have their own
directory.
Sometimes you may feel the need to implement some of your programs in a scripting language like Bash or Perl. For example, the ‘autotoolset’ package is exclusively a collection of shell scripts. Theoretically, a script does not need to be compiled. However, there are still issues pertaining to scripts such as:
make install
, uninstalled
with make uninstall
and distributed with make dist
.
#!
right.
CLEANFILES = $(bin_SCRIPTS)
You also need to write your own targets for building the script by hand.
For example:
# -* bash *- echo "Howdy, world!" exit 0
# -* perl *- print "Howdy, world!\n"; exit(0);
bin_SCRIPTS = hello1 hello2 CLEANFILES = $(bin_SCRIPTS) EXTRA_DIST = hello1.sh hello2.pl hello1: $(srcdir)/hello1.sh rm -f hello1 echo "#! " $(BASH) > hello1 cat $(srcdir)/hello1.sh >> hello1 chmod ugo+x hello1 hello2: $(srcdir)/hello2.pl $(PERL) -c hello2.pl rm -f hello2 echo "#! " $(PERL) > hello2 cat $(srcdir)/hello2.pl >> hello2 chmod ugo+x hello2
AC_INIT AM_INIT_AUTOMAKE(hello,0.1) AC_PATH_PROGS(BASH, bash sh) AC_PATH_PROGS(PERL, perl perl5.004 perl5.003 perl5.002 perl5.001 perl5) AC_OUTPUT(Makefile)
#!/bin/bash #!/usr/bin/perl
Instead we let Autoconf pick up the correct path, and then we insert it
during make
. Since we omit the #!
line, we leave a comment
instead that indicates what kind of file this is.
In the special case of perl
we also invoke
perl -c hello2.pl
This checks the perl script for correct syntax. If your scripting language
supports this feature I suggest that you use it to catch errors at
“compile” time.
The AC_PATH_PROGS
macro looks for a specific utility and returns
the full path.
If you wish to conform to the GNU coding standards, you may want your script
to support the --help
and --version
flags, and you may want
--version
to pick up the version number from
AM_INIT_AUTOMAKE
.
Here's an enhanced hello world scripts:
VERSION=@VERSION@
$VERSION="@VERSION@";
# -* bash *- function usage { cat << EOF Usage: % hello [OPTION] Options: --help Print this message --version Print version information Bug reports to: monica@whitehouse.gov EOF } function version { cat << EOF hello $VERSION - The friendly hello world program Copyright (C) 1997 Monica Lewinsky <monica@whitehouse.gov> This is free software, and you are welcome to redistribute it and modify it under certain conditions. There is ABSOLUTELY NO WARRANTY for this software. For legal details see the GNU General Public License. EOF } function invalid { echo "Invalid usage. For help:" echo "% hello --help" } # ------------------------- if test $# -ne 0 then case $1 in --help) usage exit ;; --version) version exit ;; *) invalid exit ;; fi # ------------------------ echo "Howdy world" exit
# -* perl *- sub usage { print <<"EOF"; Usage: % hello [OPTION] Options: --help Print this message --version Print version information Bug reports to: monica@whitehouse.gov EOF exit(1); } sub version { print <<"EOF"; hello $VERSION - The friendly hello world program Copyright (C) 1997 Monica Lewinsky <monica@whitehouse.gov> This is free software, and you are welcome to redistribute it and modify it under certain conditions. There is ABSOLUTELY NO WARRANTY for this software. For legal details see the GNU General Public License. EOF exit(1); } sub invalid { print "Invalid usage. For help:\n"; print "% hello --help\n"; exit(1); } # -------------------------- if ($#ARGV == 0) { do version() if ($ARGV[0] eq "--version"); do usage() if ($ARGV[0] eq "--help"); do invalid(); } # -------------------------- print "Howdy world\n"; exit(0);
bin_SCRIPTS = hello1 hello2 CLEANFILES = $(bin_SCRIPTS) EXTRA_DIST = hello1.sh hello2.pl hello1: $(srcdir)/hello1.sh $(srcdir)/version.sh rm -f hello1 echo "#! " $(BASH) > hello1 cat $(srcdir)/version.sh $(srcdir)/hello1.sh >> hello1 chmod ugo+x hello1 hello2: $(srcdir)/hello2.pl $(srcdir)/version.pl $(PERL) -c hello2.pl rm -f hello2 echo "#! " $(PERL) > hello2 cat $(srcdir)/version.pl $(srcdir)/hello2.pl >> hello2 chmod ugo+x hello2
AC_INIT(hello,0.1) AM_INIT_AUTOMAKE AC_PATH_PROGS(BASH, bash sh) AC_PATH_PROGS(PERL, perl perl5.004 perl5.003 perl5.002 perl5.001 perl5) AC_CONFIG_FILES([Makefile version.sh version.pl])
Basically the idea with this approach is that when configure
calls
AC_OUTPUT
it will substitute the files version.sh
and
version.pl
with the correct version information. Then, during
building, the version files are merged with the scripts. The scripts
themselves need some standard boilerplate code to handle the options.
I've included that code here as a sample implementation, which I hereby
place in the public domain.
This approach can be easily generalized with other scripting languages as well, like Python and Guile.
If your package requires you to edit a certain type of files, you might want to write an Emacs editing mode for it. Emacs modes are written in Emacs LISP, and Emacs LISP source code is written in files that are suffixed with ‘*.el’. Automake can byte-compile and install Emacs LISP files using Emacs for you.
To handle Emacs LISP, you need to invoke the
AM_PATH_LISPDIR
macro in your configure.ac. In the directory containing the Emacs LISP files, you must add the following line in your Makefile.am:
lisp_LISP = file1.el file2.el ...
where ‘$(lispdir)’ is initialized by ‘AM_PATH_LISPDIR’. The ‘LISP’ primitive also accepts the ‘noinst’ location.
Most Emacs LISP files are meant to be simply compiled and installed.
Then the user is supposed to add certain invocations in per .emacs
to use the features that they provide. However, because Emacs LISP is a full
programming language you might like to write full programs in Emacs LISP,
just like you would in any other language, and have these programs be
accessible from the shell. If the installed file is called foo.el
and it defines a function main
as an entry point, then you can
run it with:
% emacs --batch -l foo -f main
In that case, it may be useful to install a wrapper shell script containing
#!/bin/sh emacs --batch -l foo -f main
so that the user has a more natural interface to the program. For more details on handling shell scripts See Scripts with Automake. Note that it's not necessary for the wrapper program to be a shell script. You can have it be a C program, if it should be written in C for some reason.
Here's a tutorial example of how that's done. Start by creating a directory:
% mkdir hello-0.1 % cd hello-0.1
Then create the following files:
AC_INIT AM_INIT_AUTOMAKE(hello,0.1) AM_PATH_LISPDIR AC_OUTPUT(Makefile)
(defun main () "Hello world program" (princ "Hello world\n"))
#!/bin/sh emacs --batch -l hello.el -f main exit
lisp_LISP = hello.el EXTRA_DIST = hello.el hello.sh bin_SCRIPTS = hello CLEANFILES = $(bin_SCRIPTS) hello: $(srcdir)/hello.sh <TAB> rm -f hello <TAB> cp $(srcdir)/hello.sh hello <TAB> chmod ugo+x hello
% touch NEWS README AUTHORS ChangeLog % aclocal % autoconf % automake -a % ./configure % make % make distcheck # make install
FIXME: Discussion
FIXME: Do you want to volunteer for this section?
To install data files, you should use the ‘DATA’ primitive instead of ‘SCRIPTS’. The main difference is that ‘DATA’ will allow you to install files in data installation locations, whereas ‘SCRIPTS’ will only allow you to install files in executable installation locations.
Normally it is assumed that the files listed in ‘DATA’ are written by you and are not generated by a program, therefore they are not cleaned by default. If you want your data to be generated by a program, you must provide a target for building the data, and you must also mention the data file in ‘CLEANFILES’ so that it's cleaned when you type ‘make clean’. You should of course include the source for the program and the appropriate lines in Makefile.am for building the program. For example:
noinst_PROGRAMS = mkdata mkdata_SOURCES = mkdata.cc pkgdata_DATA = thedata CLEANFILES = $(pkgdata_DATA) thedata: mkdata <TAB> ./mkdata > thedata
Note that because the data generation program is a one-time-use program, we don't want to install it so we list in in ‘noinst_*’.
If your data files are written by hand, then all you need to do is list them in the ‘DATA’ assignment:
pkgdata_DATA = foo1.dat foo2.dat foo3.dat
In general, you should install data files in ‘pkgdata’. However, if your data files are configuration files or files that the program modifies as it runs, they should be installed in other directories. For more details See Installation standard directories.
FIXME: Needs to be written
At the moment Autotoolset distributes the following additional utilities:
LF
macros which introduce mainly support for C++, Fortran and
embedded text.
LF
macros and the ‘acmkdir’
utility but we will postpone our discussion of Fortran support until the
next chapter.
LF
macrosIn last chapter we explained that a minimal configure.in file looks like this:
AC_INIT(package,version) AM_CONFIG_HEADER(config.h) AM_INIT_AUTOMAKE AC_PROG_CXX AC_PROG_RANLIB AC_CONFIG_FILES([Makefile ...]) AC_OUTPUT
If you are not building libraries, you can omit AC_PROG_RANLIB
.
Alternatively you can use the following macros that are distributed with Autotools, and made accessible through the ‘aclocal’ utility. All of them are prefixed with ‘LF’ to distinguish them from the standard macros:
AC_PROG_CC AC_PROG_CPP AC_AIX AC_ISC_POSIX AC_MINIX AC_HEADER_STDC
which is a traditional Autoconf idiom for setting up the C compiler.
AC_PROG_CXX AC_PROG_CXXCPP
and then invokes the portability macro:
LF_CPP_PORTABILITY
This is the recommended way for configuring your C++ compiler.
#include <config.h>
In the past it used to be necessary to have to include a file called
cpp.h. I've sent this file straight to hell.
$ configure ... --with-warnings ...
Warnings can help you find out many bugs, as well as help you improve your coding habits. On the other hand, in many cases, many of these warnings are false alarms, which is why the default behaviour of the compiler is to not show them to you. You are probably interested in warnings if you are the developer, or a paranoid end-user.
AC_INIT(package,version) AM_CONFIG_HEADER(config.h) AM_INIT_AUTOMAKE LF_CONFIGURE_CXX AC_PROG_RANLIB AC_CONFIG_FILES([Makefile ...]) AC_OUTPUT
A full-blown configure.in file for projects that mix Fortran and C++ (and may need the C compiler also if using ‘f2c’) invokes all of the above macros:
AC_INIT(package,version) AM_INIT_AUTOMAKE LF_CANONICAL_HOST LF_CONFIGURE_CC LF_CONFIGURE_CXX LF_CONFIGURE_FORTRAN LF_SET_WARNINGS AC_PROG_RANLIB AC_CONFIG_SUBDIRS(fortran/f2c fortran/libf2c) AC_CONFIG_FILES([Makefile ...]) AC_OUTPUT
In order for LF_CPP_PORTABILITY
to work correctly you need to append
certain things at the bottom of your acconfig.h. This is done for you
automatically by acmkdir
.
When the LF_CPP_PORTABILITY
macro is invoked by configure.in
then the following portability problems are checked:
CXX_HAS_NO_BOOL
. It is possible to emulate bool
with the
following C preprocessor directives:
#ifdef CXX_HAS_NO_BOOL #define bool int #define true 1 #define false 0 #endif
To make your code portable to compilers that don't support
bool, through this workaround, you must follow one rule: never
overload your functions in a way in which the only distinguishing feature is
bool
vs int
.
This workaround is included in the default acconfig.h after
@BOTTOM@
that gets installed by acmkdir
.
#include <iostream.h> main() { for (int i=0;i<10;i++) { } for (int i=0;i<10;i++) { } }
This is legal C++ and the variable i
is supposed to have scope only
inside the for-loop braces and the parentheses. Unfortunately, most C++
compilers use an obsolete version of the standard's draft in which the
scope of i
is the entire main
in this example.
The workaround we use is as follows:
#ifdef CXX_HAS_BUGGY_FOR_LOOPS #define for if(1) for #endif
By nesting the forloop inside an if-statement, the variable i
is
assigned the correct scope. Now if your if-statement scoping is also broken
then you really need to get another compiler.
The macro CXX_HAS_BUGGY_FOR_LOOPS
is defined for you if appropriate,
and the code for the work-around is included with the
default acconfig.h
.
In addition to these workarounds, the following additional features are
introduced at the end of the default acconfig.h
. The features are
enabled only if your configure.in calls LF_CPP_PORTABILITY
.
loop
is defined such that
loop(i,a,b)
is equivalent
for (int i = a; i <= b; i++)
This is syntactic sugar that makes it easier on the hand to write nested loops like:
int Ni,Nj,Nk; loop(i,0,Ni) loop(j,0,Nj) loop(k,0,Nk) { ... }
minimizing the probability of making a spelling bug. If you need to do more unusual looping you can use one of the following macros:
inverse_loop(i,a,b) <--> for (int i = a; i >= b; i--) integer_loop(i,a,b,s) <--> for (int i = a; i <= b; i += s)
This feature depends on having correct scoping in for which fortunately is easily taken care of.
#define pub public: #define pro protected: #define pri private:
Now you can declare a class prototype in a Java-like style like this:
class foo { pri double a,b; pub double c,d; pub foo(); pub virtual ~foo(); pri void method1(void); pub void method2(void); };
Personally I find this notation more lucid than the standard C++ syntax because this way I can see the protection level of each variable and method without having to possibly scroll up to see what it is. Also, it is less bug-prone this way.
const double pi = 3.14159265358979324;
assert
is simple. Suppose that at a certain point
in your code, you expect two variables to be equal. If this expectation
is a precondition that must be satisfied in order for the subsequent
code to execute correctly, you must assert
it with a statement
like this:
assert(var1 == var2);
In general assert
takes as argument a boolean expression.
If the boolean expression is true, execution continues. Otherwise the
‘abort’ system call is invoked and the program execution is stopped.
If a bug prevents the precondition from being true, then you
can trace the bug at the point where the precondition breaks down instead
of further down in execution or not at all. The ‘assert’ call is
implemented as a C preprocessor macro, so it can be enabled or disabled
at will.
One way to enable assertions is to include assert.h.
#include <assert.h>
Then it's possible to disable them by defining the ‘NDEBUG’ macro. Alternatively, because it is easy to provide our own assert, if your configure.in invokes ‘LF_CPP_PORTABILITY’ then ‘assert’ will be conditionally defined for you in the config.h file. By default, the ‘configure’ script will enable assertions. You can disable assertions at configure-time like this:
% configure ... --disable-assert ...
During debugging and testing it is a good idea to leave assertions enabled. However, for production runs it's best to disable them.
If your program crashes at an assertion, then the first thing you should do is to find out where the error happens. To do this, run the program under the gdb debugger. First invoke the debugger:
% gdb ...copyright notice...
Then load the executable and set a breakpoint at the ‘abort’ system call:
(gdb) file "executable" (gdb) break abort
Now run the program:
(gdb) run
Instead of crashing, under the debugger the program will be paused when the ‘abort’ system call is invoked, and you will get back the debugger prompt. Now type:
(gdb) where
to see where the crash happened. You can use the ‘print’ command to look at the contents of variables and you can use the ‘up’ and ‘down’ commands to navigate the stack. For more information, see the GDB documentation or type ‘help’ at the prompt of gdb.
Another suggestion is to never call the abort
system call directly.
Instead, please do this:
assert(false); exit(1);
This way if assertions are enabled, the program will stop and the stack will be retained. Otherwise the program will simply exit.
The C++ language has been standardized very recently. As a result, not all
compilers fully support all the features that the ANSI C++ standard requires,
including the g++
compiler itself. Some of the problems commonly
encountered, such as incorrect scoping in for-loops and lack of the
bool
data type can be easily worked around. In this section we
give some tips for avoiding more portability problems. I welcome people on
the net reading this to email me their tips, to be included in this
tutorial.
int n = 10; double **foo; foo = new (double *)[i];
The g++
compiler will parse this and do the right thing, but other
compilers are more picky. The correct way to do it is:
int n = 10; double **foo; foo = new double * [i];
g++
.
FIXME: I need to add some stuff here.
Putting all of this together, we will now show you how to create a super
Hello World package, using the LF
macros and the utilities that
are distributed with the ‘autotoolset’ distribution.
The first step is to build a directory tree for the new project. Instead of doing it by hand, use the ‘acmkdir’ utility. Type:
% acmkdir hello
‘acmkdir’ prompts you with the current directory pathname. Make sure that this is indeed the directory where you want to install the directory tree for the new package. You will be prompted for some information about the newly created package. When you are done, ‘acmkdir’ will ask you if you really want to go for it. Say ‘y’. Then ‘acmkdir’ will do the following:
AC_INIT AM_CONFIG_HEADER(config.h) AM_INIT_AUTOMAKE(test,0.1) LF_HOST_TYPE LF_CONFIGURE_CXX LF_SET_WARNINGS AC_PROG_RANLIB AC_OUTPUT(Makefile doc/Makefile m4/Makefile src/Makefile)
You can edit this and customize it to your needs. More specifically, you will need to update the version number here every time to you cut a new distribution.
EXTRA_DIST = reconf configure SUBDIRS = m4 doc src
The ones in the src
and doc
subdirectories are empty. The
one in m4 contains a template Makefile.am which you should
edit if you want to add new macros.
#!/bin/sh rm -f config.cache rm -f acconfig.h aclocal -I m4 autoconf acconfig autoheader automake -a exit
The makes sure that all the utilities are invoked, and in the right order. Before ‘acmkdir’ exits, it will call the ‘reconf’ script for you once to set things up.
Now enter the directory hello-0.1/src and start coding:
% cd hello-0.1/src % gpl -cc hello.cc % vi hello.cc % vi Makefile.am
This time we will use the following modified hello world program:
#ifdef HAVE_CONFIG_H #include <config.h> #endif #include <iostream.h> main() { cout << "Welcome to " << PACKAGE << " version " << VERSION; cout << " for " << YOUR_OS << endl; cout << "Hello World!" << endl; }
and for Makefile.am the same old thing:
bin_PROGRAMS = hello hello_SOURCES = hello.cc
Now back to the toplevel directory:
% cd .. % reconf % configure % make % src/hello Welcome to test version 0.1 for i486-pc-linux-gnulibc1 Hello World!
Note that by using the special macros PACKAGE
, VERSION
,
YOUR_OS
the program can identify itself, its version number and the
operating system for which it was compiled. The PACKAGE
and
VERSION
are defined by AM_INIT_AUTOMAKE
and
YOUR_OS
by LF_HOST_TYPE
.
Now you can experiment with the various options that configure offers. You can do:
% make distclean
and reconfigure the package with one of the following variations in options:
% configure --disable-assert % configure --with-warnings
or a combination of the above. You can also build a distribution of your hello world and feel cool about yourself:
% make distcheck
The important thing is that you can write extensive programs like this and stay focused on writing code instead of maintaining stupid header file, scripts, makefiles and all that.
The ‘acmkdir’ utility can be invoked in the simple manner that we showed in the last chapter to prepare the directory tree for writing C++ code. Alternatively, it can be instructed to create directory trees for Fortran/C++ code as well as documentation directories.
In general, you invoke ‘acmkdir’ in the following manner:
% acmkdir [OPTIONS] "dirname"
If you are creating a toplevel directory, then everything will appear under ‘dirname-0.1’. Otherwise, the name ‘dirname’ will be used instead.
‘acmkdir’ supports the following options:
latex
documentation directory
(see Writing documentation with LaTeX).
If your package will have more than
one documentation texts, you usually want to invoke this under the
‘doc’ subdirectory:
% cd doc % acmkdir -latex tutorial % acmkdir -latex manual
Of course, the Makefile.am under the doc directory will need
to refer to these subdirectories with a SUBDIRS
entry:
SUBDIRS = tutorial manual
Alternatively, if you decide to use the doc directory itself for documentation (and you are massively sure about this), then you can
% rm -rf doc % acmkdir -latex doc
You should use this feature if you wish to typeset your documentation
using LaTeX instead of Texinfo.
The disadvantage of using ‘latex’ for your documentation
is that you can only produce a printed book; you can not also generate
on-line documentation. The advantage is that you can typeset very complex
mathematics, something which you can not do under Texinfo since it only
uses plain TeX. If you are documenting mathematical software, you may
prefer to write the documentation in Latex. Autotoolset will provide you
with LaTeX macros to make your printed documentation look like Texinfo
printed documentation.
TYPE
.
The types available are: default
, traditional
,
fortran
. Eventually I may implement two additional types:
f77
, f90
.
Now, a brief description of these toplevel types:
LF
macros installed by Autotoolset.
The acconfig.h file is automagically generated and a custom
INSTALL file is installed. The defaults reflect my own personal
habits.
#undef PACKAGE #undef VERSION
which are required by Automake.
f2c
translator. The software is configured such that if a Fortran
compiler is not available, f2c
is built instead, and then used
to compile the Fortran code. We will explain all about Fortran in the
next chapter.
In some cases, we want to embed text to the executable file of an application. This may be on-line help pages, or it may be a script of some sort that we intend to execute by an interpreter library that we are linking with, like Guile or Tcl. Whatever the reason, if we want to compile the application as a stand-alone executable, it is necessary to embed the text in the source code. Autotoolset provides with the build tools necessary to do this painlessly.
As a tutorial example, we will write a simple program that prints the contents of the GNU General Public License. First create the directory tree for the program:
% acmkdir copyleft
Enter the directory and create a copy of the txtc
compiler:
% cd copyleft-0.1 % mktxtc
Then edit the file configure.in and add a call to the
LF_PROG_TXTC
macro. This macro depends on
AC_PROG_CC AC_PROG_AWK
so make sure that these are invoked also. Finally add txtc.sh to
your AC_OUTPUT
.
The end-result should look like this:
AC_INIT(reconf) AM_CONFIG_HEADER(config.h) AM_INIT_AUTOMAKE(copyleft,0.1) LF_HOST_TYPE LF_CONFIGURE_CC LF_CONFIGURE_CXX LF_SET_OPTIMIZATION LF_SET_WARNINGS AC_PROG_RANLIB AC_PROG_AWK LF_PROG_TXTC AC_OUTPUT(Makefile txtc.sh doc/Makefile m4/Makefile src/Makefile)
Then, enter the src directory and create the following files:
% cd src % gpl -l gpl.txt % gpl -cc gpl.h % gpl -cc copyleft.cc
The gpl.txt file is the text that we want to print. You can substitute
it with any text you want. This file will be compiled into gpl.o
during the build process. The gpl.h file is a header file that gives
access to the symbols defined by gpl.o. The file copyleft.cc
is where the main
will be written.
Next, add content to these files as follows:
extern int gpl_txt_length; extern char *gpl_txt[];
#ifdef HAVE_CONFIG_H #include <config.h> #endif #include <iostream.h> #include "gpl.h" main() { loop(i,1,gpl_txt_length) { cout << gpl_txt[i] << endl; } }
SUFFIXES = .txt .txt.o: $(TXTC) $< bin_PROGRAMS = copyleft foo_SOURCES = copyleft.cc gpl.h gpl.txt
$ cd .. $ reconf $ configure $ make $ src/copyleft | less
To verify that this works properly, do the following:
$ cd src $ copyleft > copyleft.out $ diff gpl.txt copyleft.out
The two files should be identical. Finally, convince yourself that you can make a distribution:
$ make distcheck
and there you are.
Note that in general the text file, as encoded by the text compiler, will not be always identical to the original. There is one and only one modification being made: If any line has any blank spaces at the end, they are trimmed off. This feature was introduced to deal with a bug in the Tcl interpreter, and it is in general a good idea since it conserves a few bytes, it never hurts, and additional whitespace at the end of a line shouldn't really be there.
This magic is put together from many different directions. It begins with
the LF_PROG_TXTC
macro:
TXTC
to point to a Text-to-C
compiler. To create a copy of the compiler at the toplevel directory of your
source code, use the mktxtc
command:
% mktxtc
The compiler is implemented as a shell script, and it depends on sed
,
awk
and the C compiler, so you should call the following two macros
before invoking AC_PROG_TXTC
:
AC_PROG_CC AC_PROG_AWK
The compiler is intended to be used as follows:
$(TXTC) text1.txt text2.txt text3.txt ...
such that given the files text1.txt, text2.txt, etc. object files text1.o, text2.o, etc, are generated that contains the text from these files.
SUFFIXES = .txt .txt.o: $(TXTC) $<
assuming that your text files will end in the .txt
suffix. The first
line informs Automake that there exist source files using non-standard
suffixes. Then we describe, in terms of an abstract Makefile rule, how to
build an object file from these non-standard suffixes. Recall the use of
the symbol $<
. Also note that it is not necessary
to use $(srcdir)
on $<
for VPATH builds.
If you embed more than one type of files, then you may want to use more
than one suffixes. For example, you may have .hlp files containing
online help and .scm files containing Guile code. Then you
want to write a rule for each suffix as follows:
SUFFIXES = .hlp .scm .hlp.o: $(TXTC) $< .scm.o: $(TXTC) $<
It is important to put these lines before mentioning any SOURCES
assignments. Automake is smart enough to parse these abstract makefile
rules and recognize that files ending in these suffixes are valid source
code that can be built to object code. This allows you to simply list
gpl.txt with the other source files in the SOURCES
assignment:
copyleft_SOURCES = copyleft.cc gpl.h gpl.txt
In order for this to work however, Automake must be able to see your abstract rules first.
When you “compile” a text file foo.txt this makes an object file that defines the following two symbols:
int foo_txt_length; char *foo_txt[];
Note that the dot characters are converted into underscores. To make these symbols accessible, you need to define an appropriate header file with the following general form:
extern int foo_txt_length; extern char *foo_txt[];
When you include this header file into your other C or C++ files then:
foo_txt[0];
and use it to print diagnostic messages.
char *foo_txt[1]; -> first line char *foo_txt[2]; -> second line ...
foo_txt_length
is defined such that
char *foo_txt[foo_txt_length+1] == NULL
The last line of the text is:
char *foo_txt[foo_txt_length];
You can use a for
loop (or the loop
macro defined by
LF_CPP_PORTABILITY)
together with foo_txt_length
to loop over the entire text, or you can
exploit the fact that the last line points to NULL
and do a
while
loop.
When making a package, you can organize it as a flat package or
a deep package. In a flat package, all the source files are placed
under src
without any subdirectory structure. In a deep package,
libraries and groups of executables are separated by a subdirectory
structure. The perennial problem with deep packages is dealing with
interdirectory dependencies. What do you do if to compile one library you
need header files from another library in another directory? What do you do if
to compile the test suite of your library you need to link in another library
that has just been compiled in a different directory?
One approach is to just put all these interdependent things in the same
directory. This is not very unreasonable since the Makefile.am
can document quite thoroughly where each file belongs, in case you need to
split them up in the future. On the other hand, this solution becomes less
and less preferable as your project grows. You may not want to clutter
a directory with source code for too many different things. What do you
do then?
The second approach is to be careful about these dependencies and just invoke the necessary features of Automake to make everything work out.
For *.a
files (library binaries), the recommended thing to do
is to link them by giving the full relative pathname. Doing that allows
Automake to work out the dependencies correctly across multiple directories.
It also allows you to easily upgrade to shared libraries with Libtool.
To retain some flexibility it may be best to list these interdirectory
link sequences in variables and then use these variables. This way, when you
move things around you minimize the amount of editing you have to do.
In fact, if all you need these library binaries for is to build a test suite
you can simply assign them to LDFLAGS
. To make these assignments
more uniform, you may want to start your pathnames with $(top_builddir)
.
For *.h
files (header files), you can include an
INCLUDES = -I../dir1 -I../dir2 -I../dir3 ...
assignment on every Makefile.am of every directory level listing
the directories that contain include files that you want to use. If your
directory tree is very complicated, you may want to make these assignments
more uniform by starting your pathnames from $(top_srcdir)
.
In your source code, you should use the syntax
#include "foo.h"
for include files in the current directory and
#include <foo.h>
for include files in other directories.
There is a better third approach, provided by Autotoolset, but it only
applies to include files. There is nothing more that can be done with
library binaries; you simply have to give the path. But with header files,
it is possible to arrange at configure-time that all header files are
symlinked under the directory $(top_builddir)/include
. Then you will
only need to list one directory instead of many.
Autotoolset provides two Autoconf macros: LF_LINK_HEADERS
and
LF_SET_INCLUDES
, to handle this symlinking.
LF_LINK_HEADERS(src/dir1 src/dir2 src/dir3 ... src/dirN)
When this macro is invoked for the first time, the directory $(top_srcdir)/include is erased. Then for each directory src/dirK listed, we look for the file src/dirK/Headers and link the public header files mentioned in that file under $(top_srcdir)/include. The link will be either symbolic or hard, depending on the capabilities of your operating system. If possible, a symbolic link will be preferred.
You can invoke the same macro by passing an optional argument that specifies a directory name. For example:
LF_LINK_HEADERS(src/dir1 src/dir2 ... src/dirN , foo)
Then the symlinks will be created under the $(top_srcdir)/include/foo directory instead. This can be significantly useful if you have very many header files to install and you'd like to call them something like:
#include <foo/file1.h>
During compilation, when you try to
$(default_includes)
to contain the correct collection of -I
flags, such that the include files are accessible. If you invoke it with
no arguments as
LF_SET_INCLUDES
then the following assignment will take place:
default_includes = -I$(prefix) -I$(top_srcdir)/include
If you invoke it with arguments:
LF_SET_INCLUDES(dir1 dir2 ... dirN)
then the following assignment will take place instead:
default_includes = -I$(prefix) -I$(top_srcdir)/include/dir1 \ -I$(top_srcdir)/include/dir2 ... \ -I$(top_srcdir)/include/dirN
You may use this variable as part of your INCLUDES
assignment
in your Makefile.am like this:
INCLUDES = $(default_includes)
If your distribution has a lib directory, in which you install
various codelets and header files, then a path to that library is
added to default_includes
also. In that case, you have one
of the following:
default_includes = -I$(prefix) -I$(top_srcdir)/lib -I$(top_srcdir)/include
or
default_includes = -I$(prefix) -I$(top_srcdir)/lib \ -I$(top_srcdir)/include/dir1 ... \ -I$)top_srcdir)/include/dirN
A typical use of this system involves invoking
LF_LINK_HEADERS(src/dir1 src/dir2 ... src/dirN) LF_SET_INCLUDES
in your configure.in and adding the following two lines in your Makefile.am:
INCLUDES = $(default_includes) EXTRA_DIST = Headers
The variable $(default_includes)
will be assigned by the
configure
script to point to the Right Thing. You will also
need to include a file called Headers in every directory level
that you mention in LF_LINK_HEADERS
containing the public header
files that you wish to symlink. The filenames need to be separated by
carriage returns in the Headers file. You also need to mention
these public header files in a
include_HEADERS = foo1.h foo2.h ...
assignment, in your Makefile.am, to make sure that they are installed.
With this usage, other programs can access the installed header files as:
#include <foo1.h>
Other directories within the same package can access the uninstalled yet header files in exactly the same manner. Finally, in the same directory you should access the header files as
#include "foo1.h"
This will force the header file in the current directory to be installed, even when there is a similar header file already installed. This is very important when you are rebuilding a new version of an already installed library. Otherwise, building might be confused if your code tries to include the already installed, and not up-to-date, header files from the older version.
Alternatively, you can categorize the header files under a directory, by invoking
LF_LINK_HEADERS(src/dir1 src/dir2 , name1) LF_LINK_HEADERS(src/dir3 src/dir4 , name2) LF_SET_INCLUDES(name1 name2)
in your configure.in. In your Makefile.am files you still add the same two lines:
INCLUDES = $(default_includes) EXTRA_DIST = Headers
and maintain the Headers file as before. However, now the header files will be symlinked to subdirectories of $(top_srcdir)/include. This means that although uninstalled header files in all directories must be included by code in the same directory as:
#include "header.h"
code in other directories must access these uninstalled header files as
#include <name1/header.h>
if the header file is under src/dir1 or src/dir2 or as
#include <name2/header.h>
if the header file is under src/dir3 or src/dir4. It follows that you probably intend for these header files to be installed correspondingly in such a manner so that other programs can also include them the same way. To accomplish that, under src/dir1 and src/dir2 you should list the header files in your Makefile.am like this:
name1dir = $(includedir)/name1 name1_HEADERS = header.h ...
and under src/dir3 and src/dir4 like this:
name2dir = $(includedir)/name2 name2_HEADERS = header.h
One disadvantage of this approach is that the source tree is modified
during configure-time, even during a VPATH build. Some may not like that, but
it suits me just fine.
Unfortunately, because Automake requires the GNU compiler to compute
dependencies, the header files need to be placed in a constant location
with respect to the rest of the source code. If a mkdep
utility
were to be distributed by Automake to compute dependencies when the installer
installs the software and not when the developer builds a source code
distribution, then it would be possible to allow the location of the header
files to be dynamic. If that development ever takes place in Automake,
Autotoolset will immediate follow. If you really don't like this,
then don't use this feature.
Usually, if you are installing one or two header files per library you
want them to be installed under $(includedir)
and be includeable
with
#include <foo.h>
On the other hand, there are many applications that install a lot of header files, just for one library. In that case, you should put them under a prefix and let them be included as:
#include <prefix/foo.h>
Examples of libraries doing this X11 and Mesa.
This mechanism for tracking include files is most useful for very large projects. You may not want to bother for simple homework-like throwaway hacks. When a project starts to grow, it is very easy to switch.
In this chapter I will discuss in extreme detail the portability issues
with C++. Most of this work will be based on bzconfig
which I
will adapt to include in Autotoolset eventually. I don't know the structure
of this chapter yet.
This chapter is devoted to Fortran. We will show you how to build programs that combine Fortran and C or C++ code in a portable manner. The main reason for wanting to do this is because there is a lot of free software written in Fortran. If you browse ‘http://www.netlib.org/’ you will find a repository of lots of old, archaic, but very reliable free sources. These programs encapsulate a lot of experience in numerical analysis research over the last couple of decades, which is crucial to getting work done. All of these sources have been written in Fortran. As a developer today, if you know other programming languages, it is unlikely that you will want to write original code in Fortran. You may need, however, to use legacy Fortran code, or the code of a neighbour who still writes in Fortran.
The most portable way to mix Fortran with your C/C++ programs is to translate the Fortran code to C with the ‘f2c’ compiler and compile everything with a C/C++ compiler. The ‘f2c’ compiler is available at ‘http://www.netlib.org/’ but as we will soon explain, it is also distributed with the ‘autotools’ package. Another alternative is to use the GNU Fortran compiler ‘g77’ with ‘g++’ and ‘gcc’. This compiler is portable among many platforms, so if you want to use a native Fortran compiler without sacrificing portability, this is one way to do it. Another way is to use your OS's native Fortran compiler, which is usually called ‘f77’, if it is compatible with ‘g77’ and ‘f77’. Because performance is also very important in numerical codes, a good strategy is to prefer to use the native compiler if it is compatible, and support ‘g77’ as a fall-back option. Because many sysadmins don't install ‘g77’ supporting ‘f2c’ as a third fall-back is also a good idea.
Autotools provides support for configuring and building source code written in part or in whole in Fortran. The implementation is based on the build system used by GNU Octave, which has been generalized for use by any program.
The traditional Hello world program in Fortran looks like this:
c....:++++++++++++++= PROGRAM MAIN PRINT*,'Hello World!' END
All lines that begin with ‘c’ are comments. The first line is the
equivalent of main()
in C. The second line says hello, and the
third line indicates the end of the code. It is important that all command
lines are indented by 7 spaces, otherwise the compiler will issue a syntax
error. Also, if you want to be ANSI compliant, you must write your code all
in caps. Nowadays most compilers don't care, but some may still do.
To compile this with ‘g77’ (or ‘f77’) you do something like:
% g77 -o hello hello.f % hello
To compile it with the f2c translator:
% f2c hello.f % gcc -o hello hello.c -lf2c -lm
where ‘-lf2c’ links in the translator's system library.
In order for this to work, you will have to make sure that the header file
f2c.h
is present since the translated code in hello.c includes
it with a statement like
#include "f2c.h"
which explicitly requires it to be present in the current working directory.
In this case, the ‘main’ is written in Fortran. However most of the Fortran you will be using will actually be subroutines and functions. A subroutine looks like this:
c....:++++++++++++++ SUBROUTINE FHELLO (C) CHARACTER *(*) C PRINT*,'From Fortran: ',C RETURN END
This is the analog of a ‘void’ function in C, because it takes arguments but doesn't return anything. The prototype declaration is K&R style: you list all the arguments in parenthesis, separated with commas, and you declare the types of the variables in the subsequent lines.
Suppose that this subroutine is saved as fhello.f. To call it from C you need to know what it looks like from the point of the C compiler. To find out type:
% f2c -P fhello.f % cat fhello.P
You will find that this subroutine has the following prototype declaration:
extern int fhello_(char *c__, ftnlen c_len);
It may come as a surprise, and this is a moment of revelation, but although in Fortran it appears that the subroutine is taking one argument, in C it appears that it takes two! And this is what makes it difficult to link code in a portable manner between C and Fortran. In C, everything is what it appears to be. If a function takes two arguments, then this means that down to the machine language level, there is two arguments that are being passed around. In Fortran, things are being hidden from you and done in a magic fashion. The Fortran programmer thinks that he is passing one argument, but the compiler compiles code that actually passes two arguments around. In this particular case, the reason for this is that the argument you are passing is a string. In Fortran, strings are not null-terminated, so the ‘f2c’ compiler passes the length of the string as an extra hidden argument. This is called the linkage method of the compiler. Unfortunately, linkage in Fortran is not standard, and there exist compilers that handle strings differently. For example, some compilers will prepend the string with a few bytes containing the length and pass a pointer to the whole thing. This problem is not limited to strings. It happens in many other instances. The ‘f2c’ and ‘g77’ compilers follow compatible linkage, and we will use this linkage as the ad-hoc standard. A few proprietary Fortran compilers like the Dec Alpha ‘f77’ and the Irix ‘f77’ are also ‘f2c’-compatible. The reason for this is because most of the compiler developers derived their code from ‘f2c’. So although a standard was not really intended, there we have one anyway.
A few things to note about the above prototype declaration is that the symbol ‘fhello’ is in lower-case, even though in Fortran we write everything uppercase, and it is appended with an underscore. On some platforms, the proprietary Fortran compiler deviates from the ‘f2c’ standard either by forcing the name to be in upper-case or by omitting the underscore. Fortunately, these cases can be detected with Autoconf and can be worked around with conditional compilation. However, beyond this, other portability problems, such as the strings issue, are too involved to deal with and it is best in these cases that you fall back to ‘f2c’ or ‘g77’. A final thing to note is that although ‘fhello’ doesn't return anything, it has return type ‘int’ and not ‘void’. The reason for this is that ‘int’ is the default return type for functions that are not declared. Therefore, to prevent compilation problems, in case the user forgets to declare a Fortran function, ‘f2c’ uses ‘int’ as the return type for subroutines.
In Fortran parlance, a subroutine is what we'd call a ‘void’ function. To Fortran programmers in order for something to be a function it has to return something back. This reflects on the syntax. For example, here's a function that adds two numbers and returns the result:
c....:++++++++++++++++ DOUBLE PRECISION FUNCTION ADD(A,B) DOUBLE PRECISION A,B ADD = A + B RETURN END
The name of the function is also the name of the return variable. If you run this one through ‘f2c -P’ you will find that the C prototype is:
extern doublereal add_(doublereal *a, doublereal *b);
There's plenty of things to note here:
integer -> int real -> float doublereal -> double complex -> struct { real r,i; }; doublecomplex -> struct { doublereal r,i; };
A more interesting case is when we deal with complex numbers. Consider a function that multiplies two complex numbers:
c....:++++++++++++++++++++++++++++++ COMPLEX*16 FUNCTION MULT(A,B) COMPLEX*16 A,B MULT = A*B RETURN END
As it turns out, the prototype for this function is:
extern Z_f mult_(doublecomplex *ret_val, doublecomplex *a, doublecomplex *b);
Because complex numbers are not a native type in C, they can not be returned efficiently without going through at least one copy. Therefore, for this special case the return value is placed as the first argument in the prototype! Actually despite many people's feelings that Fortran must die, it is still the best tool to use to write optimized functions that are heavy on complex arithmetic.
Now that we have brought up some of the issues about Fortran linkage, let's show you how to work around them. We will write a simple Fortran function, and a C program that calls it, and then show you how to turn these two into a GNU-like package, enhanced with a configure script and the works. This discussion assumes that you have installed the utilities in ‘autotools’, the package with which this tutorial is being distributed.
First, begin by building a directory for your new package. Because this project will involve Fortran, you need to pass the ‘-f’ flag to ‘acmkdir’:
% acmkdir -t fortran foo
The ‘-t’ flag directs ‘acmkdir’ to unpack a copy of the ‘f2c’ translator and to build proper toplevel ‘configure.in’ and ‘Makefile.am’ files. This will take a while, so relax and stretch a little bit.
Now enter the foo-0.1 directory and look around:
% cd foo-0.1 % cat configure.in AC_INIT(hello,0.1) AM_CONFIG_HEADER(config.h) AM_INIT_AUTOMAKE LF_CONFIGURE_CC LF_CONFIGURE_CXX AC_PROG_RANLIB LF_HOST_TYPE LF_PROG_F77_PREFER_F2C_COMPATIBILITY dnl LF_PROG_F77_PREFER_NATIVE_VERSION LF_PROG_F77 LF_SET_WARNINGS AC_CONFIG_SUBDIRS(fortran/f2c fortran/libf2c) AC_CONFIG_FILES([Makefile fortran/Makefile f2c_comp doc/Makefile m4/Makefile src/Makefile ]) % cat Makefile.am EXTRA_DIST = reconf configure SUBDIRS = fortran m4 doc src
There are some new macros in configure.in and a new subdirectory: fortran. There is also a file that looks like a shell script called f2c_comp.in. We will discuss the gory details about all this in the next section. Now let's write the code. Enter the src directory and type:
$ cd src $ mkf2c
This creates the following files:
#ifdef __cplusplus extern "C" { #endif #if defined (sun) int MAIN_ () { return 0; } #elif defined (linux) && defined(__ELF__) int MAIN__ () { return 0; } #endif #ifdef __cplusplus } #endif
$ vi fhello.f $ vi hello.cc
with
c....:++++++++++++++++++++++++++++++ SUBROUTINE FHELLO (C) CHARACTER *(*) C PRINT*,'From Fortran: ',C RETURN END
#ifdef HAVE_CONFIG_H #include <config.h> #endif #include <string.h> #include "f2c.h" #include "f77-fcn.h" extern "C" { extern int f77func(fhello,FHELLO)(char *c__, ftnlen c_len); } main() { char s[30]; strcpy(s,"Hello world!"); f77func(fhello,FHELLO)(s,ftnlen(strlen(s))); }
f77func
macro is included in acconfig.h
automatically for you if the LF_CONFIGURE_FORTRAN
macro is included
in your configure.in. The definition is as follows:
#ifndef f77func #if defined (F77_APPEND_UNDERSCORE) # if defined (F77_UPPERCASE_NAMES) # define f77func(f, F) F##_ # else # define f77func(f, F) f##_ # endif #else # if defined (F77_UPPERCASE_NAMES) # define f77func(f, F) F # else # define f77func(f, F) f # endif #endif #endif
Recall that we said that the issue of whether to add an underscore and
whether to capitalize the name of the routine can be dealt with conditional
compilation. This macro is where this conditional compilation happens.
The LF_PROG_F77
macro will define
F77_APPEND_UNDERSCORE F77_UPPERCASE_NAMES
appropriately so that f77func
does the right thing.
To compile this, create a Makefile.am as follows:
SUFFIXES = .f .f.o: $(F77) -c $< bin_PROGRAMS = hello hello_SOURCES = hello.cc fhello.f f2c.h f2c-main.c hello_LDADD = $(FLIBS)
Note that the above Makefile.am is only compatible with version 1.3 of Automake, or newer versions. The previous versions don't grok Fortran filenames on the hello_SOURCES so you may want to upgrade.
Now you can compile and run the program:
$ cd .. $ reconf $ configure $ make $ src/hello From Fortran: Hello world!
If you have a native ‘f77’ compiler that was used, or the portable ‘g77’ compiler you missed out the coolness of using ‘f2c’. In order to check that out do:
$ make distclean $ configure --with-f2c $ make
and witness the beauty! The package will begin by building an f2c binary for your system. Then it will build the Fortran libraries. And finally, it will build the hello world program which you can run as before:
$ src/hello
It may seem an overkill to carry around a Fortran compiler. On the other hand you will find it very convenient, and the ‘f2c’ compiler isn't really that big. If you are spoiled on a system that is well equipped and with a good system administrator, you may find it a nasty surprise one day when you discover that the rest of the world is not necessarily like that.
If you download a real Fortran package from Netlib you might find it very
annoying having to enter the filenames for all the Fortran files in
‘*_SOURCES’. A work-around is to put all these files in their own
directory and then do this awk
trick:
% ls *.f | awk '{ printf("%s ", $1) }' > tmp
The awk filter will line-up the output of ls
in one line. You can use
your editor to insert its contents to your Makefile.am. Eventually
I may come around to write a utility for doing this automagically.
The best way to get started is by building the initial directory tree with ‘acmkdir’ like this:
% acmkdir -t fortran <directory-filename>
This will install all the standard stuff. It will also install a directory called fortran containing a copy of the f2c compiler and f2c_comp, a shell script invoking the compiler in a way that it looks the same as invoking a real compiler
The file configure.in uses the following special macros:
f2c
compatibility
over performance. In general Fortran programmers are willing to sacrifice
everything for the sake of performance. However, if you want to use
Fortran code with C and C++ code, you will have many reasons to also
give importance to f2c
compatibility. Use this macro to state this
preference. The effect is that if the installer's platform has a native
Fortran compiler installed, it will be used only if it is f2c
compatible. This macro must be invoked before invoking
LF_PROG_F77
.
f2c
compatibility. You may want to invoke this
instead if your entire program is written in Fortran.
This macro must be invoked before invoking LF_PROG_F77
.
F77_APPEND_UNDERSCORE
F77_UPPERCASE_NAMES
% f2c -P foo.f
on the file containing the subroutine and examine the file foo.P. In order for this macro to work properly you must precede it with calls to
AC_PROG_CC AC_PROG_RANLIB LF_HOST_TYPE
You also need to call one of the two *_PREFER_*
macros. The default
is to prefer f2c
compatibility.
AC_CONFIG_SUBDIRS(fortran/f2c fortran/libf2c) AC_OUTPUT([Makefile fortran/Makefile f2c_comp doc/Makefile m4/Makefile src/Makefile])
The AC_CONFIG_SUBDIRS
macro directs configure to execute the
configure scripts in fortran/f2c and fortran/libf2c.
The stuff in AC_OUTPUT
that are important to Fortran support are
building fortran/Makefile and f2c_comp. Because,
f2c_comp is mention in AC_OUTPUT
, Automake will automagically
bundle it when you build a source code distribution.
If you have originally set up your directory tree for a C or C++ only project and later you realize that you need to also use Fortran, you can upgrade your directory tree to Fortran as follows:
% mkfortran
and the f2c_oomp by invoking
% mkf2c_comp
both on the toplevel directory level.
AC_PROG_CC AC_PROG_RANLIB LF_HOST_TYPE LF_PROG_F77_PREFER_F2C_COMPATIBILITY LF_PROG_F77
If you have invoked LF_CONFIGURE_CC
then there is no need to
invoke AC_PROG_CC
again.
AC_OUTPUT
:
AC_CONFIG_SUBDIRS([fortran/f2c fortran/libf2c])
and add the following files to AC_OUTPUT
:
fortran/Makefile f2c_comp
% make distclean % ./reconf % ./configure % make
It is important to call reconf for the changes to take effect.
If a directory level contains Fortran source code, then it is important to let Automake know about it by adding the following lines in the beginning.
SUFFIXES = .f .f.o: $(F77) -c $<
This is pretty much the same idea with the embedded text compiler.
You can list the Fortran source code filenames in the SOURCES
assignments together with your C and C++ code. To link executables,
you must add $(FLIBS)
to LDADD
and link against
f2c-main.c just as in the hello world example. Please do
not include f2c-main.c in any libraries however.
Now consider the file hello.cc line by line. First we include the standard configuration stuff:
#ifdef HAVE_CONFIG_H #include <config.h> #endif #include <string.h>
Then we include the Fortran related header files:
#include "f2c.h"
Then we declare the prototypes for the Fortran subroutine:
extern "C" { extern int f77func(fhello,FHELLO)(char *c__, ftnlen c_len); }
There is a few things to note here:
extern "C" { }
The C++ language uses name mangling to support function overloading. This means that if you have two C++ functions called:
int foo(double x); int foo(double x,double y);
the C++ compiler internally assigns them different names in an intelligent fashion to avoid conflict. Just like the Fortran compiler does things behind your back, so does the C++ compiler to support some of its special features. Any code written between ‘extern "C"’ is compiled with name mangling disabled. This is necessary for the Fortran declarations because we don't want the names of the Fortran subroutines to be mangled.
f77func(fhello,FHELLO)(s,ftnlen(strlen(s)));
This may seem pedantic but it is necessary for the C++ compiler, and it is a good habit even for C programmers. Since Fortran routines are supposed to be wrapped, this is not too much to ask.
integer
explicitly. Unfortunately the standard header file distributed with
f2c
defines integer
as long int
to account for 16-bit
machines. That's a bad idea, and on the 64-bit Dec Alpha it is a bug. The
header file distributed with
mkf2c does the right thing.
SOURCES
assignments on your Makefile.am to make sure that
they are included in the source code distribution.
Fortran is infested with portability problems. There exist two important
Fortran standards: one that was written in 1966 and one that was written
in 1977. The 1977 standard is considered to be the standard Fortran.
Most of the Fortran code is written by scientists who have never had any
formal training in computer programming. As a result, they often write
code that is dependent on vendor-extensions to the standard, and not
necessarily easy to port. The standard itself is to blame as well, since
it is sorely lacking in many aspects. For example, even though standard
Fortran has both REAL
and DOUBLE PRECISION
data types
(corresponding to float
and double
) the standard only
supports single precision complex numbers (COMPLEX
). Since many
people will also want double precision complex numbers, many vendors provided
extensions. Most commonly, the double precision complex number is called
COMPLEX*16
but you might also see it called DOUBLE COMPLEX
.
Other such vendors extensions include providing a flush
operation
of some sort for file I/O, and other such esoteric things.
To make things worse (or better) now there are two more standards out there: the 1990 standard and the 1995 standard. A 2000 standard is also at work. Fortran 90 and its successors try to make Fortran more like C and C++, and even though there are no free compilers for both variants, they are becoming alarmingly popular with the scientific community. In fact, I think that the main reason why these variants of Fortran are being developed is to make more business for proprietary compiler developers. So far as I know, Fortran 90 does not provide any features that C++ can not support with a class library extension. Moreover Fortran 90 does not have the comprehensive foundation that allows C++ to be a self-extensible language. This makes it less worthwhile to invest effort on Fortran 90, because it means that eventually people will want features that can only be implemented by redefining the language and rewriting the compilers for it. Instead, in C++, you can add features to the language simply by writing C++ code, because it has enough core features to allow virtually unlimited self-extensibility.
If your primary interest is portability and free software, you should stay
away from Fortran
90 as well as Fortran 95, until someone writes a free compiler for them.
You will be better off developing in C++
and only migrating to
Fortran 77 the parts that are performance critical. This way you get the
best of both worlds.
On the flip side, if you limit your Fortran code just to number-crunching,
then it becomes much easier to write portable code. There are still a few
things you should take into account however.
Some Fortran code has been written in the archaic 1966 style. An example
of such code is the fftpack
package from netlib
. The main
problems with such code are the following:
I,J,...,N
are type INTEGER
. All others are REAL
To compile this code with
modern compilers it is necessary to add the following line to every source
file:
IMPLICIT DOUBLE PRECISION (A-H,O-Z)
This instructs the compiler to do the right thing, which is to implicitly
assume that all variables starting with A-H
and O-Z
are
double precision and all other variables are integers. Alternatively you can
say
IMPLICIT REAL (A-H,O-Z)
but it is very rarely that you will ever want to go with single precision.
Occasionally, you may find that the programmer breaks the rules. For example,
in fftpack
the array IFAC
is supposed to be a double
even though implicitly it is suggested to be an int
. Such inconstancies
will probably show up in compiler errors. To fix them, declare the type
of these variables explicitly. If it's an array then you do it like this:
DOUBLE PRECISION IFAC(*)
If the variable also appears in a DIMENSION
declaration, then you
should remove it from the declaration since the two can't coexist in
some compilers.
DIMENSION C(1)
means that C
has an unknown length, instead of meaning that it has
length 1. In modern Fortran, this is an unacceptable notation and modern
compilers do get confused over it. So all such instances must be replaced
with the correct form which is:
DIMENSION C(*)
Such “arrays” in reality are just pointers. The user can reference the array as far as he likes, but of course, if he takes it too far, the program will either do the Wrong Thing or crash with a segmentation fault.
INTEGER
,
‘9.435784839284958’ is always type REAL
(even if the additional precision specified is lost, and even when used in
a ‘DOUBLE PRECISION’ context such as being assigned to a
‘DOUBLE PRECISION’ variable!). On the other hand, 1E0
is
always REAL
and 1D0
is always ‘DOUBLE PRECISION’.
If you want your code to be exclusively double precision, then you should
scan the entire source for constants, and make sure that they all have the
D0
suffix at the end. Many compilers will tolerate this omission while
others will not and go ahead and introduce single precision error to your
computations leading to hard to find bugs.
In general the code in http://www.netlib.org/
is very reliable and
portable, but you do need to keep your eyes open for little problems like
the above.
There are many variants of Fortran like Fortran 90, and HPF. Fortran 90 attempts, quite miserably, to make Fortran 77 more like C++. HPF allows engineers to write numerical code that runs on parallel computers. These variants should be avoided for two reasons:
FIXME: New section. Needs to be written
FIXME: Needs to be written
FIXME: Advice on how to write a good manual General stuff. Reference manual vs user manual. When to write a manual. How to structure a manual. Texinfo vs. Latex Copyright issues.
FIXME: Needs to be written
FIXME: Needs to be written
The appendices
If you want to give your programs to other people or use programs that were written by other people, then you need to worry about copyright. The main reason why autoconf and automake were developed was to make sharing software easier. So, if you want to use these tools to develop free software, it is important to understand copyright. In this chapter we will address the legal issues involved with releasing software to the public. See Philosophical issues, for a discussion of the philosophical issues involved.
When you create an original work, like a computer program, or a novel, and so on, the government automatically grants you a set of legal rights called copyright. Copyright is the right to obstruct others from using, modifying and redistributing your work. Anyone that would like to use, modify or redistribute your work needs to enter an agreement with you. By granting you this monopoly, the government limits the freedom of the public to express themselves in ways that involve infringing your copyright. The government justifies copyright by claiming that it is a bargain that benefits the public because it encourages the creation of more works. 9 The holder of the copyright, called the “owner”, is the only person that can enforce per copyright.
Copyright ownership can be transfered to another person or organization. When a work is being developed by a team, it makes legal sense to transfer the copyright to a single organization that can then coordinate enforcement of the copyright. In the free software community, some people assign their software to the Free Software Foundation. The arrangement is that copyright is transfered to the FSF. The FSF then grants you all the rights back in the form of a license agreement, and commits itself legally to distributing the work only as free software. If you want to do this, you should contact the FSF for more information. It is not a good idea to assign your copyright to anyone else, unless you know what you are getting into. By assigning you rights to someone and not getting any of those rights back in the form of an agreement, you may place yourself in a position where you are not allowed to use your own work. Unfortunately, if you are employed or a student in a University you have probably already signed many of your rights away. Universities as well as companies like to lay as much claim on any copyrightable work you produce as possible, even work that you do as a hobby that has little to do with them.
Because copyright does not allow your users to do much with your software, other than have a copy, you need to give them permissions that allow them to freely use, modify and redistribute it. In the free software community, we standardize on using a legal document, the GNU General Public License to grant such permissions. See Applying the GPL, for more details on how to use the GPL.
Copyright covers mainly original works. However, it also introduces the concept of derived works. In general, if someone copies a portion of your work into per work, then it becomes derived work of your work, and both you and person share copyright interest on per work.
If the only information that you give an impartial observer is a copy of your work and a copy of per work, the observer has no deterministic way of deciding whether or not per work is legally derived from your work. The legal term derived work refers to the process with which person created per work, rather than an actual inherent property of the end-result of the effort. Your copyright interest is established by the fact that part of that process involved copying some of your work into per work (and then perhaps modifying it, but that is not relevant to whether or not you have copyright interest).
So, if you and someone write two very similar programs, because the programs are simple, then you don't have copyright interest in each others work, because you both worked indepedently. If, however, the reason for the similarity is that person copied your work, then you have copyright interest on per work. When that happens, person can only distribute the resulting program (i.e. source code, or the executable) under terms that are consistent with the terms with which person was allowed to have a copy of your work and use it in per program.
The law is less clear about what happens if person refers to your work without actually doing any copying. A judge will have to decide this if it goes to court. This is why when you work on a free software project, the only way to avoid liabilities like this is by not referring to anyone else's work, unless per work is also free software. This is one of the many ways that copyright obstructs cooperation between citizens.
Fortunately there is a legal precedent with derived work and user interfaces. The courts have decided that user interfaces, such as the application programming interface (API) that a software library is exporting to the programs that link to it can not be copyrighted. So, if you want to clone a library, while it is not a good idea to refer to the actual source code of the library, it is okey to refer to a description of the interface that the library defines. It is best to do this by reading the documentation, but if no documentation is available, reading the header files is the next best thing.
The concept of derived work is very slippery ground and has many gray areas, especially when it pertains to linking libraries that other people have written to your programs. See The GPL and libraries, for more discussion on this issue.
In addition to copyright law, there is another legal beast: the patent law. Unlike copyright, which you own automatically by the act of creating the work, you don't get a patent unless you file an application for it. If approved, the work is published but others must pay you royalties in order to use it in any way.
The problem with patents is that they cover algorithms, and if an algorithm is patented you can neither write nor use an implementation for it, without a license. What makes it worse is that it is very difficult and expensive to find out whether the algorithms that you use are patented or will be patented in the future. What makes it insane is that the patent office, in its infinite stupidity, has patented algorithms that are very trivial with nothing innovative about them. For example, the use of backing store in a multiprocessing window system, like X11, is covered by patent 4,555,775. In the spring of 1991, the owner of the patent, AT&T, threatened to sue every member of the X Consortium including MIT. Backing store is the idea that the windowing system save the contents of all windows at all times. This way, when a window is covered by another window and then exposed again, it is redrawn by the windowing system, and not the code responsible for the application. Other insane patents include the IBM patent 4,674,040 which covers “cut and paste between files” in a text editor. Recently, a stupid corporation called “Wang” tried to take Netscape to court over a patent that covered “bookmarks” and lost.
Even though this situation is ridiculous, software patents are a very serious problem because they are taken very seriously by the judicial system. Unfortunately they are not taken equally seriously by the patent office (also called PTO) itself. The more patents the PTO approves, the more income the PTO makes. Therefore, the PTO is very eager to let dubious patents through. After all, they figure that if the patent is invalid, someone will knock it down in court eventually.
It is not necessary for someone to have a solid case to get you into trouble. The cost of litigation is often sufficient extortion to force small businesses, non-profit organizations and individual software developers to settle, even when there is not solid case. The only defense against a patent attack is to prove that there is “prior art”; in other words, you need to show that what is described in the patent had already been invented before the date on which the application for that patent was filed. Unfortunately, this is costly, not guaranteed to work, and the burden of proof rests with the victim of the attack. Another defense is to make sure you don't have a lot of money. If you are poor, lawyers are less likely to waste money suing you.
Companies like to use software patents as strategic weapons for applying extortion, which is unfortunately sanctioned by the law. They build an arsenal of software patents by trying to pass whatever can get through the Patent Office. Then years later, when they feel like it, they can go through their patent arsenal and find someone to sue and extort some cash.
There have actually been patent attacks aimed directly against the free software community. The GNU system does not include the Unix ‘compress’ utility because it infringes a patent, and the patent owner has specifically targeted the volunteer that wrote a ‘compress’ program for the GNU project. There may be more patent attacks in the future. On November of 1998 two internal memos were leaked from Microsoft about our community. According to these memos, Microsoft perceives the free software community as a competitor and they seem to consider a patent-based attack among other things. It is important to note however that when an algorithm is patented, and, worse, when that patent is asserted by the owner, this is an attack on everyone that writes software, not only to the free software community. This is why it is not important who is being targeted in each specific incident. Patents hurt all of us.
An additional legal burden to both copyrights and patents is governmental boneheadedness over encryption algorithms. According to the US government, a computer program implementing an encryption algorithm is considered munition, therefore export-control laws on munitions apply. What is not allowed under these laws is to export the software outside the borders of the US. The government is pushing the issue by claiming that making encryption software available on the internet is the same thing as exporting it. Zimmermann, the author of a popular encryption program, was sued by the government based on this interpretation of the law. However the government's position was not tested at court because the government decided to drop the charges, after dragging the case for a few years, long enough to send a message of terror to the internet community. The current wisdom seems to be that it is okey to make encryption software available on the net provided that you take strong measures that will prevent foreigners to download your work. It should be noted however that doing so still is taking a legal risk that could land you to federal prison in the company of international smugglers of TOW missiles and M1 Abrams tanks.
The reason why the government's attitude towards encryption is unconstitutional is because it violates our inalienable right to freedom of speech. It is the current policy of the government that publishing a book containing the source code for encryption software is legal, but publishing the exact same content in digital form is illegal. As the internet increasingly becomes the library of the future, part of our freedom will be lost. The reason why the government maintains such a strange position today is because in the past they have tried to assert that publishing encryption software both digitally and on books is illegal. When the RSA algorithm was discovered, the National Security Agency (also known as NSA – No Such Agency) attempted to prevent the inventors from publishing their discovery in journals and presenting it at conferences. Judges understand books and conferences and the government had to give up fighting that battle. They still haven't given up on the electronic front however.
Other countries also have restrictive laws against encryption. In certain places, like France, you are not be even allowed to run such programs. 10 The reason why governments are so paranoid of encryption is because it is the key to a wide array of technologies that have the potential to empower the individual citizens to an extent that makes governments uncomfortable. Encryption is routinely used now by human rights activists operating on totalitarian countries. Encryption can also be used to create an unsanctioned para-economy based on digital cash, and allow individuals to carry out transactions and contracts completely anonymously. These prospects are not good news for Big Brother.
The Free Software Foundation is fighting the US government export restrictions very effectively by asking volunteers in a free country to develop free encryption software. The GNU Privacy Guard is now very stable, and is already being used by software developers. For more information, see http://www.gnupg.org/.
Both copyright and patent laws are being used mainly to destroy our freedom to cooperate with our fellow hackers. By freedom we refer to three things: the freedom to use software, the freedom to modify it and improve it, and the freedom to redistribute it with the modifications and improvements so that the whole community benefits. Combined with the possible default assignment of your rights to an employer or university, the laws can actually interfere even with your ability to write computer programs for a hobby and cooperate with other hackers on that basis!
To defend our freedoms from those who would like to take them from us, the free software community uses the General Public License, also known as the GPL. In broads strokes, the GPL does the following:
The purpose of the GPL is to use the copyright law to encourage a world in which software is not copyrighted. If copyright didn't cover software, then we would all be free to use, modify and redistribute software, and we would not be able to restrict others from enjoying these freedoms because there would be no law giving anyone such power. One way to grant the freedoms to the users of your software is to revoke your copyright on the software completely. This is called putting your work in the public domain. The problem with this is that it only grants the freedoms. It does not create the reality in which no-one can take these freedoms away from derived works. In fact the copyright law covers by default derived works regardless of whether the original was public domain or copyrighted. By distributing your work under the GPL, you grant the same freedoms, and at the same time you protect these freedoms from hoarders.
The GNU GPL is a legal instrument that has been designed to create a safe haven in which software can be written free from copyright law encumbrance. It allows developers to freely share their work with a friendly community that is also willing to share theirs, and at the same time protect them from being exploited by publishers of proprietary software. Many developers would not contribute to our community without this protection.
To apply the GPL to your programs you need to do the following things:
// Copyright (C) (years) (Your Name) <your@email.address> // // This program is free software; you can redistribute it and/or // modify it under the terms of the GNU General Public License // as published by the Free Software Foundation; either // version 2 of the License, or (at your option) any later // version. // // This program is distributed in the hope that it will be useful, // but WITHOUT ANY WARRANTY; without even the implied warranty of // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // GNU General Public License for more details. // // You should have received a copy of the GNU General Public License // along with this program; if not, write to the Free Software // Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
If you have assigned your copyright to an organization, like the Free Software Foundation, then you should probably fashion your copyright notice like this:
// Copyright (C) (years) Free Software Foundation // (your name) <your@email.address> (initial year) // etc...
This legal notice works like a subroutine. By invoking it, you invoke the full text of the GNU General Public License which is too lengthy to include in every source file. Where you see ‘(years)’ you need to list all the years in which you finished preparing a version that was actually released, and which was an ancestor to the current version. This list is not the list of years in which versions were released. It is a list of years in which versions, later released, were completed. If you finish a version on Dec 31, 1997 and release it on Jan 1, 1998, you need to include 1997, but you do not need to include 1998. This rule is complicated, but it is dictated by international copyright law.
--version
command-line flag. For details please read the GPL.
If you are unfamiliar with all this legalese you may find it surprising; you might even find it stupid. This is a very natural reaction. Until 1980, software copyright was not taken seriously in the US. In fact copyrights then had to be registered in order to be valid, and it was very natural for people to just copy software around, even though they knew it was illegal. It took significant amounts of lobbying and propaganda by proprietary publishers to cultivate the current litigious paranoia over copyrights and “convince” the public that helping out their neighbour by giving them an unauthorized copy is not only illegal, but it also “morally wrong”. Even though copyright laws are international, through treaties, there are many countries in the world, where this brainwashing hasn't yet taken place, and where people still make unauthorized copies of software for their friends with no second thoughts. Such people are being described with smear words like “pirates” by publishers and their lawyers, but it is not true that these people do what they do because of a malicious intent. These people do what they do, because it is natural for them to be nice and help their friends.
One problem with this attitude is that many of us don't want to disobey the law, because copyright is an indiscriminate weapon that cuts both ways. We prefer therefore to beat the hoarders at their own game. This means that we can not use, modify or distribute programs that are not distributed with a copyright notice and appropriate permissions, because the default status of such programs is that no permissions are granted whatsoever. If you write a program that you want to share with other people, then please apply the terms of the GPL to the copies that you distribute, so that your friends can use, modify and share the program with their friends, without breaking any laws and to protect your contribution to our community from the hoarders. Please do not violate copyright law. Instead, say no to proprietary software and use free software on the free GNU/Linux operating system.
The GNU development tools were written primarily to aid the development and distribution of free software in the form of source code distributions. The philosophy of the GNU project, that software should be free, is very important to the future of our community. Now that free software systems, like GNU/Linux, have been noticed by the mainstream media, our community will have to face many challenges to our freedom. We may have a free operating system today, but if we fail to deal with these challenges, we will not have one tomorrow. What are these challenges? Three that we have already had to face are: secret hardware, non-free libraries, and software patents. Who knows what else we might have to face tomorrow. Will we respond to these challenges and protect our freedom? That depends on our philosophy.
In this appendix we include a few articles written by Richard Stallman that discuss the philosophical concerns that lead to the free software movement. The text of these articles is included here with permission under the following terms:
Copying Notice
Copyright © 1998 Free Software Foundation Inc 59 Temple Place, Suite 330, Boston, MA 02111, USA Verbatim copying and distribution is permitted in any medium, provided this notice is preserved.
All of these articles, and others are distributed on the web at:
http://www.gnu.org/philosophy/index.html
This article appeared in the February 1997 issue of Communications of the ACM (Volume 40, Number 2).
(from "The Road To Tycho", a collection of articles about the antecedents of the Lunarian Revolution, published in Luna City in 2096)
For Dan Halbert, the road to Tycho began in college when Lissa Lenz asked to borrow his computer. Hers had broken down, and unless she could borrow another, she would fail her midterm project. There was no one she dared ask, except Dan.
This put Dan in a dilemma. He had to help her, but if he lent her his computer, she might read his books. Aside from the fact that you could go to prison for many years for letting someone else read your books, the very idea shocked him at first. Like everyone, he had been taught since elementary school that sharing books was nasty and wrong, something that only pirates would do.
And there wasn't much chance that the SPA, the Software Protection Authority, would fail to catch him. In his software class, Dan had learned that each book had a copyright monitor that reported when and where it was read, and by whom, to Central Licensing. (They used this information to catch reading pirates, but also to sell personal interest profiles to retailers.) The next time his computer was networked, Central Licensing would find out. He, as computer owner, would receive the harshest punishment, for not taking pains to prevent the crime.
Of course, Lissa did not necessarily intend to read his books. She might want the computer only to write her midterm. But Dan knew she came from a middle-class family and could hardly afford the tuition, let alone her reading fees. Reading his books might be the only way she could graduate. He understood this situation; he himself had had to borrow to pay for all the research papers he read. (10% of those fees went to the researchers who wrote the papers; since Dan aimed for an academic career, he could hope that his own research papers, if frequently referenced, would bring in enough to repay this loan.)
Later on, Dan would learn there was a time when anyone could go to the library and read journal articles, and even books, without having to pay. There were independent scholars who read thousands of pages without government library grants. But in the 1990s, both commercial and nonprofit journal publishers had begun charging fees for access. By 2047, libraries offering free public access to scholarly literature were a dim memory.
There were ways, of course, to get around the SPA and Central Licensing. They were themselves illegal. Dan had had a classmate in software, Frank Martucci, who had obtained an illicit debugging tool, and used it to skip over the copyright monitor code when reading books. But he had told too many friends about it, and one of them turned him in to the SPA for a reward (students deep in debt were easily tempted into betrayal). In 2047, Frank was in prison, not for pirate reading, but for possessing a debugger.
Dan would later learn that there was a time when anyone could have debugging tools. There were even free debugging tools available on CD or downloadable over the net. But ordinary users started using them to bypass copyright monitors, and eventually a judge ruled that this had become their principal use in actual practice. This meant they were illegal; the debuggers' developers were sent to prison.
Programmers still needed debugging tools, of course, but debugger vendors in 2047 distributed numbered copies only, and only to officially licensed and bonded programmers. The debugger Dan used in software class was kept behind a special firewall so that it could be used only for class exercises.
It was also possible to bypass the copyright monitors by installing a modified system kernel. Dan would eventually find out about the free kernels, even entire free operating systems, that had existed around the turn of the century. But not only were they illegal, like debuggers; you could not install one if you had one, without knowing your computer's root password. And neither the FBI nor Microsoft Support would tell you that.
Dan concluded that he couldn't simply lend Lissa his computer. But he couldn't refuse to help her, because he loved her. Every chance to speak with her filled him with delight. And that she chose him to ask for help, that could mean she loved him too.
Dan resolved the dilemma by doing something even more unthinkable–he lent her the computer, and told her his password. This way, if Lissa read his books, Central Licensing would think he was reading them. It was still a crime, but the SPA would not automatically find out about it. They would only find out if Lissa reported him.
Of course, if the school ever found out that he had given Lissa his own password, it would be curtains for both of them as students, regardless of what she had used it for. School policy was that any interference with their means of monitoring students' computer use was grounds for disciplinary action. It didn't matter whether you did anything harmful. The offense was making it hard for the administrators to check on you. They assumed this meant you were doing something else forbidden, and they did not need to know what it was.
Students were not usually expelled for this, not directly. Instead they were banned from the school computer systems, and would inevitably fail all their classes.
Later, Dan would learn that this kind of university policy started only in the 1980s, when university students in large numbers began using computers. Previously, universities maintained a different approach to student discipline; they punished activities that were harmful, not those that merely raised suspicion.
Lissa did not report Dan to the SPA. His decision to help her led to their marriage, and also led them to question what they had been taught about piracy as children. The couple began reading about the history of copyright, about the Soviet Union and its restrictions on copying, and even the original United States Constitution. They moved to Luna, where they found others who had likewise gravitated away from the long arm of the SPA. When the Tycho Uprising began in 2062, the universal right to read soon became one of its central aims.
Author's Note
The right to read is a battle being fought today. Although it may take 50 years for our present way of life to fade into obscurity, most of the specific laws and practices described above have already been proposed, either by the Clinton Administration or by publishers.
There is one exception: the idea that the FBI and Microsoft will keep the root passwords for personal computers. This is an extrapolation from the Clipper chip and similar Clinton Administration key-escrow proposals, together with a long-term trend: computer systems are increasingly set up to give absentee operators control over the people actually using the computer system.
The SPA, which actually stands for Software Publisher's Association, is not today an official police force. Unofficially, it acts like one. It invites people to inform on their coworkers and friends. Like the Clinton Administration, it advocates a policy of collective responsibility whereby computer owners must actively enforce copyright or be punished.
The SPA is currently threatening small Internet service providers, demanding they permit the SPA to monitor all users. Most ISPs surrender when threatened, because they cannot afford to fight back in court. (Atlanta Journal-Constitution, 1 Oct 96, D3.) At least one ISP, Community ConneXion in Oakland CA, refused the demand and was actually sued. The SPA is said to have dropped this suit recently, but they are sure to continue the campaign in various other ways.
The university security policies described above are not imaginary. For example, a computer at one Chicago-area university prints this message when you log in (quotation marks are in the original):
“This system is for the use of authorized users only. Individuals using this computer system without authority or in the excess of their authority are subject to having all their activities on this system monitored and recorded by system personnel. In the course of monitoring individuals improperly using this system or in the course of system maintenance, the activities of authorized user may also be monitored. Anyone using this system expressly consents to such monitoring and is advised that if such monitoring reveals possible evidence of illegal activity or violation of University regulations system personnel may provide the evidence of such monitoring to University authorities and/or law enforcement officials.”
This is an interesting approach to the Fourth Amendment: pressure most everyone to agree, in advance, to waive their rights under it.
References
Free software is a matter of liberty, not price. To understand the concept, you should think of free speech, not free beer.
Free software refers to the users' freedom to run, copy, distribute, study, change and improve the software. More precisely, it refers to three levels of freedom:
You may have paid money to get copies of GNU software, or you may have obtained copies at no charge. But regardless of how you got your copies, you always have the freedom to copy and change the software. In the GNU project, we use copyleft to protect these freedoms legally for everyone.
See Categories of software, for a description of how “free software,” “copylefted software” and other categories of software relate to each other.
When talking about free software, it is best to avoid using terms like “give away” or “for free”, because those terms imply that the issue is about price, not freedom. Some common terms such as “piracy” embody opinions we hope you won't endorse. See Confusing words, for a discussion of these terms.
Digital information technology contributes to the world by making it easier to copy and modify information. Computers promise to make this easier for all of us.
Not everyone wants it to be easier. The system of copyright gives software programs “owners”, most of whom aim to withhold software's potential benefit from the rest of the public. They would like to be the only ones who can copy and modify the software that we use.
The copyright system grew up with printing—a technology for mass production copying. Copyright fit in well with this technology because it restricted only the mass producers of copies. It did not take freedom away from readers of books. An ordinary reader, who did not own a printing press, could copy books only with pen and ink, and few readers were sued for that.
Digital technology is more flexible than the printing press: when information has digital form, you can easily copy it to share it with others. This very flexibility makes a bad fit with a system like copyright. That's the reason for the increasingly nasty and draconian measures now used to enforce software copyright. Consider these four practices of the Software Publishers Association (SPA):
All four practices resemble those used in the former Soviet Union, where every copying machine had a guard to prevent forbidden copying, and where individuals had to copy information secretly and pass it from hand to hand as “samizdat”. There is of course a difference: the motive for information control in the Soviet Union was political; in the US the motive is profit. But it is the actions that affect us, not the motive. Any attempt to block the sharing of information, no matter why, leads to the same methods and the same harshness.
Owners make several kinds of arguments for giving them the power to control how we use information:
Our ideas and intuitions about property for material objects are about whether it is right to take an object away from someone else. They don't directly apply to making a copy of something. But the owners ask us to apply them anyway.
A little thought shows that most such people would not have bought copies. Yet the owners compute their “losses” as if each and every one would have bought a copy. That is exaggeration—to put it kindly.
This line of persuasion isn't designed to stand up to critical thinking; it's intended to reinforce a habitual mental pathway.
It's elementary that laws don't decide right and wrong. Every American should know that, forty years ago, it was against the law in many states for a black person to sit in the front of a bus; but only racists would say sitting there was wrong.
To those who propose this as an ethical axiom—the author is more important than you—I can only say that I, a notable software author myself, call it bunk.
But people in general are only likely to feel any sympathy with the natural rights claims for two reasons.
One reason is an overstretched analogy with material objects. When I cook spaghetti, I do object if someone else eats it, because then I cannot eat it. His action hurts me exactly as much as it benefits him; only one of us can eat the spaghetti, so the question is, which? The smallest distinction between us is enough to tip the ethical balance.
But whether you run or change a program I wrote affects you directly and me only indirectly. Whether you give a copy to your friend affects you and your friend much more than it affects me. I shouldn't have the power to tell you not to do these things. No one should.
The second reason is that people have been told that natural rights for authors is the accepted and unquestioned tradition of our society.
As a matter of history, the opposite is true. The idea of natural rights of authors was proposed and decisively rejected when the US Constitution was drawn up. That's why the Constitution only permits a system of copyright and does not require one; that's why it says that copyright must be temporary. It also states that the purpose of copyright is to promote progress—not to reward authors. Copyright does reward authors somewhat, and publishers more, but that is intended as a means of modifying their behaviour.
The real established tradition of our society is that copyright cuts into the natural rights of the public—and that this can only be justified for the public's sake.
Unlike the others, this argument at least takes a legitimate approach to the subject. It is based on a valid goal—satisfying the users of software. And it is empirically clear that people will produce more of something if they are well paid for doing so.
But the economic argument has a flaw: it is based on the assumption that the difference is only a matter of how much money we have to pay. It assumes that “production of software” is what we want, whether the software has owners or not.
People readily accept this assumption because it accords with our experiences with material objects. Consider a sandwich, for instance. You might well be able to get an equivalent sandwich either free or for a price. If so, the amount you pay is the only difference. Whether or not you have to buy it, the sandwich has the same taste, the same nutritional value, and in either case you can only eat it once. Whether you get the sandwich from an owner or not cannot directly affect anything but the amount of money you have afterwards.
This is true for any kind of material object—whether or not it has an owner does not directly affect what it is, or what you can do with it if you acquire it.
But if a program has an owner, this very much affects what it is, and what you can do with a copy if you buy one. The difference is not just a matter of money. The system of owners of software encourages software owners to produce something—but not what society really needs. And it causes intangible ethical pollution that affects us all.
What does society need? It needs information that is truly available to its citizens—for example, programs that people can read, fix, adapt, and improve, not just operate. But what software owners typically deliver is a black box that we can't study or change.
Society also needs freedom. When a program has an owner, the users lose freedom to control part of their own lives.
And above all society needs to encourage the spirit of voluntary cooperation in its citizens. When software owners tell us that helping our neighbors in a natural way is “piracy”, they pollute our society's civic spirit.
This is why we say that free software is a matter of freedom, not price.
The economic argument for owners is erroneous, but the economic issue is real. Some people write useful software for the pleasure of writing it or for admiration and love; but if we want more software than those people write, we need to raise funds.
For ten years now, free software developers have tried various methods of finding funds, with some success. There's no need to make anyone rich; the median US family income, around $35k, proves to be enough incentive for many jobs that are less satisfying than programming.
For years, until a fellowship made it unnecessary, I made a living from custom enhancements of the free software I had written. Each enhancement was added to the standard released version and thus eventually became available to the general public. Clients paid me so that I would work on the enhancements they wanted, rather than on the features I would otherwise have considered highest priority.
The Free Software Foundation (FSF), a tax-exempt charity for free software development, raises funds by selling GNU CD-ROMs, T-shirts, manuals, and deluxe distributions, (all of which users are free to copy and change), as well as from donations. It now has a staff of five programmers, plus three employees who handle mail orders.
Some free software developers make money by selling support services. Cygnus Support, with around 50 employees [when this article was written], estimates that about 15 per cent of its staff activity is free software development—a respectable percentage for a software company.
Companies including Intel, Motorola, Texas Instruments and Analog Devices have combined to fund the continued development of the free GNU compiler for the language C. Meanwhile, the GNU compiler for the Ada language is being funded by the US Air Force, which believes this is the most cost-effective way to get a high quality compiler. [Air Force funding ended some time ago; the GNU Ada Compiler is now in service, and its maintenance is funded commercially.]
All these examples are small; the free software movement is still small, and still young. But the example of listener-supported radio in this country [the US] shows it's possible to support a large activity without forcing each user to pay.
As a computer user today, you may find yourself using a proprietary program. If your friend asks to make a copy, it would be wrong to refuse. Cooperation is more important than copyright. But underground, closet cooperation does not make for a good society. A person should aspire to live an upright life openly with pride, and this means saying “No” to proprietary software.
You deserve to be able to cooperate openly and freely with other people who use software. You deserve to be able to learn how the software works, and to teach your students with it. You deserve to be able to hire your favorite programmer to fix it when it breaks.
You deserve free software.
The biggest deficiency in free operating systems is not in the software–it is the lack of good free manuals that we can include in these systems. Many of our most important programs do not come with full manuals. Documentation is an essential part of any software package; when an important free software package does not come with a free manual, that is a major gap. We have many such gaps today.
Once upon a time, many years ago, I thought I would learn Perl. I got a copy of a free manual, but I found it hard to read. When I asked Perl users about alternatives, they told me that there were better introductory manuals–but those were not free.
Why was this? The authors of the good manuals had written them for O'Reilly Associates, which published them with restrictive terms–no copying, no modification, source files not available–which exclude them from the free software community.
That wasn't the first time this sort of thing has happened, and (to our community's great loss) it was far from the last. Proprietary manual publishers have enticed a great many authors to restrict their manuals since then. Many times I have heard a GNU user eagerly tell me about a manual that he is writing, with which he expects to help the GNU project–and then had my hopes dashed, as he proceeded to explain that he had signed a contract with a publisher that would restrict it so that we cannot use it.
Given that writing good English is a rare skill among programmers, we can ill afford to lose manuals this way.
Free documentation, like free software, is a matter of freedom, not price. The problem with these manuals was not that O'Reilly Associates charged a price for printed copies–that in itself is fine. (The Free Software Foundation sells printed copies of free GNU manuals, too.) But GNU manuals are available in source code form, while these manuals are available only on paper. GNU manuals come with permission to copy and modify; the Perl manuals do not. These restrictions are the problems.
The criterion for a free manual is pretty much the same as for free software: it is a matter of giving all users certain freedoms. Redistribution (including commercial redistribution) must be permitted, so that the manual can accompany every copy of the program, on-line or on paper. Permission for modification is crucial too.
As a general rule, I don't believe that it is essential for people to have permission to modify all sorts of articles and books. The issues for writings are not necessarily the same as those for software. For example, I don't think you or I are obliged to give permission to modify articles like this one, which describe our actions and our views.
But there is a particular reason why the freedom to modify is crucial for documentation for free software. When people exercise their right to modify the software, and add or change its features, if they are conscientious they will change the manual too–so they can provide accurate and usable documentation with the modified program. A manual which forbids programmers to be conscientious and finish the job, or more precisely requires them to write a new manual from scratch if they change the program, does not fill our community's needs.
While a blanket prohibition on modification is unacceptable, some kinds of limits on the method of modification pose no problem. For example, requirements to preserve the original author's copyright notice, the distribution terms, or the list of authors, are ok. It is also no problem to require modified versions to include notice that they were modified, even to have entire sections that may not be deleted or changed, as long as these sections deal with nontechnical topics. (Some GNU manuals have them.)
These kinds of restrictions are not a problem because, as a practical matter, they don't stop the conscientious programmer from adapting the manual to fit the modified program. In other words, they don't block the free software community from doing its thing with the program and the manual together.
However, it must be possible to modify all the technical content of the manual; otherwise, the restrictions do block the community, the manual is not free, and so we need another manual.
Unfortunately, it is often hard to find someone to write another manual when a proprietary manual exists. The obstacle is that many users think that a proprietary manual is good enough–so they don't see the need to write a free manual. They do not see that the free operating system has a gap that needs filling.
Why do users think that proprietary manuals are good enough? Some have not considered the issue. I hope this article will do something to change that.
Other users consider proprietary manuals acceptable for the same reason so many people consider proprietary software acceptable: they judge in purely practical terms, not using freedom as a criterion. These people are entitled to their opinions, but since those opinions spring from values which do not include freedom, they are no guide for those of us who do value freedom.
Please spread the word about this issue. We continue to lose manuals to proprietary publishing. If we spread the word that proprietary manuals are not sufficient, perhaps the next person who wants to help GNU by writing documentation will realize, before it is too late, that he must above all make it free.
We can also encourage commercial publishers to sell free, copylefted manuals instead of proprietary ones. One way you can help this is to check the distribution terms of a manual before you buy it, and prefer copylefted manuals to non-copylefted ones.
Every decision a person makes stems from the person's values and goals. People can have many different goals and values; fame, profit, love, survival, fun, and freedom, are just some of the goals that a good person might have. When the goal is to help others as well as oneself, we call that idealism.
My work on free software is motivated by an idealistic goal: spreading freedom and cooperation. I want to encourage free software to spread, replacing proprietary software which forbids cooperation, and thus make our society better.
That's the basic reason why the GNU General Public License is written the way it is–as a copyleft. All code added to a GPL-covered program must be free software, even if it is put in a separate file. I make my code available for use in free software, and not for use in proprietary software, in order to encourage other people who write software to make it free as well. I figure that since proprietary software developers use copyright to stop us from sharing, we cooperators can use copyright to give other cooperators an advantage of their own: they can use our code.
Not everyone who uses the GNU GPL has this goal. Many years ago, a friend of mine was asked to rerelease a copylefted program under non-copyleft terms, and he responded more or less like this:
Sometimes I work on free software, and sometimes I work on proprietary software–but when I work on proprietary software, I expect to get paid.
He was willing to share his work with a community that shares software, but saw no reason to give a handout to a business. His goal was different from mine, but he decided that the GNU GPL was useful for his goal too.
If you want to accomplish something in the world, idealism is not enough–you need to choose a method which works to achieve the goal. In other words, you need to be “pragmatic.” Is the GPL pragmatic? Let's look at its results.
Consider GNU C++. Why do we have a free C++ compiler? Only because the GNU GPL said it had to be free. GNU C++ was developed by an industry consortium, MCC, starting from the GNU C compiler. MCC normally makes its work as proprietary as can be. But they made the C++ front end free software, because the GNU GPL said that was the only way they could release it. The C++ front end included many new files, but since they were meant to be linked with GCC, the GPL did applied to them. The benefit to our community is evident.
Consider GNU Objective C. NeXT initially wanted to make this front end proprietary; they proposed to release it as .o files, and let users link them with the rest of GCC, thinking this might be a way around the GPL's requirements. But our lawyer said that this would not evade the requirements, that it was not allowed. And so they made the Objective C front end free software.
Those examples happened years ago, but the GNU GPL continues to bring us more free software.
Many GNU libraries are covered by the GNU Library General Public License, but not all. One GNU library which is covered by the ordinary GNU GPL is Readline, which implements command-line editing. A month ago, I found out about a non-free program which was designed to use Readline, and told the developer this was not allowed. He could have taken command-line editing out of the program, but what he actually did was rerelease it under the GPL. Now it is free software.
The programmers who write improvements to GCC (or Emacs, or Bash, or Linux, or any GPL-covered program) are often employed by companies or universities. When the programmer wants to return his improvements to the community, and see his code in the next release, the boss may say, “Hold on there–your code belongs to us! We don't want to share it; we have decided to turn your improved version into a proprietary software product.”
Here the GNU GPL comes to the rescue. The programmer shows the boss that this proprietary software product would be copyright infringement, and the boss realizes that he has only two choices: release the new code as free software, or not at all. Almost always he lets the programmer do as he intended all along, and the code goes into the next release.
The GNU GPL is not Mr. Nice Guy. It says “no” to some of the things that people sometimes want to do. There are users who say that this is a bad thing–that the GPL “excludes” some proprietary software developers who “need to be brought into the free software community”.
But we are not excluding them from our community; they are choosing not to enter. Their decision to make software proprietary is a decision to stay out of our community. Being in our community means joining in cooperation with us; we cannot “bring them into our community” if they don't want to join.
What we can do is offer them an inducement to join. The GNU GPL is designed to make an inducement from our existing software: “If you will make your software free, you can use this code.” Of course, it won't win 'em all, but it wins some of the time.
Proprietary software development does not contribute to our community, but its developers often want handouts from us. Free software users can offer free software developers strokes for the ego–recognition and gratitude–but it can be very tempting when a business tells you, “Just let us put your package in our proprietary program, and your program will be used by many thousands of people!” The temptation can be powerful, but in the long run we are all better off if we resist it.
The temptation and pressure are harder to recognize when they come indirectly, through free software organizations that have adopted a policy of catering to proprietary software. The X Consortium (and its successor, the Open Group) offers an example: funded by companies that made proprietary software, they have strived for a decade to persuade programmers not to use copyleft. Now that the Open Group has made X11R6.4 non-free software, those of us who resisted that pressure are glad that we did.
Pragmatically speaking, thinking about greater long-term goals will strengthen your will to resist this pressure. If you focus your mind on the freedom and community that you can build by staying firm, you will find the strength to do it. “Stand for something, or you will fall for nothing.”
And if cynics ridicule freedom, ridicule community...if “hard nosed realists” say that profit is the only ideal...just ignore them, and use copyleft all the same.
To copyleft or not to copyleft? That is one of the major controversies in the free software community. The idea of copyleft is that we should fight fire with fire–that we should use copyright to make sure our code stays free. The GNU GPL is one example of a copyleft license.
Some free software developers prefer non-copyleft distribution. Non-copyleft licenses such as the XFree86 and BSD licenses are based on the idea of never saying no to anyone–not even to someone who seeks to use your work as the basis for restricting other people. Non-copyleft licensing does nothing wrong, but it misses the opportunity to actively protect our freedom to change and redistribute software. For that, we need copyleft.
For many years, the X Consortium was the chief opponent of copyleft. It exerted both moral suasion and pressure to discourage free software developers from copylefting their programs. It used moral suasion by suggesting that it is not nice to say no. It used pressure through its rule that copylefted software could not be in the X Distribution.
Why did the X Consortium adopt this policy? It had to do with their definition of success. The X Consortium defined success as popularity–specifically, getting computer companies to use X Windows. This definition put the computer companies in the driver's seat. Whatever they wanted, the X Consortium had to help them get it.
Computer companies normally distribute proprietary software. They wanted free software developers to donate their work for such use. If they had asked for this directly, people would have laughed. But the X Consortium, fronting for them, could present this request as an unselfish one. "Join us in donating our work to proprietary software developers," they said, suggesting that this is a noble form of self-sacrifice. "Join us in achieving popularity", they said, suggesting that it was not even a sacrifice.
But self-sacrifice is not the issue: tossing away the defenses of copyleft, which protect the freedom of everyone in the community, is sacrificing more than yourself. Those who granted the X Consortium's request entrusted the community's future to the good will of the X Consortium.
This trust was misplaced. In its last year, the X Consortium made a plan to restrict the forthcoming X11R6.4 release so that it will not be free software. They decided to start saying no, not only to proprietary software developers, but to our community as well.
There is an irony here. If you said yes when the X Consortium asked you not to use copyleft, you put the X Consortium in a position to license and restrict its version of your program, along with its own code.
Te X Consortium did not carry out this plan. Instead it closed down and transferred X development to the Open Group, whose staff are now carrying out a similar plan. To give them credit, when I asked them to release X11R6.4 under the GNU GPL in parallel with their planned restrictive license, they were willing to consider the idea. (They were firmly against staying with the old X11 distribution terms.) Before they said yes or no to this proposal, it had already failed for another reason: the XFree86 group follows the X Consortium's old policy, and will not accept copylefted software.
Even if the X Consortium and the Open Group had never planned to restrict X, someone else could have done it. Non-copylefted software is vulnerable from all directions; it lets anyone make a non-free version dominant, if he will invest sufficient resources to add some important feature using proprietary code. Users who choose software based on technical characteristics, rather than on freedom, could easily be lured to the non-free version for short term convenience.
The X Consortium and Open Group can no longer exert moral suasion by saying that it is wrong to say no. This will make it easier to decide to copyleft your X-related software.
When you work on the core of X, on programs such as the X server, Xlib, and Xt, there is a practical reason not to use copyleft. The XFree86 group does an important job for the community in maintaining these programs, and the benefit of copylefting our changes would be less than the harm done by a fork in development. So it is better to work with the XFree86 group and not copyleft our changes on these programs. Likewise for utilities such as xset and xrdb, which are close to the core of X, and which do not need major improvements. At least we know that the XFree86 group has a firm commitment to developing these programs as free software.
The issue is different for programs outside the core of X: applications, window managers, and additional libraries and widgets. There is no reason not to copyleft them, and we should copyleft them.
In case anyone feels the pressure exerted by the criteria for inclusion in X Distributions, the GNU project will undertake to publicize copylefted packages that work with X. If you would like to copyleft something, and you worry that its omission from X Distributions will impede its popularity, please ask us to help.
At the same time, it is better if we do not feel too much need for popularity. When a businessman tempts you with "more popularity", he may try to convince you that his use of your program is crucial to its success. Don't believe it! If your program is good, it will find many users anyway; you don't need to feel desperate for any particular users, and you will be stronger if you do not. You can get an indescribable sense of joy and freedom by responding, "Take it or leave it–that's no skin off my back." Often the businessman will turn around and accept the program with copyleft, once you call the bluff.
Friends, free software developers, don't repeat a mistake. If we do not copyleft our software, we put its future at the mercy of anyone equipped with more resources than scruples. With copyleft, we can defend freedom, not just for ourselves, but for our whole community.
Here is a glossary of various categories of software that are often mentioned in discussions of free software. It explains which categories overlap or are part of other categories.
If a program is free, then it can potentially be included in a free operating system such as GNU, or free GNU/Linux systems .
There are many different ways to make a program free—many questions of detail, which could be decided in more than one way and still make the program free. Some of the possible variations are described below.
Free software is a matter of freedom, not price. But proprietary software companies sometimes use the term “free software” to refer to price. Sometimes they mean that you can obtain a binary copy at no charge; sometimes they mean that a copy is included on a computer that you are buying. This has nothing to do with what we mean by free software in the GNU project.
Because of this potential confusion, when a software company says its product is free software, always check the actual distribution terms to see whether users really have all the freedoms that free software implies. Sometimes it really is free software; sometimes it isn't.
Many languages have two separate words for “free” as in freedom and “free” as in zero price. For example, French has “libre” and “gratuit”. English has a word “gratis” that refers unambiguously to price, but no common adjective that refers unambiguously to freedom. This is unfortunate, because such a word would be useful here.
Free software is often more reliable than non-free software.
Sometimes people use the term “public domain” in a loose fashion to mean “free” or “available gratis.” However, “public domain” is a legal term and means, precisely, “not copyrighted”. For clarity, we recommend using “public domain” for that meaning only, and using other terms to convey the other meanings.
In the GNU Project, we copyleft almost all the software we write, because our goal is to give every user the freedoms implied by the term “free software.” See Copylefted for more explanation of how copyleft works and why we use it.
Copyleft is a general concept; to actually copyleft a program, you need to use a specific set of distribution terms. There are many possible ways to write copyleft distribution terms.
If a program is free but not copylefted, then some copies or modified versions may not be free at all. A software company can compile the program, with or without modifications, and distribute the executable file as a proprietary software product.
The X Window System illustrates this. The X Consortium releases X11 with distribution terms that make it non-copylefted free software. If you wish, you can get a copy which has those distribution terms and is free. However, there are non-free versions as well, and there are popular workstations and PC graphics boards for which non-free versions are the only ones that work. If you are using this hardware, X11 is not free software for you.
A Unix-like operating system consists of many programs. We have been accumulating components for this system since 1984; the first test release of a “complete GNU system” was in 1996. We hope that in a year or so this system will be mature enough to recommend it for ordinary users.
The GNU system includes all the GNU software, as well as many other packages such as the X Window System and TeX which are not GNU software.
Since the purpose of GNU is to be free, every single component in the GNU system has to be free software. They don't all have to be copylefted, however; any kind of free software is legally suitable to include if it helps meet technical goals. We can and do use non-copylefted free software such as the X Window System.
Some GNU software is written by staff of the Free Software Foundation, but most GNU software is contributed by volunteers. Some contributed software is copyrighted by the Free Software Foundation; some is copyrighted by the contributors who wrote it.
Semi-free software is much better than proprietary software, but it still poses problems, and we cannot use it in a free operating system.
The restrictions of copyleft are designed to protect the essential freedoms for all users. For us, the only justification for any substantive restriction on using a program is to prevent other people from adding other restrictions. Semi-free programs have additional restrictions, motivated by purely selfish goals.
It is impossible to include semi-free software in a free operating system. This is because the distribution terms for the operating system as a whole are the conjunction of the distribution terms for all the programs in it. Adding one semi-free program to the system would make the system as a whole just semi-free. There are two reasons we do not want that to happen:
If there is a job that needs doing with software, then until we have a free program to do the job, the GNU system has a gap. We have to tell volunteers, “We don't have a program yet to do this job in GNU, so we hope you will write one.” If we ourselves used a semi-free program to do the job, that would undermine what we say; it would take away the impetus (on us, and on others who might listen to our views) to write a free replacement. So we don't do that.
The Free Software Foundation follows the rule that we cannot install any proprietary program on our computers except temporarily for the specific purpose of writing a free replacement for that very program. Aside from that, we feel there is no possible excuse for installing a proprietary program.
For example, we felt justified in installing Unix on our computer in the 1980s, because we were using it to write a free replacement for Unix. Nowadays, since free operating systems are available, the excuse is no longer applicable; we have eliminated all our non-free operating systems, and any new computer we install must run a completely free operating system.
We don't insist that users of GNU, or contributors to GNU, have to live by this rule. It is a rule we made for ourselves. But we hope you will decide to follow it too.
Shareware is not free software, or even semi-free. There are two reasons it is not:
For example, GNU Ada is always distributed under the terms of the GNU GPL, and every copy is free software; but its developers sell support contracts. When their salesmen speak to prospective customers, sometimes the customers say, “We would feel safer with a commercial compiler.” The salesmen reply, “GNU Ada is a commercial compiler; it happens to be free software.”
For the GNU Project, the emphasis is in the other order: the important thing is that GNU Ada is free software; whether it is commercial is not a crucial question. However, the additional development of GNU Ada that results from the business that supports it is definitely beneficial.
There are a number of words and phrases which we recommend avoiding, either because they are ambiguous or because they imply an opinion that we hope you may not entirely agree with.
Free software is often available for free–for example, on many FTP servers. But free software copies are also available for a price on CD-ROMs, and proprietary software copies may occasionally be available for free.
But this analogy overlooks the crucial difference between material objects and information: information can be copied and shared almost effortlessly, while material objects can't be. Basing your thinking on this analogy is tantamount to ignoring that difference.
Even the US legal system does not entirely accept this analogy, since it does not treat copyrights just like physical object property rights.
If you don't want to limit yourself to this way of thinking, it is best to avoid using the term “intellectual property” in your words and thoughts.
A suggestion has been made to use the term “intellectual policy” instead of `intellectual property.”
If you don't believe that illegal copying is just like kidnaping and murder, you might prefer not to use the word “piracy” to describe it. Neutral terms such as “prohibited copying” or “illegal copying” are available for use instead. Some of us might even prefer to use a positive term such as “sharing information with your neighbor.”
It is easy to avoid “protection” and use neutral terms instead. For example, instead of “Copyright protection lasts a very long time,” you can say, “Copyright lasts a very long time.”
So it is pertinent to mention that the legal system–at least in the US–rejects the idea that copyright infringement is “theft”. Copyright advocates who use terms like “stolen” are misrepresenting the authority that they appeal to.
The idea that laws decide what is right or wrong is mistaken in general. Laws are, at their best, an attempt to achieve justice; to say that laws define justice or ethical conduct is turning things upside down.
The following articles by Richard Stallman describe how we license free software in our community. The text of these articles in included here with permission under the following terms:
Copying Notice
Copyright © 1998 Free Software Foundation Inc 59 Temple Place, Suite 330, Boston, MA 02111, USA Verbatim copying and distribution is permitted in any medium, provided this notice is preserved.
An exception is the article in Why you should use the GPL. This article was written by Eleftherios Gkioulekas to make this appendix more self contained and you may copy it under the following terms:
Copying Notice
Copyright © 1998 Eleftherios Gkioulekas Verbatim copying and distribution is permitted in any medium, provided this notice is preserved.
The simplest way to make a program free is to put it in the public domain, uncopyrighted. This allows people to share the program and their improvements, if they are so minded. But it also allows uncooperative people to convert the program into proprietary software. They can make changes, many or few, and distribute the result as a proprietary product. People who receive the program in that modified form do not have the freedom that the original author gave them; the middleman has stripped it away.
In the GNU project, our aim is to give all users the freedom to redistribute and change GNU software. If middlemen could strip off the freedom, we might have many users, but those users would not have freedom. So instead of putting GNU software in the public domain, we copyleft it. Copyleft says that anyone who redistributes the software, with or without changes, must pass along the freedom to further copy and change it. Copyleft guarantees that every user has freedom.
Copyleft also provides an incentive for other programmers to add to free software. Important free programs such as the GNU C++ compiler exist only because of this.
Copyleft also helps programmers who want to contribute improvements to free software get permission to do that. These programmers often work for companies or universities that would do almost anything to get more money. A programmer may want to contribute her changes to the community, but her employer may want to turn the changes into a proprietary software product.
When we explain to the employer that it is illegal to distribute the improved version except as free software, the employer usually decides to release it as free software rather than throw it away.
To copyleft a program, first we copyright it; then we add distribution terms, which are a legal instrument that gives everyone the rights to use, modify, and redistribute the program's code or any program derived from it but only if the distribution terms are unchanged. Thus, the code and the freedoms become legally inseparable.
Proprietary software developers use copyright to take away the users' freedom; we use copyright to guarantee their freedom. That's why we reverse the name, changing “copyright” into “copyleft.”
Copyleft is a general concept; there are many ways to fill in the details. In the GNU Project, the specific distribution terms that we use are contained in the GNU General Public License (GNU GPL). An alternate form, the GNU Library General Public License (GNU LGPL), applies to a few (but not all) GNU libraries. The license permits linking the libraries into proprietary executables under certain conditions.
The appropriate license is included in many manuals and in each GNU source code distribution (usually in files named COPYING and COPYING.LIB).
The GNU GPL is designed so that you can easily apply it to your own program if you are the copyright holder. You don't have to modify the GNU GPL to do this, just add notices to your program which refer properly to the GNU GPL.
If you would like to copyleft your program with the GNU GPL, please see the instructions at the end of the GPL text. If you would like to copyleft your library with the GNU LGPL, please see the instructions at the end of the LGPL text (note you can also use the ordinary GPL for libraries).
Using the same distribution terms for many different programs makes it easy to copy code between various different programs. Since they all have the same distribution terms, there is no need to think about whether the terms are compatible. The Library GPL includes a provision that lets you alter the distribution terms to the ordinary GPL, so that you can copy code into another program covered by the GPL.
The GPL is not the only way to implement copyleft. However, as a practical matter, it is convenient to standardize on using the GPL to copyleft software because that allows to copy source code from copylefted programs and use it on other copylefted programs without worrying about license compatibility.
If you want your program to be free, then GPL grants all the permissions that are necessary to make it free. Some people do not like the GPL because they feel it gives too many permissions. In that case, these people do not really want their program to be free. When they choose to use a more restrictive license, as a result, they are effectively choosing not to be part of the free software community.
One very common restriction, that often comes up, is to allow free use only for “non-commercial” purposes. The idea behind such a restriction is to prevent anyone from making any money without giving you a cut of their profit. Copyleft actually also serves this goal, but from a different angle. The angle is that making money is only one of the many benefits that one can derive from using a computer program, and it should not be discriminated against all the other benefits. Copyleft however does prevent others from making money by modifying your program and distributing it as proprietary software with restrictive licensing. If person wants to distribute the program, person also has to distribute the source code, in which case you benefit by having access to per modifications, or person has to negotiate with you for special terms.
Another peculiar restriction that often comes up is allowing use and modification but requiring the redistribution of any modified versions. The reason why this is a peculiar restriction is because at first sight, it doesn't sound that bad; it does sound like free software. The advocates of this idea explain that there are certain situations where it is very anti-social to make a useful modification on a free program, use the program and benefit from it, and not release it. However, if you legally require your users to release any modifications they make, then this creates another problem, especially when this requirement conflicts with privacy rights. The public should be free to redistribute your program, but they should also be free to choose not to redistribute the program at all. The fundamental idea behind copylefted works is that they are owned by the public. But, “the public” is the individual, as much as it is the entire community. Copyleft protects the community by forbidding hoarding, but the individual also deserves an equivalent protection; the protection of both their privacy and their freedom.
Some developers, who do want to be part of our community, use licenses that do not restrict any of our freedoms but which ask for a “favor” from the user. An example of such a favor is to request that you change the name of the program if you modify it, or to not use the name of some organization in advertising. There is nothing ethically wrong with asking for such favors. Requiring them legally however creates a serious problem; it makes their terms incompatible with the terms of the GPL. It is very inefficient to inflict the price of such an incompatibility on our community for the sake of a favor. Instead, in almost all cases, it is just as good an idea to ask for such favors in the documentation distributed with the program, where there is more latitude in what restrictions you can impose (see Why free software needs free documentation).
Some people complain that the GPL is “too restrictive” because it says no to software hoarding. They say that this makes the program “less free”. They say that “free flow of ideas” means that you should not say no to anyone. If you would like to give your users more permissions, than provided by the GPL, all you need to do is append the text of these permissions to the copyright notices that you attach to every file; there is no need to write a new license from scratch. You can do this, if you are the original author of the file. For files that were written by others, you need their permission. In general, however, doing this is not a good idea.
The GPL has been very carefully thought-out to only give permissions that give freedom to the users, without allowing any permissions that would give power to some users to take freedom from all of the other users. As a result, even though the terms say no to certain things, doing so guarantees that the program remains free for all the users in our community. The US constitution guarantees some of our rights by making them inalienable. This means that no-one, not even the person entitled to the rights, is allowed to waive them. For example, you can't waive your right to freedom and sell yourself as a slave. While this can be seen as a restriction in terms of what you are allowed to do, the effect is that this restriction gives you more freedom. It is not you that the restriction really is targetting, but all the people, that have power over you, that might have an interest in taking your freedom away.
In many countries, other than the US, copyright law is not strictly enforced. As a result, the citizens in these countries can afford not to care about copyright. However, the free software community trascends nations and borders, and many of us do not have the same latitude. So, if you write a program that you want to share with other people, please be clear about the copyright terms. The easiest way to do this is by applying the terms of the GPL.
The GNU Project has two principal licenses to use for libraries. One is the GNU Library GPL; the other is the ordinary GNU GPL. The choice of license makes a big difference: using the Library GPL permits use of the library in proprietary programs; using the ordinary GPL for a library makes it available only for free programs.
Which license is best for a given library is a matter of strategy, and it depends on the details of the situation. At present, most GNU libraries are covered by the Library GPL, and that means we are using only one of these two strategies, neglecting the other. So we are now seeking more libraries to release under the ordinary GPL.
Proprietary software developers have the advantage of money; free software developers need to make advantages for each other. Using the ordinary GPL for a library gives free software developers an advantage over proprietary developers: a library that they can use, while proprietary developers cannot use it.
Using the ordinary GPL is not advantageous for every library. There are reasons that can make it better to use the Library GPL in certain cases. The most common case is when a free library's features are readily available for proprietary software through other alternative libraries. In that case, the library cannot give free software any particular advantage, so it is better to use the Library GPL for that library.
This is why we used the Library GPL for the GNU C library. After all, there are plenty of other C libraries; using the GPL for ours would have driven proprietary software developers to use another–no problem for them, only for us.
However, when a library provides a significant unique capability, like GNU Readline, that's a horse of a different color. The Readline library implements input editing and history for interactive programs, and that's a facility not generally available elsewhere. Releasing it under the GPL and limiting its use to free programs gives our community a real boost. At least one application program is free software today specifically because that was necessary for using Readline.
If we amass a collection of powerful GPL-covered libraries that have no parallel available to proprietary software, they will provide a range of useful modules to serve as building blocks in new free programs. This will be a significant advantage for further free software development, and some projects will decide to make software free in order to use these libraries. University projects can easily be influenced; nowadays, as companies begin to consider making software free, even some commercial projects can be influenced in this way.
Proprietary software developers, seeking to deny the free competition an important advantage, will try to convince authors not to contribute libraries to the GPL-covered collection. For example, they may appeal to the ego, promising “more users for this library” if we let them use the code in proprietary software products. Popularity is tempting, and it is easy for a library developer to rationalize the idea that boosting the popularity of that one library is what the community needs above all.
But we should not listen to these temptations, because we can achieve much more if we stand together. We free software developers should support one another. By releasing libraries that are limited to free software only, we can help each other's free software packages outdo the proprietary alternatives. The whole free software movement will have more popularity, because free software as a whole will stack up better against the competition.
Since the name “Library GPL” conveys the wrong idea about this question, we are planning to change the name to “Lesser GPL.” Actually implementing the name change may take some time, but you don't have to wait–you can release GPL-covered libraries now.
Copyright © 1989, 1991 Free Software Foundation, Inc. 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software—to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Library General Public License instead.) You can apply it to your programs, too.
When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights.
We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations.
Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and modification follow.
Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does.
You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee.
These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program.
In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License.
The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable.
If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code.
If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances.
It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice.
This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License.
Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and “any later version”, you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation.
If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the “copyright” line and a pointer to where the full notice is found.
one line to give the program's name and a brief idea of what it does. Copyright (C) 19yy name of author This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) 19yy name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details.
The hypothetical commands ‘show w’ and ‘show c’ should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than ‘show w’ and ‘show c’; they could even be mouse-clicks or menu items—whatever suits your program.
You should also get your employer (if you work as a programmer) or your school, if any, to sign a “copyright disclaimer” for the program, if necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. signature of Ty Coon, 1 April 1989 Ty Coon, President of Vice
This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Library General Public License instead of this License.
[1] GUI is an abbreviation for graphical user interface
[2] The author is also a former “vi” user that has found much happiness and bliss in Emacs
[3] Note that in Emacs parlance a window is not an X window. A frame is an X window. A window is a region within the frame.
[4] M-x means <ALT>-x. If you do not have an <ALT> key, then use <ESC> x instead.
[5] Many individuals refer to Microsoft Windows 95 as Win95. In hacker terminology, a win is something that is good. We do not believe that Microsoft Windows 95 is a good operating system, therefore we call it Lose95
[6] Note that in Emacs lingo a window does not correspond to an X window. It is the frame that corresponds to an X window. A window is merely a region within the frame. And the same Emacs process can actually be responsible for more than one frame
[7] Proposed Federal censorship regulations may prohibit us from giving you information about the possibility of aborting Emacs functions. We would be required to say that this is not an acceptable way of terminating an unwanted function
[8] In the event that the minor number has already grown larger than 90, I guess you can call your prerelease 0.900
[9] The Free Software Foundation and many others however believe that the current policies fall short of this justification and need to be re-evaluated
[10] The laws in France are now changing and they might be completely different by the time you read this book