The CGI Process
The basic principle of Common Gateway Interface (CGI) is that a Web server passes client request information to CGI programs in system environment variables (and in some cases through standard input or command line arguments) and all standard output of CGI programs is returned to Web clients.
- Parsing CGI input
- Processing the request
- Generating the response
Throughout the topic there will be references to conversion modes, which has to deal with how data is presented to a CGI programs and how data that is returned by the CGI program is processed by the HTTP Server. To learn more about conversion modes, see CGI data conversions.
Parsing CGI input
When the environment variables have been set by the HTTP server, it starts the CGI program. (For complete list of environment variables set by the HTTP Server, see Environment variables set by HTTP Server.) It is then up to this CGI program to find out where to get the information needed to fulfill the request.
The two most common ways a CGI program may be called from the HTML document:
- By using an HTML form and the request method (environment variable REQUEST_METHOD) POST.
- By using an HTML anchor tag to specify the URL for the CGI program
and adding the variables to this URL. This would be interpreted as
REQUEST_METHOD=GET
.
The CGI script has to perform the following tasks in order to retrieve the necessary information:
- Find out the REQUEST_METHOD used by the client.
- If the REQUEST_METHOD used was the GET method, the CGI program knows that all additional values may be retrieved from the QUERY_STRING environment variable.
- If the REQUEST_METHOD used was POST, the CGI knows that additional information was passed using STDIN. It will then have to query the CONTENT_LENGTH environment variable to know how much information it will have to read from STDIN.
An example of data read in the QUERY_STRING variable (%%MIXED%% mode):
NAME=Eugene+T%2E+Fox&ADDR=etfox%40ibm.net&INTEREST=RCO
- A plus sign (+) represents spaces.
- A percent sign (%) that is followed by the American National Standard Code for Information Interchange (ASCII) hexadecimal equivalent of the symbol represents special characters, such as a period (.) or slash (/).
- An ampersand (&) separates fields and sends multiple values for a field such as check boxes.
Parsing breaks the fields at the ampersands and decodes the ASCII hexadecimal characters. The results look like this:
NAME=Eugene T. Fox
ADDR=etfox@ibm.net
INTEREST=RCO
You can use the QtmhCvtDb() API to parse the information into a structure. The CGI program can refer to the structure fields. If using %%MIXED%% input mode, the “%xx” encoding values are in ASCII and must be converted into the “%xx” EBCDIC encoding values before calling QtmhCvtDb(). If using %%EBCDIC%% mode, the server will do this conversion for you. The system converts ASCII “%xx” first to the ASCII character and then to the EBCDIC character. Ultimately, the system sets the EBCDIC character to the “%xx” in the EBCDIC CCSID.
The main advantage of using the GET method is that you can access the CGI program with a query without using a form.
The main advantage to the POST method is that the query length can be unlimited so you do not have to worry about the client or server truncating data. The query string of the GET method cannot exceed 8 KB.
Processing the request
Processing the request is the second stage of a CGI program. In this stage, the program takes the parsed data and performs the appropriate action. For example, a CGI program designed to process an application form might perform one of the following functions:
- Take the input from the parsing stage
- Convert abbreviations into more meaningful information
- Plug the information into an e-mail template
- Use SNDDST to send the e-mail.
Generating the response
When the CGI program has finished processing it has to send its result back to the HTTP server that invoked the program. By doing so the output indirectly is sent to the client that initially requested the information.
Because the CGI program issues its result through STDOUT, the HTTP server has to read the information from there and interpret what to do.
A CGI program writes a CGI header that is followed by an entity body to standard output. The CGI header is the information that describes the data in the entity body. The entity body is the data that the server sends to the client. A single newline character always ends the CGI header. The newline character for ILE C is \n. For ILE RPG or ILE COBOL, it is hexadecimal '15'. The following are some examples of Content-Type headers:
Content-Type: text/html\n\n
Content-Type: text/html; charset=iso-8859-2\n\n
If the response is a static document, the CGI program returns either the URL of the document using the CGI Location header or returns a Status header. The CGI program does not have an entity body when using the Location header. If the host name is the local host, HTTP Server will retrieve the specified document that the CGI program sent. It will then send a copy to the Web client. If the host name is not the local host, the HTTP processes it as a redirect to the Web client. For example:
Location: http://www.acme.com/products.html\n\n
The Status header should have a Content_Type: and a Status in the CGI header. When Status is in the CGI header, an entity body should be sent with the data to be returned by the server. The entity body data contains information that the CGI program provides to a client for error processing. The Status line is the Status with an HTTP 3 digit status code and a string of alphanumeric characters (A-Z, a-z, 0-9 and space). The HTTP status code must be a valid 3 digit number from the HTTP/1.1 specification.
CONTENT-TYPE: text/html\n
Status: 600 Invalid data\n
\n
<html><head><title>Invalid data</title>
</head><body>
<h1>Invalid data typed</h1>
<br><pre>
The data entered must be valid numeric digits for id number
<br></pre>
</body></html>