Apache Server Unleashed by Richard Bowen, PDF, EPUB, 0672318083

Apache Server Unleashed by Richard Bowen

  • Print Length: 656 Pages
  • Publisher: Sams
  • Publication Date: March 9, 2000
  • Language: English
  • ISBN-10: 0672318083
  • ISBN-13: 978-0672318085
  • File Format: PDF, EPUB

 

”Preview”
According to Netcraft ( http://www.netcraft.com/ ), the Apache Web server is used
more than all other Web servers combined. Of the approximately 7 million Web sites on
the World Wide Web, about 4 million of them (55 percent) are running Apache. If you
also count server software based on the Apache code, this figure is closer to 60 percent.
In this chapter, you’ll see how Apache came to be and why it has become so popular.
Introducing Apache
P ART I
10
Note
Netcraft has been surveying the Web since July 1995, when it registered 18,957
sites on the Web. The company updates its survey monthly, showing the growth
or decline of each major player, and offers commentary on these trends. You
can see the survey at http://www.netcraft.net/survey/ . Netcraft is an Internet
research company, offering surveys like this one, as well as security consultancy
and various Web and Internet services.
Figure 1.1 shows a graph of the most popular Web servers and how many Web sites are
using those servers.
60%
30%
0%
Aug 1996 1997 1998 1999
Apache Other
Microsoft
Netscape
NCSA
F IGURE 1.1
Distribution of
Web servers in
use.
Note
If you’re really interested in hunting down the origins of the World Wide Web,
you may want to find a copy of the paper titled “As We May Think,” by
Vannevar Bush. This paper, written in 1945 (no, that’s not a typo), talks about
ways to organize information. His ideas look a lot like hypertext. You can read
this article online at http://www.theatlantic.com/unbound/flashbks/
computer/bushf.htm .
In the Beginning
The Web is still a very young phenomenon. Tim Berners-Lee invented the Web in late
1990 while working at CERN, the European Laboratory for Particle Physics. He devel-
oped it so that physicists working at various universities around the world could have
instantaneous access to information, to enable their collaboration on a variety of projects.
Tim defined URLs, HTTP, and HTML and, with Robert Cailliau, wrote the first Web
server and the first Web client software, which was later dubbed a browser.
Just a few years ago, it would have been necessary to explain what these concepts meant
to all but the most technically aware audience. Now, there are few people (at least in
developed nations) who are unaware of the WWW.
Shortly after Tim’s initial work, a group at the National Center for Supercomputing
Activities (NCSA) at the University of Illinois at Urbana-Champaign (UIUC) developed
the NCSA HTTPd Web server and the NCSA Mosaic graphical Web browser. Mosaic
wasn’t the first graphical Web browser, although it’s almost universally remembered as
such. That honor rightfully belongs to Viola, written by Pei Wei and available before
Mosaic. But Mosaic quickly stole the spotlight—and most users—becoming the most
widely used Web browser sometime in 1992.
NCSA HTTPd was the server most used on the Web for the first several years of its exis-
tence. However, in 1994, Rob McCool, who had developed NCSA HTTPd, left NCSA,
and the project fizzled. There was no longer any central organization collecting fixes,
developing new features, and distributing a functional product.
Since the source code of the server was publicly available, many people using it had
developed their own bug fixes and additional features that they needed for their own
sites. These patches were shared rather haphazardly via Usenet, but there wasn’t a cen-
tralized mechanism for collecting and distributing these patches.
Thus, Apache—like the World Wide Web—was put together largely by volunteers.
Although the demise of the NCSA HTTPd project left developers with a product that
didn’t work very well at the time and no one to complain to, a far superior product
resulted in the long run.
Who’s Responsible?
In February 1995, Brian Behlendorf and Cliff Skolnick put together a mailing list, got
some space on a machine, and got bandwidth donated by HotWired. Brian set up a CVS
(Concurrent Versioning System) tree, so that anyone who wanted to could contribute new
Apache’s History and Lore
C HAPTER 1
11
1
A PACHE ’ S
H ISTORY AND
L ORE
features and bug fixes. This way, a group of developers could collect their code modifica-
tions in one place and produce a combined product. Starting with NCSA HTTPd 1.3,
they started applying these patches. The first release of this product—named Apache,
because it was “a patchy” server—was version 0.6.2, released in April 1995.
The eight original core members of the Apache Group were Behlendorf, Skolnick, Roy
T. Fielding, Rob Hartill, David Robinson, Randy Terbush, Robert S. Thau, and Andrew
Wilson.
Shortly after the initial release, Thau designed a completely new architecture. Starting
with version 0.8.8 in August 1995, Apache was switched to this new code base.
Netcraft shows Apache passing NCSA as the leading HTTP server sometime in early 1996.
Introducing Apache
P ART I
12
Note
NCSA’s HTTPd project started and stopped a few times over the years and is
currently stopped. As a student-run project, it was really at the mercy of which
current students were interested, and whether there was funding. While it was
active, the NCSA HTTPd project traded expertise and code with the Apache
Group, and there was never really a feeling that they were in competition. They
were just colleagues working toward a common goal.
You can learn more about the NCSA HTTPd project at
http://hoohoo.ncsa.uiuc.edu/ . Although much of the documentation there
hasn’t been updated in several years, it still has some of the best available tuto-
rials on such subjects as CGI and HTML forms.
Recent Happenings
Suddenly, organizations such as The Wall Street Journal and Forbes are using the term
open source in front-page articles.
This seems a little strange to folks who have been familiar with the concept for a few
decades and are used to it being ignored, or actively snubbed, by people in the commer-
cial software industry.
In May 1997, Eric Raymond gave a talk, “The Cathedral and the Bazaar,” at the Linux
Kongress in Würzburg, Germany (see http://www.linux-kongress.de/1997/ ). This
started a chain of events, not the least of which was Netscape’s decision to release the
source code for its Web browser. The software world was no longer able to ignore the
“free software” movement, which renamed itself Open Source to shed some of the nega-
tive associations surrounding the movement. Eric was already well known in the free
software movement and had produced a substantial number of important software prod-
ucts, including GNU Emacs, NetHack, ncurses, and fetchmail. He wrote fetchmail, at
least in part, as research into the mystery of why the Open Source software development
model worked at all, when traditional capitalistic common sense says that it should not.
You can find the full text of his Linux Kongress talk and subsequent talks on his Web site
at http://www.tuxedo.org/~esr/writings/cathedral-bazaar/ .
In June 1998, the Apache Group announced that it was entering an agreement with IBM
for continued development of the Apache server so that IBM could include that code in
its WebSphere product. This was one of the first examples of a major software company
endorsing an existing Open Source project and was one of the linchpins in making the
Open Source movement appear viable to the rest of the software world. The endorsement
and financial support of the world’s largest software company told other companies that
the Open Source movement was not just a bunch of long-haired rebels intent on under-
mining the commercial software industry, but that it was a proven method of producing
quality products.
Before the IBM deal, there had been a number of attempts to make Apache work on
Windows, but there were some substantial technical difficulties and very few skilled
Windows programmers interested in the project. With the funding and resources that
came with the IBM agreement, they could make Apache run on Windows and do it well.
Apache on Windows is a great alternative to IIS, particularly for those people already
familiar with Unix but who have to use Windows. The modular approach taken by
Apache is a welcome relief when compared to IIS, which installs an enormous mono-
lithic application that does everything, including a wide variety of things that you proba-
bly are not interested in it doing.
Apache is lightweight, but any feature you want can be added by loading another mod-
ule. Apache is easy to configure and manage and allows you to configure settings that IIS
doesn’t even let you think about. And if you just have to have a graphical configuration
utility, Commanche provides this without taking away any of your power as a server
administrator.
Apache’s History and Lore
C HAPTER 1
13
1
A PACHE ’ S
H ISTORY AND
L ORE
Note
The Apache Group warns that Apache on Windows shouldn’t be considered as
reliable as Apache on Unix and Unix-like platforms (such as Linux), but improve-
ments are being made. Having a solid, reliable server for Windows is one of the
primary goals for the Apache 2.0 release, expected some time in 2000.
Why Apache Works So Well
Apache is just a fantastic product. It does everything you want it to do, and none of the
stuff that you don’t want it to do. It’s fast, reliable, and inexpensive. What more could
you want from a piece of software?
Apache can be all these things because it is open source. That means that everyone that
uses the product has access to the source code. If you have an idea of something that
would be useful, you can write and submit the code for that feature to the Apache Group
for possible inclusion in the product. This means that features that make it into Apache
are features that real people are actually using on real Web sites, not features that some-
one suggested in a marketing meeting after conducting a focus group.
Also, when bugs are found, the many people who have access to the code can determine
what’s breaking and suggest fixes for the problem. (Or, to quote Eric Raymond, “Given
enough eyeballs, all bugs are shallow.”) Hence, bug fixes usually follow closely on the
heels of bug discoveries. Contrast this to closed-source software products where, if you
report a bug, you are at the mercy of someone else’s schedule for a bug fix—if, in fact,
you ever get one at all.
Introducing Apache
P ART I
14
Tip
You can read Apache’s official history on the Apache Web site at
http://www.apache.org/ABOUT_APACHE.html .
Summary
Apache was developed by actual users who needed to fix problems with, and add fea-
tures to, the Web server software available in the World Wide Web’s early days. As such,
it’s a server that does things that real Web sites need. Apache and its derivatives are used
on about 60 percent of the Web sites today—more than all other Web servers combined.

HTTP—the Hypertext Transfer Protocol—is the language that Web browsers and Web
servers use to speak to one another. This chapter discusses the component parts of that
language, and what a typical HTTP conversation looks like.
Most of this conversation occurs completely outside your notice most of the time. But,
it’s very useful to know what’s going on behind the scenes so that you have more insight
into what’s happening when something goes wrong.
The HTTP specification defines the underlying framework on which all Web traffic sits.
URLs, HTML, and other components of using the Web are defined in separate specifica-
tions. They are kept apart so that they can evolve more freely than if they were tied
together in one specification.
You can see all the related Web specifications at the W3C (World Wide Web Consortium)
Web site at http://w3.org/ .
HTTP Headers
Much of the information exchanged between the client and the server is in the form of
HTTP headers. An HTTP header is of the form:
HeaderName: Data
When the client connects to the server, it sends several HTTP headers across the wire,
telling the server who it is and what it wants. The server will send back a number of
response headers, describing the data that’s being returned or explaining why no data is
being returned.
Although users are most interested in the body of the message—the actual Web page or
other resource that they wanted to see—this is the least interesting part of the HTTP con-
versation.
The HTTP specification defines a large number of headers that can be used. Section 14
of the HTTP/1.1 specification, Header Field Definitions, is 50 pages long. In addition to
these headers, the client and server can make up their own headers if they like.
Introducing Apache
P ART I
16
Note
You can get a copy of the HTTP/1.1 specification at http://www.ietf.org/rfc/
rfc2616.txt . There’s also a copy of this document on the CD-ROM that accom-
panies this book.
Table 2.1 shows general HTTP headers, which can be used by either the server or the
client. Headers specific to the client request or to the server response are listed in related
sections below.
T ABLE 2.1 General HTTP Headers
Header Syntax Meaning
Cache-Control: directives Different directives are available, depending on
whether this header is being sent by the server or
by the client. See Table 2.3 for directives that can
be used by the client (request) with this header.
See Table 2.5 for directives that can be used by the
server (response) with this header.
Connection: type Specifies the type of connection, such as Keep-
Alive or Close .
1 Content-Language: language
Used by either the client or the server to indicate
what (human) language the resource is in. These
are the standard two-letter codes to indicate vari-
ous languages. For example, English is repre-
sented as en , German as de , French as fr , and so
on. These codes are used in content negotiation, if
a client requires a document in a particular lan-
guage. Example: Content-Language: en
Content-Length: number_of_bytes When data is being sent by either the client or the
server, this header indicates the size in bytes of
that data.
Content-Location: URI Provides a URI (uniform resource identifier)
where the content is available if it’s different from
the requested URI.
Content-MD5: MD5 digest Contains the MD5 digest of the request or
response body.
Content-Range range/content_length In a request, this indicates that only part of the
content is being requested. In a response, it indi-
cates that only part of the content is being
returned. Example: Content-Range 0-300/2402
HTTP
C HAPTER 2
17
2
HTTP
continues
1 The following Content-* headers would be used by the client when POSTing or PUTting data to
the server. They would be used by the server when returning a document to the client.
Content-Type type/subtype Indicates the MIME type of the data being passed
in the message body. Example: Content-Type:
text/html
Date: date The date and time on which the transaction
occurred. Example: Date: Thu, 23 Sep 1999,
22:58:27 EDT
Expires: date Indicates when the data in the body should be con-
sidered stale. Example: Expires: Wed, 03 Dec
2016 22:13:00 GMT
Last-Modified: date Indicates when the data in the body was last
modified.
Pragma: directive Can be used to include implementation directives.
Example: Pragma: no-cache
Transfer-Encoding: encoding_type Indicates what encoding was performed to transfer
the message across the HTTP connection.
Upgrade: protocol/version Lets the sender of the message suggest to the
recipient that communication would be better han-
dled in some other protocol. This allows commu-
nication to be initiated in an older protocol, but for
the client and server to negotiate a newer protocol.
Example: Upgrade: HTTP/2.0
Via: server One or more Via headers can be put on a message
to show that it got to its destination through one or
more proxy servers. Example: Via: 1.1
proxy.com (Apache 1.3.7)
Warning: warning-code message Conveys additional information about the request
or response. The defined warning messages are
as follows:
• 110 Response is stale indicates that the
response is stale.
• 111 Revalidation failed indicates that an
attempt to revalidate failed.
• 112 Disconnected operation indicates that
the information, which will be cache was
disconnected from the network intentionally.
• 113 Heuristic expiration i ndicates that
the response’s age is greater than 24 hours.
• 199 Miscellaneous warning may include
arbitrary information, which will be passed to
the user.
Introducing Apache
P ART I
18
T ABLE 2.1 continued
Header Syntax Meaning
• 214 Transformation applied indicates
that the cache or proxy applied some change
to the content-encoding.
• 299 Miscellaneous persistent warning may
include arbitrary information, which will be
passed to the user.
HTTP
C HAPTER 2
19
2
HTTP
Header Syntax Meaning
Note
The following Content-* headers would be used by the client when POST ing or
PUT ting data to the server. They would be used by the server when returning a
document to the client.
Note
MIME (multipart Internet mail extensions) is a way of indicating the type of a
document. A MIME type consists of the type and the subtype. The type indi-
cates, in very broad terms, what type the document is. This can be something
like text , audio , or application . The subtype is much more specific and indi-
cates exactly what file format the data is encoded with. Subtypes might be
something like html , wav , or ms-word . Put together, the type and the subtype
very specifically define the file type, such as text/html , audio/wav , or
application/ms-word .
The MIME type tells the Web client (browser) what to do with the document
that it’s receiving. A browser knows, for example, that when it receives a docu-
ment of type text/html , it should format it and display it in the browser win-
dow. When it receives a document of type audio/mp3 , however, it may launch
an external program to play the audio content. See Chapter 7, “MIME Types,”
for more information.
The HTTP Conversation
Each HTTP transaction is handled as a separate conversation, without memory of previ-
ous conversations. For this reason, we say that HTTP is stateless—it doesn’t remember
the state that it was in at the end of the last conversation.
The HTTP conversation consists of several parts, each of which is covered in a separate
section of this chapter. The structure of the conversation is as follows:
• Client request The client (usually a Web browser) initiates the conversation by
connecting to the server and requesting a URI.
Introducing Apache
P ART I
20
Note
Throughout this book, you may see URI and URL used somewhat interchange-
ably. Although this practice is a little sloppy, it’s pretty common. URL (uniform
resource locator) is a subset of URI (uniform resource identifier). However, at
this time, it’s the only subset, so the terms really are fairly interchangeable.
• Request headers In addition to the request, the client will send some additional
headers.
• Request body The request body can contain additional data.
• Server status As the first part of the response, the server returns a status code,
indicating whether the request was successful and, if not, what went wrong.
• Response headers The server can then return any number of response headers.
• Requested data If the request was successful, the requested data will then be
returned to the client.
• Disconnect The conversation is now over, so the server will disconnect from the
client and wait for another request. A possible exception to this is if Keep-Alive is
enabled, in which case the connection will stay open for the next request from the
same client.
Client Request
The client (Web browser or other HTTP client) initiates the connection to the server and
makes a request. This request consists of three parts: the method, the resource being
requested, and the HTTP version number. The method is usually GET , POST , or HEAD .
Although other methods are permitted by the HTTP specification most are seldom used.
The HTTP/1.1 specification defines the request methods in Table 2.2.
T ABLE 2.2 HTTP Request Methods
Method Meaning
OPTIONS A request for information about the communication options available for
the specified URI.
GET Requests a document from the server.
HEAD Like GET , except only the headers are returned.
POST Sends data to some handler indicated by the URI.
PUT Requests that the data in the body section be stored at the specified URI.
DELETE Requests that the specified resource be deleted.
TRACE For debugging purposes; lets the client see what’s being received on the
other end.
CONNECT Reserved for future use.
The following sections cover just the three most commonly used methods. For more
information on the other methods, consult the HTTP specification in RFC2616, which
you can obtain from the W3C Web site at http://w3.org/ . The document is also located
at http://www.ietf.org/rfc/rfc2616.txt , and is on the CD-ROM that accompanies
this book.
GET
The GET method requests a particular URI from the server. That URI can be a document,
such as an HTML document, a GIF image, or a MP3 audio file, or it can be process,
such as a CGI program, that produces output to be displayed by the client.
A GET request will look something like this:
GET /fish/salmon.html HTTP/1.0
The actual file location of the URI is determined by the server. This determination can be
made in several ways. The server checks for Alias directives that match the requested
URI. It will check the location obtained by appending the URI to the server’s
DocumentRoot . Other HTTP servers have other methods of determining what’s to be
returned to the client. The resource isn’t necessarily a file but might be a dynamically
generated document. If the URI refers to an executable program and the server is config-
ured to consider it a CGI program, it will execute it and return the results to the client.
HTTP
C HAPTER 2
21
2
HTTP
Method Meaning
Note
Apache can be configured with other handlers for different types of URIs. See
Chapter 14, “Handlers,” for more information.
A GET request can be made conditional with any of the If-* request headers listed later
in Table 2.3. Table 2.3 also describes the Range request header, which you can use to
make a partial GET request.
HEAD
A HEAD request is similar to a GET request, except that the server should return only the
headers that it would have returned for a GET request but not return the data portion.
Using HEAD requests is useful for determining if the document has been modified since
the last time it was requested. If not, the client can conserve time and bandwidth by
using a local cached copy.
POST
For a POST request, the data contained in the body of the request should be sent to the
specified URI. The URI should refer to a handler that can process the data in some fash-
ion. This might be a CGI program.
Request Headers
After the request, the client can send additional headers to the server, providing addi-
tional information about itself or about the request. For example, a typical HTTP request,
with the additional request headers, might look something like the following:
GET /index.html HTTP/1.0
Connection: Keep-Alive
User-Agent: Mozilla/4.5 (WinNT; U)
Host: www.rcbowen.com
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*
Apache can limit the number of headers to be accepted from the client with the
LimitRequestFields configuration directive. By default, this is set to 100 .
See Chapter 5, “Server Configuration Files,” for more information on LimitRequestFields .
The HTTP specification at http://www.ietf.org/rfc/rfc2616.txt lists the various
headers and their meanings. Table 2.3 lists the defined headers that can be sent with a
request, in addition to the general headers listed in Table 2.1.
Introducing Apache
P ART I
22
T ABLE 2.3 Request Headers
Header/Syntax Meaning
Accept: type/subtype, type/subtype Lists the document types that the client prefers to
receive.
Accept-Charset: charset Indicates the acceptable character set(s) in a
response. Example: Accept-Charset: iso-8859-5
Accept-Encoding: encoding-type Indicates the acceptable encoding type(s) in a
response. Example: Accept-Encoding: gzip
Accept-Language: language Indicates the acceptable language(s) in a response.
Example: Accept-Language: en, de
Authorization: credentials Permits the client to pass authentication creden-
tials to the server, in order to enter a protected
area. See Chapter 16, “Authentication,” for more
details.
Cache-Control: directives Different directives are available according to
whether this header is being sent by the server or
by the client. See Table 2.5 for directives that can
be used by the server (response) with this header.
The directives that can be used in the request
are as follows:
• no-cache —Don’t cache the response.
• no-store —The cache must not store any part
of the response. Useful for protecting
sensitive data.
• max-age = seconds —The client isn’t willing
to accept a response that’s older than the spec-
ified number of seconds.
• max-stale = seconds —The client is willing
to accept a cached response that has exceeded
its expiration date by a maximum of the spec-
ified number of seconds. If no number of sec-
onds is specified, the client is willing to
accept a stale response of any age.
• min-fresh = seconds —The client will accept
a response that will be fresh the specified
number of seconds into the future. That is, the
data’s expiration date is later than the current
time plus the specified number of seconds.
HTTP
C HAPTER 2
23
2
HTTP
continues
• no-transform —Some proxy servers, in order
to save space or for whatever other reason,
occasionally convert data from one format to
another. For example, they might convert PCX
image files to the less wasteful JPEG format
to save cache space. This directive indicates
that the client isn’t willing to accept data that
has been converted to another format and is
willing to accept it only in its original form.
• only-if-cached —The client is willing to
accept data only if it comes from a cache.
This may be used for reasons of poor network
connectivity, for example.
Expect: expectation Indicates that the client is expecting a particular
behavior by the server. Example: Expect: 100-
continue
From: email_address Indicates the email address of the user operating
the browser. This isn’t sent without the user’s
approval, and so is almost never actually sent in
practice. Example: From: rbowen@rcbowen.com
Host: hostname:port The hostname and (optionally) the port number of
the host from which the URI is being requested.
This is the header that allows name-based virtual
hosts to work, because it lets a server know to
which virtual host the request was directed.
Example: Host: www.mk.net:80
If-Match: search_string(s) Makes the request conditional. The server should
return the requested document only if the search
string matches the value of the ETag response
header field.
If-Modified-Since: date Makes a request conditional. The server should
return a 304 (not modified) status if the docu-
ment hasn’t been modified since the specified
date. Example: If-Modified-Since: Thu, 23
Sep 1999, 22:58:27 EDT
Introducing Apache
P ART I
24
T ABLE 2.3 continued
Header/Syntax Meaning
If-None-Match: search_string(s) Makes the request conditional. The server should
return the requested document only if the search
string doesn’t match the value of the ETag
response header field.
If-Range: date Combines an If-Modified-Since and a Range
command. It means that if the document hasn’t
been changed since the specified date, send the
missing parts.
If-Unmodified-Since: date Makes the request conditional. The server should
return the document if it hasn’t been modified
since the specified date. Example: If-
Unmodified-Since: Thu, 23 Sep 1999,
22:58:27 EDT
Max-Forwards: number Limits the number of times the request will be
forwarded. The proxy server should decrement
the number before forwarding the request and, if
the number reaches 0, it must respond as the final
recipient. Example: Max-Forwards: 5
Proxy-Authorization: credentials Allows the client to pass authentication creden-
tials to a proxy that requires them.
Range: -range Allows the client to request just a portion of the
document. Example: Range: 0-500
Referer: URL Indicates the URL of the document from which
the link to the current document was taken. That
document is called the referrer. The header, how-
ever, is spelled Referer . Example: Referer:
http://www.mk.net/index.html
TE: transfer_codings Indicates what transfer codings the client is will-
ing to receive.
User-Agent: agent_name/version Indicates what user agent (browser or Web client)
software is requesting the document. Example:
User-Agent: Mozilla/4.5 (WinNT; U)
Request Body
In addition to the request itself and the request headers, the client might send additional
data to the server in the request body. This is generally used for sending data to a CGI
process with a POST request, but it might be used for a number of other purposes, such as
publishing a document to the server with a PUT request.
HTTP
C HAPTER 2
25
2
HTTP
The end of the headers is indicated by a single blank line; everything after this blank line
is considered to be the body of the request.
When there is data in the body, the client must send information about that content in the
headers, with such headers as Content-Type and Content-Length , as described earlier.
This content is passed on to the handling process over standard input ( STDIN ).
Server Status Codes
Having received the full request from the client, the server will first return a status code
and response headers before sending the actual response.
The messages are in five groups, each representing a different type of status condition:
• 100-series messages are informational.
• 200-series messages indicate that a client request was completed successfully.
• 300-series messages indicate that the request was redirected for some reason.
• 400-series messages indicate that there was an error on the client end.
• 500-series messages indicate that there was an error on the server end.
T ABLE 2.4 Server Status Codes
Code Message Meaning
100 Continue The client may continue with its request.
101 Switching Protocols The server is switching to another protocol, as
requested by the client by way of an Upgrade
header.
200 OK The client request was successful, and the server
returned the requested information.
201 Created A new URI was created. A Location header will
be returned by the server, indicating the location of
that new URI.
202 Accepted The request was accepted but not actually acted on.
The server may or may not act on the request at a
later time, and the body of the response may con-
tain additional information.
203 Non-authorative Information The information isn’t from the original server but
from a local cache or a third-party copy.
Introducing Apache
P ART I
26
204 No Content The response body contains no content. The
browser shouldn’t attempt to repaint its page view.
This response can be returned from a CGI process
that doesn’t want to have the client move off the
current page, for example.
205 Reset Content The browser should clear all content from the
HTML form contained on the page.
206 Partial Content The server is returning a partial response. This can
be used in response to a Range header, which
requests only a portion of the page.
300 Multiple Choices The requested URI might be ambiguous and could
refer to any one of several pages. This may be
used, for example, if a page is available in several
different languages.
301 Moved Permanently The URI is no longer available on this server. The
new location for the document is provided in a
Location header. All future requests should be
made to the new location.
302 Moved Temporarily The document has moved. The new location is
indicated with a Location header. However, future
requests should still be made to the old URI.
303 See Other The requested URI can be found at another URI,
which is indicated by the Location header.
304 Not modified This will be passed only if the client passed an If-
Modified-Since header, if the document hasn’t
been modified since that time. The client should
use whatever copy it has cached locally. This can
be used by a proxy to determine whether to serve a
cached copy or get the copy from the server.
305 Use Proxy The requested document should be accessed
through a proxy. The location of the proxy is
returned in a Location header.
306 Unused The 306 code was used in a previous version of the
specification but is no longer used.
307 Temporary Redirect The requested document is temporarily under a dif-
ferent URI used for a request other than a GET or a
HEAD , and the user must confirm the redirect.
HTTP
C HAPTER 2
27
2
HTTP
Code Message Meaning
continues
400 Bad Request There was a syntax error in the client request.
401 Unauthorized The client didn’t provide the correct authentication
to access the requested document. This response
code triggers the password dialog on most
browsers.
402 Payment Required This status code shows that the authors of HTTP
were either thinking ahead or had a sense of
humor. This code isn’t actually used by any servers
at this time.
403 Forbidden The client isn’t permitted to have the URI that it
requested.
404 Not Found Perhaps the most common error status code that
you will encounter on the Web. It indicates that the
document requested isn’t available. Either it has
moved or the client simply requested a document
that doesn’t exist.
405 Method Not Allowed The method used by the client isn’t permitted for
that particular URI.
406 Not Acceptable The URI exists but isn’t available in the format
requested by the client. This usually occurs when
the client asks for a document in a particular lan-
guage or encoding method.
407 Proxy Authentication This message is returned by a proxy server,
Required indicating that it needs to authorize the request
before passing it on to the destination server.
408 Request Time-out The client didn’t complete the request within a
specified time, and the server is terminating the
connection.
409 Conflict The request conflicts in some way with the server
configuration or with another request.
410 Gone The URL has been permanently removed and has
left no forwarding address.
411 Length Required The request didn’t provide a Content-Length
header, and one is needed.
Introducing Apache
P ART I
28
T ABLE 2.4 continued
Code Message Meaning
412 Precondition Failed A condition specified in one of the If-* headers
was false.
413 Request Entity Too The body of the request was larger than the
Large server was configured to permit.
414 Request-URI Too Long The request URI was longer than the server is con-
figured to permit.
415 Unsupported Media Type The body of the request was of a media type that
the server doesn’t know how to handle.
416 Request Range Not The range requested is out of range for the
Satisfiable resource requested. For example, the range
started after the end of the file being requested.
417 Expectation Failed An expectation given in the Expect header wasn’t
met.
500 Internal Server Error This catch-all error message indicates that some-
thing on the server (usually, a CGI program) has
failed.
501 Not Implemented The requested action can’t be performed.
502 Bad Gateway The server, while trying to act as a gateway or
proxy, received an invalid response from another
server further up the chain.
503 Service Unavailable The server isn’t available, due to overloading or
maintenance. The server may indicate the expected
length of the delay in a Retry-After response
header.
504 Gateway Timeout The server, acting as a proxy, didn’t receive a
response from the next server up the chain before
the timeout period expired.
505 HTTP Version Not The HTTP version number specified by the client
Supported isn’t supported by the server.
Response Headers
Following the status code comes one or more response headers. Table 2.5 shows the pos-
sible response headers that can be used, in addition to the headers listed in Table 2.1.
HTTP
C HAPTER 2
29
2
HTTP
Code Message Meaning
T ABLE 2.5 Server Response Headers
Header Syntax Meaning
Accept-Ranges: bytes_or_none Informs the client whether the server is willing to
send partial document ranges.
Age: seconds Indicates the age, in seconds, of the response.
This implies that the response isn’t being served
firsthand but is being served from cache.
Cache-Control: directives Different directives are available, depending on
whether this header is being sent by the server or
by the client. See Table 2.3 for directives that can
be used by the client (request) with this header.
The server (response) directives are as follows:
• public —The information is public and so may
be stored in any cache.
• private —The information is intended for a
single user and may not be cached. This speci-
fies only where the content may be cached
and doesn’t guarantee any kind of data privacy.
• no-cache —Don’t cache this response.
• no-store —Don’t store any part of this
response. This is meant to protect sensitive
material but shouldn’t be considered a guaran-
tee of data security.
• no-transform —Instructs the client not to per-
form any content-encoding transformations on
the data being sent.
• client must revalidate before getting the
content.
• proxy-revalidate —Similar to the must-
revalidate directive but refers to public
caches.
• max-age —The maximum age that this data
should be allowed to attain before it’s removed
from the cache.
• s-maxage —Similar to the max-age directive but
applicable to a shared public cache.
Introducing Apache
P ART I
30
ETag: etag value Provides the current value of the entity tag of the
requested variant.
Location: URI Redirects the client to a new location. Example:
Location: http://www.mk.net/
Proxy-Authenticate: challenge Included as part of a 407 ( Proxy Authentication
Required ) response.
Retry-After: date or seconds Can be used with a 503 ( Service Unavailable )
response to indicate when the service will again
be available.
Server: software version comment Indicates the server software that’s serving the
request, the version number, and any other com-
ment about that software. Example: Apache/1.3.9
(Unix) mod_perl/1.21
WWW-Authenticate: challenge Must be included with a 401 ( Unauthorized )
response. See Chapter 16 for more details.
Requested Data
The end of the response headers is indicated by a single blank line. Everything following
this is the response body.
The returned data can be the contents of a file or the response from a CGI process.
Although this is what the user is actually interested in, this is the least interesting part of
the entire transaction.
Disconnect or Keep-Alive
At this point, the HTTP conversation is complete. The data has been sent to the user.
The server will either terminate the connection or, if it’s a Keep-Alive connection, it will
hold it open until it receives another request over the connection or until the connection
times out, whichever happens first.
An Example HTTP Conversation
To see what an HTTP conversation looks like, it is useful to try it yourself. You don’t
have to have a Web browser to connect to a Web server—a simple Telnet client will do.
HTTP
C HAPTER 2
31
2
HTTP
Header Syntax Meaning
At your command prompt, (shell or DOS, as appropriate) enter
telnet www.apacheunleashed.com 80
If you’re using Windows, you’ll probably have a new window launched. If on a Unix
machine, you’ll see something like
Trying 204.146.167.214…
Connected to www.rcbowen.com.
Escape character is ‘^]’.
In either case, you’ll see nothing more—just a cursor waiting for input. You’ve connected
to a Web server as an ordinary Web client. The server is waiting for a request. At the
prompt, you can type a HTTP request.
GET /index.html
Introducing Apache
P ART I
32
Note
Remember that case matters— GET must be uppercase.
Once you press Enter or Return, you’ll see some HTML scroll past. This is either the
page you requested or a page telling you that something was wrong with your request.
In either case, you have completed the simplest HTTP conversation with an Apache Web
server. Experiment with different requests and headers to see what sort of responses you
get.
Summary
HTTP, the Hypertext Transfer Protocol, is the language that the client and server use to
communicate with one another and is the basis for traffic on the Web. The HTTP conver-
sation consists of a request, headers, and possibly a body being sent from the client, and
a status, headers, and a body being returned by the server.
For more details about HTTP, see http://www.ietf.org/rfc/rfc2616.txt . A copy of
this document is also provided on the CD-ROM that accompanies this book.

 

Apache Server Unleashed is designed for both the Apache Web developer and system administrator. Apache Server Unleashed teaches you how to extend the base server through CGI scripts and modules, with extensive advanced coverage on modules. This book teaches the system administrator how to fine-tune the server for specific traffic use. Some topics include how to start, stop and restart the server in order to update, retune, and address any general or disaster recovery issues. Learn underlying architecture of Apache Server, specifically security and authentication, as well as newer topics such as running Apache on Windows NT.


Related posts

Development of Educational Game using Microsoft Kinect: Primary School Mathematics Case Studies by Mira Kartiwi, PDF 3330064684
The Kimball Group Reader: Relentlessly Practical Tools for Data Warehousing and Business Intelligence Remastered Collection by Warren Thornthwaite, PDF 1119216311
The Virtual Missionary: The Power of Your Digital Testimony by Greg Trimble, PDF 1462121063
Password Recovery: Unlocking Computer For Windows 8, Windows 7, Windows Vista, Windows XP, Unlock ZIP & RAR Unlock Password In 30 Minutes! by Stephan Jones, PDF 1505469651
Frontiers in Data Science (Chapman & Hall/CRC Big Data Series) by Frank Emmert-Streib, PDF
Deep Learning with R by J. J. Allaire, PDF 161729554X

Leave a Reply

Your email address will not be published. Required fields are marked *