httpd.rewrite
Contents- 0.1.0 Location
- 0.2.0 Function
- 0.3.0 Case study
- 0.4.0 URI
- 0.5.0
/sys/lib/httpd.rewrite
- 0.6.0 Httpd root for real and virtual hosts
- 0.7.0
/usr/bob/www/etc/rewrite
- 1.0.0 Format of
/sys/lib/httpd.rewrite
- 1.1.0 General rule
- 1.2.0 Meta symbols
- 1.3.0 Format of the first field
- 1.4.0 Format of second field
2003/02/11 Update
Location
/sys/lib/httpd.rewrite
Function
This file has two roles:- specify httpd root of real host and virtual host
- define URI redirection
Case study
In the explanation below, we assume our domain name is:pegasus.goodwill.com
alice bob carol david emily frank
alice
is a regular user and wants to have her web page inpegasus
, that is she want to have following URI:
http://pegasus.goodwill.com/~alice
bob
is a owner of documents on real host. The URI is:
http://pegasus.goodwill.com
and he thinkshttps
should be used for the access tohttp://pegasus.goodwill.com/private
.
carol
hopes to have virtual hostcar.goodwill.com
in this server and wishes to make use of the host for sales. She has a plan to use two protocolshttp
andhttps
and want to have two directories for each protocols.
david
is a system administrator of goodwill.com.
emily
was a user of this server but now she is absent.david
want to announce clients who is accessing her page that she is removed from the server.
frank
was a user of this server and now have his home page inhttp://www.eecs.harvard.edu/~frank
. He want to redirect clients to his new address.
URI
First of all I would like to explain URI(URL) briefly.According to RFC2068, URI of HTTP is defined as follows:
scheme://host[:port][/path][;params][?query]
[ ]
is meta symbol that denotes inside of this symbol can be omitted.
scheme
ishttp
orhttps
host
is a domain name of the server. The name must be registered to DNS.
port
is a decimal number. if it is omitted then 80 is assumed forhttp
and 443 is assumed forhttps
path
is relative path from document root
params
are defined in HTTP but not be used in major web server such as Apache. Pegasus usesparams
as arguments for CGI. On the other handquery
is passed to CGI as an environment parameter. Here you don't need to know the format ofparams
andquery
except that space characters are not be included in them.
scheme://host[:port][/path][?query]
[;params]
" is disappeared, but ';
' is remained as a special character. The semantic is unclear.
Old format works in regular situation, but might make a problem if the URI is in a directory that must be authenticated. For example Mac/Safari fails to be authenticated. Therefore it is wise not to use ';
'.
User's document is conventionally expressed using the path that begin with "
~user
" in URI, where "user
" is a user name of the system. That is, "~user
" is expressed as a directory under document root of real host. However implementation need not keep "~user
" under the document root. Rather it would be convenient to have user's document in the directory of the user.
/sys/lib/httpd.rewrite
The following is an example of "/sys/lib/httpd.rewrite
" for our case study.# pattern replacement http://car */usr/carol/www https://car */usr/carol/https /~frank http://www.eecs.harvard.edu/~frank /~emily */usr/david/www_removed /private https://pegasus.goodwill.com/private / */usr/bob/wwwThe lines in this file consist of two fields except comment line. Fields are separated by spaces.
The first field is a pattern of request. The request is limited only to http
and https
.
host
is the name of real host or virtual host. Note that the host
is not full name of the domain.
Note that we are assuming that the domain name of carol
's page is
car.goodwill.com
pegasus.goodwill.com
Then the trailing ".goodwill.com" must be remove from the first field.
If
carol
's page has another domain name, saycar.bar.com
http://car.bar.com */usr/carol/www
httpd.rewrite
.
If the first field begin with
/
then the request is regarded as the request to real host or user. That is, we regard scheme://host[:port]
is omitted, where host
is the name of real host.
In second field:
- if it begins with '
*
', httpd root is specified
- if it begins with
scheme
, redirect client to this URI
Note that the second field that begin with '
/
' is not described here. I would like to examine the real needs before I specify such a field.
If the requested URI matches the first field, then we have two cases:
- if the second field begins with '
*
', the extra portion of matched path is passed to the name space that is specified by second field.
- if the second fields does not begin with '
*
', the extra portion of matched path is added to the second fields and redirected to the client
- lines beginning with
http
orhttps
are for real host or virtual hosts
- lines beginning with "
/~
" are for users
- lines beginning with "
/
" and not "/~
" are for real host
- first, lines beginning with
scheme
are compared with the request. If no lines are matched, then lines beginning with "/
" are examined.
- comparison is done in descending direction.
- stop if a line that matches exists
- "
/path
" that begins with "/~
" does not match "/
"
Therefore a line with "
/
" as the first field must be placed at the end
If there is no field in "/sys/lib/httpd.rewrite
" that matches with URI, then Pegasus examines the possibility that the request is to the user.
Httpd root for real and virtual hosts
Every documents owner is permitted to have access control files and CGI files. All the system users who publish web contents have a httpd root(we refer "$web
"), and can have "doc
", "etc
" and "bin
" under the directory; "doc
" is the place where document is located, "etc
" is the place where access control files such as "rewrite
", "allow
", "passwd
" and "handler
" are located, and "bin
" is the place where CGI files are located.Pegasus set httpd root in "
/sys/lib/httpd.rewrite
". For example, assume bob
administrate the the document of real host and carol
administrate the the document of virtual host car
. Then httpd root is specified in the second field using '*
' followed by the directory path like this:http://car */usr/carol/www https://car */usr/carol/https / */usr/bob/wwwIn this case,
carol
are given two httpd root "/usr/carol/www
" and "usr/carol/https
" for her virtual host "car
".A system user
alice
can have their personal home page of URI:http://pegasus.goodwill.com/~alice
/sys/lib/httpd.rewrite
" if she has "$home/web/doc
". Then "$home/web
" is her httpd root.
Note: name of virtual host must be registered to DNS
- If parent domain name of virtual host is equal to that of the server, then you can omit the parent domain name. For example virtual host "
car.goodwill.com
" can be simplly "car
" if it is in "pegasus.goodwill.com
".
- If parent domain name of virtual host is not equal to that of the server, then you must write full name. For example virtual host "
pegasus.goodwill.org
" must be written as it is if it is in server "pegasus.goodwill.com
".
- If a virtual host has own IP, please add the IP as one of hosts.
/usr/bob/www/etc/rewrite
File "/usr/bob/www/etc/rewrite
" is read after "/sys/lib/httpd.rewrite
". Expression/private https://pegasus.goodwill.com/privatewill be considered in "
$web/etc/rewrite
".
Format of /sys/lib/httpd.rewrite
General rule
- lines beginning with '
#
' are comment
- All lines except comment consist of two fields
- Field is separated by spaces
Meta symbols
- "
[ ]
" denotes we can omit what is inside of this symbols
- "
|
" denotes selection
- "
:=
" denotes substitution
- "
::
" left item is explained by right
Format of the first field
First field is one of the followings:1. scheme://host[:port] 2. /path
scheme := http | https
host ::
host name | full host name | IP
port ::
port number
path ::
path to document
- lower case for
scheme
andhost
- omit
port
for80
and443
Format of second field
Second field is one of the followings:1. scheme://hostdom[:port][/path][;params][?query] 2. */path
params := param[;params]
scheme ::
any scheme
hostdom ::
full name of the host (pegasus.goodwill.com
for example)
port ::
port number
path ::
path to the document
params ::
argument list that is passed to CGI
query ::
query string