Logo address

Concept

目次

2002/11/29 Update

Encapsulation of name space based on document management

Traditional web server

Generally speaking, "server root" of web server has no essential meaning. For example, ServerRoot in httpd.conf of Apache web server is the directory to locate a log file and some configuration files.

"Server root" of traditional web server does nothing to regulate accesses to the name space in which the server is servicing. The problem will become clear if we run CGI programs under the server: all the files in the system will be seen from the CGI programs. There is potentially a serious security problem. By this reason CGI programs of users will be prohibited or will be regulated under the control of system administrator.

Fig.1 illustrates the relation among three name spaces: real space, service space, document space.
Real space is a set of files that can be seen from console and this is also a set of files on the system. ( Real space is shown by shadowed rectangle. )
Service space is a set of files in which the web server is servicing to the client. Service space is also a set of files that can be accessed by CGI programs. Service space is exactly equal to real space in traditional web server.

Fig.1: traditional web server
real space = service space

Document space is a set of documents that consists Web Pages. The document space of alice is a set of document that can be accessed using URI /~alice/. Document space of all users is in the service space and therefore they can be accessed equally by any CGI programs.

Progress made by Plan 9 standard web server

An essential progress was made first by the httpd of Plan9 second edition. The server could encapsulate service space under server root(Fig.2). The technique stood on the special ability of Plan 9: per process name space. The server root became "/" in the name space that was seen by CGI programs of this server. That is, CGI programs were encapsulated in the name space that is specified by server root.
Fig.2 illustrates the encapsulation. The area shown by dark gray color is a set of files out side of service space; the space is essentially hidden from CGI programs.

Fig.2: Name space of Plan 9 standard web server
real space > service space

Name space made by Pegasus web server

Pegasus inherited the great idea by Plan 9 standard web server and developed it father.
General speaking a web server has documents that are created and managed by various persons. These documents will be accessed by clients like bellow:
	http://some.dom.com/pathname	# document of real host
	http://some.dom.com/~alice/pathname	# document of user
	http://other.dome.com/pathname	# document of virtual host (the IP is not same as that of real host)
	http://virtual.dom.com/pathname # document of virtual host (the IP is same as real host)
where pathname is the path to the document from document root.

One of the problems of traditional web server (including that of Plan 9) is that the service space is shared among these persons.( see Fig.1 and Fig.2 ).
(Note that the problem is similar to that of address space of personal computers in the early days. Logical address was not supported.)
This means a CGI program of some person can look the documents of other persons. Therefore, there exists potential possibility of interference among the persons who have documents on the web server.
This problem will be fixed if web server can offer service in name space that is allocate to each document, and has been fixed first by Pegasus.

Fig.3a Fig.3b

Pegasus offers own name space to each document administrator.
Fig.3a illustrates service space when Pegasus is serving document {alice}
Fig.3b illustrates service space when Pegasus is serving document {bob}

Name space reconfiguration by Pegasus

Let's assume three persons in our server.
then we simply configure in /sys/lib/http.rewrite:
http://car	*/usr/carol/www
/		*/usr/bob/www
A user such as alice can have her home page without setting her web root. Then
/~alice		*/usr/alice/web
is her default web root.
System administrator, probably bob, configure service space that restrict all hosts and users in /lib/namespace.httpd. Then , each time the server is accessed by the client, only the files owned by the person whose document is requested are merged to the name space configured in /lib/namespace.httpd.
As a result we will have service space and document space shown in Fig.4 if alice's document is requested. (Fig.4 abbreviates name space configured in /lib/namespace.httpd)

Fig.4. Name space of Pegasus
Independent name space is given to the document administrator

Document owned by other persons are hided. So we can say alice, bob and carol are given , not hodgepodge shop but each shop. This is a base to avoid trouble in CGI.

Protection of files from wrong access

There may be files that should be protected from wrong access by users of the server directly(through ftp or telnet) or indirectly(through CGI program). This problem have been a head aching problem. For example, alice might have a file data that is readable only by herself and by her CGI.
In UNIX, web server is servicing as nobody, therefore alice must permit nobody to read data. Then a CGI program of other person also can read the file. Windows server is servicing as LocalSys, then what will happen? LocalSys have same privilege as root of UNIX.
Pegasus resolves this problem as follows:
  1. run server as user web.
    web is not real user. therefore web need not own it's file.
  2. add web as a group member of alice to /adm/users:
    alice:alice:web
  3. set permission of data:
    --rw-r----- alice web .... data
Note: Why the problem of access protection is solved so simply? Because service space of Pegasus is encapsulated to each user.

Unix resolves this problem using CGI wrapper( for example, look http://download.sourceforge.net/cgiwrap ). That is, CGI wrapper is set SUID of root and httpd is forced to access to CGI only via CGI wrapper.

Comparing two method, we can conclude that:
1. Pegasus method is safer than CGI wrapper, because all files of a user will fall into danger under CGI wrapper if the user write a problematic CGI. On the other hand, only files that permit writing access to `web' will fall into danger under Pegasus.
2. Pegasus method is much easier to administrate. There is almost nothing to administrate. The only thing to do is to run Pegasus as user `web'.

Virtual document environment with high freedom

The current target of Pegasus is to serve CGI to users both with high security and high freedom. Pegasus realizes virtual documents using execution handler. So called CGI file of Pegasus is one of special configuration of execution handler. The setting of execution handler is completely left to the document owners. Therefore they can place the contents depending on their needs.

It might be required to explain "execution handler" , because the term may be original to Pegasus. "Execution handler" is a program that processes files requested by clients. User defines relation between path pattern of the request and the program to process it. ( The definition is written in $web/etc/handler) We call the program "handler" of the file. If requested file is same as the handler, the file is a traditional CGI file.
The below is the current setting of my server http://plan9.aichi-u.ac.jp:

# path     mimetype   unused  execpath arg ...
/netlib/*/index.html text/html 0 /bin/ftp2html
*.http        -          0     $target
*.html     text/html     1     $target
*.dx_html  text/html     0     /bin/dx $target
A special handler can be assigned to files with special suffix.Thus we can introduce Server Side Include using execution handler.
A special handler can also be assigned to specific directories. Thus we can introduce auto-indexing mechanism for the directories of FTP service.
`Execution handler' may be applied to wide range of applications and I would like to emphasize: it is completely controllable by users (not by system administrator).