Logo address

Basic Installation of Pegasus

Contents

2014/09/15

Here I will explain how to install Pegasus with assumptions:

Installing httpd and mon

  1. Get Pegasus 2.7 from
    http://plan9.aichi-u.ac.jp/netlib/pegasus/pegasus-2.7.tgz
  2. Unpack
    term% gunzip pegasus-2.7.tgz
    term% tar -xf pegasus-2.7.tar
    then a directory "pegasus-2.7" will be created in the directory in which you executed "tar".
    Let this directory be "$pegasus" in the explanation below.
  3. Compile
    term% cd $pegasus/httpd
    term% mk
  4. Install
    The default installation place is:
    /usr/local/bin/386
    Create the directory and execute:
    term% mk install
In the same way, you can install "mon".
	term% cd $pegasus/mon
	term% mk install

Virtual user "web"

Pegasus runs as user "web" and does service as the user.
You must not give password for user "web".

/adm/users

  1. Add a user "web" to "/adm/users"
  2. Add a group "webu" and the group members (web,alice,bob) to "/adm/users"

/usr/web

The directory "/usr/web" is a default base directory on which Pegasus configures namespace.
( You can use other directory. Look the section "/sys/lib/httpd.conf" of this page.)
	term% mkdir /usr/web
	term% chmod 775 /usr/web
Don't let owner of "/usr/web/" be "web".

Create empty directories under "/usr/web/". The goal is like the followings*.

d-rwxrwxr-x bob bob      ..... /usr/web/bin/386
d-rwxrwxr-x bob bob      ..... /usr/web/bin/rc
d-rwxrwxr-x bob bob      ..... /usr/web/dev
d-rwxrwxr-x bob bob      ..... /usr/web/env
d-rwxrwxr-x bob bob      ..... /usr/web/etc
d-rwxrwx--- bob web      ..... /usr/web/etc/nonce
d-rwxrwxr-x bob bob      ..... /usr/web/lib
d-rwxrwxr-x bob bob      ..... /usr/web/mnt
d-rwxrwxr-x bob bob      ..... /usr/web/proc
d-rwxrwxr-x bob bob      ..... /usr/web/rc/lib
d-rwxrwxr-x bob bob      ..... /usr/web/sys/lib
d-rwxr-xr-x bob bob      ..... /usr/web/tmp
where "bob" is your account name. Take notice of the permission bits and the group of "/usr/web/etc/nonce/".

You have the replica under "$pegasus/example/usr/web/". Therefore you may copy the replica to "/usr/web".
The copy is easy if you use my tool "cpdir". You can get "cpdir" from http://plan9.aichi-u.ac.jp/netlib/cmd/.

	term% cpdir -mv $pegasus/example/usr/web /usr/web
and then
	term% chmod 770 /usr/web/etc/nonce
	term% chgrp webu /usr/web/etc/nonce

Note that:

Configurations

Pegasus uses some files in system directories. The templates are in "$pegasus/sample". Copy them to the appropriate place:
	term% cd $pegasus/sample
	term% cp sys/lib/httpd.conf /sys/lib/httpd.conf
	term% cp sys/lib/httpd.rewrite /sys/lib/httpd.rewrite
	term% cp lib/namespace.httpd /lib/namespace.httpd
Note that you have already "/sys/lib/httpd.rewrite" and "/lib/namespace.httpd" of official httpd. It is wise to make a backup of these files.

/sys/lib/httpd.conf

Take a look at "/sys/lib/httpd.conf". The contents will be:
maia% cat sys/lib/httpd.conf
#
#	Remove "#" to set your value if you want to change default vaue
#	Note that
#	1. second field is default value except "myname"
#	2. all unit of time is second
#

#
#	files for Pegasus
#
#   server name is taken from ndb
#
# base		/usr/web		# base directory for Pegasus
# namespace	/lib/namespace.httpd	# name space configuration
# rewrite	/sys/lib/httpd.rewrite	# system rewrite file

#
#	currently we have the following parameters that might be required tuning
#

# charset	utf-8  # HTTP header charset. The default is latin(iso-8859-1)

## allowbasic	0	# gone

# parsetimeout	15	# timeout to parse header
# waittimeout1	15	# wait timeout for an non-authenticated client
# waittimeout2	900 	# wait timeout for an authenticated client
# cgitimeout	5	# timeout for CGI
# posttimeout	900	# timeout to get POST data
## connectlimit	300	# gone
# maxpost1	10	# maximum post data size (in unit of MB) for unauthorized client
# maxpost2	100	# maximum post data size (in unit of MB) for authorized client
# maxconnect	50	# max connections by a single remote IP (default 50)
# maxconnect	0	# no restriction to max connections
# obstime	3	# observation time to detect burst access
## maxaccess	20	# gone
# lockouttime   180	# lockouttime for maxconnect and maxaccess  (in unit of sec)
# contmax	100	# max persistent continuation count for safty

You need not change the default values in this file. The tuning is after you have monitored the performance of the server.

/lib/namespace.httpd

Take a look at "/lib/namespace.httpd". The contents will be
bind -a /usr/web/bin/$cputype /bin
bind -a /usr/web/bin/rc /bin
bind /sys/lib /usr/web/sys/lib
bind /lib /usr/web/lib
bind /bin /usr/web/bin
bind /rc/lib /usr/web/rc/lib
bind -c #e /usr/web/env
bind #c /usr/web/dev
bind /proc /usr/web/proc
Not all will be required. And note that the line:
	bind /sys/lib /usr/web/sys/lib
will make all files under "/sys/lib" accessible via CGI. Especially you should note that some secret files such as "/sys/lib/ssh" and "/sys/lib/tls" might be there. They should be protected against reading by others.

CGI environment configured in "/lib/namespace.httpd" will be inherited to real host, virtual hosts, and regular users. Therefore you should be careful enough.

Although the content is configured for CGI and the content might be too much for regular CGI service, I think the configuration is harmless.

/sys/lib/httpd.rewrite

Suppose you are bob who administrates real host documents, and you want to locate documents under "/usr/bob/www/doc". Then the following single line is enough for most cases.
#

# 	syntax: prefix replacement
# 	parsed by splitting into fields separated by spaces and tabs.
# 	Anything following a # is ignored.
#
#	Pegasus extension for virtual host
#	`*' prefixed items will be bound to web root
#


# Home page for IP based virtual host. Don't foreget the IP of plan9
#http://car		*/usr/carol/www
#http://202.250.160.122	*/usr/carol/www

# Redirection to another site
#/~emili	http://plan9.bell-labs.com

# Httpd root of real host is "/usr/bob/www"
/	*/usr/bob/www

This configuration stands on the assumption that bob has web documents for his real host in /usr/bob/www/doc/.

If you do want to configure more complicated hosting service, look "/sys/lib/httpd.rewrite".

/sys/log/http and /sys/log/blacklist

Before invoking Pegasus, create log files.
Access log file of Pegasus is /sys/log/http. (NB: not /sys/log/httpd)
You will find brute client IP in /sys/log/blacklist.
If you find IPs that you don't think brute in blacklist, you may loosen the parameter maxconnect in /sys/lib/http.conf.

Web Document

httpd root

The distinct feature of Pegasus is that it is running in confined name space.
Assume that user "alice" have her web document in /usr/alice/web/doc, then Pegasus sees the document in /doc. The path /usr/alice/web is the boundary that Pegasus can scope. Pegasus is running inside the boundary when accessing documents of alice. We call the boundary "httpd root", which is the key concept to understand Pegasus.

Web pages for regular users

Httpd root for regular user "alice" is /usr/alice/web/.
The permission should be

	d-rwxr-x--- M 73622 alice web    0 Aug 13  2007 /usr/alice/web
if you want to protect your files under the directory against other regular users.

Her document root is /usr/alice/web/doc/.

	d-rwxr-xr-x M 73622 alice alice  0 Aug 31  2007 /usr/alice/web/doc
Then you should be able to look her document by accessing
	http://your.server.com/~alice
from your browser.

You need not other setting for regular user's web page.
Having /usr/alice/web/doc/ is enough for the service. (of course you need the contents in /usr/alice/web/doc/)

Access control

For access control, alice need to have control files in
	/usr/alice/web/etc/allow 	# for IP based control
	/usr/alice/web/etc/passwd	# password based control
Pegasus support both Basic and Digest authentications.
Look Pegasus 2.2 manual for details.

CGI

For CGI service, alice must have

	/usr/alice/web/etc/handler
Two lines will be enough in most cases:
*.cgi       text/html    +       $target
*.html      text/html    0       $target
Then files with suffix ".cgi" or ".html" in alice's document space are CGI programs if exec bit is set.
Files with suffix ".cgi" are so called CGI files of other web servers such as Apache.
On the other hands, files with suffix ".html" enables more handy format.
Look hander in Pegasus 2.2 manual.

Pegasus does not support special directory such as "cgi-bin". You can locate CGI program anywhere in document space.

Alice can add her own tools for CGI in directories:

	/usr/alice/web/bin/rc
	/usr/alice/web/bin/386

All CGIs that is running in Pegasus see the document root as /doc and the CGI is running as user web.
This is true not only for regular users but also real host and virtual host.
That is, both Pegasus httpd and Pegasus CGI is running in confined name space (sandboxed name space) without using CGI wrapper such as suEXEC, cigwrap and SBOX in unix world.

Directories listed below is probably enough for alice to do CGI.

d-rwxrwx--- M 73622 alice web    0 Aug 13  2007 /usr/alice/web
d-rwxr-xr-x M 73622 alice webu   0 Oct 23  2006 /usr/alice/web/bin
d-rwxrwxr-x M 73622 alice alice  0 Aug 31  2007 /usr/alice/web/doc
d-rwxrwxr-x M 73622 alice alice  0 Aug 31  2007 /usr/alice/web/etc
d-rwxrwxr-x M 73622 alice webu   0 Aug 13  2007 /usr/alice/web/log

Fig.1: Basic directory structure of Pegasus

You may need log directory for debugging CGI. The directory is seen as /log in CGI.
Temporally working directory is automatically provided as ram disk. The directory is seen as /tmp in CGI. The /tmp is private to the CGI and will be removed automatically when the CGI is finished.

Real and virtual hosts

Basic directory structures and the roles of Pegasus are the same among real host, virtual host, and regular users.
Hence, what I explained in regular users is also true for both real host and virtual hosts.

The difference is in that we must explicitly write the directories for real host and for virtual hosts in the file /sys/lib/httpd.rewrite.

Here is my live example:

# syntax: prefix replacement
# parsed by splitting into fields separated by spaces and tabs.
# Anything following a # is ignored.
#
# prefix is a literal string match which is applied to each
# file prefix of each url. The most specific, ie longest
# pattern wins,  and is applied once (no rescanning).
# Leave off trailing slash if pattern is a directory.
#
# If replacemant is a url, a "Permanently moved" message is returned.
#
# Home page for virtual host. don't foreget IP of plan9
http://plan9		*/usr/arisawa/www
https://plan9		*/usr/arisawa/www
http://202.250.160.122	*/usr/arisawa/www
https://202.250.160.122	*/usr/arisawa/www
http://cpa		*/usr/cpa/www

# Redirection to another site
#/~carol	http://plan9.bell-labs.com
/	*/usr/arisawa/http

My machine named ar supports:

The document for real host is located in /usr/arisawa/http/doc which can be accessed by the URL

	http://ar.aichi-u.ac.jp
My Plan9 pages are in /usr/arisawa/www/doc. The URL is
	http://plan9.aichi-u.ac.jp
which is an IP based virtual host. The IP address is 202.250.160.122.
Hence, we should add a line
	http://202.250.160.122	*/usr/arisawa/www
so that we allow client to access using the IP
	http://202.250.160.122
For real host, Pegasus takes care of the IP. Thus you can access to ar by
	http://202.250.160.40
without the IP in /sys/lib/httpd.rewrite.

Virtual hosts cannot have user's URL, that is, URL such as

	http://cpa.aichi-u.ac.jp/~foo
is intentionally disabled.

Pegasus support https. They works uniformly for real host, virtual hosts and regular users.
Likewise Basic and Digest authentication works uniformly.
The uniformity comes from the uniformity of basic directory structure of httpd space of Pegasus.

Run Pegasus

Run Pegasus using "mon"

Execute
	term% b=/usr/local/bin/$cputype
	term% $b/mon -du web $b/httpd -suM
and confirm by "ps" command that "mon" and "httpd" is really running.
The process owner should be "bob" and "web" respectively.

To restart httpd, excute

	Kill httpd | rc
then mon will automatically restart the httpd.

Confirm Pegasus does service

Try to access using a browser and take a look at "/sys/log/http".

Updating

You might find "pegasus-2.7a", "pegasus-2.7b", ... in
http://plan9.aichi-u.ac.jp/netlib/pegasus/
Those are bug fix versions to Pegasus 2.7.
Only updated components for ver.2.7 are there.