|
Last Updated: February 25, 1998
Simple Keyword Index and Search Engine
A convenient method to allow clients to retrieve information from a
web site is by building an index of the web pages and providing search
capability. This may seem like a daunting task, but with a few tools
this task becomes very easy.
Normally web indexing and searching requires a complex, hard to use
and install WAIS-based solution, however a small yet powerful program
SWISH-E
(which is based on SWISH),
written by Kevin Hughes, can be used to fulfill your web indexing and
searching needs. The configuration of web indexing and searching is
presented in three sections below.
- The SWISH-E index
- The HTML source for the search form
- Installing the CGI source code
The SWISH-E index
SWISH-E is
an enhanced version of SWISH,
which was originally written by Kevin Hughes. SWISH-E stands for Simple
Web Indexing System for Humans - Enhanced.
With it, you create searchable indexes of files on your Web
server--and let people browsing your site search the generated
indexes. Although SWISH-E is not intended to be a full-featured
indexing and search tool, it is quite easy to manage, is incredibly
powerful for being so simple, and was specifically created for use
with Web sites.
To create a SWISH-E index of your site:
- Telnet to your Virtual Server.
- At a command prompt type:
cd (and hit return - this will
put you in your home directory)
- Then untar the swish-e tar file BWSD has prepared for you, type:
tar -xvf /usr/local/contrib/swish-e.tar
- Then change your current working directory to your swish
directory, type:
cd ~/usr/local/swish-e
- Create a SWISH configuration
file for your Virtual Server. Save this file under the name of
"website.conf" (or some other name of your liking) and
then store this file in your ~/usr/local/swish-e directory.
PLEASE NOTE:
You should be very familiar with the FileRules
section of the configuration file. By default, SWISH
will not index files in directories containing a ".htaccess"
file. If you have a directory that contains a ".htaccess"
file, and would like it indexed- comment out that FileRules
line by placing a "#" as the first character in the
line. |
- Run swish-e. Type:
cd ~/usr/local/swish-e
./swish-e -c CONFIG_FILE
Where CONFIG_FILE is the name of your
SWISH configuration file you
created in the previous step.
After you run the swish-e executable, a SWISH-E
index file will be generated. The name of the SWISH-E
index file is that which you specified as the value for the IndexFile
variable in the SWISH configuration file.
HTML Source for the Search Form
The HTML source below represents a sample search
form. This form can be customized for your Virtual Server by
simply changing the occurrence of SWISH_INDEX_FILE (shown in bold)
to the name of the swish-e index which you specified as the value for
the IndexFile variable
in the SWISH configuration file.
<html> <head> <title>Search Swish Index</title>
</head>
<body>
<h1>Search Swish Index</h1>
<form method="GET" action="/cgi-bin/library/searchindex/query.pl">
<!-- want to mimic "swish -f swishindex -w keywords -m maxresults" -->
<input type="hidden" name="swishindex"
value="/usr/local/swish-e/SWISH_INDEX_FILE">
<b>Search for the following keywords:</b><br>
<input name="keywords" size=40 maxlength=512>
<p>    
<input type=radio name=detail value=yes CHECKED>
<b>Verbose report</b>    
<input type=radio name=detail value=no>
<b>Simple report</b>
<p> <b>Maximum number of results:</b><br>
<input name="maxresults" size=5 value=40 maxlength=64>
<p> <input type="submit" value="Search">
<input type="reset" value="Reset">
<p> __________________________________________
<p>search example 1: john and doe or jane<br>
search example 2: john and (doe or jane)<br>
search example 3: not (john or jane) and doe<br>
search example 4: j* and doe<br> <p> <
/form>
</body>
</html>
You may hide the "maxresults" edit field by simply using
a "type=hidden" argument with the input tag.
If you are unfamiliar with the FORM HTML element, or would
like to learn more about forms, the following URL is an excellent
resource:
http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Docs/fill-out-forms/overview.html
Install the Search CGI
To install the search form on your Virtual Server you will need to
do the following:
- Download the Search Form
You will need to download the Search Form
HTML source and store it somewhere in your "usr/local/etc/httpd/htdocs"
directory structure. Feel free to customize the form, add graphics,
etc. But be sure that the variable name for each input field
is not altered.
- Untar the Simple Search CGI source code
The CGI source code for the search form was
installed when you created your SWISH-E index file. To
extract only the CGI scripts from the swish-e tar archive, perform
the following commands:
-
telnet or SSH to your Virtual Server.
- change directories to your home directory (type "cd"
and hit return)
- type "tar -xvf /usr/local/contrib/swish-e.tar
[path]"
Where the value of "[path]" is "usr/local/etc/httpd/cgi-bin/library".
This will only install the query.pl and util.pl file into your "www/cgi-bin/library/searchindex"
directory.
- Customize the Appearance of the Search CGI
Two subroutines in the util.pl file are used to print out header
and footer information. These functions are print_header_info
and print_footer_info. Feel free to modify these
functions such that the CGI outputs pages that are in synch with the
motifs of the rest of your site.
Once you have completed the installation successfully, you will have
a working search form like the one shown below (go ahead and test it).
|