Click here to return to Newclient Main Help Page


Back


Last Updated: February 25, 1998

Simple Keyword Index and Search Engine

A convenient method to allow clients to retrieve information from a web site is by building an index of the web pages and providing search capability. This may seem like a daunting task, but with a few tools this task becomes very easy.

Normally web indexing and searching requires a complex, hard to use and install WAIS-based solution, however a small yet powerful program SWISH-E (which is based on SWISH), written by Kevin Hughes, can be used to fulfill your web indexing and searching needs. The configuration of web indexing and searching is presented in three sections below.

  1. The SWISH-E index
  2. The HTML source for the search form
  3. Installing the CGI source code


The SWISH-E index

Powered by SWISH-E SWISH-E is an enhanced version of SWISH, which was originally written by Kevin Hughes. SWISH-E stands for Simple Web Indexing System for Humans - Enhanced. With it, you create searchable indexes of files on your Web server--and let people browsing your site search the generated indexes. Although SWISH-E is not intended to be a full-featured indexing and search tool, it is quite easy to manage, is incredibly powerful for being so simple, and was specifically created for use with Web sites.

To create a SWISH-E index of your site:

  1. Telnet to your Virtual Server.
  2. At a command prompt type:

      cd (and hit return - this will put you in your home directory)

  3. Then untar the swish-e tar file BWSD has prepared for you, type:

      tar -xvf /usr/local/contrib/swish-e.tar

  4. Then change your current working directory to your swish directory, type:

      cd ~/usr/local/swish-e

  5. Create a SWISH configuration file for your Virtual Server. Save this file under the name of "website.conf" (or some other name of your liking) and then store this file in your ~/usr/local/swish-e directory.
    PLEASE NOTE:
    You should be very familiar with the FileRules section of the configuration file. By default, SWISH will not index files in directories containing a ".htaccess" file. If you have a directory that contains a ".htaccess" file, and would like it indexed- comment out that FileRules line by placing a "#" as the first character in the line.

  6. Run swish-e. Type:

      cd ~/usr/local/swish-e
      ./swish-e -c CONFIG_FILE

    Where CONFIG_FILE is the name of your SWISH configuration file you created in the previous step.

    After you run the swish-e executable, a SWISH-E index file will be generated. The name of the SWISH-E index file is that which you specified as the value for the IndexFile variable in the SWISH configuration file.


HTML Source for the Search Form

The HTML source below represents a sample search form. This form can be customized for your Virtual Server by simply changing the occurrence of SWISH_INDEX_FILE (shown in bold) to the name of the swish-e index which you specified as the value for the IndexFile variable in the SWISH configuration file.

<html> <head> <title>Search Swish Index</title> 
</head> 
<body> 
<h1>Search Swish Index</h1> 
<form method="GET" action="/cgi-bin/library/searchindex/query.pl"> 
<!-- want to mimic "swish -f swishindex -w keywords -m maxresults" --> 
<input type="hidden" name="swishindex"
 value="/usr/local/swish-e/SWISH_INDEX_FILE"> 
<b>Search for the following keywords:</b><br> 
<input name="keywords" size=40 maxlength=512> 
<p> &#160; &#160; 
<input type=radio name=detail value=yes CHECKED> 
<b>Verbose report</b> &#160; &#160; 
<input type=radio name=detail value=no> 
<b>Simple report</b> 
<p> <b>Maximum number of results:</b><br> 
<input name="maxresults" size=5 value=40 maxlength=64> 
<p> <input type="submit" value="Search"> 
<input type="reset" value="Reset"> 
<p> __________________________________________
<p>search example 1: john and doe or jane<br> 
search example 2: john and (doe or jane)<br> 
search example 3: not (john or jane) and doe<br> 
search example 4: j* and doe<br> <p> <
/form> 
</body> 
</html> 

You may hide the "maxresults" edit field by simply using a "type=hidden" argument with the input tag.

If you are unfamiliar with the FORM HTML element, or would like to learn more about forms, the following URL is an excellent resource:

http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Docs/fill-out-forms/overview.html


Install the Search CGI

To install the search form on your Virtual Server you will need to do the following:

  1. Download the Search Form
    You will need to download the Search Form HTML source and store it somewhere in your "usr/local/etc/httpd/htdocs" directory structure. Feel free to customize the form, add graphics, etc. But be sure that the variable name for each input field is not altered.
  2. Untar the Simple Search CGI source code

          The CGI source code for the search form was installed when you created your SWISH-E index file. To extract only the CGI scripts from the swish-e tar archive, perform the following commands:

    1. telnet or SSH to your Virtual Server.
    2. change directories to your home directory (type "cd" and hit return)
    3. type "tar -xvf /usr/local/contrib/swish-e.tar [path]"
      Where the value of "[path]" is "usr/local/etc/httpd/cgi-bin/library". This will only install the query.pl and util.pl file into your "www/cgi-bin/library/searchindex" directory.

  3. Customize the Appearance of the Search CGI
    Two subroutines in the util.pl file are used to print out header and footer information. These functions are print_header_info and print_footer_info. Feel free to modify these functions such that the CGI outputs pages that are in synch with the motifs of the rest of your site.

Once you have completed the installation successfully, you will have a working search form like the one shown below (go ahead and test it).

Search for the following keywords:

Maximum number of results:

    Verbose report     Simple report

__________________________________________

search example 1: virtual and server
search example 2: multiple and ds3 and connectivity
search example 3: individual and access and log
search example 4: unlimited and e-mail and (aliases or mailboxes)

   


top


bd