next up previous contents index
Next: 4.7.6 The local disk Up: 4.7 Gatherer administration Previous: 4.7.4 Controlling access to

4.7.5 Periodic gathering and realtime updates

     

  The Gatherer program does not automatically do any periodic updates -- when you run it, it processes the specified URLs, starts up a gatherd daemon (if one isn't already running), and then exits. If you want to update the data periodically (e.g., to capture new files as they are added to an FTP archive), you need to use the UNIX cron command to run the Gatherer program at some regular interval.

   

To set up periodic gathering via cron, use the RunGatherer command that RunHarvest will create. An example RunGatherer script follows:

       #!/bin/sh
       #
       #  RunGatherer - Runs the ATT 800 Gatherer (from cron)
       #
       HARVEST_HOME=/usr/local/harvest; export HARVEST_HOME
       PATH=${HARVEST_HOME}/bin:${HARVEST_HOME}/lib/gatherer:${HARVEST_HOME}/lib:$PATH
       export PATH
       cd ${HARVEST_HOME}/gatherers/att800
       exec Gatherer att800.cf

    You should run the RunGatherd command from your system startup (e.g. /etc/rc.local) file, so the Gatherer's database is exported each time the machine reboots. An example RunGatherd script follows:

       #!/bin/sh
       #
       # RunGatherd - starts up the gatherd process (from /etc/rc.local)
       #
       HARVEST_HOME=/usr/local/harvest; export HARVEST_HOME
       PATH=${HARVEST_HOME}/lib/gatherer:$PATH; export PATH
       gatherd -dir ${HARVEST_HOME}/gatherers/att800/data 8001



Duane Wessels
Wed Jan 31 23:46:21 PST 1996