Since the FTP mirror system uses anonymous FTP, you need to set up anonymous FTP access on the machine where you will run a replicator. This is fairly straightforward, usually involving creating an account and home directory for user ``ftp''. See the ftpd manual page for details.
To run a Replicator, retrieve the replica distribution (it's also included in the Harvest source distribution in the src/replicator directory), and then do:
% tar xf replica_distribution.tar
This will create the files mirrord.tar.gz and CreateReplica. Execute CreateReplica and follow the prompts. The default answers to the CreateReplica installation script will create a replica of Harvest's www-home-pages Broker. We suggest you start by replicating the www-home-pages Broker before you create your own replicated Broker.
At the end of running CreateReplica you will be given the URL for a page that allows you to control and monitor the status of floodd and mirrord. For example, from this page you can force logical topology updates, check the time of next synchronization between mirrord's, and view replication group membership and bandwidth estimates.
Note that it takes a while before data starts filling in your replica, because floodd needs a chance to run and compute bandwidths, and mirrord needs a chance to run and generate the FTP mirror configuration file needed to pull over the data. Typically data will start arriving 20-30 minutes after a new replica is first created. Note also that if you force an update before this time (using the floodd control/status HTML page), data will start arriving sooner, but may be retrieved from a suboptimal neighbor (e.g., across a slow network link).
The replicator allows you to tell it which files to replicate, allowing it to be used in more general settings than just for replicating Brokers. For the case of Brokers, however, the files you want to replicate are admin/Registry and the objects/ directory. The other files are either set up locally by the site administrator, or regenerated from the replicated files (e.g., the index files are generated from the replicated objects).
In a future version of Harvest we will integrate the replica creation mechanism into the RunHarvest command.