The Glimpse indexing system can be tuned in a variety of ways to suit your particular needs. Probably the most noteworthy parameter is indexing granularity, for which Glimpse provides three options: a tiny index (2-3% of the total size of all files -- your mileage may vary), a small index (7-8%), and a medium-size index (20-30%). Search times are better with larger indexes. By changing the GlimpseIndex-Option in your Broker's broker.conf file, you can tune Glimpse to use one of these three indexing granularity options. By default, GlimpseIndex-Option builds a medium-size index using the glimpseindex program.
Note also that with Glimpse 3.0 it is much faster to search with ``show matched lines'' turned off in the Broker query page.
Glimpse uses a ``stop list'' to avoid indexing very common words. This list is not fixed, but rather computed as the index is built. For a medium-size index, the default is to put any word that appears at least 500 times per Mbyte (on the average) in the stop-list. For a small-size index, the default is words that appear in at least 80% of all files (unless there are fewer than 256 files, in which case there is no stop-list). Both defaults can be changed using the -S option, which should be followed by the new number (average per Mbyte when -b indexing is used, or % of files when -o indexing is used). Tiny-size indexes do not maintain a stop-list (their effect is minimal).
glimpseindex includes a number of other options that may be of interest. You can find out more about these options (and more about Glimpse in general) in the Glimpse manual pages. If you'd like to change how the Broker invokes the glimpseindex program, then edit the src/broker/Glimpse/index.c file from the Harvest source distribution.