This is a prototype to use Squid Cache as additional Gatherer for Harvest. It
scans Squid Cache object directories
for HTML URLs and creates a SOIF stream which can be piped to template2db to
create a database suitable for export by gatherd.
Possible improvements would be:
Create Broker objects instead of creating Gatherer database.
Only put objects into Gatherer database when necessary instead of creating
a SOIF stream.
Modify object parser to read output from wget and larbin.
Harvest's gatherer creates news urls like
but in some situations it may be useful to have news urls with hostname