The second new file format is the ``Abstract'' type, which is a file that contains only the text of a paper abstract (a format that is common in technical report FTP archives). To recognize that a file is written in this format, we'll use the naming convention that the filename for ``Abstract'' files ends in ``.abs''. So, we add that type recognition customization to the lib/byname.cf file as a regular expression:
Another way to write a summarizer is to write a program or script that takes a filename as the first argument on the command line, extracts the structured information, then outputs the results as a list of SOIF attribute-value pairs.
Summarizer programs are named TypeName.sum, so we call our new summarizer Abstract.sum. Remember to place the summarizer program in a directory that is in your path so that Gatherer can run it. You'll see below that Abstract.sum is a Bourne shell script that takes the first 50 lines of the file, wraps it as the ``Abstract'' attribute, and outputs it as a SOIF attribute-value pair.
#!/bin/sh # # Usage: Abstract.sum filename # head -50 "$1" | wrapit "Abstract"