Changi Expire Program Version 0.9p ================================== This program is a prerelease and so is this document. What is Expire? --------------- This program will be used in conjunction with a news database formatted in the traditional way. The term 'traditional' means that each article is stored in a single file with the article number as it's filename. Additionally a file called history stores each article-ID plus some other information and another file called active contains a list of all active newsgroups. Expire will check the arrival, posting and expiration date of each article and determine wether it should be removed or retained. At least expire will rebuild the history and update the active file. This special version was written for Changi, a NNTP server for OS/2. Users of Changi are now waiting a nearly unacceptable long period for the release of a new version. Unacceptable, because among other things the expire program of the last version 0.9n is not what I would call a smart piece of software. A note for the NNTP freaks of you. This expire program contains many functions usually done by different programs. It will create and/or maintain an overview database. And it will recreate lost history files as well as lost active files. This package contains three files. expire.exe The main program exsort.exe Called by expire to sort the history exp09p.txt The one you're reading now Expire and Changi are available via anonymous ftp from ftp.gigo.com:/pub/misc ftp.cam.org:/users/tomlins While I write this you need two files to get the current version of Changi: chang09m.zip contains version 0.9m chpat09n.zip contains an upgrade to 0.9n Thanks to Jason Fessler, the author of gigo (fido - internet/usenet gateway) there is also a mailing list available. To subscribe, send mail to listserv@gigo.com and type subscribe changi in the first line of your message. I'm sorry that I had not very much time to attend this forum, but there are many very helpful people waiting for your problems with expire. Is This Software Free? ---------------------- If you like free software, then take this for free. However, if you'd do me a favor, send me a postcard, preferably showing your hometown. For Changi I received lot's of postcards from all over the world. And, the more I get, the more I like them. Thankyou very, very much for all the nice postcards I received. As Changi's new release isn't finished, I decided to distribute the new expire on it's own, just to keep people happy and postcards flowing in. For the address see my personal note at the end of this document. How about the source code ------------------------- I'll make it available as soon as possible at the above mentioned ftp sites. History ------- When I started to create my own NNTP server for private use, I didn't worry about news expiration, because I used UUPC to retrieve the articles from my provider and this package already includes a news expiration program. The trouble began with Chanx, a program which is able to retrieve news in a more simple way then by UUPC. I still prefer UUPC, but setting it up requires special knowledge and not all providers support it. People filled their harddisks with articles and then started crying for a program to get rid of old articles. I programmed a quick hack, stealing most of the code from UUPC's expire, adding some extra to make it work smoother with Changi. In the meantime I had developed, stolen and discarded many new ideas to enhance Changi and it's associated programs. The good news are, that you will find lots of enhancements compared with it's previous release. Now the bad news: This version of expire is not plug-in compatible. Many parts of Changi version 0.9p (not yet released) had been redesigned. To make expire working with pre-0.9p version of Changi, you need at least add option '-Uxs' to force expire generating a compatible history file and avoid pausing the server. Please read that part of this document very carefully, which describes the various program options. Note too, that you need to stop pre-0.9p versions of the Changi server while running this version of expire. Changes ------- Expire configuration file A new configuration file offers the possibility to define different expiration periods for different newsgroups. Rebuilding active file Option '-a' will scan the news directories for groups not found int the active file and add them. Option '-A' will create a new active file from scratch, overwriting any existing file. Fast expiration without rebuilding history With option '-C' the program will remove expired articles and mark the corresponding message-ID as canceled. New history file layout A new index file layout using hash keys in a binary tree structure significantly reduces disk space consumption and access time. Syslog interface While previous versions created a logfile, this new release uses a local syslog daemon for greater flexibility. The syslog daemon is not included in this distribution, but is part of IBM's TCP/IP base product and available as freeware ported by Jochen Friedrich. Ability to control Changi server Expire is able to flush, pause and restart the server, if it needs exclusive access to the history and active file. However, this will not work with pre-0.9p versions of the server, in which case you have to stop the server during article expiration. Changed name of sort program The name of the sort program has been change to exsort to avoid conflicts with existing programs. Summary file for statistics Expire will optionally append a summary record to a text file for later processing by a statistic software. Support for cross-post linking An article will not be removed as long as any other cross-posts of the article remain. Installing expire ----------------- You may copy expire and exsort in any directory. By default expire expects the history and active files in the current directory and the articles files in subdirectories of the subdirectory news. For example, articles of the group comp.os.os2 are expected in news\comp\os\os2. If this behaviour doesn't fit, don't worry. Expire offers many command line options and searches certain enviroment settings to become customized. Calling Expire -------------- The program will be usually called with several command line parameters. The general format is expire [options] [newsgroups] The program supports a large set of [options] to control it's behaviour and enhance it's flexibility. They are described in detail in the next section. The [newsgroups] command line parameter specifies a comma separated list of newsgroups with simple pattern matching. Only articles of matching groups will be expired. Example: expire -e3 comp.os.os[29]*,!comp.os.os2.mail-news This call will expire all articles older than three days in the groups comp.os.os2 as well as comp.os.os9 and all subgroups except comp.os.os2.mail-news. Do not use this parameter together with option '-r' or '-h', because this will create an incomplete history. Note that this parameter is supported for compatibility. It is not very usefull anymore, since the expire configuration file offers a better way expire articles in different groups on different times. Expire configuration file ------------------------- For sure this is the most useful enhancement of this realease. You may create this file with the name 'expire.conf' using any text editor in the same directory where you run expire. The program will automatically use it, if it finds it. The file may contain any number of empty lines anywhere. Also lines with human readable comments are accepted if they are marked with '#' (number sign) in the first column. All other lines must be given in the following format. newsgroups min-keep max-keep The first field specifies the newsgroups this line is valid for. A comma separated list of newsgroups with simple pattern matching may be given. The second field, min-keep, specifies the minimum time to keep any article. This is the normal age at which an article expires. Even if an article contains an explicit expiration date in it's header, it will be kept at least this time period. To define a time value, you may simply give a number, which will be interpreted as the number of days you want to keep articles. Furthermore you may add a single letter representing one of these a units: h hours d days (default) w weeks Even combinations are possible, like 1d12h, which means one and a half day. The value you use depends highly on your available diskspace and may also be influenced by your interest and the traffic in the specified newsgroups. The third field, max-keep, defines the maximum time to keep any article. This parameter is used to get rid of articles with an explicit expiration date too far in the future. You should specify a very long period, about two to three months, to not accidently destroy periodical postings with long update delays. Note, that the sequence of entries is important too. Lines at the end of the file may overwrite previous definitions. Example: # This is a sample expire configuration file # First entry specifies the default. * 1w 90d # This second entry keep os2 related groups a bit longer *os2* 2w3d 90 # I want junk removed soon junk,control* 12h 12h # Keep some local groups for a long period ping* 4w 90 Command line options -------------------- This part of the document is the key to become a real expire guru. All options are optionally. Expire will run fine without any, but may not do what you expected it to do and will for sure not do it the best way. One way I'd suggest is, to use option '-C' for a daily run immediately before retrieving new articles and run without '-C' but with option '-E' weekly. -a Rebuild active file Before starting expiration, and after reading in an existing active file, the program will scan the news directory. If it finds any subdirectory containing articles files without a corresponding newsgroup in the active file, the new newsgroup will be added. -A Create active file This option will force expire to ignore any existing active file. It will scan the news directory for subdirectories containing article files and create all corresponding newsgroups. The list of groups will be written to a new active file. Be aware, that this will silently overwrite any existing active file and remove newsgroups without currently stored articles. -b Remove bad articles When rebuilding the history, expire needs to read each article header. Articles with incomplete headers or corrupted otherwise are normally ignored. Setting this option will remove them. -c Name of the expire configuration file By default expire will look for a file named expire.conf in the current directory and will not worry, if this file doesn't exist. This option may define a different filename and force expire to terminate, if it couldn't read this file. See previous chapter for further explanations on how to create an expire configuration file. -C If this option is given, expire will not remove entries from the history file. Instead expired articles will be marked canceled, which saves expire from rebuilding the history index. This is a good option for daily expiration, but you should not forget to run the program from time to time without this option in order to get rid of outdated history entires. -de Turns excessive logging on. See option -do. -dt Let the program run in testmode. All actions are performed like in normal mode, but no articles will be removed and no active file or history update takes place. -do By default you need any syslog daemon to view and/or store logged messages. This option will output them on the screen using the standard error channel. Here's an example on how to redirect this output to a file: 'expire -do 2>mylog'. -dp Adds the current program id to logged messages. See option -do. -e