NeBinEx - News Binary Extractor v2.0 Beta 10


Contents:

  1. Introduction
  2. Version 2.0 Highlights
  3. How to use NeBinEx
  4. Joins
  5. Newsgroups
  6. Specialised DDE Servers
    1. Decoding servers
    2. Validate servers
  7. External DDE Post-process servers
  8. External Validation & Post-process programs
  9. How is decoded an article
  10. Some notes
  11. Troubbleshooting
  12. Some ideas for future releases
  13. Why is it Freeware?
  14. Program support

Introduction[Top]

In the dark days before this program, I used to spend some hours each day getting the News binaries, using a newsreader; selecting the correct headers, in the correct order, and then decoding them using the newsreader built-in decoder. This is a daunting task, mainly because those interesting newsgroups in particular have a lot of traffic, and a lot of garbage (replies to binaries and "me too" articles). I like a lot those newsgroups (you know, the alt.binaries.pictures...) but i don't have the time, so I started thinking about doing a program that does this automagically, without or with little user interaction, so that all i need to do is start it, and collect the binaries after some time.


Version 2.0 Highlights[Top]


How to use NeBinEx[Top]

The first time you execute the program you will be warned that there are "no news servers defined". The preferences window will appear automatically. You only need to add a news server and then select newsgroups from the servers newsgroup list. Right-click in the servers list (in the white part) to make the popup menu appear. Use the "Add Server" option and fill the data it needs (usually only the server). You can press the "Investigate Registry" button to make the program search for other news readers and their configuration. After you have pressed the Next and Finish buttons the newsgroups list will be retrieved. When the program has retrieved it you have to select the ones you want, these will be added to the newsgroups list. Later, you can add or remove newsgroups from the same popup menu.

Always check to see if there is a popup menu in any control (list boxes, edit boxes, etc). To know what is the function of each option in the preferences window, use the context help system (the "What's This?" option in the popup menus).

There are now some instructions on usage in the help file. Check them.


Joins[Top]

A join is a set of newsgroups that is handled as an unique newsgroup (with all the articles mixed). This way, if in a newsgroup an article is no longer in the server, maybe it is in other group of the join. This helps also if there is a lot of crossposting between the groups of a join because only one copy is used. The joins properties are the same as for newsgroups, see below. Remember that if a join does not have any newsgroups their properties cannot be changed as it is not an entity on itself but a set of newsgroups options.


Newsgroups[Top]

A newsgroup have some properties that you can change in the popup menu of the newsgroups list in the preferences window. These are their meaning:

Option Meaning
Join The join a newsgroup belongs to (not available on joins).
Server The server a newsgroup belongs to.
Disabled If the newsgroup is disabled the program will not get binaries from it.
Moderated If a newsgroup is moderated the program will strip the first word of the subjects because they will interfere with the sorting process.
Update in the next connection The newsgroup will be reset to the last article in the next connection.
Select binaries
  • No
  • Yes, Default Checked
  • Yes, Default Not Checked
If yes, you will be prompted to select the binaries you want to get. They will be preselected or not.
Binaries are Text files Needed so that the program will not try to delete binaries that are not encoded (f.e. an stories newsgroup).
Subdirectory Directory under which the binaries of this newsgroup will be saved.

Specialised DDE Servers[Top]

<WARNING: FEATURE WITH NO FUTURE IF SERVERS DO NOT APPEAR SOON!!!>

You can use specialised DDE servers to decode and validate binaries. Those servers must conform to the protocol used by the program, so only programs supporting this protocol will work.

The protocol works this way:

After connecting with the configured DDE server and topic, NeBinEx sends the filename of the file to be processed as a macro. NeBinEx uses the configured DDE item to get messages from the server. Those messages can be normal messages (that appear in the main window) or commands. When the server sends a line starting with "#" NeBinEx thinks that it is a command. The connection ends when the server sends a "." (full-stop) as the only character in a line.

Decoding servers[Top]

The server should not send any message if the file cannot be decoded. If the file can be decoded, the server must delete it after decoding, and the new decoded file must be created in the same directory.

The format of the file to be decoded is this (fields inside braces are optionally filled):

------8<------

[<here goes the original article headers>]
Xnbe_---_Start_of_Special_Data_---:
Xnbe_Version: [<program version>]
Xnbe_Filename: [<filename>]
Xnbe_Encoding: [<MIME Content-Transfer-Encoding>]
Xnbe_Type_SubType: [<MIME Content-Type>]
Xnbe_Subject: <subject>
Xnbe_---__End_of_Special_Data__---:

[<here goes the encoded data>]

------8<------

 The commands a decoding server can send are these:

Command & Parameters Purpose
#FILENAME <filename> The client understands that the decoded file has this name (needed). Can be a long filename, without quotes. If no path is specified, the directory will be the same as the origin file.
#FORMAT <string> Name of the encoding format (not used for now).
Validate servers[Top]

 When the file is validated (or invalidated), the server should send the commands:

Command Meaning
#OK the file is valid
#UNKNOWN the server cannot validate the file
#BAD the file is invalid
Sample for DDE server code:

In Delphi. With a TDDEServerConv named DECODE (the topic) and a TDDEServerItem named MESSAGES (the item).

------8<------

procedure TForm1.DECODEExecuteMacro(Sender: TObject; Msg: TStrings);
var s:tstringlist;
begin

s:=tstringlist.create;
s.add ('DDEServer v1.0: Decoding '+msg.strings[0]);
s.add ('DDEServer v1.0: New name: decoded.gif');
s.add ('#FILENAME decoded.gif');
s.add ('#FORMAT foobar');
s.add ('.');
MESSAGES.lines.assign (s);
s.free;

end;

------8<------


External Post-process DDE servers[Top]

You can use any DDE server to post-process a binary. The connection is completely configurable.


External Validation and Post-process programs[Top]

You can use any executable to validate and/or post-process a binary. You can configure how you call them and what return codes they have.


How is decoded an article[Top]

The program joins all the parts of a binary in one article, with the header of the first part. Then it checks the type and encoding of the article. An article can be MIME or non-MIME compliant. Non-MIME articles are treated as MIME with type text/plain, they are usually encoded with uuencode. If the type is not multipart, the article is given to the decoding routine. If it is multipart, it calls the decoding routine for all the parts. If a part is another multipart, it calls itself recursively.

The decoding routine tries to decode using the detected encoding, and, if no encoding is detected, it tries with some formats. Usually, only the MIME encodings (base64 and quoted-printable) can be detected with the headers, and the others can be detected searching in the article text.

The majority of the articles use uuencode and base64. If the article has MIME headers, it will probably be multipart with two parts, a description of the binary in the first part, and the binary itself in the second part encoded with base64. When decoded, this articles output two files, the description's text file, which can be deleted by the program, and the real binary file. The text file has no name and will probably be saved as KNOWN*.TXT because its MIME type is text/plain.


Some notes[Top]

In version 2.0, the *.GRP files can be in a different directory than the program's directory. So if you have different configuration files, it's better to use different directories for their *.GRP files, so they will not interfere between them.

Files decoded as KNOWN*.* are files with an unknown filename, but NeBinEx will try to use the correct extension using the MIME type.

The program will try to decode articles with more than one part index in the subject, using the last part index (f.e. in the subject "girl series (1 of 5) [1/2]", the real part index is the last one).

From version 2.0, the program gets all the headers in the same connection. There is no limit to the number of headers except for free memory & disk. As some newsgroups have a lot of traffic, the memory usage will be high when processing this groups. The new Joined Groups feature will mix headers from several groups, so the memory usage will be the sum of all the joined groups. Also, this feature opens the same number of files as the number of joined groups at the same time when saving the joined headers (the GRP files are not joined).

Replied binaries can now be detected and decoded, but with some limitations. The file must be complete, of course, and the reply characters must be the same for all the lines of the binary. If the lines are word-wrapped the decoded file will be broken. There is no special requirements for the reply characters, any set of characters can be used. The only format not supporting this feature is base64, because it has no initialization string.

Because this version is multithreaded, retrieved articles could fill the hard disk before they get a chance to be decoded. The retrieve thread gets more priority than the decoder thread and the temporal pre-decoded articles will be acummulated Be warned that if the newsgroup/join you are retrieving from have a lots of binaries you will have to have enough hard disk space (pre-decoded articles occupy aproximately 4/3 more than decoded ones). You can use the "higher" priority setting for the decoding thread but this will slow the retrieve process.


Troubleshooting[Top]

If you see that the binaries are incorrectly decoded, check the "Keep unknown & invalid binaries" option and then update, send me by email any UNKNOWN* and/or INVALID* fiiles. Check first that they seem to be encoded binaries and not text files. Some times files are incorrectly named and are encoded twice. For example, a binary decoded as file puppy.jpg should be named puppy.uue because it is still encoded with uuencode. These binaries will not be correctly validated.

If there is an error send me the generated *.DBG files.


Some Ideas for future releases[Top]


Why is it Freeware?[Top]

Well, i don't like the shareware concept. All my programs are Freeware or Public Domain mainly because it is a problem to track the registered users for each program, get the money and "protect" the programs. I think that "little" amount of money is not worth the extra work and a "big" amount will make the users crack the program. And hey! wouldn't be great if all the world made their programs free? :-)


Program Support[Top]

For any bug report, suggestion or comment, send email to "Antonio Cordero <acordero@ts.es>" or check out my web homepage at http://www.eui.upm.es/~ccanto