WebJavaCopy 0.10 ================ Description: ------------ WebJavaCopy is a Java application designed to copy files from the internet to your hard drive. You are in complete control of which web pages are scanned and which pages (and other files) are simply downloaded. Requirements: ------------- Java 1.1 runtime environment This is NOT an applet - your web browser cannot run this Java application. Installation: ------------- If you've unzipped this (keeping the directory structure), it's installed. Just run "java wjcp.App" from this directory. If you add this directory to your CLASSPATH, you can run it from any directory. There are currently no parameters. Note: To keep the directory structure, be sure to use infozip's unzip, or pkunzip with the -d parameter. Other unzips (i.e., WinZip) have similar options, be sure to use them. Usage: ------ Type in the URL you wish to scan into the URL entry field. Currently this has to be a web page address. In the future, this should be permitted to go straight to a download. To complete this, press enter in this field. Immediately, the HTML Engine will start downloading the page. As it finds files to download, it will start putting them in the "Links Found" list. This happens in the background so you can type in another URL - if you type in a new URL before the previous one is finished, it will merely be queued up in the HTML Queue. * Feature: If a link is found more than once, it will not be put in the Links Found list a second time. In fact, this feature works whether the original link is still in the list - if you have removed, saved, or loaded that link already, it will not reappear! This is saved in the configuration file. Currently, the only way to allow a link to reappear is to shut down WebJavaCopy, delete the configuration file, and restart. When you select one or more links, you may choose to load, save, or remove them. - Load will only highlight if all selected links are themselves HTML, HTM, or TXT files. This will remove each link from the found-links list and place them, in the same order, in the HTML queue. If the engine is not active, this will activate it. - Save will allow you to save each selected link. For each link being saved, a dialog will pop up asking where to save the link as, and what filename to save as. This will default to the last used directory and the filename in the URL. - Remove will simply remove the URL. As above, the URL will not reappear should it be found again later. (Can be useful to get rid of banner advertisments and the such.) * Feature: If there are any links showing in the "Links Found" list when you shut down WebJavaCopy, they will be restored when starting up. The HTML and File Save queues are saved/restored as well. NOTES (read: known BUGS): ------------------------- - Settings menu doesn't do anything yet. You can trigger this setting all you want - won't make any difference. On the other hand, the "Crawl Other Sites" setting IS saved between runs. - The configuration file will most likely NOT be upgradable to newer versions. Future: ------- - URL field should allow Files as well as HTML. - Settings menu will do something useful - # of engines will be configurable (minimum of 1 each) - Let the user bring up a dialog box with the seen list to scan them, save them, or merely remove them from the seen list. - command-line parameter to start downloading/scanning. - command-line parameter to shut down when done - automatically determining a description for each link