Option panel : Spider
- Accept cookies
Accept cookies generated by the remote server
If you do not accept cookies, some "session-generated" pages will not be retrieved
- Check document type
Define when the engine has to check document type
The engine must know the document type, to rewrite the file types. For example, if a link called /cgi-bin/gen_image.cgi generates a gif image, the generated file will not be called "gen_image.cgi" but "gen_image.gif"
Avoid "never", because the local mirror could be bogus
- Parse java files
Must the engine parse .java files (java classes) to seek included filenames?
It is checked by default
- Spider
Must the engine follow remote robots.txt rules when they exist?
The default is "follow"
Back to Home