AddWeb 6 submitted to an engine and said it was successful. How does AddWeb 6 know it was a success, and what exactly does successful mean?
AddWeb 6 has a database full of information on the sites it submits to. Here is the honest answer to the question of how submission programs process of a typical submission:
1. Look at the engine's rules and see if you are violating them. If it is a violation, skip this engine. Otherwise, continue.
2. Look up the domain name with your Internet Providers DNS server. If AddWeb 6 finds it, continue, otherwise fail.
3. Connect to the server of the engine you are submitting to. If we can connect, proceed. Otherwise fail.
4. Try to submit your data to the program on the server that accepts submissions. Wait for a response.
5. If the server responds to AddWeb 6, AddWeb 6 reads the response and compares it to good and bad result criteria.
6. If the results show a server failure or a submission failure, Fail the submission.
7. If the submission passes all the above criteria, consider it a success.
Did the submission actually succeed? Most of the time, yes. However, the possibility exists that the server may have responded to AddWeb 6 with an error that AddWeb 6 could not process.
Why AddWeb 6 doesn't always catch failures and how we address it:
There are ways to fine-tune the error checking so that 99.9% of the errors are caught, and they are built into AddWeb 6. They are as follows:
In order to catch many of the errors, AddWeb 6 would have to read the page that the server returns and scan for words that indicate an error. Doing so would catch more errors, but it would also severely slow down the submission process as AddWeb 6 would have to download the entirety of each response page, scan it, then report. Sometimes, the response pages are pages that redirect to other pages, causing double, and even triple the downloading. Some of the response pages are as large as 2 or 3 megabytes. Imagine if AddWeb 6 had to download a 20K page for each submission. On a submission run of 1,500 engines, this would cause AddWeb 6 to download 30 megabytes of response pages. Even with all of the simultaneous submissions AddWeb 6 does, on a typical dialup connection, this would add 2 hours to the submission process.
Even then, it would only catch errors that were generated by the keyword searches. This could also cause false failures because those words may be on a successful response page. For example, if we scanned for the word 'Error', a false failure would be caused on a response page that said something to the extent of 'No Errors were found. Thank You'. So in order to be more accurate, AddWeb 6 would have to save the responses to your hard disk, where you would scan them manually. This is clearly not an option.
What we do to keep our database as clean and up to date as possible:
Our staff manually checks the forms, servers and response pages. Here is how:
Cyberspace HQ has a sort of 'spider' that routinely goes through our engine database and does the following:
1. Our spider checks to see if the website still exists. If it does not, it places it into a 'To Check' list.
2. If the site exists, our spider looks at the submission page on the site to make sure we have the correct submission information. If we do not, it places it into the 'To Check' list.
3. The spider takes all of the engines that did not get sent to the 'To Check' list and does a submission to them every week. The submission is done with a very high timeout value and saves the response pages.
4. The spider scans all of the response pages for a list of keywords that indicate a possible failure. It places all possible failures into the 'To Check' list, and all of the rest into a 'Success' folder.
5. People on our staff regularly go through the 'To Check' list and manually view the problems. If they can be corrected, they are fixed and placed in the engine database for the next update. If not, they are deleted from the database.
6. Every few weeks, our staff manually view response pages that were deemed to be successful by the spider. If the successes are in error, they are checked for possible correction or deleted from the engine database.
7. While people are checking and fixing current database issues, we also have people adding new resources that we find or are submitted to us.
8. Every few weeks (or less) we release a new engine database with all corrections, deletions and additions.
When we find an error in a major engine, databases are updated immediately with the fix.