home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!noc.near.net!transfer.stratus.com!transfer.stratus.com!usenet
- From: dan@az.stratus.com (Dan Danz)
- Newsgroups: comp.sys.stratus
- Subject: Re: loss of end-of-file mark problem (LONG)
- Followup-To: comp.sys.stratus
- Date: 12 Jan 1993 21:06:27 GMT
- Organization: Stratus Computer Inc, Marlboro MA
- Lines: 260
- Distribution: world
- Message-ID: <1ivbsjINN2f2@transfer.stratus.com>
- References: <1993Jan7.131833.15374@porthos.cc.bellcore.com>
- NNTP-Posting-Host: bittersprings.az.stratus.com
- Keywords: TPF end_of_file runout cache salvage recreate_index verify_end_of_file LPF ESD
-
- [dante writes]
- > In the event of a VOS crash, I have heard that it is possible that VOS
- > can lose a files's end of file mark, thereby corrupting the file and
- > losing access to the file on reboot. The solution to the problem is to
- > Transaction Protect(TP) the file. I was wondering if anyone has heard
- > of the problem and if so, then to discuss the problem and its possible
- > solutions.
-
- Before we can understand potential solutions (and there are more than
- one), we need to understand all the players, such as Disk Cache, Disk
- Salvage, the end of file pointer, etc.
-
- DISK CACHE
-
- VOS maintains a disk cache in main memory in which it holds recently
- accessed blocks of disk. When programs (including portions of the
- operating system) request access to disk data, the cache manager first
- attempts to locate the data in the disk cache. If the data can be
- found there, the resultant access time can be several orders of
- magnitude faster than if the disk itself has to be accessed. When the
- data in the cache is changed, VOS rewrites the changed blocks back to
- disk at its convenience and according to an algorithm that includes
- measures to avoid needless writes to disk when the block is constantly
- changing. Note that blocks belonging to directories and
- sub-directories of the file system also participate in the caching
- scheme.
-
- In the event of a system interruption (which can occur, for example,
- because of such diverse things as extended power failures,
- software-detected inconsistencies in system behavior, or an operator
- inadvertently pressing the power switch), some in-memory disk blocks
- might not be flushed to disk. Frequently, it happens that new data
- blocks for a file are written to disk but the directory information
- for the file has not yet been updated at the time of an interruption.
-
- Note that if the system is stopped using the shutdown command, then
- the disk cache is flushed to disk before the system is truly stopped,
- and the integrity of the file system is not impacted. Note also that
- the integrity of files protected by the VOS Transaction Protection
- Facility (TPF) will not be affected by system interruptions.
-
- DISK SALVAGE
-
- When the system is restarted following such an interruption, the
- "salvage_disk" command is automatically executed for each disk volume
- that is mounted. The salvage_disk operation is a time-consuming, disk
- intensive, operation that must complete prior to allowing any new
- operations on the files.
-
- One of the tasks of the salvager is to detect and repair directory
- information about files. It is capable of detecting when, for
- example, new data blocks have been allocated to a file and written to
- the disk but are not reflected in the count of the number of blocks in
- the directory entry for the file.
-
- END OF FILE POINTER
-
- The salvager, however, does not attempt to verify another directory
- item: last_record_number. This field is used differently by the
- various file types, but is colloquially referred to as the "end of
- file pointer". It indicates to the file system where to begin when
- writing new records and beyond which the application programs should
- not be allowed to read data. This value is typically updated only at
- the time the file is closed or runout. In the interim, the
- information is kept only in VOS memory.
-
- Data can be lost when the value of this field does not reflect the
- true end of file. Consider the following example.
-
- 1. An application program opens a file for sequential output (or
- update or append) operations.
-
- 2. It writes 200 records to the file, that occupy 1.5 blocks of disk
- space.
-
- 3. The file is closed (or s$control RUNOUT_OPCODE is used to flush
- blocks of data and the directory entries to disk). The value of
- the last_record_number field at this point is that associated with
- a point in the file in the middle of the second block immediately
- beyond the 200th record.
-
- 4. The file is re-opened for update or append operations and 300 more
- records are written to the file. Two new blocks are allocated to
- the file to hold the records.
-
- 5. A system interruption occurs. The salvager is likely to report
- that the number of blocks (2) in the directory entry for the file
- is wrong and change it to the correct value (4). However, the
- last_record_number field retains the same value as in step #3 above
- (someplace in the middle of the second block).
-
- 6. The following scenarios are typical of what happens next:
-
- a.) An application program that sequentially reads the file
- receives e$end_of_file when attempting to access any
- record beyond the 200th. The dump_file command, however,
- displays all the blocks of the file, and records 201
- through (say) 465 can be seen. The first part of the
- next record 466 can be seen at the end of the last block.
- In some rare cases, the last block holding data will be
- followed by one or more blocks containing only bytes of
- FFx (this occurs when the block has been allocated but
- never written).
-
- b.) An application program that attempts to write new records
- to the file overwrites records beginning at record 201.
-
- c.) An application program that accesses the file using keyed
- reads via an index receives the error code
- e$invalid_record_number when it attempts to access
- records beyond the 200th, or the record returned is one
- of the records written by (b.) above.
-
- Posts (by Tom Moser, Dan Swartzendruber, Walt Mankowski) have offered
- some possible solutions, and I'd like to put the total set in
- perspective.
-
- EMERGENCY SHUTDOWN
-
- VOS Release 11.1 includes a feature called Emergency Shutdown (ESD);
- if the operating system is about to experience a system interruption
- because of software detected conditions, it first tries to invoke ESD,
- which attempts a sequence of events similar to an actual shutdown;
- that is, it tries to complete all outstanding I/O, including writing
- out the file partition bit maps on all disks, and updating the labels
- on all mounted disks. The reboot process does not need to salvage any
- disks that were shut down in this orderly fashion, thus eliminating
- the time it would normally take to salvage them. It also ensures that
- the the end-of-file pointer information is recorded properly in the
- directory entry.
-
- ESD takes a conservative approach, however; before flushing any of the
- disk cache, it validates the integrity of the software and hardware
- that will be used. (It doesn't want to flush the cache if the reason
- for the crash is that the disk cache control structures have been
- corrupted, for example.)
-
- ESD is not (and cannot be) invoked for extended power failures,
- emergency power-off, the failure of a simplexed component, or in the
- rare case of failure of both partners of a duplexed component.
-
- RUNOUT
-
- The application program(s) writing the file can ensure that all the
- blocks of the file, it's indexes, and the directory entry have been
- written to disk by calling s$control with the RUNOUT_OPCODE. This
- feature is available in all releases of VOS and can been done by any
- process having write access to the file, which suggests a user-written
- daemon could periodically flush some critical files, perhaps based on
- activity and/or time between flushes. This technique narrows the
- window in which the problem can occur, but does not close it entirely.
-
- VERIFY_END_OF_FILE
-
- A tool called verify_end_of_file was developed to detect (and correct)
- many cases of incorrect end-of-file pointers and has been shipped to
- all sites since VOS release 6.3. Until recently, it was an
- undocumented tool and was shipped in the >system>maint_library.
- Beginning in release 11.6, verify_end_of_file is in
- >system>command_library and has been documented in Stratus manuals.
-
- The tool starts at the last recorded end-of-file position
- and attempts to read records beyond this point. If it's successful,
- it is capable of updating the directory information. For it to be
- totally successful, all data blocks in the disk cache must have been
- written to disk, however; if not, the tool attempts to recover as many
- whole records as it is able to read.
-
- In addition, if the end-of-file pointer needs to be adjusted and the
- file has indexes, it's probable that the index contents and the data
- are not consistent, so any indexes on the file must be recreated.
-
- RECREATE_INDEX
-
- For embedded indexes (where the file contains all the information used
- in the index keys), a recreate_index tool has been provided (also
- since VOS release 6.3). It automates the manual steps of
- display_file_status (to get the index parameters), delete_index, and
- create_index that are required to rebuild an index.
-
- VOS had a limitation of only allowing one index to be created at a
- time; this restriction was removed in VOS release 10.3, and
- recreate_index was rewritten to spawn multiple processes, greatly
- shortening the time required to rebuild multiple indexes on large
- files.
-
- Starting with VOS release 11.6, recreate_index becomes a fully-
- documented user command in >system>command_library.
-
- LOG PROTECTED FILES (LPF)
-
- A future release of VOS (probably release 12) will contain a feature
- called Log Protected Files. Here's a preview of this feature.
-
- A log protected file is conceptually an intermediate step between
- ordinary VOS files and transaction protected files. It guarantees
- data consistency after a system interruption, but does not permit an
- application to combine multiple operations into a single transaction.
- Instead, each s$subroutine call is treated as a separate, atomic
- transaction.
-
- The protection provided to log protected files is automatic; existing
- applications can be changed to use log protection without
- redevelopment or internal changes by using a new command to enable the
- log protected attribute for the appropriate files. Such files are
- then called log protected files.
-
- Each s$subroutine call that modifies a log protected file makes sure
- that all the resulting changes are written to a special log file
- before any of the changes are written to the actual file. This
- includes changes to file indices and control information such as
- end-of-file pointers. When VOS recovers from a system interruption,
- the logged modifications that are not marked as completed are
- re-applied to the appropriate files before the file system salvage and
- recovery operations are performed.
-
- Log protection does not guarantee that all changes up to the very last
- s$subroutine call will be present after a system interruption, but it
- does guarantee that all changes up to and including some relatively
- recent s$subroutine call will be present, and that no modifications
- made after that call will be present. "Relatively recent" means that
- the time will vary dynamically depending upon the setting of the
- existing module tuning parameter "modified grace time". This
- parameter may range from 1 second to 1 hour, but is typically on the
- order of 1 to 4 minutes. If desired, the application can cause
- changes to be logged immediately by using the runout s$control
- operation.
-
-
- TRANSACTION PROTECTION (TP)
-
- Transaction Protection remains the ultimate in file integrity
- protection, especially when the application requires atomic operations
- on data that resides in multiple records and/or files. TP-protected
- files are not exposed to any of the dangers discussed above; however,
- the TP solution is the most expensive of all in terms of resource use
- and it requires active application program participation in the process.
-
-
- SUMMARY
-
- System failures, whatever the cause (hardware, software,
- environmental, operator error) can damage open output files. However,
- there are various levels of protection and correction, depending on
- the release of the operating system and the nature of the application.
-
- Stratus Systems Engineers (there's at least one assigned to most
- sites) can work with you to design your application, explain the
- tradeoffs involved, and help ensure that the integrity of your files
- is adequately protected.
-
- If you have a specific question about any of the tools or techniques
- involved, I urge you to contact your Customer Assistance Center.
-
- --
- L. W. "Dan" Danz (WA5SKM) VOS Mail: Dan_Danz@vos.stratus.com
- Sr Consulting Software SE NeXT Mail: dan@az.stratus.com
- Customer Assistance Center Voice Mail/Pager: (602) 852-3107
- Telecommunications Division Customer Service: (800) 828-8513
- Stratus Computer, Inc. 4455 E. Camelback #115-A, Phoenix AZ 85018
-