NetNews Usenet Archive 1993 #1

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #1 / NN_1993_1.iso / spool / comp / os / vms / 20704 < prev next >

Wrap

Internet Message Format | 1993-01-10 | 4.4 KB

Path: sparky!uunet!news.claremont.edu!nntp-server.caltech.edu!SOL1.GPS.CALTECH.EDU!CARL From: carl@SOL1.GPS.CALTECH.EDU (Carl J Lydick) Newsgroups: comp.os.vms Subject: Re: HELP!!! Security problem for gurus. [Directories] Date: 10 Jan 1993 22:39:46 GMT Organization: HST Wide Field/Planetary Camera Lines: 75 Distribution: world Message-ID: <1iq8jiINNb7n@gap.caltech.edu> References: <1iou2eINN372@gap.caltech.edu>,<86009@zl2tnm.gen.nz> Reply-To: carl@SOL1.GPS.CALTECH.EDU NNTP-Posting-Host: sol1.gps.caltech.edu In article <86009@zl2tnm.gen.nz>, don@zl2tnm.gen.nz (Don Stokes) writes: >carl@SOL1.GPS.CALTECH.EDU (Carl J Lydick) writes: >> Agreed. That's one of the many uses of standalone BACKUP. Shut the system >> down *NOW*, before you run the chance of rendering the disk unreadable. > >Not necessarily. It's still running. Start looking. SHOW ERROR. Run >Did a file get overwritten? When? What's running that might have done it? >Is the disk structure intact? ANALYZE/DISK. We need to know now. Time >is money. All of these things can be found after a crash, provided the system disk hasn't been corrupted beyond the ability of VERIFY to deal with it, and provided you've got a dumpfile on the disk. Please note the code that writes the dump file is in an entirely separate thread from the normal disk XQP, so there's a good chance the dump file will be valid even if the XQP is somehow hosed. >I've already outlined a case where the very act of shutting down was one >of the causes of the problem. You're advocating taking an action before >diagnosing the problem. If it looks serious, I can at least stabilise >things by chucking the users off and stopping queues. If it's really bad >I'm going to have to restore from backup anyway. And if it's something that is continuing to corrupt the system disk, the longer you fool around trying to diagnose the problem while running the corrupted system, the more likely the system disk will become too corrupted for you to ever successfully diagnose the problem. >... yeah, you get to find out, three hours and many thousand $ of lost >production time later, that you accidentally deleted a non-critical file >and could have fixed the problem in less than two minutes. I'd rather >you explained that to the CEO. Perhaps you can tell me which facility bugchecks because "you accidentally deleted a non-critical file?" I'd appreciate it if you'd argue about the way things actually work rather than against straw men of your own devising. >Sites that care about their production have people available. I've >crawled out of bed in the wee smalls more times than I care to count to >attend to sick systems (usually hardware problems). Operators were >instructed that if they didn't know what to do, they were to call; they'd >get a far bigger blasting if they didn't than the odd curse they'd get for >digging me out unnecessarily. I'm also available 24 hours a day. But I'm not on site all that time. It can take up to half an hour for me to get to the computer. A lot can happen in half an hour. >The nature of bugcheck code is that it must be simpleminded. > >Taking an action that may be destructive (and rebooting onto a sick system >disk is a Bad Thing), may render a recoverable situation unrecoverable, >or even just increases the downtime required to fix the problem, is not >an appropriate response. Who said anything about "rebooting onto a sick system disk"? Whether or not that happens automatically is controlled by the SYSGEN parameter BUGREBOOT >> Fine. Then perhaps we're simply arguing about just how severe the problem must >> be before the system takes matters out of our hands. It might be useful if VMS >> had more than two classes of BUGCHECKS, and allowed the system manager to set a >> SYSGEN parameter that specified the lowest class that was considered fatal. > >Maybe, maybe not. I like how it is. Then why are you bitching about it? -------------------------------------------------------------------------------- Carl J Lydick | INTERnet: CARL@SOL1.GPS.CALTECH.EDU | NSI/HEPnet: SOL1::CARL Disclaimer: Hey, I understand VAXen and VMS. That's what I get paid for. My understanding of astronomy is purely at the amateur level (or below). So unless what I'm saying is directly related to VAX/VMS, don't hold me or my organization responsible for it. If it IS related to VAX/VMS, you can try to hold me responsible for it, but my organization had nothing to do with it.