home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.sys.sun.admin
- Path: sparky!uunet!destroyer!cs.ubc.ca!van-bc!tfic.bc.ca!tbr
- From: tbr@tfic.bc.ca (Tom Rushworth (V))
- Subject: Re: DAT's on 4.1.2
- Message-ID: <1992Nov12.234219.21889@tfic.bc.ca>
- Summary: 4.1.1 worked, 4.1.2 doesn't
- Organization: Timberline Forest Inventory Consultants
- References: <2592@bigfoot.first.gmd.de> <BxJpsC.68J@immd4.informatik.uni-erlangen.de>
- Date: Thu, 12 Nov 1992 23:42:19 GMT
- Lines: 122
-
- We've just upgraded our machines from SunOS 4.1.1 to 4.1.2, and have found
- a problem with our Archive Python dat drive. We've tried changing machines,
- cables and drives, but the problem persists. It looks like the problem is
- in the 4.1.2 SCSI tape drivers - has anyone else seen this? Does anyone
- have a suggestion?
-
- The problem is that at some point (many MB into the backup) we get what
- appears to be a SCSI renegotiation for synchronous (possibly from a SCSI
- bus reset?) that is taken as an error by the 4.1.2 SCSI driver. All we
- see (in the write case) is the console message that the target is now
- synchronous. I don't know how to find out what caused it to renegotiate.
- Gory details are below....
-
- We have the st_conf.c entry for the DAT drive as follows:
- ---------------------------------------------------------
- /* Local addition for ArchiveST DAT drive systems */
- /* Added 1992 Nov 09 - tbr */
- /* Modifications Copyright 1992 Archive Corporation */
- {
- "ArchiveST 4mm DAT/DAT-DC", 14, "ARCHIVE Python",
- 0x30, 512,
- (ST_KNOWS_EOD | ST_BSF | ST_BSR | ST_VARIABLE),
- 5000, 5000,
- { 0, 0, 0, 0 }, { 0, 0, 0, 0 }
- },
- ---------------------------------------------------------
-
- The following was extracted from /var/adm/messages, with the dates,
- "virgil vmunix" and some spacing compressed.....
-
- 17:26 - machine booted with new kernel, DAT configured for SCSI-2
-
- 17:26:49 vv: SunOS Release 4.1.2 (VIRGIL) #1: Mon Nov 9 14:43:42 PST 1992
- ...
- 17:26:49 vv: esp0 at SBus slot 0 0x800000 pri 3
- 17:26:49 vv: esp0: Target 0 now Synchronous at 4.0 mb/s max transmit rate
- 17:26:49 vv: sd3 at esp0 target 0 lun 0
- 17:26:49 vv: sd3: <Wren V 94181-702 cyl 1530 alt 2 hd 15 sec 52>
- 17:26:49 vv: esp0: Target 4 now Synchronous at 4.0 mb/s max transmit rate
- 17:26:49 vv: st0 at esp0 target 4 lun 0
- 17:26:49 vv: st0: <ArchiveST 4mm DAT/DAT-DC>
- ...
-
- 17:27 - insert tape, try "mt -f /dev/rst[01] status" to check message
-
- 17:27:47 vv: esp0: Target 4 now Synchronous at 4.0 mb/s max transmit rate
-
- 17:31 - start (the world's biggest, ugliest) backup shell script
-
- 17:31:35 vv: esp0: Target 4 now Synchronous at 4.0 mb/s max transmit rate
-
- 20:28 - office deserted - backup dies with write error somewhere in
- last file to be backed up (approx 650M written
- successfully, in earlier files)
-
- 20:28:18 vv: esp0: Target 4 now Synchronous at 4.0 mb/s max transmit rate
- 20:28:18 vv: st0: Error for command 'write', Error Level: 'Fatal'
- 20:28:18 vv: Block: 2404 File Number: 6
- 20:28:18 vv: Sense Key: Media Error
- 20:28:18 vv: Vendor (ArchiveST 4mm DAT/DAT-DC) Unique Error Code: 0x3b
- 20:28:18 vv: esp0: Target 4 now Synchronous at 4.0 mb/s max transmit rate
- 20:28:18 vv: st0: Error for command 'write file mark', Error Level: 'Fatal'
- 20:28:18 vv: Block: 2404
- 20:28:18 vv: Sense Key: Media Error
- 20:28:18 vv: Vendor (ArchiveST 4mm DAT/DAT-DC) Unique Error Code: 0x3b
-
- 06:30 - next morning - try reading last (supposedly) successfully
- written file with the following results:
-
- 06:28:25 vv: esp0: Target 4 now Synchronous at 4.0 mb/s max transmit rate
- - much reading happens ..., then BOOM!
- 07:13:46 vv: esp0: Disconnected command timeout for Target 4 Lun 0
- 07:13:46 vv: st0: transport completed with timeout
- 07:13:46 vv: st0: attempting a device reset
- 07:13:46 vv: st0: SCSI transport failed: reason 'timeout': giving up
- 07:13:53 vv: esp0: Target 4 didn't disconnect after sending COMMAND COMPLETE
- 07:13:53 vv: st0: transport completed with tran_err
- 07:13:53 vv: st0: attempting a device reset
- 07:13:53 vv: st0: attempting a bus reset
- 07:13:53 vv: esp0: spurious interrupt
- 07:13:56 vv: esp0: ILLEGAL bit set
- 07:13:56 vv: State=SELECT_SNDMSG Last State=FREE
- 07:13:56 vv: Latched stat=0x16<XZERO,MSG,CD> intr=0x40<ILL> fifo 0x20
- 07:13:56 vv: last msg out: <unknown msg 0xff>; last msg in: <unknown msg 0xff>
- 07:13:56 vv: DMA csr=0x96400210<EN,INTEN>
- 07:13:56 vv: addr=fff00000 last=fff00000 last_count=1
- 07:13:56 vv: Cmd dump for Target 0 Lun 0:
- 07:13:56 vv: cdb=[ 0xa 0x0 0x0 0x50 0x10 0x0 0x0 0x0 0x0 0x0 ]
- 07:13:56 vv: pkt_state 0x0 pkt_flags 0x0 pkt_statistics 0x0
- 07:13:56 vv: cmd_flags=0x23 cmd_timeout 35
- 07:13:56 vv: Mapped Dma Space:
- 07:13:56 vv: Base = 0x6000 Count = 0x2000
- 07:13:56 vv: Transfer History:
- 07:13:56 vv: Base = 0x6000 Count = 0x0
- 07:13:56 vv: current phase 0x60=SELECT_SNDMSG stat=0x16 0x0 0x0
- 07:13:56 vv: current phase 0x23=SYNCHOUT stat=0x16 0x2d 0xf
- 07:13:56 vv: current phase 0x21=PREEMPTED stat=0x0 0x4 0x0
- 07:13:56 vv: current phase 0x60=SELECT_SNDMSG stat=0x0 0x4 0x0
- 07:13:56 vv: current phase 0x23=SYNCHOUT stat=0x0 0x2d 0xf
- 07:13:56 vv: current phase 0x1c=RESET stat=0x0 0x10
- 07:13:56 vv: current phase 0x1c=RESET stat=0x0 0x7
- 07:13:56 vv: current phase 0x1c=RESET stat=0x0 0x10
- 07:13:56 vv: current phase 0x1c=RESET stat=0x13 0x7
- 07:13:56 vv: current phase 0x5=MSG_IN stat=0x13 0x0
- 07:13:56 vv: current phase 0x27=STATUS stat=0x13 0x2
- 07:13:56 vv: current phase 0xb=CMD_CMPLT stat=0x13
- 07:13:56 vv: current phase 0x60=SELECT_SNDMSG stat=0x10 0x4 0x0
- 07:13:56 vv: current phase 0x23=SYNCHOUT stat=0x10 0x2d 0xf
- 07:13:56 vv: current phase 0xb=CMD_CMPLT stat=0x17 0x2000
- 07:13:56 vv: current phase 0x27=STATUS stat=0x17 0x0
- 07:13:56 vv: sd3: SCSI transport failed: reason 'reset': retrying command
- 07:13:56 vv: st0: transport completed with reset
- 07:13:56 vv: esp0: Target 0 now Synchronous at 4.0 mb/s max transmit rate
-
- I'm more concerned about being able to write the tape than read it at the
- moment (who needs to read a backup anyway? :)), since I suspect that if I can
- find and fix the write problem the read problem may well go away too.
-
- Any ideas or suggestions appreciated! Thanks.
- ----
- Tom Rushworth (604) 733-0731 [FAX: 733-0634] | uunet!ubc-cs!van-bc!tacitus!tbr
- Timberline Forest Inventory Consultants | or: tbr@tfic.bc.ca
-