![[Intel Navigation Header]](/CONTENT/PIX/HEADER.GIF)
StorageExpress(TM) System: Troubleshooting Notes from Tech Support
Contents:
WHAT TO DO IF YOU RECEIVE A DAEMON ERROR
There are 2 actions to take to resolve 80% of daemon errors encountered
when using Exabyte drives:
* Install adequate memory in the StorageExpress CPU.
* Update the tape drive firmware.
The latest firmware for Exabyte drives is 06J0 (8500c drives) and O6L0
(8500 drives). This new firmware fixes daemon errors caused by an
undetected session-end marker. There are other causes of daemon errors
such as sub-standard or exhausted media, poorly maintained tape heads,
or unavailable buffers. To avoid experiencing, or as the base solution
for daemon errors, Intel suggests that all StorageExpress system
customers using Exabyte 8500 & 8500C tape drives update the firmware on
the drive and ensure that adequate memory is installed on the
StorageExpress system. 8M of RAM is adequate for single streaming
systems, for 2 to 4 streams 16M of RAM is required.
ACTIVITY LOG ERRORS: "RECEIVE TIMEOUT" FOLLOWED BY "FILE [FILE_NAME] IS TRUNCATED"
Product: StorageExpress(TM) Backup Server
Symptom:
The activity log produced during a workstation backup will report system
message number 251 "Receive TimeOut", followed by system message number
260 "File [File_name] is truncated", and the backup operation is
canceled.
Typically the StorageExpress system and the target workstation will be
separated by a Novell file server supporting internal routing.
Description:
The communication link between the StorageExpress system and the
workstation is supported by SPX. If the communication timing between
these two nodes is increased by introducing hardware to the link then
the abort, listen and verify parameters must be altered in the NET.CFG
to accommodate the latency in communication.
Note: Changing the SPX parameters on the StorageExpress system (using
the SPXCONFG.NLM) will not affect the workstation communication process.
Solution:
The target workstation should reference a NET.CFG file when loading the
network drivers. The NET.CFG file should include SPX configuration
parameters and should be set as follows.
For DOS and Windows workstations:
SPX ABORT TIMEOUT=3000
SPX LISTEN TIMEOUT=108
SPX VERIFY TIMEOUT=108
SPX CONNECTIONS=50
For OS/2 2.1 Workstations:
PROTOCOL STACK SPX
ABORT TIMEOUT 60000
LISTEN TIMEOUT 12000
VERIFY TIMEOUT 6000
SESSIONS 50
Note: For OS/2 workstations, the LISTEN TIMEOUT value must be two times
greater than the VERIFY TIMEOUT. The ABORT TIMEOUT value must be 10
times greater than the VERIFY TIMEOUT. The maximum value for ABORT
TIMEOUT is 65,535.
Refer to Novell documentation for further information on NET.CFG and SPX
configuration.
"OUT OF MEMORY ON HOST SERVER"
Symptom
Backup, Restore, or Merge job will be halted and cancelled. The error
message in the activity log will state "out of memory on host server."
The monitor on the StorageExpress system will report "Cache Buffers
Maximum Limit Exceeded." This is most likely to occur on dual streaming
systems with 8 meg of memory, installed on a Token Ring network.
Description
The StorageExpress server is low on memory.
Solution
Adding more memory to the StorageExpress system is the permanent
solution. See Adding memory to the StorageExpress(TM) system control
unit for more information.
STORAGEEXPRESS DOES NOT SHOW UP ON THE NETWORK DURING INITIAL SETUP
Symptom:
During initial installation, the StorageExpress system does not appear
on the network. It does not show in NLIST or SLIST.
Descriptions:
* StorageExpress network adapter option is not set correctly.
Adapter Type
Frame Type
IPX Network Number
Cable Type
Token Ring Speed
* Source Routing is not set correctly for Token Ring network.
* Routers between the Central Console and StorageExpress.
Solutions:
* Issue the RESET ROUTERS command on Novell file servers. Wait
several minutes for routers to update.
* Check the StorageExpress Token Ring Adapter settings. Document
6724 contains instructions.
* Check the source routing setup. It needs to be set up exactly like
the rest of the network. If other Novell file servers on the same
network ring load ROUTE.NLM then you need to have source routing
enabled.
* Load Central Console on a workstation that is on the same segment
or ring as the StorageExpress system.
* Do a display servers from a file server on the same ring/segment as
the StorageExpress unit to see if it is broadcasting and being
received.
* Connect a keyboard and monitor to the StorageExpress unit and do a
display servers to see what the unit can find. Type CONFIG at the
colon prompt on the StorageExpress unit to verify the environment
configuration. Load the MONITOR NLM at the ":" prompt, check LAN
Information to verify packets sent and received.
* Reinitialize the StorageExpress system.
* Check Router and/or Bridge configurations for filtering.
* Update file server LAN drivers along with Router and Bridge
microcode to current versions.
STORAGEEXPRESS WILL NOT ACCEPT A TAPE DURING BACKUP OR RESTORE
Symptom:
The StorageExpress system is set to backup or restore files. It will
not accept a tape during the session.
Solution:
* Check the name of the tape. The StorageExpress system will look
either for specific tape names during the backup or restore
operation or a blank tape (Quick erase will blank a tape).
* Check that the tape heads have been cleaned with the cleaning tape
that came with the StorageExpress, according to the recommended
schedule.
* Make sure the tape is not a cleaning tape or an image tape of the
StorageExpress system.
CENTRAL CONSOLE RUNS SLOW OR NOT AT ALL
Symptom:
All requests made from the Central Console Application are sent to the
network (you will see the beach ball symbol and/or the network symbol).
As a result the application runs very slowly.
Solution:
* The workstation running Central Console must have the following IPX
and NETX loaded.
* - IPX.COM 3.1 or later.
* - NETX.COM (XMS or EMS) 3.26 or later or VLM 1.1.
* - IPXODI.COM 1.2 or later (if used).
* - LSL.COM 1.21 or later (if used).
* If the workstation is using VLMs, the VLM.EXE should be v1.10 or
later. The latest version of VLM.EXE and update instructions can
be found in the WINUPx.exe and DOSUPx.EXE file on NetWire.
* Check that the workstation running Central Console has at least 4
megs of RAM.
* Check the SYSTEM.INI file under [386Enh]. The line "network="
should be:
network=*vnetbios,vnetware.386,vipx.386
* Tip: Best performance occurs when the StorageExpress unit and the
Central Console workstation are on the same network segment.
* Check that the file server you are logged on to is on the same
network segment or ring as the StorageExpress system.
* Check your Windows Setup (located in the Main group of the Program
Manager) for the Network shell version, it should say "Novell
NetWare (shell version x.xx)." The shell version should be 3.26 or
higher.
Do not run Central Console from a file server.
CENTRAL CONSOLE CANNOT SEE THE STORAGEEXPRESS SYSTEM
Symptom:
The Central Console workstation is not able to see the StorageExpress
unit.
Solutions:
The workstation running Central Console must have the following IPX and
NETX or VLM loaded:
- IPX.COM 3.1 or later
- NETX.COM (XMS or EMS) 3.26 or later
- IPXODI.COM 1.2 or later (if used)
- LSL.COM 1.21 or later (if used)
If the workstation is using VLMs, the VLM.EXE should be v1.10 or
later. The latest version of VLM.EXE and update instructions can
be found in the WINUPx.EXE and DOSUPx.EXE file on NetWire.
* Check that the workstation running Central Console has at least 4
megs of RAM.
* Check the SYSTEM.INI file under [386Enh]. The line "network="
should be:
-- network=*vnetbios,vnetware.386,vipx.386
* Tip: Best performance occurs when the StorageExpress unit and the
Central Console workstation are on the same network segment.
* Check your Windows Setup (located in the Main group of the Program
Manager) for the Network shell version, it should say "Novell
NetWare (shell version x.xx)." The shell version should be 3.26 or
higher.
* Madge's Smart IPX and Smart ODI drivers will not work with the
Central Console workstation. Do not use these high performance
drivers. Use the standard Madge Novell drivers in this
workstation.
* Check that the internal IPX number of the StorageExpress system is
unique on the network.
* This can also be caused when upgrading and adding DAT drives. The
tape driver will not load and Central Console cannot see the
StorageExpress unit, and cannot do the update. Run RCONSOLE to the
StorageExpress and type: Tape Driver is ready. This will cause the
StorageExpress system to think it has a driver and allow Central to
see the unit. This will permit the update to complete.
* Do a Reset Routers on the target server that the workstation is
logged into.
BO_INSERT ERRORS AND CORRUPTED FTS (FILE TRACKER) DATABASE
Symptom: (for v1.30.A.E)
May see one or more of the following:
* Data not seen in the FTS browser.
* Entry in the Activity Log indicating: "Failed BO_<some-term>".
Description:
The StorageExpress system is powered off before it is downed or there is
a power spike to an unprotected unit. If the StorageExpress system is
powered off during the Quick tape erase, Full tape erase, or Merging of
a tape, the FTS database will be corrupted and a BO_<some term> error
will show. Quick Erase removes the tape name header and deletes
information in the database associated with the erased tape. This
purging of information from the StorageExpress system could take as long
as 45 minutes.
Solution:
* Map a drive or login as supervisor to the StorageExpress system:
for example:
Map U:=[SE_name]\SYS:ENGINE
Copy all the FTS files from the NULLFTS directory to the ENGINE
directory.
Ncopy \Nullfts \FTS*.BTR \Engine
Note: When you try to copy these files from the NULLFTS (or TEMPFTS)
directory to the ENGINE directory, an error may show that a file is in
use and you will be unable to copy the files. This error occurs because
a file is in use by Btrieve. To work around this problem, use Central
Console to SHUTDOWN AND REBOOT TO Netware MODE and try the operation
again or RCONSOLE into the StorageExpress unit. From the colon prompt :
type in:
Unload MPP
Unload FHM
Unload Prune
Unload Update
Unload Btrieve
Then exit Rconsole login into StorageExpress as supervisor and continue
with the file copy.
THE MESSAGE "ERROR" OR "CANNOT ASSOCIATE" RECEIVED WHEN ASSOCIATING SERVERS FROM CENTRAL CONSOLE
Symptom:
A message indicating "error" or "Cannot Associate" is received when
trying to associate the StorageExpress with a file server. If there
are other servers in the network that have been previously associated,
they may or may not show up as being currently associated. In most
cases they no longer show on the associated list. There are several
possible causes of this error:
Solutions:
* For v1.30A.E. or less, check for the user GUEST as an account on
the StorageExpress unit. The StorageExpress unit MUST have a GUEST
account in its bindery to perform several functions.
* Do not put a password on GUEST account.
* Check for restrictions to the user GUEST. Account, Time, or
Station restrictions should be checked in SYSCON to make sure any
changes made to GUEST are not causing problems.
* If the adapter card is using a NE3200.COM driver, update to the
newest driver and update the NET.CFG file:
Link Driver NE3200
Double Buffer
Frame Ethernet_802.3
Protocol IPX 0 Ethernet_802.3
* Delete *.il and *.dl files from the central console directory
(normally C:\CENTRAL).
* For v1.40A.E and 1.41.A.E, see if you can tell which FTS file is
corrupted:
-- Look in the Activity.log, if you see BO_ errors, the message
will be followed by the corrupted file name.
-- Run BUTIL from the StorageExpress Sytem Console. (Currently
BUTIL does not come with StorageExpress backup system, however it
does come with NetWare: you will have to get it from another
NetWare 3.11 file server.)
RCONSOLE to the StorageExpress
Copy BUTIL.NLM to the SYS:SYSTEM directory of the StorageExpress
:Load BUTIL -stat sys:\engine\ftsjob.btr
Run BUTIL for each btrieve file. If the file is corrupted, you
will see an error message, *The command did not complete due to
unrecoverable error*.
-- If the file(s) that are corrupted are the FTSPATH.BTR and/or
FTSJBDTL.BTR files, then you can copy just those two files from
the StorageExpress SYS:NULLFTS directory.
NThe FTSPATH and FTSJBDTL files must be copied together, never
copy just one of them.
-- If it is any other file, or you can not determine which file it
is then copy ALL the files from the SYS:NULLFTS directory. The
files are FTSPATH.BTR, FTSJBDTL.BTR, FTSJOB.BTR, SLSET.BTR and
SLSETDTL.BTR.
Once you have copied all the files from the SYS:NULLFTS
directory, you May need to delete and recreate the streamlined
jobs.
Login to the StorageExpress unit as supervisor. Change to the PUBLIC
directory. Delete the EP.* file. Reboot the StorageExpress and the
work station. You will need to reassociate all the servers and
workstations to the StorageExpress.
CONFIGURATION - "INVALID PASSWORD"
Product: StorageExpress server 1.20.A.E.
Symptom:
When saving the configuration changes to the StorageExpress system
through the "Configure StorageExpress" option in Central Console, and
the password has been changed, "Invalid Password" error is displayed.
After several retries this is usually successful.
Description:
Passwords were not written correctly as bindery objects.
Solution:
Upgrade your StorageExpress system software to v1.30.A.E.
"CANNOT SUBMIT SELFBACKUP JOB" OR "FAILED TO GET QUEUE JOB LIST, FE' "
Product: StorageExpress(TM) server 1.20.A.E.
Symptom:
When you are setting up a self backup job, the error "Cannot submit
selfbackup job" is seen on the Central Console screen or the selfbackup
job does not run and the error "Failed to get queue job list, fe'" is
reported in the activity log.
Description:
Queues were not written correctly as bindery objects.
Solution:
Upgrade your StorageExpress system software to v1.30.A.E.
NETWORK CONNECTION - STORAGEEXPRESS SYSTEM HANGS WHEN LCD SAYS "ATTACHING DISK DRIVE..."
Product: v1.20.A.E. or earlier
Symptom:
The StorageExpress system hangs when the LCD display shows "Attaching
Disk Drive..." and the Central Console "Connect" screen does not show
the StorageExpress system.
Description:
The Transaction Tracking System (TTS) of the NetWare server software has
detected a bad transaction that needs to be backed out. The NetWare
software creates a new file and then requests a 'Y' or 'N' response to
delete the backout file that can be seen on a monitor but not on the LCD
display.
Solution:
Upgrade your StorageExpress system software to v1.30.A.E.
CPU UNIT HANGS AT "DETERMINING LOCAL NETWORK NUMBER...XXX" OR NAME UNIQUENESS
Symptom:
CPU unit is attempting to connect to the Network and the server is not
answering to any of the addresses.
Solution:
* Check that there is a server running on the same ring or segment
and that REPLY GET NEAREST SERVER=ON.
* Check the physical network cable connection to the CPU unit. Be
sure this connection is working.
* Check that the frame type or source routing options match the
network settings.
CPU UNIT HANGS AT "ACTIVITY LOGGER STARTED" AND OTHER SYMPTOMS
Symptom:
The StorageExpress system cannot communicate with the Tape drive in the
peripheral unit. The StorageExpress system attempts to communicate with
the Tape unit(s) during initialization and hangs at the "Activity Logger
Started" screen. Other symptoms include:
* A StorageExpress system with four tape drives in four separate
groups (parallel) will have the SCSI IDs 3 and 4 timeout while
attempting a self-restore. The self-restore may take 20 to 30
minutes and completes successfully even though it appears to have
failed. Possible error messages:
* "Tape driver at SCSI id 3 may not have been initialized".
* "Tape driver at SCSI id 4 may not have been initialized".
* CPU unit LCD indicates drive 1 or 2 error.
* Central Console configuration screen does not allow "parallel"
drive with 2 tape drives.
Description:
These symptoms are caused when the CPU unit is unable to attach to the
peripheral unit. The symptoms during or after a self-restore are caused
by a problem in the way the LDENGINE.NLM is loading the drivers for the
self-restore.
Solution:
Upgrade your StorageExpress system software to v1.30.A.E.
Check the Peripheral Unit I/O Cable.
Verify that the SCSI switches on the back of the peripheral unit are
correctly set for your tape drive type.
Check the internal tape drive cabling in the peripheral unit.
Check the SCSI ID switch. Information on the switch settings is in
document 6763.
CPU UNIT HANGS AT "SELF TEST" AFTER POWER UP
Symptom:
When the StorageExpress unit is powered up, it hangs at the "Self test"
screen.
Description:
Usually the main causes of this symptom include:
* Hardware configuration errors.
* Hardware failures.
* Improperly connected cables.
Solution:
If this installation includes the Token Ring Option, verify the switch
settings and EPROM location on the adapter board. Document 6724
contains Token Ring Option instructions.
Power the system off and re-seat the external and internal SCSI cables
as well as the External SCSI terminator. Power the system back on.
SEDEMO.EXE
This self-running demo contains an overview of the StorageExpress system
as well as a guided walk-through of the Windows based central management
software. It will show you how easily you can configure automatic
backups and much more. The demo requires VGA, 640K RAM, uses 700K of
hard disk space uncompressed, and supports a mouse if available.
Trademark information