home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.sys.next.sysadmin
- Path: sparky!uunet!newshost!root
- From: dglattin@trirex.com (Dennis Glatting)
- Subject: Re: crippled netinfo
- Message-ID: <1992Aug28.135743.20043@Trirex.COM>
- Sender: root@Trirex.COM (Operator)
- Organization: Trirex Systems Inc.
- References: <MeZffxu00WD510h3U9@andrew.cmu.edu>
- Date: Fri, 28 Aug 1992 13:57:43 GMT
- Lines: 146
-
- In article <MeZffxu00WD510h3U9@andrew.cmu.edu> del+@CMU.EDU (Daniel Edward
- Lovinger) writes:
- >
- > We have over 12k current users, and are using NetInfo to link
- > together a set of about 20 NeXTstations that are part of our larger
- > installation. After the last rebuild to insert the latest 4k account
- > adds, the server will not stay up for more than a few minutes - it
- > goes into a tight loop after which all data in /users in the / domain
- > is emptied.
- >
- > Chronology of the last two days ...:
- >
- > * 3pm Thu, after trying to wish the / domain back to life
- > after the add, we rebuild the database from scratch.
- > This appears to work, and /users is visible and
- > lookups are working as the passwd file is loaded back
- > in. People can log in, telnet, etc.
- >
- > * noon Friday the database had zeroed the /users from the
- > previous day's rebuild. Start niload again on the
- > passwd file, same circumstances during load. Back up
- > the database after the niload finishes (seven hours
- > later).
- >
- > * 1:30pm Sat, the database has zeroed out /users again. Kill
- > netinfo, back in the database saveset, restart. Was
- > able to nidump groups and look at passwd information.
- > About two minutes later the netinfod for / goes into a
- > tight loop and becomes unresponsive. Look at
- > /usr/adm/messages and find nothing useful. Nothing
- > else unusual. Repeat backup installation with same
- > result. Begin waving dead chickens ...
- >
- > This is with the NeXT OS 2.0 netinfo suite. Before we tried
- > updating /usrers to 12k, we'd had little problems with the machine.
- > Does anyone have experience with NetInfo at this scale? I am
- > suspecting a magic number in the server that we must have crossed.
- >
- > I think my current options are limited to converting to YP on
- > the fly (something we have very little experience with). I have also
- > heard of third party NetInfo server implementations ... and
- > experience? Could a 3.0beta site comment on NetInfo changes? Would
- > that also be a realistic option? Can 2.0 machines contact a 3.0
- > server?
-
- I administer some large computer NeXT sites. I have seen NetInfo
- wierdness before.
-
- Do you know who your NetInfo clones are? Until you have solved the
- problem do this:
-
- * Destroy all clones.
- * Go to *every* (except the master of course) NeXT in the network and look
- in /etc/netinfo. If there is anything there other than local.nidb, blow
- it away.
- * Go to *every* machine (except the NetInfo master) and replace
- /etc/hostconfig with the default from /usr/template/client/etc/hostconfig.
- It would probably help if you edited /etc/hostconfig and set IPNETMASK to
- -AUTOMATIC-. That of course is dependent upon your network configuration.
-
- Your NetInfo network performance will suffer with only one NetInfo server
- but it is extremely important that you are working in a known environment.
- In the past I have found clients set up clones, later destroyed the
- clone's serves properties, but left the databases intact. Even later,
- when the NetInfo servers were down, machines were receiving out dated
- NetInfo information from those exNetInfo servers. I also saw those
- machines providing NetInfo informtion thereby corrupting the master when
- we were rebuilding the master NetInfo database.
-
- I also suggest that you shut down all of the NeXT machines on your network
- when you rebuild the master.
-
- Here is a script I wrote that may help you. It backs up the NetInfo
- database. I run it under cron every eight hours. The script trims the
- backup directory such that only a weeks worth of backups are retained.
- (Sorry about using csh -- I was playing that day :).)
-
-
- --------- cut here ---------
- #! /bin/sh
- # This is a shell archive, meaning:
- # 1. Remove everything above the #! /bin/sh line.
- # 2. Save the resulting text in a file.
- # 3. Execute the file with /bin/sh (not csh) to create the files:
- # niback
- # This archive created: Fri Aug 28 09:54:13 1992
- export PATH; PATH=/bin:$PATH
- if test -f 'niback'
- then
- echo shar: will not over-write existing file "'niback'"
- else
- cat << \SHAR_EOF > 'niback'
- #!/bin/csh
-
- # Dennis P. Glatting
- # Trirex Systems Inc.
- # 16-Jul-92
-
- # (c) Copyright Trirex Systems Inc., 1992
-
- # This is a modification to Amit's NetInfo backup script.
- # Rather than copy the databases verbatim, which uses a massive amount
- # disk space, I am 'tar'ing and 'compress'ing the NetInfo data.
- # Also, I am doing the tar from the NetInfo directory level of /etc
- # than descending the directory tree. This makes restoration cleaner
- # but with a little more work.
-
- set date = `date`
- set day = `echo $date | awk '{print $1}'`
- set target_file = "netinfo."`echo $date | awk '{print $2 $3 $4}'`".tar.Z"
-
- set backup_dir = /etc/netinfo.old
- set source_dir = /etc/netinfo
-
-
- # If the backup directories does not exist then make it.
- if ( ! -d $backup_dir ) then
- mkdir $backup_dir
- chmod 700 $backup_dir
- endif
- if ( ! -d $backup_dir/$day ) then
- mkdir $backup_dir/$day
- chmod 700 $backup_dir/$day
- endif
-
- # Maintain only a week's amount of data.
- find $backup_dir -type f -mtime +7 -print | xargs rm -f
-
- # Do the backup
- cd $backup_dir
- tar cf - $source_dir | compress >$day/$target_file
-
-
- exit 0
- SHAR_EOF
- chmod +x 'niback'
- fi # end of overwriting check
- # End of shell archive
- exit 0
- --------- cut here ---------
-
- --
- Dennis P. Glatting / Sr. Technical Manager / Trirex Systems Inc.
- 315 Post Road West / Westport, Connecticut 06880 / (203)221-4600
- dennis_glatting@trirex.com (NeXTmail Ok)
- Member League for Programming Freedom
-