NetNews Usenet Archive 1992 #19

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #19 / NN_1992_19.iso / spool / comp / sys / next / sysadmin / 4914 < prev next >

Wrap

Text File | 1992-08-29 | 5.7 KB | 158 lines

Newsgroups: comp.sys.next.sysadmin Path: sparky!uunet!newshost!root From: dglattin@trirex.com (Dennis Glatting) Subject: Re: crippled netinfo Message-ID: <1992Aug28.135743.20043@Trirex.COM> Sender: root@Trirex.COM (Operator) Organization: Trirex Systems Inc. References: <MeZffxu00WD510h3U9@andrew.cmu.edu> Date: Fri, 28 Aug 1992 13:57:43 GMT Lines: 146 In article <MeZffxu00WD510h3U9@andrew.cmu.edu> del+@CMU.EDU (Daniel Edward Lovinger) writes: > > We have over 12k current users, and are using NetInfo to link > together a set of about 20 NeXTstations that are part of our larger > installation. After the last rebuild to insert the latest 4k account > adds, the server will not stay up for more than a few minutes - it > goes into a tight loop after which all data in /users in the / domain > is emptied. > > Chronology of the last two days ...: > > * 3pm Thu, after trying to wish the / domain back to life > after the add, we rebuild the database from scratch. > This appears to work, and /users is visible and > lookups are working as the passwd file is loaded back > in. People can log in, telnet, etc. > > * noon Friday the database had zeroed the /users from the > previous day's rebuild. Start niload again on the > passwd file, same circumstances during load. Back up > the database after the niload finishes (seven hours > later). > > * 1:30pm Sat, the database has zeroed out /users again. Kill > netinfo, back in the database saveset, restart. Was > able to nidump groups and look at passwd information. > About two minutes later the netinfod for / goes into a > tight loop and becomes unresponsive. Look at > /usr/adm/messages and find nothing useful. Nothing > else unusual. Repeat backup installation with same > result. Begin waving dead chickens ... > > This is with the NeXT OS 2.0 netinfo suite. Before we tried > updating /usrers to 12k, we'd had little problems with the machine. > Does anyone have experience with NetInfo at this scale? I am > suspecting a magic number in the server that we must have crossed. > > I think my current options are limited to converting to YP on > the fly (something we have very little experience with). I have also > heard of third party NetInfo server implementations ... and > experience? Could a 3.0beta site comment on NetInfo changes? Would > that also be a realistic option? Can 2.0 machines contact a 3.0 > server? I administer some large computer NeXT sites. I have seen NetInfo wierdness before. Do you know who your NetInfo clones are? Until you have solved the problem do this: * Destroy all clones. * Go to *every* (except the master of course) NeXT in the network and look in /etc/netinfo. If there is anything there other than local.nidb, blow it away. * Go to *every* machine (except the NetInfo master) and replace /etc/hostconfig with the default from /usr/template/client/etc/hostconfig. It would probably help if you edited /etc/hostconfig and set IPNETMASK to -AUTOMATIC-. That of course is dependent upon your network configuration. Your NetInfo network performance will suffer with only one NetInfo server but it is extremely important that you are working in a known environment. In the past I have found clients set up clones, later destroyed the clone's serves properties, but left the databases intact. Even later, when the NetInfo servers were down, machines were receiving out dated NetInfo information from those exNetInfo servers. I also saw those machines providing NetInfo informtion thereby corrupting the master when we were rebuilding the master NetInfo database. I also suggest that you shut down all of the NeXT machines on your network when you rebuild the master. Here is a script I wrote that may help you. It backs up the NetInfo database. I run it under cron every eight hours. The script trims the backup directory such that only a weeks worth of backups are retained. (Sorry about using csh -- I was playing that day :).) --------- cut here --------- #! /bin/sh # This is a shell archive, meaning: # 1. Remove everything above the #! /bin/sh line. # 2. Save the resulting text in a file. # 3. Execute the file with /bin/sh (not csh) to create the files: # niback # This archive created: Fri Aug 28 09:54:13 1992 export PATH; PATH=/bin:$PATH if test -f 'niback' then echo shar: will not over-write existing file "'niback'" else cat << \SHAR_EOF > 'niback' #!/bin/csh # Dennis P. Glatting # Trirex Systems Inc. # 16-Jul-92 # (c) Copyright Trirex Systems Inc., 1992 # This is a modification to Amit's NetInfo backup script. # Rather than copy the databases verbatim, which uses a massive amount # disk space, I am 'tar'ing and 'compress'ing the NetInfo data. # Also, I am doing the tar from the NetInfo directory level of /etc # than descending the directory tree. This makes restoration cleaner # but with a little more work. set date = `date` set day = `echo $date | awk '{print $1}'` set target_file = "netinfo."`echo $date | awk '{print $2 $3 $4}'`".tar.Z" set backup_dir = /etc/netinfo.old set source_dir = /etc/netinfo # If the backup directories does not exist then make it. if ( ! -d $backup_dir ) then mkdir $backup_dir chmod 700 $backup_dir endif if ( ! -d $backup_dir/$day ) then mkdir $backup_dir/$day chmod 700 $backup_dir/$day endif # Maintain only a week's amount of data. find $backup_dir -type f -mtime +7 -print | xargs rm -f # Do the backup cd $backup_dir tar cf - $source_dir | compress >$day/$target_file exit 0 SHAR_EOF chmod +x 'niback' fi # end of overwriting check # End of shell archive exit 0 --------- cut here --------- -- Dennis P. Glatting / Sr. Technical Manager / Trirex Systems Inc. 315 Post Road West / Westport, Connecticut 06880 / (203)221-4600 dennis_glatting@trirex.com (NeXTmail Ok) Member League for Programming Freedom