home *** CD-ROM | disk | FTP | other *** search
- Xref: sparky comp.arch.storage:617 comp.databases:6519
- Newsgroups: comp.arch.storage,comp.databases
- Path: sparky!uunet!iWarp.intel.com|inews.Intel.COM!cad018!mfineman
- From: mfineman@cad018..intel.com ( Mark S. Fineman )
- Subject: Re: Info on large, slow storage wanted (jukeboxes, etc.)
- Message-ID: <Btz7HF.1BH@nntp-sc.Intel.COM>
- Sender: news@nntp-sc.Intel.COM (USENET News System)
- Nntp-Posting-Host: cad018
- Organization: Intel Corporation, Santa Clara, CA
- References: <1992Aug29.210553.8744@rhein-main.de> <1992Aug31.043738.19685@psg.com> <22302@venera.isi.edu>
- Date: Thu, 3 Sep 1992 00:17:38 GMT
- Lines: 68
-
- In article <22302@venera.isi.edu> rod@venera.isi.edu (Rodney Doyle Van Meter III) writes:
- >In article <1992Aug31.043738.19685@psg.com> randy@psg.com (Randy Bush) writes:
- >>vhs@rhein-main.de (Volker Herminghaus-Shirai) writes:
- >>
- >>> I need to design a retrieval system for ~15TB of data, of which ~5TB are
- >>> retrieved with a high frequency (25 requests/second) and an access time
- >>> of avg. 30 seconds (max. 120 seconds). Retrieval lacks any locality, so the
- >>> 5TB are real random-access. The rest of the data is still accessed at a
- >>> rate of 5 requests/second.
- >>
- >>Optical RW store, a la HP.
- >>--
- >>randy@psg.com ...!uunet!m2xenix!randy
- >
- >
- >The problem is not the volume -- 15 TB is huge but not enormous
- >(like the distinction?:-)), the problem is the access rate. Even
- >the fastest cart machines are ~15 seconds to replace a disk or tape
- >(remove the AND insert the new), so to get 25 reqs/sec., you're
- >looking at not 40 but 400 cart machines.
- >
- >As pointed out, you probably want MO, not tape; VHS, 8mm, D-1, and
- >D-2 are all fairly slow to load (I think Ampex' D-2 is the fastest),
- >plus the seek times... w/ 400 cart machines, you'll have enough
- >drives that seek time isn't a problem with your latency.
- >
- >An important question is the size of an average data request --
- >if it's a bank account balance, the throughput of the device is irrelevant.
- >If it's CAD files or fluid-flow simulation, you need one of the
- >higher-speed alternatives.
- >
- >Are you prepared to handle acres of floor space, hundreds of
- >drives, all the power, etc.?
- >
- >By the time you add it all up, I think the only realistic solution
- >is RAID arrays. Yes, they're small and expensive, but I think it's
- >the only way to get that kind of throughput. Oh, you'll need
- >one mini-super for every several RAID arrays, too, don't forget.
- I'd still like to know what his real application is:
- Can we make an effective index to reduce the number of
- sides of disk?.
-
- We have 120 seconds worst case response required. This is only
- time for about 4 mounts per drive (with current technology).
- It is stated that 5000GB of data has to be looked at randomly.
- Even if this only means the 10000 ( 5000GB/.5GB/side)
- are needed, we would require 2500 drives to get the maximum
- response time requested. This would not include reading
- any significant amount of data on each platter.
-
- 2500 *$3000/drive + 15000*$300/slot = $7.5M + $4.5M = $12,000,000
- for 1GB optical disks for 15TB with 2500 drives.
-
- Hard disks are about $2000/GB, $2,000,000/TB, $30,000,000 for a
- 15TB system. If it turns out that you actually have to read through
- 1/3 of the data in 120 seconds, then we have:
- 5E12 bytes
- ----------- = 4E10 bytes/second
- 1.2E2 seconds
-
- This would require about 4E10/4E6 fast disks, = 1E4 fast disks.
- Current technology is 2E9/disk, so 2E13 bytes would fit on
- this many disks, which is close to the 1.5E13 bytes needed in
- total, so the $30,000,000 should be still in the ballpark.
- --
- (408) 765-4277; MS SC3-36, 2220 Mission College Blvd.,
- P.O. Box 58119, Santa Clara, CA 95052-8119
- mfineman@scdt.intel.com
-