NetNews Usenet Archive 1992 #19

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #19 / NN_1992_19.iso / spool / comp / lang / perl / 5524 < prev next >

Wrap

Text File | 1992-08-26 | 3.0 KB | 98 lines

Path: sparky!uunet!cs.utexas.edu!usc!news!netlabs!lwall From: lwall@netlabs.com (Larry Wall) Newsgroups: comp.lang.perl Subject: Re: Fast String Operations? Message-ID: <1992Aug27.002721.1322@netlabs.com> Date: 27 Aug 92 00:27:21 GMT References: <1992Aug25.151625.3134@IDA.ORG> Sender: news@netlabs.com Organization: NetLabs, Inc. Lines: 85 Nntp-Posting-Host: scalpel.netlabs.com In article <1992Aug25.151625.3134@IDA.ORG> rlg@IDA.ORG (Randy garrett) writes: : I'm looking for the fastest way to perform the following operation. : Basically, this is a conversion of one database format to another. : If I have a NULL field, indicated by 2 pipe symbols next to : each other, I want to insert either a -1 or a ~ between the : two pipes, depending upon whether the type of that field : is a integer or a character. I know the type of the field : because I've already pre-filled that array with the correct : types from the Data Dictionary (Thanks Sybase to Perl Interface!). : : So, I read in a series of lines from a file. If I find 2 pipe : symbols adjacent to each other, "||", in the input string, I want : to insert either a ~ or a -1 depending on the type of that field, : which I get from the @name array. There's a very important piece of information that you left out. Namely, what percentage of the lines do you expect will have a null field? If many lines don't have null fields, you want to say this: while (<INPUT>) { next unless /\|\|/; # replacement algorithm goes here } This sort of short circuit can save you oodles of time, regardless of how inefficient your replacement algorithm is. Presuming, of course, that most lines will be rejected immediately. If most lines have null fields, then it's a waste of time to do the short circuit, and you can just go straight for one of the split methods previously posted (preferably the correct one). Alternately, (there's always an alternately) you can do something fancy like this: #!/usr/bin/perl $/ = '|'; @default = (-1,'~',-1,-1,'~',-1,'~',-1); while (<>) { $field = 0, print "\n" if s/^\n//; next unless /^\|/; print $default[$field]; } continue { print; ++$field; } This presumes there's always a | right before the "\n". It seemed from your description that this is so, but it wasn't explicit. Anyhoo, try them out on your data, and see which one is faster. I can't predict without seeing the data. By the way. Just between you and me and the gatepost, this *might* be a better job for C than for Perl, if you only have to support this on one architecture. C is better than Perl at character crawling: #include <stdio.h> char *def[] = {"-1","~","-1","-1","~","-1","~","-1"}; main() { int ch; int lastch; int field = 0; while ((ch = getc(stdin)) != EOF) { if (ch == '\n') field = 0; else if (ch == '|') { if (lastch == '|') fputs(def[field], stdout); field++; } putc(ch, stdout); lastch = ch; } } Or sumph'n like that. Larry