home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.lang.perl
- Path: sparky!uunet!noc.near.net!meiko.com!mike
- From: mike@meiko.com (Mike Stok)
- Subject: Re: Fast String Operations?
- Message-ID: <1992Aug26.113147.10966@meiko.com>
- Sender: news@meiko.com
- Organization: Meiko Scientific Corp.
- References: <1992Aug25.151625.3134@IDA.ORG>
- Date: Wed, 26 Aug 1992 11:31:47 GMT
- Lines: 60
-
- In article <1992Aug25.151625.3134@IDA.ORG> rlg@IDA.ORG (Randy garrett) writes:
- >I'm looking for the fastest way to perform the following operation.
- >Basically, this is a conversion of one database format to another.
- >If I have a NULL field, indicated by 2 pipe symbols next to
- >each other, I want to insert either a -1 or a ~ between the
- >two pipes, depending upon whether the type of that field
- >is a integer or a character. I know the type of the field
- >because I've already pre-filled that array with the correct
- >types from the Data Dictionary (Thanks Sybase to Perl Interface!).
- >
- >So, I read in a series of lines from a file. If I find 2 pipe
- >symbols adjacent to each other, "||", in the input string, I want
- >to insert either a ~ or a -1 depending on the type of that field,
- >which I get from the @name array.
- >
- >Here's the code I'm using. It works, but the bad news is that
- >I process 800+ Megabytes of data at a crack with this. It
- >takes almost a day of CPU time on a Sparc 2! Any words of
- >wisdom on how to speed it up would be greatly appreciated.
-
- My guess at the code would be something like:
-
- #!/usr/bin/perl
-
- $string = "a|bcd||g||i|||";
- @name = (1,2,1,1,2,1,2,1);
-
- if ($string =~ /\|\|/)
- {
- chop ($string);
- @field = split (/\|/, $string, @name);
- for ($index = 0; $index < @field; $index++)
- {
- $field[$index] = ('', '-1', '~')[$name[$index]] if $field[$index] eq '';
- }
- $string = join ('|', @field) . '|';
- }
-
- print "$string\n";
-
- This would only do any processing if there were 2 adjacent |s in the
- string, and I assume that the | is used as a field terminator rather
- than a separator (hence the chop and the . '|' after the join), and
- that there will always be the right number of fields in a string, and
- that $[ is left at the default value of 0. It produces a different result
- to your code :-(, but if field 0 is the "a", 1 is "bcd", and field 2
- is the first empty field, I see that $name[2] == 1, so this is an empty
- integer field so it becomes '-1'. My result $string was a|bcd|-1|g|~|i|~|-1|
- but yours was a|bcd|~|g|-1|i|-1|~|...
-
- I haven't measured how fast it goes compared to your code, but I would
- happily buy you a beer if it's slower :-) (I guess I should owe you one
- anyway as I get a different answer)
-
- Mike
- --
- The "usual disclaimers" apply. | ... many were weak and confused, succumbing
- Mike Stok | to drink or drugs whenever possible ...
- mike@meiko.com |
- Meiko tel: (617) 890 7676 | Hunter S. Thompson
-