home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.unix.questions
- Path: sparky!uunet!cs.utexas.edu!zaphod.mps.ohio-state.edu!news.acns.nwu.edu!casbah.acns.nwu.edu!navarra
- From: navarra@casbah.acns.nwu.edu (John Navarra)
- Subject: Using sed to convert numbers (interesting)
- Message-ID: <1992Jul31.121118.27783@news.acns.nwu.edu>
- Sender: usenet@news.acns.nwu.edu (Usenet on news.acns)
- Organization: Northwestern University, Evanston Illinois.
- Date: Fri, 31 Jul 1992 12:11:18 GMT
- Lines: 97
-
- This is an interesting little problem I am working on. I am trying to
- convert numbers from scientific notation to regular (decimal) notation.
- I am using sed to do this. So far, this is what I have:
-
- [casbah:411] ~/bin -> cat Etodec
- sed -n '
- s/\([0-9]\)\.\([0-9]\)\([0-9]*\)E+01/\1\2.\3/p
- s/\([0-9]\)\.\([0-9][0-9]\)\([0-9]*\)E+02/\1\2.\3/p
- s/\([0-9]\)\.\([0-9]*\)E-01/.\1\2/p
- s/\([0-9]\)\.\([0-9]*\)E-02/.0\1\2/p' $*
- [casbah:411] ~/bin ->
-
- and I have the file with scientific numbers:
- [casbah:414] /tmp/navarra -> cat sci.numbers
- 1.2345E-01
- -1.2345E-01
- 1.2345E+01
- -1.2345E+01
- 1.2345E-02
- -1.2345E-02
- 1.2345E+02
- -1.2345E+02
-
- Notice first of all that all the numbers are of the form:
- [-]X.XXXXXE[+-]XX
-
- In particular, the fact that there is only one number before the decimal
- is crucial in the substitutions.
-
- So far, Etodec (scientific 'E' to 'decimal') is working fine:
- [casbah:415] /tmp/navarra -> Etodec sci.numbers
- .12345
- -.12345
- 12.345
- -12.345
- .012345
- -.012345
- 123.45
- -123.45
-
- However, just to spark some discussion on this, I have two questions.
-
- 1) for you sed gurus, I don't understand WHY my substitutions are working
- for negative numbers. For instance, the line
-
- s/\([0-9]\)\.\([0-9]*\)E-01/.\1\2/p
-
- looks for numbers of the form '1.2345E-01' or '-1.2345E-01'. However,
- I am confused as to how the hold buffer is working. The first thing
- I look for is the first number before the decimal (placing it in the
- first hold buffer), followed by a '.' Then the rest of the numbers
- are placed in the second hold buffer. What I don't understand is
- why the replacement pattern '.\1\2' doesn't produce something like:
-
- .-12345
-
- And similarly, for the case of E+01, why I don't get something like:
-
- 1-2.345
-
- If I try the line:
- s/\([0-9]\)\.\([0-9]*\)E-01//p
-
- I get -'s on the matched lines.
- How does sed know to put the '-' back at the beginning of the line?
-
-
- 2) Secondly, as you can probably tell, as the numbers get bigger or
- smaller, the substitution lines will as well. I wrote the line
- to limit the number of hold buffers to three for positive numbers
- and two for negative numbers. However as the numbers get bigger
- or smaller, here is what those lines will look like:
-
- positive
- s/\([0-9]\)\.\([0-9]\)\([0-9]*\)E+01/\1\2.\3/p
- s/\([0-9]\)\.\([0-9][0-9]\)\([0-9]*\)E+02/\1\2.\3/p
- s/\([0-9]\)\.\([0-9][0-9][0-9]\)\([0-9]*\)E+03/\1\2.\3/p
- ...
- s/\([0-9]\)\.\([0-9]...[0-9]...[0-9][0-9]\)\([0-9]*\)E+XX/\1\2.\3/p
-
- negative
- s/\([0-9]\)\.\([0-9]*\)E-01/.\1\2/p
- s/\([0-9]\)\.\([0-9]*\)E-02/.0\1\2/p'
- s/\([0-9]\)\.\([0-9]*\)E-03/.00\1\2/p'
- ...
- s/\([0-9]\)\.\([0-9]*\)E-XX/.0000.....0....\1\2/p'
-
- What I need is a variable which will scan what the value of 'XX' is
- in E[+-]XX and for numbers > 1 (E+XX) , write out 'XX' number of '[0-9]'
- strings in the pattern, and for numbers < 1 (E-XX), write out XX-1 zeroes
- in the replacement string. Any suggestions on how to do this?
-
- There is also one more problem I noticed for postive numbers
- which I will bring up if a discussion insues.
-
- Any suggestions are appreciated,
- -tms
-