home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!zaphod.mps.ohio-state.edu!pacific.mps.ohio-state.edu!cis.ohio-state.edu!physik.tu-muenchen.DE!Peer.Stritzinger
- From: Peer.Stritzinger@physik.tu-muenchen.DE
- Newsgroups: gnu.utils.bug
- Subject: Bug with NUL character in FS regexp in gawk 2.14
- Date: 25 Jan 1993 21:15:59 -0500
- Organization: GNUs Not Usenet
- Lines: 64
- Sender: daemon@cis.ohio-state.edu
- Approved: bug-gnu-utils@prep.ai.mit.edu
- Distribution: gnu
- Message-ID: <9301252057.AA08150@axon.t30.physik.tu-muenchen.de>
-
- Environment:
- gawk 2.14, patchlevel 0
-
- Intel 386 running SYSVR3.2 from Interactive.
- Sparc Sun running SUNOS 4.1x
- DEC 5000-25 running ULTRIX 4.2
-
- Problem:
- There is no possibilty to use the NUL character (written as
- ^@ in the examples) in the FS variable as part of a regular
- expression.
-
- For the tests below this input file was used:
-
- tst.in:
- ------------------------------------------------------------
- 100 ^@ 200 ^@ 300
- 400 500 600
- ------------------------------------------------------------
-
- Normaly this should work, but the string is cut off at the
- NUL character. The problem could be the usage of 'strlen'
- in function 'set_FS' but I did not investigate further.
- ------------------------------------------------------------
- gawk 'BEGIN {FS = "[ \t\0]+"}
- {for (i = 1; i <= NF; i++) {
- print i, $i }}' <tst.in
- ------------------------------------------------------------
- gawk: cmd. line:1: fatal: Premature end of regular expression: /[ /
- ------------------------------------------------------------
-
- I tried this as a workaround, but this separates fields at
- blanks, tabs, backslashes, and '0'
- ------------------------------------------------------------
- gawk 'BEGIN {FS = "[ \t\\0]+"}
- {for (i = 1; i <= NF; i++) {
- print i, $i }}' <tst.in
- ------------------------------------------------------------
- 1 1
- 2 ^@
- 3 2
- 4 ^@
- 5 3
- 6
- 1 4
- 2 5
- 3 6
- 4
- ------------------------------------------------------------
-
- The NUL character is preserved in the string variable but
- the effect remains the same.
- ------------------------------------------------------------
- gawk 'BEGIN {tmp = "[ \t\0]+"; print tmp; FS = tmp}
- {for (i = 1; i <= NF; i++) {
- print i, $i }}' <tst.in
- ------------------------------------------------------------
- [ ^@]+
- gawk: cmd. line:1: fatal: Premature end of regular expression: /[ /
- ------------------------------------------------------------
-
- Peer Stritzinger
- stritzi@physik.tu-muenchen.de
-
-