NetNews Usenet Archive 1992 #16

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #16 / NN_1992_16.iso / spool / comp / os / os2 / programm / 3705 < prev next >

Wrap

Internet Message Format | 1992-07-21 | 5.8 KB

Xref: sparky comp.os.os2.programmer:3705 comp.lang.fortran:2795 Path: sparky!uunet!haven.umd.edu!darwin.sura.net!wupost!usc!cs.utexas.edu!rutgers!modus!gear!cadlab!martelli From: martelli@cadlab.sublink.org (Alex Martelli) Newsgroups: comp.os.os2.programmer,comp.lang.fortran Subject: Re: 32 Bit Fortran Compiler ?? Message-ID: <1992Jul21.140038.27900@cadlab.sublink.org> Date: 21 Jul 92 14:00:38 GMT References: <hjkirch.710065189@morpheus> <ignacij.710085875@meishan.animal.uiuc.edu> <1992Jul06.073701.19620@cadlab.sublink.org> <ignacij.710603357@meishan.animal.uiuc.edu> <1992Jul15.155450.22340@cadlab.sublink.org> <ignacij.711385250@meishan.animal.uiuc.edu> Organization: CAD.LAB S.p.A., Bologna, Italia Lines: 176 ignacij@meishan.animal.uiuc.edu (Ignacy Misztal) writes: ... :I agree that mixing types between the program and the subroutine should :be avoided. What happens, if it brings a 4 times improvement in speed? :Forget the standard if the code will run 1 week instead of 4, and it :runs on almost every compiler. The 4 time improvement example is below, :and it works with f77 compiler. :c this is a slow program with f77 compilers : real x(1000) : read(1)n,(x(i),i=1,n) : end : :c this is a fast program with f77 compilers : real x(1000) : call get(n,x) : end : subroutine(n,x) : character x*4000 : read(1)n,x(1:4*n) : end : :I know that the inability of the f77 compilers to fold implicit loops I have tried to reproduce your results and failed utterly. First of all the runtimes of the above programs are of course in the noise. So I tried to amplify them a bit as follows: as p1.f I have: c this is an (allegedly) slow program with f77 compilers real x(1000) open(unit=1,file='pdat',form='unformatted') do 10,j=1,100 rewind(1) read(1)n,(x(i),i=1,n) 10 continue end As p2.f I have: c this is an (allegedly) fast program with f77 compilers real x(1000) open(unit=1,file='pdat',form='unformatted') do 10,j=1,100 rewind(1) call get(n,x) 10 continue end subroutine get(n,x) character x*4000 read(1)n,x(1:4*n) end I compile each with "f77 -o p1 p1.f" and "f77 -o p2 p2.f" on the HP9000/400 workstation I'm on at present. I build a datafile for them to read with the following program p3.f: c create file to try things out real x(1000) open(unit=1,file='pdat',form='unformatted') n=567 do 10,i=1,n 10 x(i)=i/float(n) write(1)n,(x(i),i=1,n) end compiled of course with "f77 -o p3 p3.f". Then I perform: p3 ; p2 ; p1 # avoid caching effects for i in 1 2 3 4 5 do time p1 done >p1.tim 2>&1 for i in 1 2 3 4 5 do time p2 done >p2.tim 2>&1 and here is p1.tim: real 0m0.20s user 0m0.04s sys 0m0.14s real 0m0.18s user 0m0.02s sys 0m0.12s real 0m0.20s user 0m0.02s sys 0m0.16s real 0m0.18s user 0m0.02s sys 0m0.12s real 0m0.18s user 0m0.04s sys 0m0.14s and here is p2.tim: real 0m0.20s user 0m0.02s sys 0m0.16s real 0m0.20s user 0m0.02s sys 0m0.16s real 0m0.18s user 0m0.00s sys 0m0.16s real 0m0.18s user 0m0.06s sys 0m0.10s real 0m0.20s user 0m0.04s sys 0m0.16s I fail to notice ANY statistically significant difference between elapsed, usermode OR sysmode times of these two programs, MUCH LESS the alleged "FOUR TIMES" improvement! I *assume* if I took the time and trouble to run these on our Apollo's, VAXen, IBM 6150's, DOS boxes, RS6000's, SONY NeWS's, DECsystems, and so on ad nauseam, I WOULD find situations in which either of these programs would surprisingly come out much faster than the other one. ON THE OTHER HAND, I *KNOW FOR SURE* that I would find situations in which the second, standard-violating program would UNsurprisingly crash and burn, while the first, allegedly "slow on f77" one (which runs PERFECTLY FINE when compiled by f77 on this here 68030 box) WILL run on all of these boxes. :in the read statement is a weakness that will be corrected soon (in :a couple of years?). But now, there is no alternative. I know that :some people will suggest breaking the read statement into 2, etc., at a :considerable cost. However, my program has to run fast now. Try separating out the I/O; on most of the boxes I have around here, doing the I/O *IN A C LANGUAGE SUBROUTINE* called from your Fortran program (sigh!) will, depending on various factors, accelerate your I/O bound programs by 20%-80%. Sad, but true. This will certainly not hold everywhere, so you'll have to have at hand a Fortran routine to link instead of the C one for the "exceptional" (HA!) boxes where the Fortran runtime I/O library IS as well-tuned as it SHOULD be. :By the way, how can I equivalence a section of an integer vector :(position known at run time only), to a character string? You equivalence the *total* integer vector to a LONG character string, or to an array of longish character strings. There are of course no guarantees in the standard about how long you CAN make a character string, or an array of them, OR an array of integers for that matter (e.g. the infamous 64K-byte limits on old Fortran PC compilers), so you may run against one or more such limitations. Example: integer*4 iarra(1000) character chst*4000 equivalence (iarra,chst) call iwantints(iarra(40),60) call iwantchar(chst(161:400)) end subroutine iwantints(ints,n) integer*4 n,ints(n) c whatever end subroutine iwantchar(chars) character*(*) chars c whatever else end [SORRY for the NON-standard '*4' on 'integer', but after getting burned BADLY by old d*mned standard-VIOLATING 16-bit compilers where "integer" meant TWO bytes, I guess I'm psychologically scarred for life on this!-]. -- Email: martelli@cadlab.sublink.org Phone: ++39 (51) 6130360 CAD.LAB s.p.a., v. Ronzani 7/29, Casalecchio, Italia Fax: ++39 (51) 6130294