home *** CD-ROM | disk | FTP | other *** search
Text File | 1991-05-04 | 74.7 KB | 1,891 lines |
- .EF 'Strategies for Porting'- % -'March 1, 1988'
- .OF 'Strategies for Porting'- % -'March 1, 1988'
- .EH '''
- .OH '''
- .TL
- Strategies for Porting
- the X v11 Sample Server
- .AU
- Susan Angebranndt
- .AU
- Raymond Drewry
- .AU
- Philip Karlton
- .AU
- Todd Newman
- .AI
- Digital Equipment Corporation
- .AI
- minor revisions by
- .AU
- Bob Scheifler
- .AI
- Massachusetts Institute of Technology
- .AI
- Revised for Release 4 and Release 5 by
- .AU
- Keith Packard
- .AI
- MIT X Consortium
-
- .LP
- X is a graphic user interface system enabling an application
- program to run on a computer other than the user's workstation.
- A "display server" (or simply "server") is the user's workstation.
- A "client" is an application program running on a machine
- with which the user interacts.
- Accompanying this document is the X Window System source tape,
- which also contains the "Definition
- of the Porting Layer" document.
-
- This document is a guide to porting the X server
- software to your workstation.
-
- .IP 1)
- Read the first section of this document (Overview of Porting Process).
- It describes what you will go through.
-
- .IP 2)
- Skim the "Definition of the Porting Layer" document.
-
- .IP 3)
- Scan the Details section of this document.
-
- .IP 4)
- Start planning and working, referring to the Strategies
- and Definition documents.
-
- You may also want to look at the following documents:
- .IP \(bu 5
- "The X Window System"
- for an overview of X.
- .IP \(bu 5
- "Xlib - C Language X Interface"
- for a view of what the client programmer sees.
- .IP \(bu 5
- "X Window System Protocol"
- for a terse description of the bytestream protocol
- between the client and server.
- .IP \(bu 5
- "X11 Server Extensions Engineering Specification"
- for a description of how to add features to your X server
- in an agreeable way.
-
- UNIX is a trademark of AT&T.
- QVSS, LK201, ULTRIX, VMS, DEC, MicroVAX and
- VAX are trademarks of Digital Equipment Corporation.
- Macintosh and Apple are trademarks of Apple Computer, Inc.
- PostScript is a trademark of Adobe Systems, Inc.
- Ethernet is a trademark of Xerox Corporation.
- X and the X Window System are trademarks of
- Massachusetts Institute of Technology.
- Cray is a trademark of Cray Research, Inc.
-
- .NH 1
- Overview of the Porting Process
- .XS
- Overview of the Porting Process
- .XE
- .NH 2
- Directories
- .XS
- Directories
- .XE
- .LP
- This section describes some of the directories accompanying the tape.
- .LP
- lib/X
- .RS
- .RE
- The X client library is used for developing client programs.
- Since it is simple and straightforward, it should be very easy to port.
-
- .LP
- doc/
- .RS
- .RE
- This directory contains documentation, such as an online copy of this document
- and the others mentioned.
- .LP
- Makefile
- .RS
- .RE
- This is a master make file for everything on the tape.
-
- .LP
- fonts/
- .RS
- .RE
- This directory has font files in a text format and the compiler
- to compile them into a binary format that your server uses.
-
- .LP
- fonts/lib
- .RS
- .RE
- This directory contains a library of routines for handling fonts. It
- contains all of the interfaces used by both the X and Font servers. It also
- contains routines for manipulating font files which are used by bdftopcf and
- mkfontdir.
-
- .LP
- fonts/bdf
- .RS
- .RE
- This directory contains directories of font files.
- The font files are in a text format called "bdf." (See the
- "Definition of the Porting Layer" document, section 5.2.8.)
- There are three sections: misc, 75dpi and 100dpi. The misc
- directory contains fonts which do not fit a particular resolution; the
- cursor font, various fixed-pixel-size fonts and fonts which are needed
- by various toolkits for icon images. 75dpi and 100dpi contain identical
- font sets at 75 dots-per-inch and 100 dots-per-inch respectively. All of
- these fonts conform to the XLFD standard; most of them are in ISO-8859-1
- (Latin-1) encoding.
-
- .LP
- fonts/tools/bdftopcf
- .RS
- .RE
- This directory has the font compiler source.
- The old bdftosnf compiler had private definitions for the glyph format;
- bdftopcf inherits the definitions from the server so special configuration
- is not needed here. Besides, if you get it wrong, the server will still
- run, but will spend quite a bit of time reformating the fonts.
-
- .LP
- fonts/tools/mkfontdir
- .RS
- .RE
- mkfontdir is run over a directory containing font files. It produces
- the database which the server uses to map font-names to file-names.
-
- .LP
- X11/
- .RS
- .RE
- This is an important directory because it has some include files
- that are actually shared between client (Xlib) and server, such as
- X.h, Xproto.h. The files in this directory are actually installed from
- include/ during the initial build process.
-
- .LP
- server/
- .RS
- .RE
- This directory
- contains all the source for the server code.
- There are some other files that are needed from other top-level directories,
- such as .h files from the X11/ directory.
-
- .LP
- server/ddx
- .RS
- .RE
- This directory has all the Device Dependent X server code.
- Operating System dependent code is under server/os.
-
- .LP
- server/ddx/cfb
- .RS
- .RE
- This directory has "Pixblit" and other
- code for Color Frame Buffer displays.
- (In this document, "color" means multiple bits per pixel.
- You may be able to simply plug in a few
- special numbers into defines, then compile it and have it work for your
- display.
- Otherwise, it will serve as no more than an example of what you should
- implement as Pixblit routines for your own frame buffer.
-
- .LP
- server/ddx/mfb
- .RS
- .RE
- This directory has "Pixblit" and other
- code for Monochrome Frame Buffer displays.
- (In this document, "monochrome" means a black and white display with
- one bit per pixel.
- Even though the usual meaning of monochrome is more general, this special
- case is so common that we decided to reserve the word for this purpose.)
- If you have a monochrome screen, you may be able to simply plug in a few
- special numbers into defines, then compile it and have it work for your
- display.
- Otherwise, it will serve as no more than an example of what you should
- implement as Pixblit routines for your own frame buffer.
-
- .LP
- server/ddx/mi
- .RS
- .RE
- This directory contains Machine Independent DDX code.
- It supplies the "Drawing Primitive" procedures that DIX calls
- to perform graphics and calls the "Pixblit routines."
- It also contains other routines.
- It is portable to any system that can draw graphics by
- manipulating rows of pixels by hand.
- See the "Definition" document for an explanation of these terms.
-
- .LP
- server/ddx/apollo,dec,hp,ibm,macII,mips,sun,tek
- .RS
- .RE
- These directories contain device-specific code which runs
- on a wide variety of machines. All of these directories
- were, at least originally, contributed by the various hardware vendors.
- None of them represent actual production code entirely.
-
- .LP
- server/dix
- .RS
- .RE
- This is the Device Independent source.
- You should not have to change any of these routines.
-
- .LP
- server/include
- .RS
- .RE
- This directory has dozens of includes that are specific to the server.
- Many of them are pairs in the form of XXX.h and XXXstr.h,
- the former having #defines for the XXX topic, and the latter having
- struct definitions. This naming convention could not be carried out
- systematically as SysV allows only 14 character filenames, which truncates
- some of the XXXstr.h file names. The include files in this directory are
- only for DIX and DIX/DDX interfaces; individual modules which export
- functionality (such as mi) include the interface-definition header files in
- their own directories.
-
- .LP
- server/os
- .RS
- .RE
- This directory has Operating System specific source, mostly in
- subdirectories.
-
- .LP
- server/os/4.2bsd
- .RS
- .RE
- This is source for UNIX 4.2 BSD (Berkeley UNIX) source.
- It will also run on 4.3 BSD and ULTRIX. This code will also
- run on several vendors mixed SysV/4BSD systems. It provides
- many routines which are not very OS specific, but which haven't
- been moved elsewhere yet for lack of need (i.e. it runs on
- every device which is supported by the sample DDX directories).
-
- .LP
- This software is contributed to the public as a service.
- We welcome contributions from other development groups for inclusion on future distributions.
-
-
- .NH 2
- Areas of Work to be Done
- .XS
- Areas of Work to be Done
- .XE
- .LP
- Most of the code for the X server is
- on an industry standard 9 track magnetic tape in
- UNIX "tar" backup format.
- The missing parts that you must supply
- for your particular workstation fall into the following three
- categories:
- .LP
- operating system:
- .RS
- .RE
- Your operating system is the first and most obvious source of differences.
- If you have a UNIX 4.2 or 4.3 BSD system, this part will be trivial to port.
- The further you move away from that, the harder it will be.
- For systems that are not UNIX-based, the hardest part
- of the porting process may be the interface to clients.
-
- .LP
- input:
- .RS
- .RE
- You need to code specific interfaces for your particular pointing device
- (mouse or tablet) and keyboard.
- These have to be non-blocking; a scheduler must be supplied
- that can wait for input events and client requests to arrive.
-
- .LP
- output:
- .RS
- .RE
- This is potentially the largest section of code you will need to
- write. If you have a memory-mapped frame buffer display, most of the
- code has already been written for you (but optimization may be desired).
- If you have color and/or a special-purpose graphics
- processor that insists upon doing all of the
- work itself, you have a substantial task.
-
- .NH 2
- About DDX, mfb, cfb, and mi
- .XS
- About DDX, mfb, cfb, and mi
- .XE
- .LP
- The DDX (device dependent X) layer provides a software interface to a
- conceptual hardware device. The imagined device provides primatives for
- drawing lines, arcs, text, filling areas, etc.
- These primatives may be actually provided in your hardware, or you may have
- to build them out of simpler primatives your hardware does provide.
- The mi (machine independent) routines provide software simulation of the
- conceptual machine built out of very simple primatives such as GetSpans,
- SetSpans, FillSpans, PushPixels, etc., which we call the Pixblit routines.
-
- The mfb layer is one implementation of the software interface that connects
- to monochrome (one bit deep) framebuffers. The cfb layer is one
- implementation that connects to multi-bit framebuffers. In both cases, some
- functionality is provided by writing directly to the framebuffer. Some more
- esoteric functionality is achieved by calling the mi routines. In order to
- be able to use the mi routines, they must also implement Pixblit routines.
- Both cfb and mfb have been extensively tuned for Release 4 to run as quickly
- as portably possible on a wide variety of machines. Mfb, in particular, has
- some Gnu C Compiler "asm" statements which substantially increase
- performance of some operations on both Vax and 68020 CPUs. Cfb, on the
- other hand, can be tuned for new architectures by describing some CPU
- characteristics in server/include/servermd.h.
-
- The mi code should be portable to all systems.
- It calls the Pixblit routines to apply the pixels,
- all device dependencies are contained in there.
-
- Some routines in mi are not used by the mfb or cfb DDX implementations.
- They are provided to make it easier for you to get a simple port running
- quickly. Unfortunately, it is not possible to provide a complete DDX
- implementation in mi, you need the Pixblit routines which actually know how
- the hardware looks.
-
- The mi, mfb, and cfb routines were designed for portability over performance.
- Therefore, you may want to spend time optimizing them if you choose to use
- them.
-
- .NH 2
- What do I do?
- .XS
- What do I do?
- .XE
- .LP
- To start, you should get the simplest server running by
- modifying as little as possible, probably using mi and maybe using mfb or cfb.
- Later, you can carefully optimize it.
-
- The first step is to copy the source code off the tar tape onto your machine.
- If yours is a UNIX system, this will be easy.
- If not, it may be more difficult.
-
- Use the UNIX "tar" command to load the tape onto your machine, if appropriate.
- If you have a network running, you might be able to get it from
- some other machine on the net by using the UNIX "ftp" command
- (some non-UNIX systems also support ftp).
-
- One way to load the source onto a non-UNIX system is to load it onto
- a UNIX system and move it to your system.
- If you are porting to a non-UNIX system, we strongly recommend that you have
- a UNIX system available in house for purposes such as this and for testing.
-
- The next step is to create a subdirectory under the ddx and os directories
- as appropriate for your code. (See the
- "Definition of the Porting Layer" document for details on directories.)
- Copy files into these directories from sibling directories that seem closest
- to what you will need.
- For instance, if you are porting to an IBM 3279 display on an IBM 4361
- mini, you create the directories ddx/3279 and os/4361 (or os/370
- if you thought this would be portable to other 370 architecture machines).
- If you were porting to a 3279 display on a UNIX 4.2 system, you would
- make a directory ddx/3279 and use the os/4.2BSD directory the way
- it is, if you thought it would work.
- (If later in the process you found it did not, you would make your own subdirectory.)
-
- Start modifying the code.
- Begin with the OS code.
- There are file i/o routines to work on, and the byte stream to the client
- is important.
- Get the byte stream working between your own test programs.
-
- The second logical step is to get some form of the X server code running.
- Make dummy versions of the input routines and graphical output routines so you
- can concentrate on getting initialization right without having the system
- crash.
- Edit Xmd.h according to the instructions in the section "Machine Dependencies"
- later in this document.
- Then compile everything.
-
- Next, work on the graphical output.
- Fill in whatever you need so that a simple client program that just draws some
- graphics on the screen works.
- For monochrome screens, setting a few
- defines and recompiling the mfb files may be all you need.
- (See "Porting MFB" in the Details, below.)
- For color screens, setting a few
- defines and recompiling the cfb files may be all you need.
- (See "Porting CFB" in the Details, below.)
-
- The xclock program is a good candidate for testing graphical output.
- Depending on your networking software, it might be easiest to
- have this test client on the same machine as your server.
-
- Finally, work on the input. Fill in code to handle the keyboard and mouse
- (or other pointing device). The cursor that echoes the position of the
- pointing device is better implemented in hardware, but mi does provides
- support for a primitive software cursor which is very easy to use. See the
- section on cursors below.
-
- Next, optimize.
-
- You are done!
- For more explanation, see the Details section, below.
-
- .NH 2
- Cost
- .XS
- Cost
- .XE
- .LP
- We estimate that a basic monochrome or color server will
- take one to two months to develop if done on
- a UNIX 4.2 BSD system by an experienced C programmer who knows the hardware
- quite well.
-
- The more software you have to write, the longer it will take.
- If it is a non-4.2 UNIX system, add one to four weeks.
- If it is a non-UNIX system, add one to two months.
- If your operating system does not have a network,
- that must be taken into consideration.
- If you buy someone else's implementation, add one to four months.
- If you decide to write it yourself, add six months to two years.
-
- If special graphics hardware (a graphics processor, not just unusual
- bitplanes) is involved, it will take much longer.
- If you want the code optimized for maximum performance, it will take much,
- much longer.
-
- The more experienced you are, the less time it will take.
- If you are new to C, add some time.
- If your programmer is not familiar with your operating system, it will take
- longer.
- If you are not familiar with windowing systems, it will take longer; if
- you're not even familiar with 2-d raster graphics, it will take longer still.
- If you've done ports to X before, it will take less time.
- If you are really hot, it will take less time.
-
-
- Of course, all of these are just guesstimates.
-
- The above figures are for one programmer.
- Some gains may be achieved through the parallelism of adding programmers.
- But, as Fred Brooks puts it, the bearing
- of a child takes nine months, no matter how many women are assigned.
-
- If you do distribute the work, it would be best to devise a good partition.
- For instance, a reasonable partition might be to have one programmer
- work on the operating system, network and input code,
- have two more working on graphics output, with one of them concentrating on
- text graphics.
- We recommend no more than a few programmers at one time.
-
- At any rate, if you have a product that is robust enough to
- be useful, you are probably about half way to making that product a solid,
- finished release.
-
- .NH 1
- Details
- .XS
- Details
- .XE
- .LP
- .NH 2
- Tools
- .XS
- Tools
- .XE
- .LP
- .NH 3
- The C Compiler
- .XS
- The C Compiler
- .XE
- .LP
- Your C compiler can have a significant effect upon the time it takes you to
- finish the project.
- Since the original source was developed on a UNIX system, the closer your
- compiler approximates the UNIX C (pcc) compiler, the better.
- Depending upon your situation, it may be worthwhile to try more than one C
- compiler and use the one that works best.
- (Programmer time is quite expensive;
- software is frequently much less expensive, even if overpriced.)
- If, for instance, the DIX code does not compile without modifications, you may
- want to look elsewhere.
-
- Sometimes we intentionally call a routine with the wrong number of arguments.
- For instance, there is a routine NoopDDA() in dixutils.c that is used
- widely as a procedure that does nothing.
- It has zero arguments but is used for situations where routines get passed
- different numbers of arguments.
- If this causes problems on your machine, you might need to change the code
- or get another compiler.
-
- If you are using an 8086 architecture, we recommend you use "large" model
- to get the server running, then switch to mixed model for speed and
- space efficiency.
-
- .NH 3
- Make and Makefiles
- .XS
- Make and Makefiles
- .XE
- .LP
- "Make" is a UNIX program that manages the compilation process.
- It reads in a text file named Makefile describing the source files
- that need to be compiled and how.
- (This file is frequently called the dependencies file because it describes
- the chain of dependencies leading to the final product.)
- Make then checks the dates of source, intermediate, and object files,
- determines the minimum compiles needed to bring a given result
- file up to date, and runs each compilation step as a child process.
-
- This idea has been imported to a wide variety of operating systems
- (frequently still called "make").
- On non-multitasking operating systems, the program frequently
- generates just a batch file with the needed compile commands in it and then
- executes this batch file as its final operation.
- (Beware: few of these non-UNIX versions contain all the features of the
- original.)
-
- We recommend using Make or whatever useful substitute you have available.
- The makefiles for the UNIX system are included with the tar tape, and they
- should work on any UNIX system.
- this code does not support "near" and "far" pointers.
- This may not be necessary or desirable on 386 systems.
- They might not work on your system.
- To aid you in generating your own makefiles for your own system, we briefly
- describe the syntax of makefiles.
-
- The dependency relationships look like this:
- .nf
-
- fig.o : fig.c fig.h xyz.h
- cc -abc fig.c
-
- .fi
- This states that the file fig.o (an object) depends upon fig.c and the two .h
- files listed.
- If fig.o is found to be older than any of the dependencies,
- execute the command(s) listed below it to bring it up to date.
-
- Most makefiles look much more complicated.
- This is primarily due to the use of macros.
- When you have a statement of the form:
- .nf
-
- COPTS = -abc -x fig -FPa
-
- .fi
- this means that you can subsequently use "$(COPTS)" as a
- text substitution macro elsewhere in the makefile.
- .nf
-
- fig.o : fig.c fig.h xyz.h
- cc $(COPTS) fig.c
-
- .fi
- This is frequently used as shown to hold C compiler options.
- It is also used to hold lists of filenames.
- .nf
-
- HFILES = fig.h xyz.h
-
- fig.o : fig.c $(HFILES)
- cc $(COPTS) fig.c
-
- .fi
-
- Another common cause for confusion in makefiles is that there are special $
- symbols that signify "the dependencies" or "the product" in a command line.
- These can be used in powerful constructs that will indicate, in just a few lines,
- "compile all .c files that you need to compile and do it this way."
-
- Consult UNIX documentation for more details.
-
- The makefiles supplied with the sample server are not guaranteed to be
- nearly as portable as the code.
- In particular, there are situations where special techniques were used to
- get everything to compile.
-
- There are some routines that need to be compiled with #defines
- entered on the command line with the -D flag of the UNIX cc command
- instead of with a normal #define directive.
- If you don't have such a facility with your compiler, you should put such #defines
- in an .h file and do some file copying in the makefile to achieve the same result.
-
- .NH 3
- Debuggers
- .XS
- Debuggers
- .XE
- .LP
- Because you are drawing graphics on the display, you will probably want to use
- a debugger that does not use the display.
- On some systems, a terminal connected to a serial port is the best way to
- communicate with the debugger.
- On network systems, you may be able to log into your test machine remotely
- and run the debugger and server from there.
-
- .NH 3
- Profiling Tools
- .XS
- Profiling Tools
- .XE
- .LP
- After you have an initial implementation running, you may want to improve its
- performance. A profiler is invaluable for this purpose because it tells you
- where you are actually consuming CPU cycles. You can then change code based
- upon hard evidence. On UNIX systems, you might use the prof and gprof
- programs. To really analyze the code, it is very useful to use a
- basic-block profiler (like the MIPS pixie system); most of the frame-buffer
- graphics primitives are large functions wrapped around multiple small inner
- loops which perform the actual rendering.
-
- To gain more insight into performance deficits, the X client "x11perf"
- (contrib/demos/x11perf), contributed by DEC, offers a wide array of
- measurement tools and an easy base to add more to. It was used extensively
- in the development of the current cfb and mfb drivers, along with the
- release 4 changes to various data structures and mi algorithms with
- frequently astounding revelations. It is hard to recommend this program too
- much. Each time you sit down to optimize some area of the server, first
- develop a test case and integrate it into x11perf; measuring before and after
- to discover performance changes. X11perf is also useful in profiling the
- server as it provides a repeatable sequence of graphics requests; set it up
- to use a fixed number of iterations.
-
- .NH 2
- Operating System Details
- .XS
- Operating System Details
- .XE
-
- .NH 3
- Machine Dependencies
- .XS
- Machine Dependencies
- .XE
- .LP
- The sample server is written to be portable to a wide variety of architectures,
- including CPU chips with different word sizes and different bit and byte ordering.
- Before compiling the code, you should set some defines to indicate what kind of
- CPU you have.
-
- First, edit Xmd.h.
- Change the following:
-
- INT32, INT16, INT8 should be signed integers of 32, 16 and 8 bytes. CARD32,
- CARD16 and CARD8 should be equivalent unsigned integers. BITS32, BITS16 and
- BYTE should be types that are most convenient for bit-oriented data. BOOL
- is the most convenient boolean value type that fits in 8 bits. Change them
- according to your compiler. Unfortunately, most of the mfb and cfb code
- "knows" that both int and long are 32 bits and will not work on systems where
- this is not the case. The rest of the server is less encumbered, but as the
- sample code has never been run on such machines, it is unknown whether it
- will work. It will certainly not work if long != 32 bits or short != 16
- bits.
-
- IMAGE_BUFSIZE is the size of a buffer of bytes that GetImage will return.
- Smaller systems may want to keep this at 1k or less;
- larger systems may put it at a few dozen k.
-
- IMAGE_BYTE_ORDER indicates the order of bytes in the image.
- On VAXen, this is LSBFirst because the least significant byte is on the left,
- and is sent down the pipe first.
- On 68000s it is MSBFirst.
-
- BITMAP_BIT_ORDER is the equivalent order of bits within a byte.
- On VAXen, this is LSBFirst because the least significant bit is most
- toward the left on the screen.
- On 68000s it is MSBFirst.
-
- BITMAP_SCANLINE_UNIT is the biggest piece of memory in which
- IMAGE_BYTE_ORDER applies (in bits). For most hardware, 32 is a good value.
- Note that mfb and cfb both assume that addresses ascend across the screen
- from left to right and then proceed down the screen.
-
- BITMAP_SCANLINE_PAD is the chunk size to which
- bitmaps sent over the bytestream should be padded.
- In other words, if you had a bitmap that only had one bit in it,
- would you want to send 8 bits, 16 bits or 32 bits?
-
- LOG2_BITMAP_PAD must be the log base 2 of BITMAP_SCANLINE_PAD.
- If BITMAP_SCANLINE_PAD is 32, this must be 5.
-
- LOG2_BYTES_PER_SCANLINE_PAD is the log
- base 2 of (BITMAP_SCANLINE_PAD divided by 8, the number of bits in a byte).
- If BITMAP_SCANLINE_PAD is 32, this must be 2.
-
- .NH 3
- Client Access
- .XS
- Client Access
- .XE
- .LP
- On many systems, one large section of code to be written may be the client
- access.
- X requires a reliable byte stream that can handle binary data.
- The sample server has code in it to communicate over three different
- byte streams: TCP/IP Ethernet, DECNET, and UNIX domain sockets.
-
- If you do not have one of these already, you may find
- the byte stream somewhat time consuming to develop.
- If you have an operating system other than a UNIX 4.2 BSD system
- there is more work involved in client access.
- If it is another UNIX system, it is somewhat easier.
- The less it resembles 4.2 BSD, the more difficult it will be.
-
- If you can't use TCP/IP Ethernet, DECNET or UNIX domain sockets,
- the alternative is to use some other byte stream mechanism.
- This will also have to be dealt with on the X client side
- (there is an implementation-specific routine in the X library
- to communicate with the server).
- You might start out by implementing both sides in the same
- machine as long as the
- client and server are separate processes and there is a convenient interprocess
- bytestream mechanism.
- In particular, this may be a first step toward implementation of your
- alternate inter-machine client-communication scheme.
-
- In theory, any reliable byte stream will work.
- Its throughput should be approximately 5k bytes per second or more;
- otherwise performance will
- deteriorate.
-
- For instance, an RS-232 or RS-422 link would work,
- although its performance would leave much to be
- desired unless you could achieve a baud rate of 56kbaud or greater.
- Since 8-bit binary data is regularly transmitted, your bytestream
- cannot use command characters for
- handshaking and protocol (such as XON/XOFF).
- Many modems or other telecommunication
- equipment will not work if designed for just normal ASCII communications
- because they may intercept certain control characters.
- Also, an RS-422 link would only offer one client-server bytestream,
- whereas you may want more than one such connection.
-
- .NH 3
- Multi-Processor OS's and Graphic Processors
- .XS
- Multi-Processor OS's and Graphic Processors
- .XE
- .LP
- The X server runs as a single process that imitates multitasking
- using an event-dispatching loop that checks for things to do from all sources
- and processes them one at a time.
- Many operations do not consume much time, so the multitasking
- appearance is upheld; but certain graphics operations may consume
- substantial amounts of CPU time.
- If another CPU or a graphics processor were
- available for these tasks, a significant gain in performance
- could be realized.
-
- Graphics processors, in particular, can offer a unique opportunity to create
- a very high-performance X server. See the section "Implementing On Top of
- Another Graphics System" for more details if you have a graphics processor.
-
- The X sample server was written as a single-threaded program for a single processor.
- A multi-processor system with a core processor (running the main server
- code) might dispatch tasks to a set of slave processors that
- effect low-level graphics operations.
- Or it may even have a completely different scheduling system, with multiple
- processors participating in the dispatch loop.
- In such cases, large parts of the server code will probably need to be rewritten.
- In particular, there are shared resources among clients,
- and you need to ensure that requests received by the server are executed
- in apparent synchrony, and you must ensure that global data structures such as the
- window tree and the resource table are maintained correctly.
-
- X is merely a bytestream protocol and anyone can write any software
- to implement it in any language on any computer system.
- The sample server is merely one implementation.
-
- .NH 3
- Server Reset
- .XS
- Server Reset
- .XE
- .LP
- The X server will reset itself immediately after all clients terminate.
- It is helpful to provide a way
- for the user to cause the server to terminate all client connections and reset
- itself.
- At an appropriate time, your server can cause all clients to be terminated by
- calling DoomClients().
- The following cycle through the dispatch loop, all clients will be terminated
- in a somewhat reasonable way.
- This will cause a reset.
- Upon reset, you should instruct your network to close all open client
- connections.
-
- For instance, when the server process receives a SIGHUP
- signal on UNIX systems, the signal routine calls DoomClients().
- On a non-UNIX system, you may prefer a special sequence
- of modifiers and keys at the keyboard.
- Whatever the user does, all windows and
- applications will be closed and the user will have only an empty screen.
-
- .NH 3
- Shutdown
- .XS
- Shutdown
- .XE
- .LP
- Depending upon your workstation environment,
- you may want your X11 server to run forever, or
- you may want to provide a way for the user to cause the server to quit
- gracefully without turning off the machine.
- Your server can quit by calling KillServerResources(), closing all network
- connections and then calling exit().
-
- For instance, on UNIX systems, when the server process gets a SIGINT or SIGTERM
- signal, it calls KillServerResources() and then exit().
- On a non-UNIX system, you may prefer to have the user press a special sequence
- of modifiers and keys at the keyboard.
- Whatever the user does to accomplish this, it will cause the X11 server
- to return to your operating system and/or shell.
- You may want to clear the graphics screen(s) before exiting.
-
- .NH 2
- Input Details
- .XS
- Input Details
- .XE
- .LP
-
- .NH 3
- The WaitForSomething Scheduler
- .XS
- The WaitForSomething Scheduler
- .XE
- .LP
- WaitForSomething() must wait for any of three occurrences:
- a hardware input event is received,
- a request from a client is received, or a request from a new client to open a
- connection is received.
- In the interim, you can do anything you want.
- On a multitasking system, you probably want to block yourself.
- This can be accomplished using mechanisms such as select(2) on 4.2BSD, or
- poll(2) on V.3. On systems on which the entire machine is dedicated to the X
- server you can loop endlessly, checking for input and client requests.
-
- It would be unwise to depend exclusively upon
- idle times for polling the keyboard and pointing device.
- You should also poll these input devices at other times.
- In fact, these tasks should be monitored by an interrupt service routine
- checking at regular intervals.
- Otherwise, the users will be constantly annoyed when their keystrokes and mouse
- events are lost.
- Also, many paint-style programs depend upon regular
- pointing-device event-reporting to enable the user to draw
- smooth curves with the pointing device
- without leaps from one cursor location to another.
- (Even if the hardware can queue one or two such events, some graphic operations
- such as copying a large image can consume more time
- than a few keystrokes in rapid succession
- by a touch typist.)
-
- DIX will process requests from each client
- until the variable isItTimeToYield is set.
- If you do not set it, you will enable one client to lock out all others by constantly
- drawing graphics.
- Therefore, you should devise a strategy for setting isItTimeToYield
- and ending the "timeslice" of a time-consuming client.
- The sample server will set this after ten requests have been read from the same
- client.
-
- The DIX code will service each client in the order received from WaitForSomething().
- You might tune the server so that if you write an event to a client,
- the priority of that client increases, by returning him earlier in
- the list or allowing more time
- before setting isItTimeToYield.
- You might set isItTimeToYield if the current
- request changes the window tree (causing exposures).
-
- .NH 3
- Keyboards
- .XS
- Keyboards
- .XE
- .LP
- The keyboard consists of two kinds of keys, regular keys and modifier keys.
- Modifier keys,
- like Shift and Control, are keys the user presses while typing regular keys.
-
- Your keyboard must be able to indicate when the user presses or releases
- keys.
- More specifically, your keyboard-interface software must be able to generate
- a KeyPress when a modifier or a regular key is pressed
- and a KeyRelease when a modifier key is released.
- You must also generate a KeyRelease for a normal key,
- but you can generate it immediately after the KeyPress is queued.
- If you cannot at least do this, you may have problems.
-
- If your keyboard currently generates queue events
- upon each key motion or calls an
- interrupt routine that can do this, your situation is improved.
-
- If you have a system in which a keymap
- has one bit for each key that is
- being pressed, you simply need to check this keymap
- at regular intervals in an interrupt service routine and
- queue events on an internal queue you maintain.
-
- If you have a keyboard at the other end of a serial line, things become more difficult
- because you must reverse-map your ASCII characters
- into keycodes.
- In addition, you need to simulate modifier keys being used.
- For instance, when you get a lowercase "a", you must send a KeyPress
- for the "A" key, then a KeyRelease for "A".
- If you get an uppercase "A", you must send a KeyPress
- for the Shift key, send a KeyPress
- for the "A" key, then a KeyRelease for "A",
- then a KeyRelease for Shift.
- If you get a space character, you do not know if the shift key has been pressed,
- so you assume it has not.
- Between keystrokes, there is no way to know if the shift key has been pressed.
- Since with this scheme the client cannot ascertain
- when the user is pressing the shift key without typing any keys,
- some client applications that try to detect this will not operate properly.
-
- If you want autorepeat, you must simulate this in your code or hardware by
- generating KeyPress and KeyRelease events when appropriate.
- The X protocol specification describes in detail how these options are
- set by a client.
-
- .NH 3
- Pointing Devices
- .XS
- Pointing Devices
- .XE
- .LP
- The pointing device may be a mouse, a graphics tablet, a light pen,
- a touch screen, a trackball, a joystick, a pair of thumbwheels,
- or any other device that allows the user to indicate
- a location on a two-dimensional surface.
- The surface should bear some resemblance to the screen, because a visible
- cursor is displayed on the screen at a location that corresponds to the
- pointing-device location.
- The pointing device must report a location as a graphics coordinate on the screen.
-
- The pointing device must have one or more "buttons" or other momentary control
- that the user can touch or press, such that the software driver can report a
- "press" and a "release" event.
- For instance, a touch screen can report press and release events when the user touches
- the screen.
- A trackball will probably require one or more separate buttons.
-
- Some of these pointing devices are absolute, some are relative.
- For instance, with a touch screen, the user directly indicates
- the desired location on the screen.
- Mice and trackballs, on the other hand, only provide relative
- motion information; some other hardware or software must integrate
- these moves into a location.
- A graphics tablet is on the absolute side, but requires a mapping
- between the absolute coordinates on the tablet surface
- and the screen coordinates.
-
- Some relative devices, such as mice, have a scheme in software
- or firmware to "accelerate" the motion of the mouse.
- For instance, on the Apple Macintosh, the interrupt service routine
- for mouse motions checks each increment to be added to the
- cursor location. If the jump is past a certain threshold,
- it doubles the jump distance.
- In this way, the user can move the mouse quickly across the screen, while
- still retaining fine control over the location for detail work.
- Unfortunately, this technique is frequently used because
- the hardware simply cannot generate fine enough position increments.
- If you implement or have available such a scheme, you should allow standard
- control calls from a client to turn this effect off and on.
-
- Buttons are numbered starting with one.
- Probably, the left button on a mouse should be number one and
- they should be numbered towards the right from there.
- Client applications that use fewer buttons than you have will start with
- one and use only as many as needed.
- Since the X protocol specifies mechanism, and not policy,
- programs that depend upon more mouse buttons than you have
- may end up waiting for a long time before you
- hand it a button click which you cannot generate.
- On the other hand, light pens, graphics tablets with pens, and touch screens
- all implicitly have one "button", so it is reasonable to assume
- that client developers will be encouraged to consider one-button pointing devices.
-
- Keep in mind that the mainstream pointing
- devices will be mice with one or more
- buttons and graphics tablets.
- Client programs written with one pointing device
- in mind may prove hard to use with another
- pointing device.
- That is, programs written for a mouse
- usually assume that the mouse location
- can be chosen very accurately. If your touch
- screen is coarse, it may be very frustrating
- to use.
- Also, a touch screen usually cannot generate mouse move
- events while the mouse "button" is not "pressed".
-
- Make a mouse in a multiple screen environment
- move from one screen to the next by creating the impression that
- the screens are adjacent to one another;
- when the user moves the pointing device off the edge of one screen,
- the cursor moves onto another.
- X provides no policy for this, and you are free to make any geometric
- models you please.
-
- .NH 2
- Graphics Output Details
- .XS
- Graphics Output Details
- .XE
- .LP
- .NH 3
- Porting MFB
- .XS
- Porting MFB
- .XE
- .LP
- If your screen is a simple monochrome frame buffer, you probably want to
- start by porting the mi and mfb routines. These will get you up quickly so
- you have something that works on which to build. Mfb has been extensively
- tuned for a few environments; in particular mfb runs very well on 68020 and
- vax CPUs where GNU CC is available. It also runs quite well on many RISC
- processors, where C compiler technology is more able to optimize some of the
- common operations. Although you could easily expend considerable time
- optimizing it, it is not unreasonable to leave most mfb routines the way
- they are.
-
- The mfb routines are extremely portable.
- Most monochrome screens need only a half-dozen defines changed
- before the code works.
- System bit and byte order and other machine dependencies
- are given by #defines.
- (It assumes that byte ordering on the screen is
- the same as byte ordering in main memory.)
-
- First, make sure you have edited Xmd.h for your CPU.
- See the section "Machine Dependencies" for instructions on how to do this.
- Then edit server/include/servermd.h to set up bit/byte orders and font
- padding information (mfb will work with any font padding, but MSBFirst
- machines work best with GLYPHPADBYTES == 4, GETLEFTBITS_ALIGNMENT == 1).
-
- Next, write a screen initialization routine which sets the whitePixel and
- blackPixel values in the screen structure and calls mfbScreenInit with
- appropriate parameters; in particular you'll need to pass the address
- of the frame buffer, the screen size in pixels, both horizontal and
- vertical resolution in dots/inch (truncated to an int, which limits the
- accuracy a bit) and the frame buffer width in pixels. This last parameter
- may seem redundant, but many displays have extra framebuffer memory per
- scanline which is not visible on the display. Set this final parameter to
- the total pixel width of the display and mfb will ignore the invisible
- space. You could just type in a literal address in hexadecimal for the
- frame buffer address, but you may want to be a bit more sophisticated.
-
- In this screen initialization routine, you'll want to initialize the various
- screen functions which apply to your hardware; hardware cursor routines
- (or mi software cursors) should be set up here. Mfb requires that the
- CloseScreen function which it stores in the screen be called at server reset
- time, make sure you wrap it if you need your own hooks here. If
- mfbScreenInit returns without troubles (TRUE), call
- mfbCreateDefColormap(pScreen) to initialize the default colormap with
- appropriate values.
-
- That's it! All other machine dependencies should be
- taken care of, for
- most screens.
-
- If you have an interlaced screen, where rows of neighboring pixels
- are not neighboring in memory, there is a way to make mfb work on it.
- The changes needed are few; carry them out carefully.
- They involve changing the mapping from the row number to
- address. Look for places where we multiply by devKind or width.
-
-
- .NH 3
- Porting CFB
- .XS
- Porting CFB
- .XE
- .LP
- If your screen is a simple packed-pixel frame buffer (either gray scale or
- color), you will want to start by porting the mi, cfb and mfb routines.
- You'll need to use mfb, even though your screen is color, as each server is
- required to support 1-bit pixmaps. The cfb routines have been extensively
- tuned for 1-byte-per-pixel displays and will work quite well with little
- change. On other displays (2 bit up to 32 bit), the existing code will
- still work, but in many areas performance will be disappointing.
-
- The cfb routines are also extremely portable. Most color screens need only
- a few changes to Xmd.h and server/include/servermd.h. The 8-bit specific
- cfb text code works best with GLYPHPADBYTES == 4, GETLEFT_BITS_ALIGNMENT ==
- 1, but will function with any padding. Also in servermd.h are several
- CPU specific tuning parameters. Read the comments carefully at the top of
- the file and set the ones appropriate for your CPU. They do not affect the
- correctness of the code, but can offer substantial performance gains if set
- correctly. If you are unsure, guess and use a profiling tool to discover
- which set work best. As with the mfb code, you could spend almost unbounded
- effort tuning various portions of the cfb layer for your particular system;
- but most of the code should run well enough unchanged to not warrant the
- effort.
-
- Finally, set up an initialization routine which calls cfbScreenInit which
- uses arguments similar to those used by mfbScreenInit, the sizes are all
- still in pixels. CfbScreenInit does take an additional parameter before the
- framebuffer width, the visual class of the default visual. Set this
- appropriately (probably PseudoColor). You needn't set up
- whitePixel/blackPixel on pseudo color machines as cfbCreateDefColormap()
- will pick appropriate values and store them in the colormap when called
- after cfbScreenInit returns success. If you want to force the values for
- whitePixel/blackPixel, set them in the screen structure after cfbScreenInit
- and before cfbCreateDefColormap.
-
- .NH 3
- Implementing On Top of Another Graphics System
- .XS
- Implementing On Top of Another Graphics System
- .XE
- .LP
- Many workstations already have their own graphics library or even their own
- windowing system. In order to coexist with the rest of the world as
- peacefully as possible, you may want to implement your X server on top of
- such a library. In fact, your machine may come with its own graphics
- processor that can greatly speed up graphics if used judiciously. Beware,
- however, that many X clients draw small objects, or only a few at a time.
- The overhead for translating X requests into graphics-system primitives may
- dominate the drawing time and cause the resultant server to be slower than a
- simple dumb frame-buffer system. Do not casually assume that the graphics
- processor is the fastest way to do things. Profile, profile, profile.
-
- Since such graphic systems usually perform high level operations such
- as line drawing, text drawing, and area fill,
- you would start accommodating them at the "Drawing Primitives" level.
- In other words, you would rewrite one or more of the
- drawing primitive routines provided (such as miPutImage(),
- miPolyArc(), miPolyFillRectangle(), or miImageText8()).
- Instead of using the equivalent mi routine, you would
- write your own routine to use the graphics system.
-
- One problem with a graphics processor, which also occurs
- when trying to implement a server atop an outside graphics
- library, is that the definition of certain functions can change in
- subtle ways.
-
- For instance, a graphics processor may support text drawing only
- by ORing the glyphs into place;
- the X routines require more sophisticated text-drawing capabilities.
- A more difficult case is that in which a graphics processor can draw only fixed-width
- characters or can draw only 8-pixel-wide characters, or can draw characters
- only in its own hardwired font.
-
- There are several approaches to this problem.
- First, you can recognize the 80 percent of the situations
- that can be executed by your graphics system, using the graphics system
- for those cases, and then executing the remaining 20 percent
- with mi (and possibly even cfb or mfb) code.
- Your GC validate routine can route
- different requests to various
- routines to do things differently.
- (See the Definition document for more information on the GC validate routine.)
-
- Secondly, you can supplement the graphics processor's work.
- You can implement each X primative call for with
- more than one call to your graphics system, possibly with
- some auxiliary touch-up.
-
- Third, request changes in your graphics processor or library.
-
- By using as many of these approaches as appropriate, you can maximize the
- overall performance and compatibility of your workstation while
- correctly interpreting the X protocol.
-
- Example: Your graphics processor applies glyphs only by "ORing" them into
- the image.
- Make the ImageGlyph routine call the graphics processor to
- draw the character's rectangle in the
- background color, then call the graphics processor to draw the character.
- If using just a solid-fill style in OR mode,
- you make the PolyGlyph routine call the graphics processor to
- draw the character.
- You use the slower mi routines for PolyGlyph routine that must effect
- tiling, stippling, etc.
-
- Example:
- A graphics processor can draw only fixed-width
- characters.
- In this case, you use the Validate routine to change the primitive
- procedure pointers in the GC depending upon whether your font is
- fixed width or variable width.
- The fixed-width fonts go directly to the graphics processor.
- The variable-width fonts would be drawn in software, probably using
- routines borrowed from the sample server.
- (Depending upon the application, much text on the screen may be fixed width
- in the default font.)
-
- Example: The graphics processor cannot clip to an irregular region as the
- entire Drawing Primitive set must do. Each routine checks the clipping
- region and ascertains whether the entity to be drawn falls entirely within
- the region. If so, the drawing is executed by the graphics processor. If
- any part of the entity is clipped, it is handled by the mi, cfb or mfb code.
-
- Example:
- A graphics processor can draw text only with its own hardwired font.
- You create the font data that would correspond to your hardwired font,
- including the character glyph images.
- You make up a name for this font and make that your default font.
- Once again, you use the Validate routine to change the primitive
- procedure pointers in the GC depending upon whether your font is
- the hardwired font or not.
- The hardwired font goes directly to the graphics processor, as long as
- you can handle the fill style and clipping.
- Other fill styles or clipping may be handled by using hardware to draw
- into a pixmap and then applying it to the screen.
- Anything else would be drawn in software, probably using
- routines borrowed from the sample server.
-
- Example:
- In X, lines are drawn with a model borrowed from PostScript
- in which the width of a line is a scalar number
- and ends of lines can either be butt (squarely cut off perpendicular to line)
- round (semicircular end), or projecting (like butt but extending past end of
- line by 1/2 line width).
- Imagine your graphics processor draws lines by smearing
- a rectangle from the source to destination.
- You get to set the height and width of the rectangle, but nothing else.
- You will not be able to use this operation for X wide lines in any but
- the simplest (i.e. horizontal/vertical or zero-width) cases.
-
- In X there are few requirements placed on zero-width lines.
- (If you get a line width of zero, the intent is that it be "the fastest,
- easiest line," not an invisible line that has no width.)
- Fill-style rules still apply, the width should be approximately 1 pixel.
- The line style (dash style) should still be processed.
- The join style can be ignored because all join
- styles look the same at this resolution (except that miter joins for acute
- angles can get very long; you can ignore this effect).
- Your algorithm can be anything reasonable, it is desirable
- that you obey the cap style "NotLast" which indicates whether the
- ending pixel should be drawn. There is also a requirement that
- the lines be identical in the face of clipping; and a suggestion that
- the lines be identical when drawn in the reverse direction.
- Client programs that are picky about the lines they draw can draw width 1
- lines. Your GC Validate routine can change the line-drawing
- routine pointer in the GC so that zero width lines get drawn by
- the graphics processor and the others are drawn by mi.
-
- Of course, the facilities of each graphics processor are unique and
- each has special considerations.
- This is an area that will require meticulous attention to detail on your part.
-
- .NH 3
- Hardware Tiling and Stipples
- .XS
- Hardware Tiling and Stipples
- .XE
- .LP
- Some hardware has the ability to apply patterns to the graphic surface.
- X makes a distinction between a tile versus
- a stipple.
- A tile is a "full color" pattern, the depth of which matches the target
- drawable.
- A stipple is a binary pattern that writes the foreground color where there are 1-bits
- areas and (if opaque) the background color on 0-bit areas.
- In addition, X allows a tile or stipple cell to have any size.
-
- Some graphics processors can apply patterns that are only
- certain cell sizes, such as 8x8 or 16x16.
- Most CPU chips will apply patterns more efficiently to some frame buffers
- when the pattern
- width is 8, 16 or 32.
- In these cases, you use the GC validate routine to switch between
- fast pattern writing versus slow pattern writing via the mi routines.
- If your pattern size is a factor of your hardware pattern
- size (such as 2x4), you can simply
- replicate it to fill the hardware rectangle.
- (Many patterns will, in fact, be such sizes, so this will not be wasted effort.
- There is a request, QueryBestSize,
- that a client can execute to ascertain what sizes are optimal.)
-
- .NH 3
- Graphic Contexts in Hardware
- .XS
- Graphic Contexts in Hardware
- .XE
- .LP
- Many hardware and firmware graphics systems have internal state analogous to
- X's Graphic Contexts.
- Such settings as current line width, current font, and current foreground color
- can be set in hardware for subsequent drawing operations.
- The sample server provides a mechanism for conveniently and efficiently
- specifying these settings: the GC validate
- procedure, which is called when necessary just before drawing.
-
- Each drawable (window or pixmap) has a fixed serial number, which is unique
- for that drawable.
- Each GC has a serial number field that reflects the last
- drawable for which it was validated.
- Before a drawing operation with a drawable and a GC, the two serial numbers
- are compared;
- and, if different, the validate routine(s) are called to validate the GC.
-
- When a GC is validated for a drawable, its serial number
- is set to the serial number of the drawable
- so that the next time these two are used together, the validate routines are not called.
- But the GC serial number is changed when some of its fields are changed, forcing
- a validate the next time around (the high bit is changed- it is unused for anything else).
-
- In other words, by default this validate
- procedure you write is called only when
- the graphic context about to be
- used in a drawing operation has been changed since the
- last validate for this GC and drawable or if the last validate
- for this GC was for another drawable.
-
- If you have only one hardware GC state, however, the validate routine must be called
- more often, because it must also be called whenever you switch between different
- GC's.
- For instance, under normal conditions,
- if you drew with drawable a and GC A and then drew with drawable b
- and GC B and kept switching between aA and bB without changing the GC's,
- each would no longer need to be
- validated because their serial numbers would match.
-
- You could ensure that the validate routines are called for each change of
- the GC in use by keeping a static GC pointer variable that points to the
- last GC used. When a new GC is validated, the serial number of the last GC
- would be changed (change the high bit -- do not change the rest which is
- clipping information). Once this has been done, set your static GC pointer
- to point to the new GC. The validate routine will then be called whenever
- the hardware GC information needs to be changed. Unfortunately, the
- validate routine is probably much more involved than is necessary for this
- process. Instead, keep a global variable which points at the current GC in
- use and check in each graphics operation that the global GC pointer matches
- the GC passed in. If not, call a function to reload the hardware state from
- the new GC and change the global pointer to point at the new GC. DestroyGC
- would then check to ensure the cached GC pointer was invalidated when the GC
- was deleted.
-
- If you have a sophisticated graphics processor that
- has, for instance, eight "contexts" of graphic parameters among which it
- can switch, you can retain eight static GC pointers
- (in an array).
- Before each graphic operation, set the hardware
- to use the hardware GC it needs.
- (You might want to run benchmarks to ensure you are not spending
- more time switching hardware GC's than necessary.)
-
- See the Definition document for more details.
-
- .NH 3
- Implementing X on top of Another Window System
- .XS
- Implementing X on top of Another Window System
- .XE
- .LP
- If you have another windowing system on top of which you want X to run
- there are several procedures in the ScreenRec and WindowRec
- you can use to execute almost all window operations.
- (Remember, DIX does not interact with your screen
- directly, so there is considerable leeway in this area.)
-
- For instance, the window borders are always drawn with PaintWindowBorder()
- and the background with PaintWindowBackground(), which you supply.
- The contents of windows are drawn with the Drawing Primitives, which you supply.
- In addition, DIX calls your routines CreateWindow() and DestroyWindow() when
- it makes and destroys windows.
- Other hooks are provided for mapping and unmapping windows, moving them,
- and changing their attributes.
-
- See the Definition document section on windows for more details.
-
- .NH 3
- Color
- .XS
- Color
- .XE
- .LP
- Color requires special considerations.
- You need to decide what class of display you have (see the Definition
- document, the section on Visuals and Depths).
-
- Next, set up all of the visuals you will support.
- Each depth can have one or more visuals with which it is associated;
- if your screen has several modes, you can list them all.
- As with depths, it may be best to begin with the simplest
- and then add visuals one at a time.
-
- Cfb has quite a range of support for colormaps. It has routines
- which emulate any visual type on a pseudo color system, most of which are
- also appropriate for other hardware types. You'll need to implement
- StoreColors, InstallColormap, UninstallColormap and ListInstalledColormaps;
- all of which are typically quite short. Place pointers to these routines in
- the screen structure before calling cfbScreenInit. If you are not using
- cfb, you may want to extract the colormap code anyway; it is not dependent
- on the rest of that directory.
-
- You might want to construct your server so that
- it appears to support multiple lookup tables simultaneously, so you can have
- multiple Colormaps installed at the same time.
- For instance, if you had a display that had ten bits per pixel and
- a lookup table of 1024 entries, instead of declaring the obvious,
- you could declare that you had a display with depth 8
- and four lookup tables.
- The extra two bits in each pixel would determine the lookup table
- to use for that pixel.
- Each time you wrote into windows on this screen, you would need to write
- those extra two bits surreptitiously to indicate the lookup table
- to use for this pixel.
- When copying pixel data off the screen onto pixmaps, the window would
- be considered eight deep, the extra two bits would be ignored.
- CopyWindow() would have to attend to these extra bits as it changed
- the colormap allegiance of affected pixels.
-
- .NH 3
- Multiple Screens
- .XS
- Multiple Screens
- .XE
- .LP
- If you have multiple screens, the implementation is more complicated.
- Each screen may have its own method of managing windows or drawing graphics.
-
- Each screen may have a different scheme for its frame buffer.
- Each screen manages pixmaps whose format is specific to that screen.
- There are no commands available to the client
- to transfer pixels directly from one screen to
- another or between pixmaps of different screens.
-
- Each server must decide what depths and formats of image pixmaps it is
- willing to transfer between the client and server.
- This usually involves some consensus among the screens.
- A given server must support depth 1, and probably supports all of the depths of
- its screens.
-
- Fortunately, you need not implement routines to copy pixels between different
- depths. The only way for the client to copy pixels between drawables
- of different depths is with CopyPlane, which copies one plane from one
- drawable to another.
- The client can copy whatever planes it needs into 1-deep pixmaps
- and can then logically combine these to achieve any desired result.
-
- Every drawable has a fixed depth. Every GC has a fixed depth.
- The GC's depth must match the depth of the drawable for drawing, or an error
- results. Any tile pixmap used with a GC must be the same depth as the GC.
-
- All screens should have the same byte and bit ordering.
- If they don't, you need to declare the "real" bit and byte ordering
- to follow one of your screens and set the variables in the screenInfo struct
- to it.
- Conversion would happen in GetImage() and PutImage() for each screen.
-
- .NH 3
- Backing Store and Save-Unders
- .XS
- Backing Store and Save-Unders
- .XE
- .LP
- Backing Store and Save-Unders are schemes in which the server saves
- parts of windows concealed by other windows so that when they
- become exposed again, the server can replace the pixel values quickly instead
- of asking the client to repaint the window.
-
- Backing Store is a scheme where a window stores away obscured areas
- of itself when covered by
- other windows.
- Save-Unders is a scheme where a window saves away parts of the
- windows beneath it when it is placed in front.
- The basic idea is the same, but the subtle differences have important implications.
-
- With Backing Store, a window tracks its own contents.
- When the client draws into a window that is partially obscured,
- the window must intercept these drawing operations and either cause the
- drawing to happen to the saved backing or forget the saved
- backing so that an expose event is generated the next time
- that part is exposed.
-
- With Save-Unders, this is difficult because the window would need to
- know which pixels are associated with which windows;
- it would need to intercept all drawing commands to all windows.
- For this reason, Save-Unders is practical only for situations in which
- either there will be no drawing underneath, or if there is,
- it can be easily intercepted
- in one location in the code.
- (See the section on software cursors for an example of this.)
-
- Backing store, on the other hand, is more complicated in another way--
- the pieces of backing that need to be stored are often irregular shapes.
- In the case of X, windows are always rectangular, so the backing store can always
- be saved as a set of rectangular pixmaps.
- If this is done, though, drawing into the backing becomes extremely complicated and
- probably slows the system to the extent that your initial
- performance savings are severely diminished.
- If backing is saved as one large pixmap, you waste pixmap memory; you essentially
- retain a duplicate copy of each window in memory in which the only parts that
- are not used are those exposed on the screen.
-
- Thus, it is usually most practical simply to discard parts of backing
- store that are drawn onto while hidden;
- an expose event will always execute properly.
-
- The sample implementation of backing store is very device-independent.
- All that is needed to use it is a small vector of device-specific functions,
- only two of which are typically used; SaveAreas and RestoreAreas:
- .nf
-
- (*miBSFuncs->SaveAreas) (pixmap, region, x, y);
- PixmapPtr pixmap;
- RegionPtr region;
- int x, y;
-
- (*miBSFuncs->RestoreAreas) (pixmap, region, x, y);
- PixmapPtr pixmap;
- RegionPtr region;
- int x, y;
- .fi
- (*SaveAreas) copies the specified region (which is pixmap relative) from the
- screen starting at (x,y) to the pixmap; (*RestoreAreas) copies the specific
- region (which is screen relative) from the pixmap to the screen, starting at
- (x,y). If you can provide these two functions; call
- miInitializeBackingStore(pScreen, funcs) and the rest will be taken care
- of. Cfb and mfb already call miInitializeBackingStore.
-
- DIX provides SaveUnders when DDX provides BackingStore. This implementation
- is not as optimal as a real save unders implementation would be, but is
- better than nothing in most cases; the mi backing store implementation
- changes its behavior when bits are saved because of save unders instead
- of backing store.
- .NH 3
- Software Cursors
- .XS
- Software Cursors
- .XE
- .LP
- The sample server is designed for a hardware cursor that maintains
- a separate cursor bit map
- in hardware so that the video electronics mixes the image of the normal display
- and the cursor before being displayed.
- Nevertheless, a software cursor module is provided with hooks which require
- various levels of support.
-
- The easiest to use level is the DispCur module (server/ddx/mi/midispcur.c).
- This provides software cursors with nearly no device-specific code. All
- that is required is that you read events from the pointer device and send
- position update events to the mi routines. Four routines are required, one
- of which is implemented in mi for the truly meek. The relevant header file
- is "server/ddx/mi/mipointer.h"; this file contains the function vector
- definition and some useful function defines.
- .nf
-
- typedef struct {
- long (*EventTime)(); /* pScreen */
- Bool (*CursorOffScreen)(); /* ppScreen, px, py */
- void (*CrossScreen)(); /* pScreen, entering */
- void (*QueueEvent)(); /* pxE, pPointer, pScreen */
- } miPointerCursorFuncRec, *miPointerCursorFuncPtr;
- .fi
- An initialized structure of this type is passed, along with the screen which
- needs cursors to miDCInitialize(pScreen, &pointerCursorFuncs). (*EventTime)
- is required to return the time of the last event processed (as 32 bits of
- milliseconds). This allows the mi cursor support to build events and fill
- in the appropriate data.
-
- (*CursorOffScreen) is called whenever the cursor would be off of the
- current screen if the user motion were tracked exactly. This routine
- returns FALSE if the cursor should be confined to the screen, TRUE if cursor
- should wander to some other screen. ppScreen should be smashed to indicate
- the new screen, px and py should indicate the position on that screen; they
- are initialized to be the position of the cursor on the old screen if the
- cursor were not confined or warped (i.e. (x,y) is not on the screen).
-
- (*CrossScreen) is called whenever the cursor is moved on/off of the screen.
- entering is TRUE when pScreen is the screen now containing the cursor and
- FALSE when pScreen used to contain the cursor.
-
- (*QueueEvent) is called in response to WarpPointer protocol requests. It
- should place the event at the tail of the input queue to be processed in
- series with the other events; this is frequently quite difficult to
- implement, however, and the mi routine, miPointerQueueEvent, simply
- processes the motion event immediately can be used (this may cause occasional
- small protocol violations).
-
- When your pointer device moves, call
- .nf
-
- miPointerDeltaCursor (pScreen, dx, dy, generateEvent)
- ScreenPtr pScreen;
- int dx, dy;
- Bool generateEvent;
- .fi
- with the distance the device has moved and TRUE for generateEvent. If you
- device reports absolute coordinates instead, use miPointerMoveCursor
- instead (which replaces the delta coordinates with absolute ones).
-
- To fill in the current pointer position for other event types, use
- miPointerPosition (pScreen, &rootX, &rootY), passing the address of
- the event rootX, rootY fields which will be filled in as appropriate.
-
- The other two cursor layers can be investigated by looking through the
- miDispCur layer which uses them. In particular, you may want to
- use miSprite which allows you to provide device-specific cursor
- drawing primitives to speed up cursor rendering.
-
- .NH 3
- Limited Hardware Cursors
- .XS
- Limited Hardware Cursors
- .XE
- .LP
- Many hardware cursor systems limit the maximum size of the cursor (for
- instance, to 16 pixels square).
- The X specification, however, specifies that a cursor can be any size.
- It is allowable for the server simply to truncate the cursor to an appropriate
- n-by-m rectangle. This may be the top-left corner, or it may be any n by m
- pixel rectangle that is entirely within the cursor and contains the hotspot;
- the exact choice is implementation dependent.
-
- .NH 3
- Fonts in Off-Screen Memory
- .XS
- Fonts in Off-Screen Memory
- .XE
- .LP
- Fonts are probably stored on disk on the server when not in use, probably
- in a bitmap format in binary, a form that is ready to go.
- Character drawing consumes much of the CPU, so you should try to
- ease the burden.
-
- Of course, you need to read fonts into memory when they are needed.
- Unless you have an extra megabyte of main memory, it is probably
- best not to retain them in memory forever; users have
- a tendency to build up large font libraries.
-
- You should have some scheme for loading fonts into memory on demand and
- for purging old fonts when no longer needed.
- Rarely will people use more than a dozen fonts simultaneously.
- (The main exceptions are programs specifically designed to show a sample of each font
- and novice What You See Is What You Get word processor users.)
- You will probably want to record the font least recently used
- and purge it when required.
- Appropriate algorithms can be found in many places, or you can devise your own.
-
- The binary format in which the fonts are stored (probably snf) has glyphs
- aligned and padded to byte, 16-bit, or 32-bit boundaries.
- You can decide which based upon #defines.
-
- .NH 3
- Graphic Memory Usage
- .XS
- Graphic Memory Usage
- .XE
- .LP
- Some servers have extremely complex hardware,
- possibly consisting of multiple frame buffers among which the
- screen can switch, possibly having a graphics processor.
- Sometimes, the graphics processor has its own address space
- that may include memory in addition to the frame buffer that is displayed on
- the screen.
- Sometimes, the graphics processor can also access main memory in your server.
- Sometimes, your main processor can access graphics-processor memory.
- Sometimes, your main processor cannot access the frame buffer.
-
- For these situations, you should carefully consider what to put in
- graphics memory and what to put in main memory for your
- particular hardware configuration.
- You should consider putting the following in graphics memory:
-
- .IP \(bu 5
- Anything you must put in graphics memory
- because of the requirements of your graphics processor
- .IP \(bu 5
- Hardware color lookup tables
- .IP \(bu 5
- Hardware GC information
- .IP \(bu 5
- Cursors
- .IP \(bu 5
- Font Glyphs
- .IP \(bu 5
- Pixmaps
- .IP \(bu 5
- Regions
- .IP \(bu 5
- Save-Unders
- .IP \(bu 5
- Backing Store
- .LP
- Use the GC validate routine to move things in and out of graphics memory.
-
- If your graphics hardware has limited resources, you might want
- to consider drawing into pixmaps that live in main memory, rather
- than special graphics memory. To do this, you should provide an in-memory
- version of the Spans functions. When drawing to an in-memory
- pixmap, and swap these Spans functions and the mi output
- code into the GC at ValidateGC time. Then the mi code will draw
- the appropriate things into the bits in memory. This will probably
- be slower than using the graphics hardware, but may be
- easier that dealing with memory allocation on the graphics
- hardware.
-
- Furthermore, you might consider drawing into pixmaps in
- main memory if your hardware does not
- draw according to the X11 spec; mixing the two styles of drawing
- may produce odd results.
-
- After you have implemented the above, and you use your X server,
- reconsider your decisions.
- (It is difficult to know how you will use an X server before you actually do
- so.)
- You may find that you want to change the use of graphics memory.
-
- .NH 3
- Graphic Output Tuning
- .XS
- Graphic Output Tuning
- .XE
- .LP
- The mi code is designed to be portable by sacrificing
- a certain amount of performance.
- Once you have got it running and have a large user base,
- it might be appropriate to make it run faster. Mfb and cfb have already
- been extensively tuned for many platforms; it is unlikely that you could
- increase performance by substantial amounts without resorting to assembly or
- gratuitous code expansion.
-
- The overall rule in optimizing software is to collect experimental data.
- Do not subjectively judge whether something "feels" faster;
- subjectivity can be easily led astray.
- Do not merely assume where the performance bottlenecks are: use a profiler;
- run benchmarks; use a stopwatch.
-
- If you do not have a profiler, try running a series of benchmarks.
- For instance, if you think that a major bottleneck is a certain loop
- in ImageGlyph, try commenting out the loop to see what
- performance gains are effected.
- Run benchmarks before and after, while running a program that will exercise
- that function.
- This gives you an indication of whether your hunches are right concerning the
- location of the bottlenecks before
- you devote a great deal of time implementing and debugging a complex algorithm.
-
- Before you install an optimization, run benchmarks.
- After you install the optimization, run the benchmarks again to check
- performance gains.
- Complicated software that yields no substantial performance gains
- will simply be a liability later when the software needs to be modified.
-
- Much optimization effort should be directed toward the operations that are
- executed most frequently.
- Sometimes, you can make a quick routine to handle a special case
- that occurs frequently and leave the more unusual cases for more general
- software that takes the time to handle all cases.
- For instance, most items that are drawn will be entirely within the clip
- region.
- Most of those that are not will be entirely outside of the clip region.
- Most drawing is executed with a plane mask of all 1s, and with an alu mode of
- Copy. Most drawing is done with a solid fill style.
- If draw is done with another fill style, the tile or stipple frequently
- has a size that is a byte or word multiple.
- The cfb and mfb routines have already been optimized for some of these
- special cases.
-
- In general, start optimizing where you have a better algorithm or know more
- about the hardware than the portable routines.
-
- .NH 4
- First-Round Optimization
- .XS
- First-Round Optimization
- .XE
- .LP
- The most important things to optimize first are probably
- text drawing, zero-width lines,
- and large area pixel copying and filling.
-
- Text drawing is best optimized by working on the Glyph routines.
- You may want to rewrite them in assembly language or implement them in
- hardware.
- Since most glyphs are written with solid fill styles and the glyph images
- usually do not lie on a clip-region boundary, you may want to make your speedy
- routine handle just this special case, and handle everything else with mi,
- cfb and
- mfb routines.
-
- You can even optimize the mfb and cfb glyph routines to your machine without
- changing much.
- Fonts glyphs are
- padded to byte boundaries for each scanline.
- You can have this padded to 32-bit boundaries, if desired.
- The
- macro getleftbits() in maskbits.h gets glyph bits from glyphs;
- optimize it for your machine.
- (For instance, take into account
- byte, word and longword boundaries, whether your machine can
- address 16-bit or 32-bit words, and whether this is efficient.)
-
- Zero-width lines are a good candidate because the rules for drawing them are relaxed.
- You need not worry about many of the details.
- Frequently, hardware or firmware can generate these.
- The most common lines are vertical and horizontal;
- special routines to draw these may be worthwhile.
-
- CopyArea and CopyWindow optimization will improve window-movement
- performance.
- Frequently, a machine will have special hardware to perform such graphic operations.
-
- .NH 4
- Second Round Optimization
- .XS
- Second Round Optimization
- .XE
- .LP
- The next phase of optimization will probably concentrate on painting
- window backgrounds,
- wide lines, some of the easy-to-perform rectangle operations, and PushPixels().
-
- Wide lines no longer present much of an opportunity
- to invest a great deal of
- work into an optimization and
- receive much benefit from it.
- The mi code now uses integer arithmetic for nearly all of the
- wide line computations and provides nearly-perfect protocol-conforming
- lines. Experimentation with moving the rendering into cfb or mfb has shown
- that not much performance is to be gained; if you want to try, the polygon
- edge walking code is written using macros which could easily be used directly
- inside the graphics layer.
-
- The code you want to look at is in miwideline.c and miwideline.h
-
- Although wide arcs have seen a substantial speedup since the original
- protocol-conformant code was shipped with R3, it would be nice they ran much
- faster. Unfortunately, the protocol defines an object which is quartic in
- description, the straight line code solution of which involves several
- square roots and a couple cube roots. If you examine the implementation in
- miarc.c, you'll see a nightmare of complicated floating point arithmetic.
- These routines attempt to come as close as possible to the protocol
- definition for a wide arc, and suffer tremendously in performance because of
- it. Wide circles, however, do provide a reasonable opportunity for
- optimization. As the both inner and outer edges of a wide circle are
- circular (unlike elliptical arcs), a fully-integer circle edge walker is
- used to scan-convert them. Zero width arcs (like lines) are not completely
- specified by the protocol, and so the traditional integer walker is included
- in mfb, cfb and mi. If you can't use one of those version directly, look in
- the relevant code in ddx/mi/mizerarc.c,mizerarc.h, ddx/mfb/mfbzerarc.c and
- ddx/cfb/cfbzerarc.c.
-
- Also included in cfb for R5 are some interesting optimizations for clipping
- points to rectangles and rendering polygons. Look in ddx/cfb/cfb8line.c and
- ddx/cfb/cfbply1rct.c for these algorithms. The polygon code could easily be
- ported to mfb, however the zero-width line code is dependent on having
- addressable pixels.
-
- PushPixels may also be an important routine to optimize. This is because it
- is used by the software cursor code. Both mfb and cfb have heavily tuned
- PushPixels routines which work in solid fill/copy mode and provide adequate
- performance for software cursors.
-