BE ENGINEERING INSIGHTS: Changes in the BeOS Driver API
By Cyril Meurillon

Version Control

We finally realized that our driver API was not perfect, and that there was room for future improvements, or "additions." That's why we'll introduce version control in the driver API for R4. Every driver built then and thereafter will contain a version number that tells which API the driver complies to.

In concrete terms, the version number is a driver global variable that's exported and checked by the device file system at load time. In Drivers.h you'll find the following declarations:

#define  B_CUR_DRIVER_API_VERSION  2
extern  _EXPORT int32  api_version;

In your driver code, you'll need to add the following definition:

#include <Drivers.h>
...
int32  api_version = B_CUR_DRIVER_API_VERSION.

Driver API version 2 refers to the new (R4) API. Version 1 is the R3 API. If the driver API changes, we would bump the version number to 3. Newly built drivers will have to comply to the new API and declare 3 as their API version number. Old driver binaries would still declare an old version (1 or 2), forcing the device file system to translate them to the newer API (3). This incurs only a negligible overhead in loading drivers.

But, attendez, vous say. What about pre-R4 drivers, which don't declare what driver API they comply to? Well, devfs treats drivers without version number as complying to the first version of the API -- the one documented today in the Be Book. Et voila.

New Entries in the device_hooks Structure

I know you're all dying to learn what's new in the R4 driver API... Here it is, revealed to you exclusively! We'll introduce scatter-gather and (a real) select in R4, and add a few entries in the device_hooks structure to let drivers deal with the new calls.

Scatter-gather

As discreetly announced by Trey in his article at http://www.be.com/aboutbe/benewsletter/volume_II/Issue35.html, we've added 2 new system calls, well known to the community of UNIX programmers:

struct iovec {
  void   *iov_base;
  size_t  iov_len;
};
typedef struct iovec iovec;


extern ssize_t   readv_pos(int fd, off_t pos,
  const iovec *vec, size_t count);
extern ssize_t   writev_pos(int fd, off_t pos,
  const iovec *vec, size_t count);

These calls let you read and write multiple buffers to/from a file or a device. They initiate an IO on the device pointed to by fd, starting at position pos, using the count buffers described in the array vec.

One may think this is equivalent to issuing multiple simple reads and writes to the same file descriptor -- and, from a semantic standpoint, it is. But not when you look at performance!

Most devices that use DMA are capable of "scatter-gather." It means that the DMA can be programmed to handle, in one shot, buffers that are scattered throughout memory. Instead of programming N times an IO that points to a single buffer, only one IO needs to be programmed, with a vector of pointers that describe the scattered buffers. It means higher bandwidth.

At a lower level, we've added two entries in the device_hooks structure:

typedef status_t (*device_readv_hook)
  (void *cookie, off_t position, const iovec *vec,
   size_t count, size_t *numBytes);
typedef status_t (*device_writev_hook)
  (void *cookie, off_t position, const iovec *vec,
   size_t count, size_t *numBytes);


typedef struct {
  ...
  device_readv_hook  readv;
    /* scatter-gather read from the device */
  device_writev_hook writev;
    /* scatter-gather write to the device  */
} device_hooks;

Notice that the syntax is very similar to that of the single read and write hooks:

typedef status_t (*device_read_hook)
  (void *cookie, off_t position, void *data,
   size_t *numBytes);
typedef status_t (*device_write_hook)
  (void *cookie, off_t position, const void *data,
   size_t *numBytes);

Only the descriptions of the buffers differ.

Devices that can take advantage of scatter-gather should implement these hooks. Other drivers can simply declare them NULL. When a readv() or writev() call is issued to a driver that does not handle scatter-gather, the IO is broken down into smaller IO using individual buffers. Of course, R3 drivers don't know about scatter-gather, and are treated accordingly.

Select

I'm not breaking the news either with this one. Trey announced in his article last week the coming of select(). This is another call that is very familiar to UNIX programers:

extern int select(int nbits,
      struct fd_set *rbits,
      struct fd_set *wbits,
      struct fd_set *ebits,
      struct timeval *timeout);

rbits, wbits and ebits are bit vectors. Each bit represents a file descriptor to watch for a particular event:

* rbits: wait for input to be available (read returns something immediately without blocking)

* wbits: wait for output to drain (write of 1 byte does not block)

* ebits: wait for exceptions.

select() returns when at least one event has occurred, or when it times out. Upon exit, select() returns (in the different bit vectors) the file descriptors that are ready for the corresponding event.

select() is very convenient because it allows a single thread to deal with multiple streams of data. The current alternative is to spawn one thread for every file descriptor you want to control. This might be overkill in certain situations, especially if you deal with a lot of streams.

select() is broken down into two calls at the driver API level: one hook to ask the driver to start watching a given file descriptor, and another hook to stop watching.

Here are the two hooks we added to the device_hooks structure:

struct selectsync;
typedef struct selectsync selectsync;


typedef status_t (*device_select_hook)
  (void *cookie, uint8 event, uint32 ref, selectsync *sync);
typedef status_t (*device_deselect_hook)
  (void *cookie, uint8 event, selectsync *sync);


#define  B_SELECT_READ       1
#define  B_SELECT_WRITE      2
#define  B_SELECT_EXCEPTION  3


typedef struct {
  ...
  device_select_hook    select;    /* start select */
  device_deselect_hook  deselect;  /* stop select */
} device_hooks;

cookie represents the file descriptor to watch. event tells what kind of event we're waiting on for that file descriptor. If the event happens before the deselect hook is invoked, then the driver has to call:

extern void notify_select_event(selectsync *sync, uint32 ref);

with the sync and ref it was passed in the select hook. This happens typically at interrupt time, when input buffers are filled or when output buffers drain. Another place where notify_select_event() is likely to be called is in your select hook, in case the condition is already met there.

The deselect hook is called to indicate that the file descriptor shouldn't be watched any more, as the result of one or more events on a watched file descriptor, or of a timeout. It is a serious mistake to call notify_select_event() after your deselect hook has been invoked.

Drivers that don't implement select() should declare these hooks NULL. select(), when invoked on such drivers, will return an error.

Introduction of "Bus Managers"

Another big addition to R4 is the notion of "bus managers." Arve wrote a good article on this, which you'll find at:
http://www.be.com/aboutbe/benewsletter/volume_II/Issue20.html

Bus managers are loadable modules that drivers can use to access a hardware bus. For example, the R3 kernel calls which drivers were using looked like this:

extern long get_nth_pci_info(long index, pci_info *info);
extern long read_pci_config(uchar bus, uchar device,
  uchar function, long offset, long size);
extern void write_pci_config(uchar bus, uchar device,
  uchar function, long offset, long size, long value);
...

Now, they're encapsulated in the PCI bus manager. The same happened for the ISA, SCSI and IDE bus related calls. More busses will come. This makes the kernel a lot more modular and lightweight, as only the code handling the present busses are loaded in memory.

A New Organization for the Drivers Directory

In R3, /boot/beos/system/add-ons/kernel/drivers/ and /boot/home/config/add-ons/kernel/drivers/ contained the drivers. This flat organization worked fine. But it had the unfortunate feature of not scaling very well as you add drivers to the system, because there is no direct relation between the name of a device you open and the name of the driver that serves it. This potentially causes all drivers to be searched when an unknown device is opened.

That's why we've broken down these directories into subdirectories that help the device file system locate drivers when new devices are opened.

* ../add-ons/kernel/dev/ mirrors the devfs name space using symlinks and directories

* ../add-ons/kernel/bin/ contains the driver binaries

For example, the serial driver publishes the following devices:

ports/serial1
ports/serial2

It lives under ../add-ons/kernel/bin/ as "serial", and has the following symbolic link set up:

../add-ons/kernel/drivers/dev/ports/serial -> ../../bin/serial

If "fred", a driver, wishes to publish a ports/XYZ device, then it should setup this symbolic link:

../add-ons/kernel/drivers/dev/ports/fred -> ../../bin/fred

If a driver publishes devices in more than one directory, then it must setup a symbolic link in every directory in publishes in. For example, driver "foo" publishes:

fred/bar/machin
greg/bidule

then it should come with the following symbolic links:

../add-ons/kernel/drivers/dev/fred/bar/foo -> ../../../bin/foo
../add-ons/kernel/drivers/dev/greg/foo -> ../../bin/foo

This new organization speeds up device name resolution a lot. Imagine that we're trying to find the driver that serves the device "/dev/fred/bar/machin". In R3, we have to ask all the drivers known to the system, one at a time, until we find the right one. In R4, we only have to ask the drivers pointed to by the links in ../add-ons/kernel/drivers/dev/fred/bar/.

Future Directions

You see that the driver world has undergone many changes in BeOS Release 4. All this is nice, but there are other features that did not make it in, which we'd like to implement in future releases. Perhaps the most important one is asynchronous IO. The asynchronous read() and write() calls don't block -- they return immediately instead of waiting for the IO to complete. Like select(), asynchronous IO makes it possible for a single thread to handle several IOs simultaneously, which is sometimes a better option than spawning one thread for each IO you want to do concurrently. This is true especially if there are a lot of them.

Thanks to the driver API versioning, we'll have no problems throwing the necessary hooks into the device_hooks structure while remaining backward compatible with existing drivers.

Copyright ©1998 Be, Inc. Be is a registered trademark, and BeOS, BeBox, BeWare, GeekPort, the Be logo and the BeOS logo are trademarks of Be, Inc. All other trademarks mentioned are the property of their respective owners. Comments about this site? Please write us at webmaster@be.com.