HDK Technical Reference

Scatter/gather operations

Scatter/gather is used to do DMA data transfers of data that is written to noncontiguous areas of memory. A scatter/gather list is a list of vectors, each of which gives the location and length of one segment in the overall read or write request. Many devices have a DMA controller on each adapter card that must be programmed in some driver-specific fashion. ISA devices generally use the system DMA controller for DMA transfers. See ``DMA'' for general information about implementing DMA operations.

To implement scatter/gather for a driver, the driver must obtain the physical addresses of each contiguous memory region with which to program the device, using the appropriate mechanism for the data type and interface version.

DDI implementation

DDI provides facilities for setting up scatter/gather operations for either STREAMS or non-STREAMS devices, depending on the type of device and the DDI version. The primary ones are:

DDI 8 non-STREAMS drivers

The following method works for most non-STREAMS drivers which, for DDI 8, use a buf(D4) structure even for I/O operations that do not go through the system buffer cache.

  1. From the CFG_ADD subfunction of the driver's config(D2) entry point routine, call the physreq_alloc(D3) function to allocate a physreq(D4) structure that defines the physical memory and contiguity requirements and populate it as follows:

    Set to the alignment requirement for the DMA engine or, if there are no alignment restrictions, set to 1.

    Set to the boundary requirement for the DMA engine or, if there are no boundary restrictions, set to 0.

    Set to 32 for a single cycle PCI DMA device or to 64 for a dual cycle or 64-bit wide bus implementation. Never set to 0 when doing scatter/gather (or other DMA operations).

    The DDI scatter/gather implementation supports both 32-bit and 64-bit data transfers. The kernel uses phys_dma_size to determine which format to use. See ``DMA up to 64 bits (DDI only)''.

    Set to the maximum number of elements in the scatter/gather list for the DMA engine, or to 1 if the DMA engine is not scatter/gather capable.

  2. Call the bcb_alloc(D3) function to allocate a bcb(D4) structure that defines how the buf_breakup(D3) function segments the data being transferred; populate it as follows:

    Set BA_SCGTH to specify a scatter/gather list.

    Set to BCB_ONE_PIECE. This in combination with the setting of bcb_granularity to 1 means that the I/O is not block-oriented.

    Maximum size of the total transfer.

    Set to 1.

    Point to the physreq structure that was allocated in populated in Step #1.

  3. Call the physreq_prep(D3) and bcb_prep(D3) functions to ``prep'' the structures.

  4. The bcb structure is returned from a call to the devinfo(D2) entry point routine with the parm argument set to either DI_RBCBP or DI_WBCBP. Be sure that the D_RANDOM flag is not set in the drv_flags member of the drvinfo(D4) structure.

    The bcb structure is used by the buf_breakup(D3) function which, for DDI 8 and later drivers, is called before entering the biostart(D2) entry point routine that handles all read and write operations.

  5. This causes the biostart(D2) entry point routine to be passed a pointer to a buf(D4) that includes a pointer to the bcb structure and a pointer to the scgth(D4) structure that contains a list of physical addresses to be used for the scatter/gather operations.

    The scgth structure defines the physical memory region that is used for all scatter/gather operations in DDI 8 and later drivers.

The system picks up the bcb when the driver's open(D2) entry point routine executes. I/O operations that originate from the read, pread, write, or pwrite system call enter the driver at the biostart(D2) entry point. At this point, the memory is ``wired'' down and meets all the physreq constraints. The driver can use the scatter/gather list of physical addresses that is associated with this bcb structure for its I/O operations.

If the memory cannot be wired down within the bcb constraints, then an error is generated to the caller and the biostart( ) entry point is not invoked.

Note that, if the I/O request is initiated with a read or write system call to a character device node of a non-STREAMS DDI 8 driver, the kernel creates a buffer of type BA_UIO to describe the data. This buffer is always passed through the buf_breakup(D3) function before the driver's biostart(D2) entry point routine is called. DDI 8 does not distinguish between block and character drivers, but if the driver specifies BA_UIO in the bcb_addrtype member of the bcb(D4) structure, buf_breakup( ) passes the buffer without manipulating it.

I/O operations that originate with readv or writev system calls atempt to invoke the biostart( ) entry point as many times as is specified by the iovcnt argument to the system call but a scatter/gather list will not be built for the iovec entries. This is a limitation of the current implementation; the solution is to ensure that the application does not call the readv( ) and writev( ) system calls for this device.

DDI 8 STREAMS devices

  1. Allocate a physreq(D4) structure with the physreq_alloc(D3) function, populate it as described for non-STREAMS drivers above, then prep it with the physreq_prep(D3) function.

  2. Allocate the raw message block with the allocb_physreq(D3str) function.

  3. Call the msgscgth(D3str) function to create a scgth(D4) list of physical scatter/gather addresses.

  4. Transmit or receive the data.

  5. Call freeb(D3str) to free the message block after a successful transmit or pass the received message upstream with the putnext(D3str) function.

SVR5 MDI driver notes

MDI drivers are a special type of STREAMS drivers. The following notes provide additional information about how scatter/gather operations are implemented for SVR5 MDI drivers.

DDI 6 and 7 non-STREAMS drivers

  1. Call the physiock(D3) function from the ioctl(D2), read(D2), or write(D2) entry point routine to set up the unbuffered access to the device.

  2. Call the vtop(D3) function from the strategy(D2) function to convert the virtual address that comes from the system call request into a physical address that is required for the actual transfer to the device.

    For SDI HBA drivers written for DDI versions prior to version 8 do this translation xlat(D2sdi) entry point routines.

  3. Test the DMA constraints. If the test fails, allocate a physreq(D4) structure with the physreq_alloc(D3) function, populate it, and prep it by calling the physreq_prep(D3) function, then call the kmem_alloc_physreq(D3) function to allocate memory according to the constraints specified in the physreq structure. The call the vtop(D3) function to get the physical address of the memory is allocated.

  4. Copy the data with the bcopy(D3) function.

DDI 6 and 7 STREAMS drivers

  1. Allocate a physreq(D4) structure with the allocb_physreq(D3str) function, populate it, then prep it with the physreq_prep(D3) function.

  2. Call the vtop(D3) function to convert the virtual address(es) of the message block and data buffers to physical addresses. MDI drivers should also call the mdi_end_of_contig_segment(D3mdi) function to get the end of the physically contiguous segment in the buffer.

  3. Transmit or receive the data.

  4. Call freeb(D3str) to free the message block after a successful transmit or pass the received message upstream with the putnext(D3str) function.

ODDI implementation

Scatter/gather mechanisms are defined for SCO OpenServer 5 SCSI host adapters and SCO OpenServer 5 MDI network adapter drivers.

SCO OpenServer 5 SCSI host adapters

Scatter/gather operations for SCO OpenServer 5 SCSI host adapters use a scatter/gather list that is defined in the scsi_io_req(D4osdi) structure:

location virtual address of an array of physical memory locations into which to transfer data.

pointer to the scatter/gather list, which is an array of type long, containing of 1KB elements.

total length, in bytes, of all the scatter/gather requests in the list.

offset into the media (usually disk) where the scatter/gather I/O will start, expressed in 512-byte blocks.

In addition, the following must be set in the scsi_ha_info(D4osdi) structure:

Set to 1 if the device can accept scatter/gather requests. The Sdsk_no_sg variable in the pack.d/Sdsk/space.c file must also be set to 0.

The Sram sample driver that is provided in the HDK O5hbasamp package illustrates how to implement scatter/gather operations in an SCO OpenServer 5 SCSI host adapter driver.

SCO OpenServer 5 MDI network adapter drivers

SCO OpenServer 5 MDI drivers use the mdi_end_of_contig_segment(D3mdi) function to check each block and break it into scatter/gather segments if necessary.

The following code example is from the shrk driver that is provided in the O5ndsampl package in the HDK. Note that this driver is compiled to compile and build for both SCO OpenServer 5 and SVR5 systems, and is useful for comparing the interfaces for the two platforms.

   /* streams scatter gather definitions and structures */
   #define SCGTH32          0

typedef struct scgth_el32 { uint_t sg_base; /* base physical address */ uint_t sg_size; /* size, in bytes, of this piece */ } scgth_el32_t;

typedef struct scgth { union { scgth_el32_t *el32; } sg_elem; unchar sg_nelem; unchar sg_format; } scgth_t;


bzero(&sg, sizeof(scgth_t)); sg.sg_elem.el32 = (scgth_el32_t *)kmem_zalloc(shrk_max_txbuf_per_msg * sizeof(scgth_el32_t), KM_NOSLEEP);

for (smp = mp; mp; mp = mp->b_cont ) { for (start = mp->b_rptr; start < mp->b_wptr; ) { end = mdi_end_of_contig_segment(start, mp->b_wptr - 1); sg.sg_elem.el32[sg.sg_nelem].sg_base = (uint_t) kvtophys(start); sg.sg_elem.el32[sg.sg_nelem].sg_size = end - start + 1; sg.sg_nelem, start, end); sg.sg_nelem++; start = end + 1; } }


kmem_free(sg.sg_elem.el32, shrk_max_txbuf_per_msg*sizeof(scgth_el32_t));


© 2005 The SCO Group, Inc. All rights reserved.
OpenServer 6 and UnixWare (SVR5) HDK - June 2005