117 lines
4.9 KiB
Text
117 lines
4.9 KiB
Text
|
In Linux 2.5 kernels (and later), USB device drivers have additional control
|
||
|
over how DMA may be used to perform I/O operations. The APIs are detailed
|
||
|
in the kernel usb programming guide (kerneldoc, from the source code).
|
||
|
|
||
|
|
||
|
API OVERVIEW
|
||
|
|
||
|
The big picture is that USB drivers can continue to ignore most DMA issues,
|
||
|
though they still must provide DMA-ready buffers (see DMA-mapping.txt).
|
||
|
That's how they've worked through the 2.4 (and earlier) kernels.
|
||
|
|
||
|
OR: they can now be DMA-aware.
|
||
|
|
||
|
- New calls enable DMA-aware drivers, letting them allocate dma buffers and
|
||
|
manage dma mappings for existing dma-ready buffers (see below).
|
||
|
|
||
|
- URBs have an additional "transfer_dma" field, as well as a transfer_flags
|
||
|
bit saying if it's valid. (Control requests also have "setup_dma" and a
|
||
|
corresponding transfer_flags bit.)
|
||
|
|
||
|
- "usbcore" will map those DMA addresses, if a DMA-aware driver didn't do
|
||
|
it first and set URB_NO_TRANSFER_DMA_MAP or URB_NO_SETUP_DMA_MAP. HCDs
|
||
|
don't manage dma mappings for URBs.
|
||
|
|
||
|
- There's a new "generic DMA API", parts of which are usable by USB device
|
||
|
drivers. Never use dma_set_mask() on any USB interface or device; that
|
||
|
would potentially break all devices sharing that bus.
|
||
|
|
||
|
|
||
|
ELIMINATING COPIES
|
||
|
|
||
|
It's good to avoid making CPUs copy data needlessly. The costs can add up,
|
||
|
and effects like cache-trashing can impose subtle penalties.
|
||
|
|
||
|
- When you're allocating a buffer for DMA purposes anyway, use the buffer
|
||
|
primitives. Think of them as kmalloc and kfree that give you the right
|
||
|
kind of addresses to store in urb->transfer_buffer and urb->transfer_dma,
|
||
|
while guaranteeing that no hidden copies through DMA "bounce" buffers will
|
||
|
slow things down. You'd also set URB_NO_TRANSFER_DMA_MAP in
|
||
|
urb->transfer_flags:
|
||
|
|
||
|
void *usb_buffer_alloc (struct usb_device *dev, size_t size,
|
||
|
int mem_flags, dma_addr_t *dma);
|
||
|
|
||
|
void usb_buffer_free (struct usb_device *dev, size_t size,
|
||
|
void *addr, dma_addr_t dma);
|
||
|
|
||
|
For control transfers you can use the buffer primitives or not for each
|
||
|
of the transfer buffer and setup buffer independently. Set the flag bits
|
||
|
URB_NO_TRANSFER_DMA_MAP and URB_NO_SETUP_DMA_MAP to indicate which
|
||
|
buffers you have prepared. For non-control transfers URB_NO_SETUP_DMA_MAP
|
||
|
is ignored.
|
||
|
|
||
|
The memory buffer returned is "dma-coherent"; sometimes you might need to
|
||
|
force a consistent memory access ordering by using memory barriers. It's
|
||
|
not using a streaming DMA mapping, so it's good for small transfers on
|
||
|
systems where the I/O would otherwise tie up an IOMMU mapping. (See
|
||
|
Documentation/DMA-mapping.txt for definitions of "coherent" and "streaming"
|
||
|
DMA mappings.)
|
||
|
|
||
|
Asking for 1/Nth of a page (as well as asking for N pages) is reasonably
|
||
|
space-efficient.
|
||
|
|
||
|
- Devices on some EHCI controllers could handle DMA to/from high memory.
|
||
|
Driver probe() routines can notice this using a generic DMA call, then
|
||
|
tell higher level code (network, scsi, etc) about it like this:
|
||
|
|
||
|
if (dma_supported (&intf->dev, 0xffffffffffffffffULL))
|
||
|
net->features |= NETIF_F_HIGHDMA;
|
||
|
|
||
|
That can eliminate dma bounce buffering of requests that originate (or
|
||
|
terminate) in high memory, in cases where the buffers aren't allocated
|
||
|
with usb_buffer_alloc() but instead are dma-mapped.
|
||
|
|
||
|
|
||
|
WORKING WITH EXISTING BUFFERS
|
||
|
|
||
|
Existing buffers aren't usable for DMA without first being mapped into the
|
||
|
DMA address space of the device.
|
||
|
|
||
|
- When you're using scatterlists, you can map everything at once. On some
|
||
|
systems, this kicks in an IOMMU and turns the scatterlists into single
|
||
|
DMA transactions:
|
||
|
|
||
|
int usb_buffer_map_sg (struct usb_device *dev, unsigned pipe,
|
||
|
struct scatterlist *sg, int nents);
|
||
|
|
||
|
void usb_buffer_dmasync_sg (struct usb_device *dev, unsigned pipe,
|
||
|
struct scatterlist *sg, int n_hw_ents);
|
||
|
|
||
|
void usb_buffer_unmap_sg (struct usb_device *dev, unsigned pipe,
|
||
|
struct scatterlist *sg, int n_hw_ents);
|
||
|
|
||
|
It's probably easier to use the new usb_sg_*() calls, which do the DMA
|
||
|
mapping and apply other tweaks to make scatterlist i/o be fast.
|
||
|
|
||
|
- Some drivers may prefer to work with the model that they're mapping large
|
||
|
buffers, synchronizing their safe re-use. (If there's no re-use, then let
|
||
|
usbcore do the map/unmap.) Large periodic transfers make good examples
|
||
|
here, since it's cheaper to just synchronize the buffer than to unmap it
|
||
|
each time an urb completes and then re-map it on during resubmission.
|
||
|
|
||
|
These calls all work with initialized urbs: urb->dev, urb->pipe,
|
||
|
urb->transfer_buffer, and urb->transfer_buffer_length must all be
|
||
|
valid when these calls are used (urb->setup_packet must be valid too
|
||
|
if urb is a control request):
|
||
|
|
||
|
struct urb *usb_buffer_map (struct urb *urb);
|
||
|
|
||
|
void usb_buffer_dmasync (struct urb *urb);
|
||
|
|
||
|
void usb_buffer_unmap (struct urb *urb);
|
||
|
|
||
|
The calls manage urb->transfer_dma for you, and set URB_NO_TRANSFER_DMA_MAP
|
||
|
so that usbcore won't map or unmap the buffer. The same goes for
|
||
|
urb->setup_dma and URB_NO_SETUP_DMA_MAP for control requests.
|