Currently, all userspace verbs operations that call into the kernel
are serialized by ib_uverbs_idr_mutex. This can be a scalability
issue for some workloads, especially for devices driven by the ipath
driver, which needs to call into the kernel even for datapath
operations.
Fix this by adding reference counts to the userspace objects, and then
converting ib_uverbs_idr_mutex into a spinlock that only protects the
idrs long enough to take a reference on the object being looked up.
Because remove operations may fail, we have to do a slightly funky
two-step deletion, which is described in the comments at the top of
uverbs_cmd.c.
This also still leaves ib_uverbs_idr_lock as a single lock that is
possibly subject to contention. However, the lock hold time will only
be a single idr operation, so multiple threads should still be able to
make progress, even if ib_uverbs_idr_lock is being ping-ponged.
Surprisingly, these changes even shrink the object code:
add/remove: 23/5 grow/shrink: 4/21 up/down: 633/-693 (-60)
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add a function to initialize address handle attributes from a work
completion. This functionality is duplicated by both verbs and the CM.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add IB_EVENT_CLIENT_REREGISTER to enum so low-level drivers can
generate "client reregister" events.
Signed-off-by: Leonid Arsh <leonida@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add an LMC cache to struct ib_device, and add a function
ib_get_cached_lmc() to query the cache.
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Push translation of static rate to HCA format into low-level drivers,
where it belongs. For static rate encoding, use encoding of rate
field from IB standard PathRecord, with addition of value 0, for
backwards compatibility with current usage. The changes are:
- Add enum ib_rate to midlayer includes.
- Get rid of static rate translation in IPoIB; just use static rate
directly from Path and MulticastGroup records.
- Update mthca driver to translate absolute static rate into the
format used by hardware. This also fixes mthca's static rate
handling for HCAs that are capable of 4X DDR.
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Have mthca's create_srq method return the actual capacity of the SRQ
that gets created. Also update comments in <rdma/ib_verbs.h> to
clarify that this is what is expected from ib_create_srq().
Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The in-kernel mthca driver contains a table of which attributes are
valid for each queue pair state transition. It turns out that both
other IB drivers -- ipath and ehca -- which are being prepared for
merging have copied this table, errors and all.
To forestall this code duplication, move this table and the code to
check parameters against it into a midlayer library function,
ib_modify_qp_is_ok().
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch allows the consumer to set the page size of "pages" mapped
by the pool FMRs, which is a feature already existing in the base
verbs API. On the cosmetic side it changes ib_fmr_attr.page_size field
to be named page_shift.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Expose a writable "node_desc" sysfs attribute for InfiniBand devices.
This allows userspace to update the node description with information
such as the node's hostname, so that IB network management software
can tie its view to the real world.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add support to uverbs to handle resizing userspace CQs (completion
queues), including adding an ABI for marshalling requests and
responses. The kernel midlayer already has ib_resize_cq().
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add a node_guid field to struct ib_device. It is the responsibility
of the low-level driver to initialize this field before registering a
device with the midlayer. Convert everyone to looking at this field
instead of calling ib_query_device() when all they want is the node
GUID, and remove the node_guid field from struct ib_device_attr.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Change the struct ib_device.resize_cq() method to take a plain integer
that holds the new CQ size, rather than a pointer to an integer that
it uses to return the new size. This makes the interface match the
exported ib_resize_cq() signature, and allows the low-level driver to
update the CQ size with proper locking if necessary.
No in-tree drivers are exporting this method yet.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The MAD layer was violating the DMA API by touching data buffers used
for sends after the DMA mapping was done. This causes problems on
non-cache-coherent architectures, because the device doing DMA won't
see updates to the payload buffers that exist only in the CPU cache.
Fix this by having all MAD consumers use ib_create_send_mad() to
allocate their send buffers, and moving the DMA mapping into the MAD
layer so it can be done just before calling send (and after any
modifications of the send buffer by the MAD layer).
Tested on a non-cache-coherent PowerPC 440SPe system.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Give each device a uverbs_cmd_mask, so that a low-level driver can
control which methods may be called on behalf of userspace.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add abi_version attribute to uverbs class devices to allow for
ABI versioning of device-specific interfaces.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Al Viro pointed out that the current IB userspace verbs interface
allows userspace to cause mischief by closing file descriptors before
we're ready, or issuing the same command twice at the same time. This
patch closes those races, and fixes other obvious problems such as a
module reference leak.
Some other interface bogosities will require an ABI change to fix
properly, so I'm deferring those fixes until 2.6.15.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Move the InfiniBand headers from drivers/infiniband/include to include/rdma.
This allows InfiniBand-using code to live elsewhere, and lets us remove the
ugly EXTRA_CFLAGS include path from the InfiniBand Makefiles.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-08-26 20:37:38 -07:00
Renamed from drivers/infiniband/include/ib_verbs.h (Browse further)