839fcaba35
The following patch adds experimental support for IPoIB connected mode, as defined by the draft from the IETF ipoib working group. The idea is to increase performance by increasing the MTU from the maximum of 2K (theoretically 4K) supported by IPoIB on top of UD. With this code, I'm able to get 800MByte/sec or more with netperf without options on a Mellanox 4x back-to-back DDR system. Some notes on code: 1. SRQ is used for scalability to large cluster sizes 2. Only RC connections are used (UC does not support SRQ now) 3. Retry count is set to 0 since spec draft warns against retries 4. Each connection is used for data transfers in only 1 direction, so each connection is either active(TX) or passive (RX). 2 sides that want to communicate create 2 connections. 5. Each active (TX) connection has a separate CQ for send completions - this keeps the code simple without CQ resize and other tricks 6. To detect stale passive side connections (where the remote side is down), we keep an LRU list of passive connections (updated once per second per connection) and destroy a connection after it has been unused for several seconds. The LRU rule makes it possible to avoid scanning connections that have recently been active. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
47 lines
1.8 KiB
Text
47 lines
1.8 KiB
Text
config INFINIBAND_IPOIB
|
|
tristate "IP-over-InfiniBand"
|
|
depends on INFINIBAND && NETDEVICES && INET && (IPV6 || IPV6=n)
|
|
---help---
|
|
Support for the IP-over-InfiniBand protocol (IPoIB). This
|
|
transports IP packets over InfiniBand so you can use your IB
|
|
device as a fancy NIC.
|
|
|
|
See Documentation/infiniband/ipoib.txt for more information
|
|
|
|
config INFINIBAND_IPOIB_CM
|
|
bool "IP-over-InfiniBand Connected Mode support"
|
|
depends on INFINIBAND_IPOIB && EXPERIMENTAL
|
|
default n
|
|
---help---
|
|
This option enables experimental support for IPoIB connected mode.
|
|
After enabling this option, you need to switch to connected mode through
|
|
/sys/class/net/ibXXX/mode to actually create connections, and then increase
|
|
the interface MTU with e.g. ifconfig ib0 mtu 65520.
|
|
|
|
WARNING: Enabling connected mode will trigger some
|
|
packet drops for multicast and UD mode traffic from this interface,
|
|
unless you limit mtu for these destinations to 2044.
|
|
|
|
config INFINIBAND_IPOIB_DEBUG
|
|
bool "IP-over-InfiniBand debugging" if EMBEDDED
|
|
depends on INFINIBAND_IPOIB
|
|
default y
|
|
---help---
|
|
This option causes debugging code to be compiled into the
|
|
IPoIB driver. The output can be turned on via the
|
|
debug_level and mcast_debug_level module parameters (which
|
|
can also be set after the driver is loaded through sysfs).
|
|
|
|
This option also creates an "ipoib_debugfs," which can be
|
|
mounted to expose debugging information about IB multicast
|
|
groups used by the IPoIB driver.
|
|
|
|
config INFINIBAND_IPOIB_DEBUG_DATA
|
|
bool "IP-over-InfiniBand data path debugging"
|
|
depends on INFINIBAND_IPOIB_DEBUG
|
|
---help---
|
|
This option compiles debugging code into the data path
|
|
of the IPoIB driver. The output can be turned on via the
|
|
data_debug_level module parameter; however, even with output
|
|
turned off, this debugging code will have some performance
|
|
impact.
|