Commit graph

615 commits

Author SHA1 Message Date
Eli Cohen
1f8f7b7a7b IB/mlx4: Fix check of max_qp_dest_rdma in modify QP
max_qp_dest_rdma is already in natural units - no need to shift.  This
was discovered by a test that deliberately requests more outstanding
atomic operation than the device supports.

Found by Sagi Rotem at Mellanox.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-19 08:51:56 -07:00
Ali Ayoub
de57c9f102 IB/mthca: Fix use-after-free on device restart
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-19 08:51:56 -07:00
Hoang-Nam Nguyen
bd5a6ccc0e IB/ehca: Return proper error code if register_mr fails
Set the return code of ehca_register_mr() to ENOMEM if the corresponding
firmware call fails due to out of resources.  Some other error codes
were explicitly mapped to EINVAL -- just remove those cases so they
get mapped to the default case, which already returns EINVAL anyway.

Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-19 08:51:54 -07:00
Roland Dreier
8b8c8bca3a IB/ipath: Fix potential deadlock with multicast spinlocks
Lockdep found the following potential deadlock between mcast_lock and
n_mcast_grps_lock: mcast_lock is taken from both interrupt context and
process context, so spin_lock_irqsave() must be used to take it.
n_mcast_grps_lock is only taken from process context, so at first it
seems safe to take it with plain spin_lock(); however, it also nests
inside mcast_lock, and hence we could deadlock:

  cpu A                                   cpu B
    ipath_mcast_add():
      spin_lock_irq(&mcast_lock);

                                            ipath_mcast_detach():
                                              spin_lock(&n_mcast_grps_lock);

                                            <enter interrupt>

                                            ipath_mcast_find():
                                              spin_lock_irqsave(&mcast_lock);

      spin_lock(&n_mcast_grps_lock);

Fix this by using spin_lock_irq() to take n_mcast_grps_lock.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-19 08:51:53 -07:00
Michael S. Tsirkin
bd18c11277 IB/mthca: Set cleaned CQEs back to HW ownership when cleaning CQ
mthca_cq_clean() updates the CQ consumer index without moving CQEs
back to HW ownership.  As a result, the same WRID might get reported
twice, resulting in a use-after-free.  This was observed in IPoIB CM.
Fix by moving all freed CQEs to HW ownership.

This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=617>

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-14 14:10:34 -07:00
Michael S. Tsirkin
3e28c56b9b IB/mthca: Fix posting >255 recv WRs for Tavor
Fix posting lists of > 255 receive WRs for Tavor: rq.next_ind must
be updated each doorbell, otherwise the next doorbell will use an
incorrect index.

Found by Ronni Zimmermann at Mellanox.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-14 14:10:34 -07:00
Joachim Fenkes
4e430dcb7b IB/ehca: Disable scaling code by default, bump version number
- Scaling code is still considered experimental, so disable it by default
- Increase version to SVNEHCA_0023

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-14 13:41:40 -07:00
Joachim Fenkes
bba9b6013e IB/ehca: Beautify sysfs attribute code and fix compiler warnings
eHCA's sysfs attributes are now being created via sysfs_create_group(),
making the process neatly table-driven. The return value is checked, thus
fixing a few compiler warnings.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-14 13:40:45 -07:00
Joachim Fenkes
c7a14939e7 IB/ehca: Remove _irqsave, move #ifdef
- In ehca_process_eq(), we're IRQ safe throughout the whole function, so we
  don't need another _irqsave in the middle of flight.

- take_over_work() is only called by comp_pool_callback(), so it can move
  into the same #ifdef block.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-14 13:40:05 -07:00
Hoang-Nam Nguyen
c55a0ddd8e IB/ehca: Fix AQP0/1 QP number
AQP0/1 should report qp_num={0|1} and the actual QP# should be stored
in struct ehca_qp, not the other way round.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-14 13:39:31 -07:00
Joachim Fenkes
92761cdaf2 IB/ehca: Correctly set GRH mask bit in ehca_modify_qp()
The driver needs to always supply the "GRH present" flag to the
hypervisor, whether it's true or false. Not supplying it (i.e. not
setting the corresponding mask bit) amounts to a "perhaps", which we
don't want.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-14 13:38:57 -07:00
Stefan Roscher
5d88278e3b IB/ehca: Serialize hypervisor calls in ehca_register_mr()
Some pSeries hypervisor versions show a race condition in the allocate
MR hCall.  Serialize this call per adapter to circumvent this problem.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-14 13:38:11 -07:00
Arthur Jones
8f140b407f IB/ipath: Shadow the gpio_mask register
Once upon a time, GPIO interrupts were rare.  But then a chip bug in
the waldo series forced the use of a GPIO interrupt to signal packet
reception.  This greatly increased the frequency of GPIO interrupts
which have the gpio_mask bits set on the waldo chips.  Other bits in
the gpio_status register are used for I2C clock and data lines, these
bits are usually on.  An "unlikely" annotation leftover from the old
days was improperly applied to these bits, and an unnecessary chip
mmio read was being accessed in the interrupt fast path on waldo.

Remove the stagnant unlikely annotation in the interrupt handler and
keep a shadow copy of the gpio_mask register to avoid the slow mmio
read when testing for interruptable GPIO bits.

Signed-off-by: Arthur Jones <arthur.jones@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-14 13:22:42 -07:00
Jack Morgenstein
26c6bc7b81 IB/mlx4: Fix uninitialized spinlock for 32-bit archs
uar_lock spinlock was used in mlx4_ib_cq_arm without being initialized
(this only affects 32-bit archs, because uar_lock is not used on
64-bit archs and MLX4_INIT_DOORBELL_LOCK() is a NOP).

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-14 13:02:58 -07:00
Linus Torvalds
de5603748a Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
  IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters
  IB: Put rlimit accounting struct in struct ib_umem
  IB/uverbs: Export ib_umem_get()/ib_umem_release() to modules
2007-05-09 19:40:09 -07:00
Rafael J. Wysocki
8bb7844286 Add suspend-related notifications for CPU hotplug
Since nonboot CPUs are now disabled after tasks and devices have been
frozen and the CPU hotplug infrastructure is used for this purpose, we need
special CPU hotplug notifications that will help the CPU-hotplug-aware
subsystems distinguish normal CPU hotplug events from CPU hotplug events
related to a system-wide suspend or resume operation in progress.  This
patch introduces such notifications and causes them to be used during
suspend and resume transitions.  It also changes all of the
CPU-hotplug-aware subsystems to take these notifications into consideration
(for now they are handled in the same way as the corresponding "normal"
ones).

[oleg@tv-sign.ru: cleanups]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-09 12:30:56 -07:00
Roland Dreier
225c7b1fee IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters
Add an InfiniBand driver for Mellanox ConnectX adapters.  Because
these adapters can also be used as ethernet NICs and Fibre Channel 
HBAs, the driver is split into two modules: 
 
  mlx4_core: Handles low-level things like device initialization and 
    processing firmware commands.  Also controls resource allocation 
    so that the InfiniBand, ethernet and FC functions can share a 
    device without stepping on each other. 
 
  mlx4_ib: Handles InfiniBand-specific things; plugs into the 
    InfiniBand midlayer. 

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-08 18:00:38 -07:00
Roland Dreier
f7c6a7b5d5 IB/uverbs: Export ib_umem_get()/ib_umem_release() to modules
Export ib_umem_get()/ib_umem_release() and put low-level drivers in
control of when to call ib_umem_get() to pin and DMA map userspace,
rather than always calling it in ib_uverbs_reg_mr() before calling the
low-level driver's reg_user_mr method.

Also move these functions to be in the ib_core module instead of
ib_uverbs, so that driver modules using them do not depend on
ib_uverbs.

This has a number of advantages:
 - It is better design from the standpoint of making generic code a
   library that can be used or overridden by device-specific code as
   the details of specific devices dictate.
 - Drivers that do not need to pin userspace memory regions do not
   need to take the performance hit of calling ib_mem_get().  For
   example, although I have not tried to implement it in this patch,
   the ipath driver should be able to avoid pinning memory and just
   use copy_{to,from}_user() to access userspace memory regions.
 - Buffers that need special mapping treatment can be identified by
   the low-level driver.  For example, it may be possible to solve
   some Altix-specific memory ordering issues with mthca CQs in
   userspace by mapping CQ buffers with extra flags.
 - Drivers that need to pin and DMA map userspace memory for things
   other than memory regions can use ib_umem_get() directly, instead
   of hacks using extra parameters to their reg_phys_mr method.  For
   example, the mlx4 driver that is pending being merged needs to pin
   and DMA map QP and CQ buffers, but it does not need to create a
   memory key for these buffers.  So the cleanest solution is for mlx4
   to call ib_umem_get() in the create_qp and create_cq methods.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-08 18:00:37 -07:00
Linus Torvalds
df6d3916f3 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (77 commits)
  [POWERPC] Abolish powerpc_flash_init()
  [POWERPC] Early serial debug support for PPC44x
  [POWERPC] Support for the Ebony 440GP reference board in arch/powerpc
  [POWERPC] Add device tree for Ebony
  [POWERPC] Add powerpc/platforms/44x, disable platforms/4xx for now
  [POWERPC] MPIC U3/U4 MSI backend
  [POWERPC] MPIC MSI allocator
  [POWERPC] Enable MSI mappings for MPIC
  [POWERPC] Tell Phyp we support MSI
  [POWERPC] RTAS MSI implementation
  [POWERPC] PowerPC MSI infrastructure
  [POWERPC] Rip out the existing powerpc msi stubs
  [POWERPC] Remove use of 4level-fixup.h for ppc32
  [POWERPC] Add powerpc PCI-E reset API implementation
  [POWERPC] Holly bootwrapper
  [POWERPC] Holly DTS
  [POWERPC] Holly defconfig
  [POWERPC] Add support for 750CL Holly board
  [POWERPC] Generalize tsi108 PCI setup
  [POWERPC] Generalize tsi108 PHY types
  ...

Fixed conflict in include/asm-powerpc/kdebug.h manually

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:50:19 -07:00
Jeff Layton
1a1c9bb433 inode numbering: change libfs sb creation routines to avoid collisions with their root inodes
This patch makes it so that simple_fill_super and get_sb_pseudo assign their
root inodes to be number 1.  It also fixes up a couple of callers of
simple_fill_super that were passing in files arrays that had an index at
number 1, and adds a warning for any caller that sends in such an array.

It would have been nice to have made it so that it wasn't possible to make
such a collision, but some callers need to be able to control what inode
number their entries get, so I think this is the best that can be done.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:16 -07:00
Paul Mackerras
02bbc0f09c Merge branch 'linux-2.6' 2007-05-08 13:37:51 +10:00
Linus Torvalds
972d45fb43 Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
  IPoIB: Convert to NAPI
  IB: Return "maybe missed event" hint from ib_req_notify_cq()
  IB: Add CQ comp_vector support
  IB/ipath: Fix a race condition when generating ACKs
  IB/ipath: Fix two more spin lock problems
  IB/fmr_pool: Add prefix to all printks
  IB/srp: Set proc_name
  IB/srp: Add orig_dgid sysfs attribute to scsi_host
  IPoIB/cm: Don't crash if remote side uses one QP for both directions
  RDMA/cxgb3: Support for new abort logic
  RDMA/cxgb3: Initialize cpu_idx field in cpl_close_listserv_req message
  RDMA/cxgb3: Fail qp creation if the requested max_inline is too large
  RDMA/cxgb3: Fix TERM codes
  IPoIB/cm: Fix error handling in ipoib_cm_dev_open()
  IB/ipath: Don't corrupt pending mmap list when unmapped objects are freed
  IB/mthca: Work around kernel QP starvation
  IB/ipath: Don't put QP in timeout queue if waiting to send
  IB/ipath: Don't call spin_lock_irq() from interrupt context
2007-05-07 12:18:21 -07:00
Roland Dreier
ed23a72778 IB: Return "maybe missed event" hint from ib_req_notify_cq()
The semantics defined by the InfiniBand specification say that
completion events are only generated when a completions is added to a
completion queue (CQ) after completion notification is requested.  In
other words, this means that the following race is possible:

	while (CQ is not empty)
		ib_poll_cq(CQ);
	// new completion is added after while loop is exited
	ib_req_notify_cq(CQ);
	// no event is generated for the existing completion

To close this race, the IB spec recommends doing another poll of the
CQ after requesting notification.

However, it is not always possible to arrange code this way (for
example, we have found that NAPI for IPoIB cannot poll after
requesting notification).  Also, some hardware (eg Mellanox HCAs)
actually will generate an event for completions added before the call
to ib_req_notify_cq() -- which is allowed by the spec, since there's
no way for any upper-layer consumer to know exactly when a completion
was really added -- so the extra poll of the CQ is just a waste.

Motivated by this, we add a new flag "IB_CQ_REPORT_MISSED_EVENTS" for
ib_req_notify_cq() so that it can return a hint about whether the a
completion may have been added before the request for notification.
The return value of ib_req_notify_cq() is extended so:

	 < 0	means an error occurred while requesting notification
	== 0	means notification was requested successfully, and if
		IB_CQ_REPORT_MISSED_EVENTS was passed in, then no
		events were missed and it is safe to wait for another
		event.
	 > 0	is only returned if IB_CQ_REPORT_MISSED_EVENTS was
		passed in.  It means that the consumer must poll the
		CQ again to make sure it is empty to avoid the race
		described above.

We add a flag to enable this behavior rather than turning it on
unconditionally, because checking for missed events may incur
significant overhead for some low-level drivers, and consumers that
don't care about the results of this test shouldn't be forced to pay
for the test.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-06 21:18:11 -07:00
Michael S. Tsirkin
f4fd0b224d IB: Add CQ comp_vector support
Add a num_comp_vectors member to struct ib_device and extend
ib_create_cq() to pass in a comp_vector parameter -- this parallels
the userspace libibverbs API.  Update all hardware drivers to set
num_comp_vectors to 1 and have all ULPs pass 0 for the comp_vector
value.  Pass the value of num_comp_vectors to userspace rather than
hard-coding a value of 1.

We want multiple CQ event vector support (via MSI-X or similar for
adapters that can generate multiple interrupts), but it's not clear
how many vectors we want, or how we want to deal with policy issues
such as how to decide which vector to use or how to set up interrupt
affinity.  This patch is useful for experimenting, since no core
changes will be necessary when updating a driver to support multiple
vectors, and we know that we want to make at least these changes
anyway.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-06 21:18:11 -07:00
Ralph Campbell
154257f362 IB/ipath: Fix a race condition when generating ACKs
Fix a problem where simple ACKs can be sent ahead of RDMA read
responses thus implicitly NAKing the RDMA read.

Signed-off-by: Ralph Campbell <ralph.cambpell@qlogic.com>
Signed-off-by: Robert Walsh <robert.walsh@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-06 21:18:11 -07:00
Ralph Campbell
6ed89b9574 IB/ipath: Fix two more spin lock problems
Fix a missing unlock in ipath_rc_rcv_resp() and remove an extra unlock
from ipath_rc_rcv_error().

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-06 21:18:11 -07:00
Steve Wise
aff9e39d97 RDMA/cxgb3: Support for new abort logic
The HW now posts 2 ABORT_RPL and/or PEER_ABORT_REQ messages.  We need
to handle them by silenty dropping the 1st but mark that we're ready
for the final message.  This plugs some close races between the uP and
HW.  Also update the minimum required firmware version.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-06 21:18:08 -07:00
Jean Delvare
6473d160b4 PCI: Cleanup the includes of <linux/pci.h>
I noticed that many source files include <linux/pci.h> while they do
not appear to need it. Here is an attempt to clean it all up.

In order to find all possibly affected files, I searched for all
files including <linux/pci.h> but without any other occurence of "pci"
or "PCI". I removed the include statement from all of these, then I
compiled an allmodconfig kernel on both i386 and x86_64 and fixed the
false positives manually.

My tests covered 66% of the affected files, so there could be false
positives remaining. Untested files are:

arch/alpha/kernel/err_common.c
arch/alpha/kernel/err_ev6.c
arch/alpha/kernel/err_ev7.c
arch/ia64/sn/kernel/huberror.c
arch/ia64/sn/kernel/xpnet.c
arch/m68knommu/kernel/dma.c
arch/mips/lib/iomap.c
arch/powerpc/platforms/pseries/ras.c
arch/ppc/8260_io/enet.c
arch/ppc/8260_io/fcc_enet.c
arch/ppc/8xx_io/enet.c
arch/ppc/syslib/ppc4xx_sgdma.c
arch/sh64/mach-cayman/iomap.c
arch/xtensa/kernel/xtensa_ksyms.c
arch/xtensa/platform-iss/setup.c
drivers/i2c/busses/i2c-at91.c
drivers/i2c/busses/i2c-mpc.c
drivers/media/video/saa711x.c
drivers/misc/hdpuftrs/hdpu_cpustate.c
drivers/misc/hdpuftrs/hdpu_nexus.c
drivers/net/au1000_eth.c
drivers/net/fec_8xx/fec_main.c
drivers/net/fec_8xx/fec_mii.c
drivers/net/fs_enet/fs_enet-main.c
drivers/net/fs_enet/mac-fcc.c
drivers/net/fs_enet/mac-fec.c
drivers/net/fs_enet/mac-scc.c
drivers/net/fs_enet/mii-bitbang.c
drivers/net/fs_enet/mii-fec.c
drivers/net/ibm_emac/ibm_emac_core.c
drivers/net/lasi_82596.c
drivers/parisc/hppb.c
drivers/sbus/sbus.c
drivers/video/g364fb.c
drivers/video/platinumfb.c
drivers/video/stifb.c
drivers/video/valkyriefb.c
include/asm-arm/arch-ixp4xx/dma.h
sound/oss/au1550_ac97.c

I would welcome test reports for these files. I am fine with removing
the untested files from the patch if the general opinion is that these
changes aren't safe. The tested part would still be nice to have.

Note that this patch depends on another header fixup patch I submitted
to LKML yesterday:
  [PATCH] scatterlist.h needs types.h
  http://lkml.org/lkml/2007/3/01/141

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:35 -07:00
Stephen Rothwell
40cd3a4564 [POWERPC] Rename get_property to of_get_property: drivers
These are all the remaining instances of get_property.  Simple rename of
get_property to of_get_property.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-02 20:04:32 +10:00
Steve Wise
60be4b5966 RDMA/cxgb3: Initialize cpu_idx field in cpl_close_listserv_req message
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-30 17:30:28 -07:00
Steve Wise
1860cdf802 RDMA/cxgb3: Fail qp creation if the requested max_inline is too large
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-30 17:30:28 -07:00
Steve Wise
4a97d47ef7 RDMA/cxgb3: Fix TERM codes
Fix TERMINATE layer, type, and ecode values based on
conformance testing.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-30 17:30:28 -07:00
Robert Walsh
6b66b2da1e IB/ipath: Don't corrupt pending mmap list when unmapped objects are freed
Fix the pending mmap code so it doesn't corrupt the list of pending
mmaps and crash the machine when pending mmaps are destroyed without
first being mapped.  Also, remove an unused variable, and use standard
kernel lists instead of our own homebrewed linked list implementation
to keep the pending mmap list.

Signed-off-by: Robert Walsh <robert.walsh@qlogic.com>
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-30 17:30:28 -07:00
Michael S. Tsirkin
9ba6d5529d IB/mthca: Work around kernel QP starvation
With mthca, RC QPs can starve each other and even UD QPs on the same
hardware schedule queue.  As a result, userspace MPI can starve
e.g. IPoIB traffic, with netdev watchdog warnings getting printed out,
and TCP connections getting stuck or failing.

Reduce the chance of this happening by using three separate hardware
schedule queues: one for userspace RC QPs, one for kernel RC QPs, and
one for all other QPs.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-30 17:30:28 -07:00
Ralph Campbell
c3af664adb IB/ipath: Don't put QP in timeout queue if waiting to send
This fixes a problem which causes too many RC timeouts and
retransmits.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-30 17:30:28 -07:00
Ralph Campbell
35ff032e65 IB/ipath: Don't call spin_lock_irq() from interrupt context
This patch fixes the problem reported by Bernd Schubert <bs@q-leap.de>
with kernel debug options enabled:

    BUG: at kernel/lockdep.c:1860 trace_hardirqs_on()

This was caused by using spin_lock_irq()/spin_unlock_irq() from
interrupt context.  Fix all the places that might be called from
interrupts to use spin_lock_irqsave()/spin_unlock_irqrestore().

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-30 17:30:27 -07:00
Paul Mackerras
49e1900d4c Merge branch 'linux-2.6' into for-2.6.22 2007-04-30 12:38:01 +10:00
Linus Torvalds
afc2e82c08 Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: (49 commits)
  IB: Set class_dev->dev in core for nice device symlink
  IB/ehca: Implement modify_port
  IB/umad: Clarify documentation of transaction ID
  IPoIB/cm: spin_lock_irqsave() -> spin_lock_irq() replacements
  IB/mad: Change SMI to use enums rather than magic return codes
  IB/umad: Implement GRH handling for sent/received MADs
  IB/ipoib: Use ib_init_ah_from_path to initialize ah_attr
  IB/sa: Set src_path_bits correctly in ib_init_ah_from_path()
  IB/ucm: Simplify ib_ucm_event()
  RDMA/ucma: Simplify ucma_get_event()
  IB/mthca: Simplify CQ cleaning in mthca_free_qp()
  IB/mthca: Fix mthca_write_mtt() on HCAs with hidden memory
  IB/mthca: Update HCA firmware revisions
  IB/ipath: Fix WC format drift between user and kernel space
  IB/ipath: Check that a UD work request's address handle is valid
  IB/ipath: Remove duplicate stuff from ipath_verbs.h
  IB/ipath: Check reserved memory keys
  IB/ipath: Fix unit selection when all CPU affinity bits set
  IB/ipath: Don't allow QPs 0 and 1 to be opened multiple times
  IB/ipath: Disable IB link earlier in shutdown sequence
  ...
2007-04-27 09:39:27 -07:00
Paul Mackerras
a48141db68 Revert "[POWERPC] Rename get_property to of_get_property: drivers"
This reverts commit d05c7a80cf,
which included changes which should go via other subsystem
maintainers.
2007-04-26 22:24:31 +10:00
Arnaldo Carvalho de Melo
d626f62b11 [SK_BUFF]: Introduce skb_copy_from_linear_data{_offset}
To clearly state the intent of copying from linear sk_buffs, _offset being a
overly long variant but interesting for the sake of saving some bytes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2007-04-25 22:28:23 -07:00
Arnaldo Carvalho de Melo
4305b54135 [SK_BUFF]: Convert skb->end to sk_buff_data_t
Now to convert the last one, skb->data, that will allow many simplifications
and removal of some of the offset helpers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:29 -07:00
Arnaldo Carvalho de Melo
27a884dc3c [SK_BUFF]: Convert skb->tail to sk_buff_data_t
So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
on 64bit architectures, allowing us to combine the 4 bytes hole left by the
layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
:-)

Many calculations that previously required that skb->{transport,network,
mac}_header be first converted to a pointer now can be done directly, being
meaningful as offsets or pointers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:28 -07:00
Arnaldo Carvalho de Melo
badff6d01a [SK_BUFF]: Introduce skb_reset_transport_header(skb)
For the common, open coded 'skb->h.raw = skb->data' operation, so that we can
later turn skb->h.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.

This one touches just the most simple cases:

skb->h.raw = skb->data;
skb->h.raw = {skb_push|[__]skb_pull}()

The next ones will handle the slightly more "complex" cases.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:25:15 -07:00
Arnaldo Carvalho de Melo
4c13eb6657 [ETH]: Make eth_type_trans set skb->dev like the other *_type_trans
One less thing for drivers writers to worry about.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:24:30 -07:00
Joachim Fenkes
1912ffbb88 IB: Set class_dev->dev in core for nice device symlink
All RDMA drivers except ehca set class_dev->dev to their dma_device
value (ehca leaves this unset).  dma_device is the only value that
makes any sense, so move this assignment to core/sysfs.c.  This reduce
the duplicated code in the rest of the drivers and gives ehca a nice
/sys/class/infiniband/ehcaX/device symlink.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-24 21:30:38 -07:00
Joachim Fenkes
c4ed790dfd IB/ehca: Implement modify_port
Add "Modify Port" verb support to eHCA driver.  The IB communication
manager needs this to set the IsCM port capability bit when
initializing.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-24 21:30:38 -07:00
Roland Dreier
30c00986f3 IB/mthca: Simplify CQ cleaning in mthca_free_qp()
mthca_free_qp() already has local variables to hold the QP's send_cq
and recv_cq, so we can slightly clean up the calls to mthca_cq_clean()
by using those local variables instead of expressions like
to_mcq(qp->ibqp.send_cq).

Also, by cleaning the recv_cq first, we can avoid worrying about
whether the QP is attached to an SRQ for the second call, because we
would only clean send_cq if send_cq is not equal to recv_cq, and that
means send_cq cannot have any receive completions from the QP being
destroyed.

All this work even improves the generated code a bit:

add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-5 (-5)
function                                     old     new   delta
mthca_free_qp                                510     505      -5

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-24 16:31:11 -07:00
Roland Dreier
532c3b5817 IB/mthca: Fix mthca_write_mtt() on HCAs with hidden memory
Commit b2875d4c ("IB/mthca: Always fill MTTs from CPU") causes a crash
in mthca_write_mtt() with non-memfree HCAs that have their memory
hidden (that is, have only two PCI BARs instead of having a third BAR
that allows access to the RAM attached to the HCA) on 64-bit
architectures.  This is because the commit just before, c20e20ab
("IB/mthca: Merge MR and FMR space on 64-bit systems") makes
dev->mr_table.fmr_mtt_buddy equal to &dev->mr_table.mtt_buddy and
hence mthca_write_mtt() tries to write directly into the HCA's MTT
table.  However, since that table is in the HCA's memory, this is
impossible without the PCI BAR that gives access to that memory.

This causes a crash because mthca_tavor_write_mtt_seg() basically
tries to dereference some offset of a NULL pointer.  Fix this by
adding a test of MTHCA_FLAG_FMR in mthca_write_mtt() so that we always
use the WRITE_MTT firmware command rather than writing directly if
FMRs are not enabled.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-24 16:31:04 -07:00
Roland Dreier
3f114853d4 IB/mthca: Update HCA firmware revisions
Update the driver's list of current firmware versions with Mellanox's
latest releases.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:21:02 -07:00
Robert Walsh
40b90430ec IB/ipath: Fix WC format drift between user and kernel space
The kernel ib_wc structure now uses a QP pointer, but the user space
equivalent uses a QP number instead.  This means we can no longer use
a simple structure copy to copy stuff into user space.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:21:01 -07:00
Robert Walsh
6ce73b07db IB/ipath: Check that a UD work request's address handle is valid
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:21:00 -07:00
Robert Walsh
0d6172a428 IB/ipath: Remove duplicate stuff from ipath_verbs.h
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:21:00 -07:00
Robert Walsh
253fb39020 IB/ipath: Check reserved memory keys
Don't let userspace use the direct-physical-map L_key or R_key.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:21:00 -07:00
Bryan O'Sullivan
f0810daf74 IB/ipath: Fix unit selection when all CPU affinity bits set
At some point things changed so that all the affinity bits can be set,
but cpus_full() macro is not true.  This caused problems with the unit
selection logic on multi-unit (board) configurations.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:59 -07:00
Bryan O'Sullivan
662af5813b IB/ipath: Don't allow QPs 0 and 1 to be opened multiple times
Signed-off-by: Robert Walsh <robert.walsh@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:59 -07:00
Bryan O'Sullivan
53c1d2c943 IB/ipath: Disable IB link earlier in shutdown sequence
Move the code that shuts down the IB link earlier in the unload
process, to be sure no new packets can arrive while we are unloading.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:59 -07:00
Bryan O'Sullivan
490462c268 IB/ipath: Prevent random program use of diags interface
To prevent random utility reads and writes of the diag interface to the
chip, we first require a handshake of reading from offset 0 and writing
to offset 0 before any other reads or writes can be done through the
diags device.   Otherwise chip errors can be triggered.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:59 -07:00
Bryan O'Sullivan
f5408ac7cc IB/ipath: On unrecoverable errors, force link down, LEDs off
If the chip is no longer usable, LEDs should be turned off so system
can be found easily in the cluster.

Also some minor reorganizing so both chips print hardware error
message at same point and only if there were unrecovered errors

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:59 -07:00
Michael Albaugh
27b044a815 IB/ipath: Fix driver crash (in interrupt or during unload) after chip reset
Re-init of the kernel structures after a chip reset was leaving the
portdata structure for port zero in an inconsistent state, and a
pointer to it either stale (in re-init code) or NULL (in devdata)
Fixing the order of operations on this struct, and the condition for
interrupt access, prevents the crashes.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:58 -07:00
Bryan O'Sullivan
9783ab4058 IB/ipath: Improve handling and reporting of parity errors
Mostly cleanup.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:58 -07:00
Bryan O'Sullivan
820054b7ca IB/ipath: Print better error messages if kernel is misconfigured
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:58 -07:00
Arthur Jones
569b87b47f IB/ipath: Force PIOAvail update entry point
Due to a chip bug, the PIOAvail register is not always updated to
memory.  This patch allows userspace to force an update.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:58 -07:00
Arthur Jones
7b196e2ff3 IB/ipath: Call free_irq() on chip specific initialization failure
In initialization, if we bailed at chip specific initialization, we
forgot to clean up the irq we had requested.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:58 -07:00
Bryan O'Sullivan
5a7d4eea91 IB/ipath: Discard multicast packets without a GRH
This patch fixes a bug where multicast packets without a GRH were not
being dropped as per the IB spec.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:57 -07:00
Bryan O'Sullivan
0ed3c594e3 IB/ipath: Fix calculation for number of kernel PIO buffers
If the module parameter "kpiobufs" is set too high, the calculation to
reset it to a sane value was incorrect.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:57 -07:00
Bryan O'Sullivan
c8c6f5d496 IB/ipath: Remove unused ipath_read_kreg64_port()
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:57 -07:00
Ralph Campbell
dd5190b6be IB/ipath: Fix RDMA reads of length zero and error handling
Fix RDMA read response length checking for RDMA_READ_RESPONSE_ONLY to
allow a zero length response.  RDMA read responses which don't match
the expected length or occur in response to some other operation
should generate a completion queue error (see table 56, ch. 9.9.2.3 in
the IB spec).

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:57 -07:00
Mark Debbage
c7e29ff11f IB/ipath: Allow receive ports mapped into userspace to be shared
Improve port-sharing performance by allowing any process to receive
packets from the shared hardware port under a spin lock for mutual
exclusion. Previously, one process was nominated as the master and
that process was responsible for receiving all packets from the shared
hardware port and either consuming them or forwarding them to their
destination. This led to starvation problems for other processes when
the master process was busy in computation phases.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:57 -07:00
Ralph Campbell
0a5a83cffc IB/ipath: Fix port sharing on powerpc
The port sharing feature mixed kernel virtual addresses as well as
physical addresses for the offset used to describe the mmap address to
map the InfiniPath hardware into user space.  This had a conflict on
powerpc.  The new scheme converts it to a physical address so it
doesn't conflict with chip addresses and yet still fits in 40/44 bits
so it isn't truncated by 32-bit applications calling mmap64().

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:56 -07:00
Bryan O'Sullivan
041eab9136 IB/ipath: Fix CQ flushing when QP is modified to error state
If a receive work request has been removed from the queue but has not
had a CQ entry generated for it and the QP is modified to the error
state, the completion entry generated is incorrect.  This patch fixes
the problem.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:56 -07:00
Bryan O'Sullivan
614d49a21e IB/ipath: Fix bad argument to clear_bit()
Code was converted from a &= ~mask to clear_bit, but the bit was left
shifted instead of being used directly, so we were either trashing
memory several pages away, or sometimes taking a kernel page fault on
an invalid page.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:56 -07:00
Bryan O'Sullivan
8ec1077b35 IB/ipath: Change packet problems vs chip errors handling and reporting
Some types of packet errors are moderately common with longer IB
cables and large clusters, and are not reported with prints by other
IB HCA drivers.  This suppresses those messages unless the new
__IPATH_ERRPKTDBG bit is set in ipath_debug.  Reporting of temporarily
disabled frequent error interrupts was also made clearer

We also distinguish between chip errors, and bad packets sent or
received in the wording of the messages.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:55 -07:00
Ralph Campbell
6f5c407460 IB/ipath: Fix PSN update for RC retries
This patch fixes a number of bugs with updating the PSN for retries of
RC requests.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:55 -07:00
Ralph Campbell
0434d271fd IB/ipath: Fix QP error completion queue entries
When switching to the QP error state, the completion queue entries
(error or flush) were not being generated correctly.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:55 -07:00
Bryan O'Sullivan
39c0d0b919 IB/ipath: Fix up some debug messages
ipath_dbg doesn't need the same prefixes that printk does.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:55 -07:00
Ralph Campbell
3859e39d75 IB/ipath: Support larger IB_QP_MAX_DEST_RD_ATOMIC and IB_QP_MAX_QP_RD_ATOMIC
This patch adds support for multiple RDMA reads and atomics to be sent
before an ACK is required to be seen by the requester.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:55 -07:00
Ralph Campbell
7b21d26dda IB/ipath: NMI cpu lockup if local loopback used
If a post send is done in loopback and there is no receive queue
entry, the sending QP is put on a timeout list for a while so the
receiver has a chance to post a receive buffer. If the another post
send is done, the code incorrectly tried to put the QP on the timeout
list again an corrupted the timeout list. This eventually leads to a
spin lock deadlock NMI due to the timer function looping forever with
the lock held.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:54 -07:00
Ralph Campbell
9f9630d5e1 IB/ipath: Fix SRQ limit event causing dropped CQ entry
A silly programming error causes a CQ entry to not be generated if a
SRQ limit event is generated.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:54 -07:00
Ralph Campbell
947d7617a1 IB/ipath: Don't initialize port memory for subports
A recent change was made to allocate memory for a port after CPU
affinity is set. That change didn't account for subports and was
trying to allocate memory for the port twice.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:54 -07:00
Bryan O'Sullivan
1908574559 IB/ipath: Definitions of two RXE parity err bits were reversed
The chip documentation on the expected TID vs eager TID parity error
bits was reversed from what was implemented in the RTL, for both
chips.  This corrects the definitions.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:54 -07:00
Bryan O'Sullivan
165c552c35 IB/ipath: Fix user memory region creation when IOMMU present
The loop which initializes the user memory region from an array of
pages was using the wrong limit for the array.  This worked OK when
dma_map_sg() returned the same number as the number of pages.  This
patch fixes the problem.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:54 -07:00
Bryan O'Sullivan
946db67fbf IB/ipath: Add ability to set and clear IB local loopback
This is a sticky state.  It is useful for diagnosing problems with
boards versus cable/switch problems.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:53 -07:00
Michael S. Tsirkin
608d8268be IB/mthca: Fix data corruption after FMR unmap on Sinai
In mthca_arbel_fmr_unmap(), the high bits of the key are masked off.
This gets rid of the effect of adjust_key(), which makes sure that
bits 3 and 23 of the key are equal when the Sinai throughput
optimization is enabled, and so it may happen that an FMR will end up
with bits 3 and 23 in the key being different.  This causes data
corruption, because when enabling the throughput optimization, the
driver promises the HCA firmware that bits 3 and 23 of all memory keys
will always be equal.

Fix by re-applying adjust_key() after masking the key.

Thanks to Or Gerlitz for reproducing the problem, and Ariel Shahar for
help in debug.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-16 14:10:55 -07:00
Stephen Rothwell
d05c7a80cf [POWERPC] Rename get_property to of_get_property: drivers
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-04-13 03:55:19 +10:00
Stephen Rothwell
a7edd0e676 [POWERPC] get_property returns const
This just tidies up some of the remains.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-04-13 03:55:17 +10:00
Steve Wise
1ca19770c5 RDMA/cxgb3: Add set_tcb_rpl_handler
As of commit 6cdbd77e ("cxgb3 - missing CPL hanler and register
setting."), the cxgb3 ethernet NIC driver no longer handles SET_TCB
replies, so we need to do it in the iWARP driver.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-12 10:37:11 -07:00
Michael S. Tsirkin
0264d88531 IB/mthca: Fix thinko in init_mr_table()
Commit c20e20ab ("IB/mthca: Merge MR and FMR space on 64-bit systems")
swapped the number of MTTs and MPTs when initializing the MR table. As
a result, we get a kernel oops when the number of MTT segments
allocated exceeds 0x20000.

Noted by Troy Benjegerdes <troy@scl.ameslab.gov>, and reproduced by
Dotan Barak <dotanb@mellanox.co.il>.  This fixes
https://bugs.openfabrics.org/show_bug.cgi?id=490

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-26 15:59:32 -07:00
Steve Wise
ed6ee5178e RDMA/cxgb3: Fix resource leak in cxio_hal_init_ctrl_qp()
This was spotted by the Coverity checker (CID 1554).

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-26 15:54:40 -07:00
Joachim Fenkes
73b9e9870f IB/ehca: Make scaling code work without CPU hotplug
eHCA scaling code must not depend on register_cpu_notifier() if
CONFIG_HOTPLUG_CPU is not set, so put all related code into #ifdefs.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-22 14:40:16 -07:00
Steve Wise
d601347188 RDMA/cxgb3: Handle build_phys_page_list() failure in iwch_reregister_phys_mem()
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-22 14:40:16 -07:00
Bryan O'Sullivan
fae8773b73 IB/ipath: Check return value of lookup_one_len
This fixes kernel.org bug 8003.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-22 14:40:15 -07:00
Al Viro
62577fa324 [PATCH] fix ipath_dma_free_coherent() prototype
method gets u64, not dma_addr_t

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-14 15:27:49 -07:00
Steve Wise
e64518f373 RDMA/cxgb3: Fix MR permission problems
Fix memory region permission problems:

- remove useless and redundant iwch_mem_perms enum.

- create ib_to_tpt_access_rights() for mapping ib access rights
  to T3 TPT permissions.

- create ib_to_mwbind_access_rights() for mapping ib access rights
  to T3 MWBIND WR permissions.

- fix up the mem reg code to utilize the new functions.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06 12:51:02 -08:00
Steve Wise
1f6a849b7c RDMA/cxgb3: Don't reuse skbs that are non-linear or cloned
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06 12:50:57 -08:00
Steve Wise
8cfccf02bb RDMA/cxgb3: Squelch logging AE errors
Only print one AE error for a given connection in the kernel log.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06 12:50:53 -08:00
Steve Wise
adf376b370 RDMA/cxgb3: Stop EP timer when MPA exchange is aborted by peer
Stop the endpoint timer when the MPA exchange is aborted by the peer.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06 12:50:49 -08:00
Steve Wise
2df50da00e RDMA/cxgb3: Move QP to error on destroy if the state is IDLE
Change iwch_destroy_qp() to always move the QP to ERROR and let
iwch_modify_qp() decide what to do.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06 12:50:45 -08:00
Steve Wise
42e3175354 RDMA/cxgb3: Fixes for "normal close" failures
Fixes for "normal close" failures:

- Start normal close timer when moving to CLOSING state.
- Handle ABORTING state in close_con_rpl().
- Stop timer correctly on abort during a normal close.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06 12:50:37 -08:00
David Miller
c3bb1092c8 RDMA/cxgb3: Fix build on sparc64
cxgb3 uses dma_alloc_coherent() et al. thus needs linux/dma-mapping.h
include in order to build reliably.

Noticed on sparc64.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06 12:45:57 -08:00
Steve Wise
aeb100e246 RDMA/cxgb3: Don't use mm after it's freed in iwch_mmap()
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06 11:48:17 -08:00