Incorrect dma mask was used for blinkled (firmware assert) recovery or
user initiated reset during initialization portion. Ensure that all
callers of aac_fib_map_free null out the fib allocation references to
prevent multiple free. Although serious sounding, no reports of these
problems have surfaced...
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add the ability for an application to issue a hardware reset to the
adapter via sysfs. Typical uses include restarting the adapter after it
has been flashed. Bumped revision number for the driver and added a
feature to periodically check the adapter's health (check_interval),
update the adapter's concept of time (update_interval) and block
checking/resetting of the adapter (check_reset).
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Inspired somewhat by Vignesh Babu <vignesh.babu@wipro.com> patch to
dpt_i2o.c to replace kmalloc/memset sequences with kzalloc, doing the
same for the aacraid driver.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add some likely() and unlikely() compiler hints in some of the aacraid
hardware interface layers. There should be no operational side effects
resulting from this patch and the changes should be mostly benign on x86
platforms.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn,
This set of fixes improve error handling stability of the driver. A popular
manifestation of the problems is an NULL pointer reference in the interrupt
handler when referencing portions of the scsi command context, or in the
scsi_done handling when an offlined device is referenced.
The aacraid driver currently does not get notification of orphaned command
completions due to devices going offline. The driver also fails to handle the
commands that are finished by the error handler, and thus can complete again
later at the hands of the adapter causing situations of completion of an
invalid scsi command context. Test Unit Ready calls abort assuming that the
abort was successful, but are not, and thus when the interrupt from the adapter
occurs, they reference invalid command contexts. We add in a TIMED_OUT flag to
inform the aacraid FIB context that the interrupt service should merely release
the driver resources and not complete the command up. We take advantage of this
with the abort handler as well for select abortable commands. And we detect and
react if a command that can not be aborted is currently still outstanding to
the controller when reissued by the retry mechanism.
Signed-off-by: Mark Haverkamp <markh@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn,
Outstanding ioctl calls still have some problems with aborting cleanly
in the face of a reset iop recovery action should the adapter ever enter
into a Firmware Assert (BlinkLED) condition. The enclosed patch resolves
some uncovered flawed handling.
Signed-off-by: Mark Haverkamp <markh@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn,
This patch is to resolve a namespace issue that will result from a patch
expected in the future that adds a new interface; rationalized as
correcting a long term issue where hw_fib, instead of hw_fib_va, refers
to the virtual address space and hw_fib_pa refers to the physical
address space. A small fragment of this patch also cleans up an unused
variable that was close to the patch fragments.
Signed-off-by: Mark Haverkamp <markh@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn,
This patch updates the adapter restart function to deal with some
adapters that have specific IOP reset needs. Since the code for
restarting the adapter was in two places, changed over to utilizing a
platform function in one place.
Signed-off-by: Mark Haverkamp <markh@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn,
Replace all if/else communication transports with a platform function call.
This is in recognition of the need to migrate to up-and-coming transports.
Currently the Linux driver does not support two available communication
transports provided by our products, these will be added in future patches, and
will expand the platform function set.
Signed-off-by Mark Haverkamp <markh@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn:
Add code to abort outstanding management ioctl fibs when the blinkLED recovery
is performed. This code is 'clunky' and does not have any real feedback in that
the reset could progress before the user application has gotten it's
notification of command completion. We put a schedule() call to delay just the
right amount for most cases, because we tried a spin and still managed to find
cases where we would spin forever waiting for the management application to
acknowledge the impending doom surrounding the cause of the BlinkLED. Will
cause an oops in the context of the management application if we proceed too
quickly. I view this as the lesser of many evils since currently if there are
outstanding management ioctls during a need to reset/recover the adapter, the
management application just locks up and waits forever. The best practices fix
for this problem not going to be simple or easy (at least the fixes I imagine
today); and we found a balance between the needs of the driver to proceed, and
the applications that locked or confused that would hold back the driver. I
just do not like the idea of a kernel oops in an application to deal with low
priority, sluggish or misbehaving applications.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn:
Blinkled at startup is useful for catching Adapters in a lot of pain, in a
BlinkLED assert, quickly; rather than waiting several minutes for commands to
timeout.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn:
Until the system is stabilized, I am suggesting the enclosed
modification to prevent the driver from tickling the panic. Once sysfs
and friends are stabilized, the patch may be backed out. We have yet to
evaluate if we really want to relinquish existing Scsi Devices in any
case, holding on to them as configuration of arrays comes and goes makes
some sense as well. As a result, we have opted to pull the lines rather
than comment them in legacy.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn:
Basically cleanup, nothing here will have an affect. Adjusting some
error codes, removing superfluous definitions and code fragments.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn
If the adapter is in blinkled (Firmware Assert) when error recovery
timeout actions have been triggered, perform an adapter warm reset and
restart the initialization.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn
This patch allows the FSACTL_SEND_LARGE_FIB, FSACTL_SENDFIB and
FSACTL_SEND_RAW_SRB ioctl calls into the aacraid driver to be
interruptible. Only necessary if the adapter and/or the management
software has gone into some sort of misbehavior and the system is being
rebooted, thus permitting the user management software applications to
be killed relatively cleanly. The FIB queue resource is held out of the
free queue until the adapter finally, if ever, completes the command.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received From Mark Salyzyn
The queue tracking is just not being used, not even for debugging. Information
about outstanding commands can be acquired from the scsi structures.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received From Mark Salyzyn
Add the ability to adjust for unusual corner case failures. Both of
these additional module parameters deal with embedded, non-intel or
complicated system scenarios.
Aif_timeout can be increased past the default 2 minute timeout to drop
application registrations when a system has an unusually high event load
resulting from continuing management requests, or simultaneous builds,
or sluggish user space as a result of system load.
Startup_timeout can be increased past the default 3 minute timeout to
drop an adapter initialization for systems that have a very large number
of targets, or slow to spin-up targets, or a complicated set of array
configurations that extend the time for the firmware to declare that it
is operational. This timeout would only have an affect on non-intel
based systems, as the (more patient) BIOS would generally be where the
startup delay would be dealt with.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn
Remove superfluous code, optimize code, harden code, cast code, correct
some text, use msleep instead of schedule_timeout_interruptible. No
bugs.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn
Plug and play actions resulting from event sequences shall time out if
they take longer than 30 seconds to complete.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Use the kthread_ API instead of opencoding lots of hairy code for kernel
thread creation and teardown.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Salyzyn, Mark <mark_salyzyn@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn,
Reduce the possibility of namespace collision. Prefix with aac_.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This is the drivers/scsi/ part of the big kfree cleanup patch.
Remove pointless checks for NULL prior to calling kfree() in drivers/scsi/.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Acked-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Received from Mark Salyzyn.
This patch adds the 'new comm' interface, which modern AAC based
adapters that are less than a year old support in the name of much
improved performance. These modern adapters support both the legacy and
the 'new comm' interfaces.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn from Adaptec.
High Priority Queues have *never* been used in the entire history of the
aac based adapters. Associated with this, aac_insert_entry can be
removed, SavedIrql can be removed & padding variable can be removed.
With the movement of SavedIrql out & replaced with an automatic variable
qflags, the locking can be refined somewhat. The sparse warnings did not
catch the need for byte swapping in the 'dprintk' debugging print
macros, so fixed this up when this code was moved outside of the now
refined locking.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn from Adaptec.
In the rare instances where the adapter, or the motherboard, is
misbehaving; driver initialization or shutdown becomes problematic. By
introducing a 3 minute timeout on the first interrupt driven command
during initialization, or the issuance of the adapter shutdown command
during driver unload, we can resolve the lockup problems induced by
common (but rare) hardware misbehaviors.
The timeout during initialization, should it occur, is accompanied by a
message presented to the console and the logs indicating that the user
should inspect and resolve problems with interrupt routing.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn from Adaptec.
Hotplug sniffs the AIFs (events) from the adapter and if a container
change resulting in the device going offline (container zero), online
(container zero completed) or changing capacity (morph) it will take
actions by calling the appropriate API.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Recevied from Mark Salyzyn from Adaptec.
Aif pre-allocation is used to pull the kmalloc outside of the locks.
Applies to the scsi-misc-2.6 git tree.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn from Adaptec:
If more than two commands are outstanding to the controller, there is no
need to notify the adapter via a PCI bus transaction of additional
commands added into the queue; it will get to them when it works through
the produce/consumer indexes.
This reduced the PCI traffic in the driver to submit a command to the
queue to near zero allowing a significant number of commands to be
turned around with no need to block for the PCI bridge to flush the
notify request to the adapter.
Interrupt mitigation has always been present in the driver; it was
turned off because of a bug that prevented one from realizing the
usefulness of the feature. This bug is fixed in this patch.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
New code from the Adaptec driver. Performance enhancement for newer
adapters. I hope that this isn't too big for a single patch. I believe
that other than the few small cleanups mentioned, that the changes are
all related.
- Added Variable FIB size negotiation for new adapters.
- Added support to maximize scatter gather tables and thus permit
requests larger than 64KB/each.
- Limit Scatter Gather to 34 elements for ROMB platforms.
- aac_printf is only enabled with AAC_QUIRK_34SG
- Large FIB ioctl support
- some minor cleanup
Passes sparse check.
I have tested it on x86 and ppc64 machines.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch addresses the sparse -Wbitwise warnings that Christoph wanted
me to eliminate. This mostly consisted of making data structure
elements of hardware associated structures the __le* equivalent.
Although there were a couple places where there was mixing of cpu and le
variable math. These changes have been tested on both an x86 and ppc
machine running bonnie++. The usage of the LE32_ALL_ONES macro has been
eliminated.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch makes some needlessly global functions static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!