* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] sd: udev accessing an uninitialized scsi_disk field results in a crash
[SCSI] st: A MTIOCTOP/MTWEOF within the early warning will cause the file number to be incorrect
[SCSI] qla4xxx: bug fixes
[SCSI] Fix scsi_add_device() for async scanning
If an EH command times out today, the LLDD's abort handler
will be called to abort the command. It is assumed that this
completes successfully, which can result in the command getting
completed later resulting in an oops. Improve the current
implementation by escalating all the way to host reset if
necessary in order to clean up the EH command.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
In asd_initiate_ssp_tmf, the TMF result code is replaced with
TMF_RESP_FUNC_FAILED except when the TMF returns a result code immediately.
However, TMFs can return result codes via an ESCB... yet these codes are
also replaced with "FAILED". The only values that can fall into that case
are TMF_* codes anyway, so get rid of this code where COMPLETE and SUCCESS
are turned into FAILED. This also lets us propagate those TMF_* codes up
to the caller.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
After discussion with andmike and dougg, it seems that the purpose of
eh_device_reset_handler is to issue LU resets, and that
eh_bus_reset_handler would be a more appropriate place for a phy reset.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
libsas: Don't BUG when connecting two expanders via wide port
When a device is connected to an expander, the discovery process goes through
sas_ex_discover_dev to figure out what's attached to the phy. If it is the
case that the phy being discovered happens to be the second phy of a wide link
to an expander, that discover_dev function will incorrectly call
sas_ex_discover_expander, which creates another sas_port and tries to attach the
other sas_phys to the new port, thus triggering a BUG. The correct thing to do is
to check the other ex_phys of the expander to see if there's a sas_port for this
sas_phy, and attach the sas_phy to the existing sas_port.
This is easily triggered if one enables the phys of a wide port between
expanders one by one.
This second version of the patch fixes a small regression in the case where
all the phys show up at once and we accidentally try to attach to a port
that hasn't been created yet.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
On Thu, 1 Feb 2007, Andrew Morton wrote:
> On Thu, 1 Feb 2007 15:34:29 -0800
> bugme-daemon@bugzilla.kernel.org wrote:
>
> > http://bugzilla.kernel.org/show_bug.cgi?id=7919
> >
> > Summary: Tape dies if wrong block size used
> > Kernel Version: 2.6.20-rc5
> > Status: NEW
> > Severity: normal
> > Owner: scsi_drivers-other@kernel-bugs.osdl.org
> > Submitter: dmartin@sccd.ctc.edu
> >
> >
> > Most recent kernel where this bug did *NOT* occur: 2.6.17.14
> >
> > Other Kernels Tested and Results:
> >
> > OK 2.6.15.7
> > OK 2.6.16.37
> > OK 2.6.17.14
> > BAD 2.6.18.6
> > BAD 2.6.18-1.2869.fc6
> > BAD 2.6.19.2 +
> > BAD 2.6.20-rc5
> >
> > NOTE: 2.6.18-1.2869.fc6 is a Fedora modified kernel, all others are from kernel.org
> >
...
> > Steps to reproduce:
> > Get a Adaptec AHA-2940U/UW/D / AIC-7881U card and a tape drive,
> > install a recent kernel
> > set the tape block size - mt setblk 4096
> > read from or write to tape using wrong block size - tar -b 7 -cvf /dev/tape foo
> >
Write does not trigger this bug because the driver refuses in fixed block
mode writes that are not a multiple of the block size. Read does trigger
it in my system.
The bug is not associated with any specific HBA. st tries to do direct i/o
in fixed block mode with reads that are not a multiple of tape block size.
The patch in this message fixes the st problem by switching to using the
driver buffer up to the next close of the device file in fixed block mode
if the user asks for a read like this.
I don't know why the bug has surfaced only after 2.6.17 although the st
problem is old. There may be another bug in the block subsystem and this
patch works around it. However, the patch fixes a problem in st and in
this way it is a valid fix.
This patch may also fix the bug 7900.
The patch compiles and is lightly tested.
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
sd_probe() calls class_device_add() even before initializing the
sdkp->device variable. class_device_add() eventually results in the user mode
udev program to be called. udev program can read the the allow_restart
attribute of the newly created scsi device. This is resulting in a crash as
the show function for allow_restart (i.e sd_show_allow_restart) returns the
attribute value by reading the sdkp->device->allow_restart variable. As the
sdkp->device is not initialized before calling the user mode hotplug helper,
this results in a crash.
The patch below solves it by calling class_device_add() only after the
necessary fields in the scsi_disk structure are initialized properly.
Signed-off-by: Nagendra Singh Tomar <nagendra_tomar@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
1) If the device reports an uncorrectable MEDIUM ERROR, such
as SK MEDIUM ERROR, ASC UNRECOVERED READ ERR, AMNF DATA
FIELD or RECORD NOT FOUND, then: In scsi_check_sense()
return SUCCESS so as to not retry -- the error is
uncorrectable -- this speeds up total processing time.
Signed-off-by: Luben Tuikov <ltuikov@yahoo.com>
Extracted the MEDIUM ERROR piece and
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Since, mailbox commands are executed in a synchronous
manner, there is no need to have a separate spinlock
primitive to protect data/register access shared by callers.
Signed-off-by: Seokmann Ju <seokmann.ju@qlogic.com>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
As ISP24xx firmware can return a CS_DATA_UNDERRUN completion
status when the storage has returned a
SAM_STAT_TASK_SET_FULL scsi-status.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Non-ISP24xx cards must have a loop-id in order to query host
statistics.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Limit assignments via qla2x00_model_name[] array to HBA
subsystem vendor IDs equal to QLogic.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Previous work to add asynchronous-scsi-scanning support
(d19044c32b) caused peculiar
semantic changes when no cabling was attached to the HBA
whereby unneeded and intrusive 'error-handling' would take
place due to the initial link state being unset.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Similarly to previous LOGO requests on non-24xx hardware,
perform an implicit-LOGO as to avoid the potential 2 *
R_A_TOV delay which can result during an explicit-LOGO
request.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This includes BIOS, EFI, FCODE and firmware versions.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
No restriction should be placed on the IRQ number assigned
to a given ISP. Original code incorrectly assumed a
non-zero IRQ number assignment by the system. In these
circumstances the proper freeing of the IRQ (via free_irq())
would not take place.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
What DMA for 16bit pcmcia card, anyway? We never do request_dma()
there and ->dma_channel never changes since initialization to -1.
IOW, that call is dead code.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch removes a duplicate device id from the IPR driver. Based on
the ipr.h file, I'm not so sure this was intended to be a duplicate, and
if so, the .h file should be modified to use the proper sub-device id
instead.
This was pointed out to me by Kay Sievers <kay.sievers@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Set allow_restart=1 for all SAS disks so that they are spun up when needed.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Register libsas's default device reset code with the scsi.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch moves the code that handles SAS failures out of the main EH
function and into a separate function. It also detects commands that have
no sas_task (i.e. they completed, but with error data) and sends them into
scsi_error for processing. This allows us to handle SCSI errors (and
enables auto-spinup as a side effect) instead of dropping them on the
floor and falling into an infinite loop. It also requires the
implementation of a device reset function, which the SAS failure code has
been modified to employ for REQ_DEVICE_RESET.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Export a couple of functions from scsi_error that are needed to handle
failed SCSI commands from the SAS EH.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
make exports GPL and
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Get rid of: "warning: ignoring return value of sysfs_create_link..."
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
sas_rphy_delete does two things: it removes the sas_rphy from the transport
layer and frees the sas_rphy. This can be broken down into two functions,
sas_rphy_remove and sas_rphy_free; sas_rphy_remove is of interest to
sas_discover_root_expander because it calls functions that require
sas_rphy_add as a prerequisite and can fail (namely sas_discover_expander).
In that case, sas_discover_root_expander needs to be able to undo the effects
of sas_rphy_add yet leave the job of freeing the sas_rphy to the caller of
sas_discover_root_expander.
This patch also removes some unnecessary code from sas_discover_end_dev
to eliminate an unnecessary cycle of sas_notify_lldd_gone/found for SAS
devices, thus eliminating a sas_rphy_remove call (and fixing a race condition
where a SCSI target scan can come in between the gone and found call).
It also moves the sas_rphy_free calls into sas_discover_domain and
sas_ex_discover_end_dev to complement the sas_rphy_allocation via
sas_get_port_device.
This patch does not change the semantics of sas_rphy_delete.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Currently, sas_form_port checks the given asd_sas_phy's sas_phy to see if
there's already a port attached. If so, the SAS addresses of the port and
the phy are compared to determine if we need to detach from the port
because the addresses don't match or if we can stop; the SAS address stored
in the sas_port reflects whatever device _was_ attached to the port/phy, and
the SAS address stored in the sas_port reflects whatever device we just
discovered. As written, the code detaches from the port if the addresses
_do_ match, and prints an error if they do _not_ match. I believe this to
be incorrect, as it seems more logical to keep the port if the addresses
match (i.e. the phy was reset but the device didn't change), and detach it
they do not (i.e. the device changed).
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn,
Take the expose_physicals flag and allow the user to select default (physicals
available via /dev/sg), exposed (physicals available via /dev/sd for
experimental reasons) and hidden (physicals blocked from all access). This
expands the functionality of the previous expose_physicals insmod parameter
which was added to support some experimental configurations.
Signed-off-by Mark Haverkamp <markh@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn,
Replace all if/else packet formations with platform function calls. This is in
recognition of the proliferation of read and write packet types, and in the
need to migrate to up-and-coming packets for new products.
Signed-off-by Mark Haverkamp <markh@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn,
Add in the NEMER/ARK physical register mapping, represented in up and coming
products currently under test at Adaptec.
Signed-off-by Mark Haverkamp <markh@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn,
Replace all if/else communication transports with a platform function call.
This is in recognition of the need to migrate to up-and-coming transports.
Currently the Linux driver does not support two available communication
transports provided by our products, these will be added in future patches, and
will expand the platform function set.
Signed-off-by Mark Haverkamp <markh@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Since the pci_block_user_cfg_access API was modified to track
block/unblocks, it was discovered that the ipr driver had a
path through its code (in PCI error recovery) which would unblock
when not previously blocked.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Don't fail initialization of an adapter if the PCI-X registers
cannot be found since it may be a PCI-E adapter.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Since ipr handles dynamic ids, it must handle driver_data
not being set, so remove the current usage of driver_data
so it can be used for other things in future patches.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
fix typos and bump version number
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Acked-by: Alexis Bruemmer <alexisb@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
On Wed, 24 Jan 2007, Andrew Morton wrote:
> On Mon, 22 Jan 2007 13:07:20 -0800
> bugme-daemon@bugzilla.kernel.org wrote:
>
> > http://bugzilla.kernel.org/show_bug.cgi?id=7864
> >
> > Summary: A MTIOCTOP/MTWEOF within the early warning will cause
> > the file number to be incorrect
> > Kernel Version: 2.6.19.2
> > Status: NEW
> > Severity: low
> > Owner: io_scsi@kernel-bugs.osdl.org
> > Submitter: ce_reisinger@yahoo.com
> >
> >
> > Write records to a SCSI tape until a write fails with a ENOSPC (you have reached
> > early warning.
> > Now perform a:
> > struct mtget before, after;
> > ioctl(fd, MTIOCGET, &before);
> > struct mtop mtop = { MTWEOF, 1 };
> > ioctl(fd, MTIOCTOP, &mtop);
> > ioctl(fd, MTIOCGET, &after);
> >
> > Check the value of mt_fileno in the before and after structures. Notice the
> > after is 2 greater then the before.
> >
> > The problem appears to be in the block of code starting at line 2817 in st.c.
> > This block is entered because the drive did return a CHECK CONDITION with NO
> > SENSE and the SENSE_EOM bit set. At lines 2824/5 the fileno is incremented. But
> > it has already been increased by the number of filemarks requested by the
> > MTIOCTOP. I believe that the residue count in the sense data should be
> > subtracted from fileno, not a increment as is done.
> >
>
> Thanks. Could you please send us a tested patch to fix these things, as
> per http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt ?
>
The analysis is basically correct and explains the bug. According to the
SCSI standards, the sense code is NO SENSE or RECOVERED ERROR in case
writing filemark(s) succeeds. If it fails (partly or completely) the sense
code is VOLUME OVERFLOW. The patch below is tested to fix the case when
one filemark is successfully written after the EOM early warning. It
should also fix the case at real EOM but this has not been tested.
Carl, thanks for reporting the bug and providing the analysis for the fix.
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The included patch fixes the following issues:
1. qla3xxx/qla4xxx co-existence issue which can result in a lockup
when qla3xxx driver is unloaded, or when ifdown; ifup is performed on
one of the interfaces correponding to qla3xxx. This is because qla4xxx
HBA supports one ethernet and iscsi interfaces per port. Both iscsi
and ethernet interfaces share the same state machine. The problem has
to do with synchronizing access to the state machine in the event of a
reset
2. mutex_lock() is sometimes not followed by mutex_unlock() prior to
invoking a msleep() in qla4xxx_mailbox_command()
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
I had thought that all drivers which didn't call scsi_scan_host()
called scsi_scan_target(). Some, such as sbp2, mptsas and libata-scsi,
call scsi_add_device() or __scsi_add_device(). We just need to wait
for the currently executing async scans to complete first. This is the
same code that's in scsi_scan_target(), except that we have to return
an error instead of void when we're declining to scan at all.
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The EH should fall into I_T recovery (and potentially stronger
remedies) if ABORT TASK fails.
Signed-off-by: Alexis Bruemmer <alexisb@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Track sas_ha_struct state so that we ignore events that come in while
we're shutting things down.
Signed-off-by: Malahal Naineni <malahal@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Convert the phy port locks to use irq spinlocks.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add the necessary hooks to the aic94xx driver to support the asynchronous SCSI
device scan infrastructure.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Allowing the phy reset controls to be world-triggerable does not seem like
a terribly good idea because SAS devices can be disrupted (and ATA devices
are really disrupted) by a phy reset. By default only root should be able
to do things like that.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Extend the use of the DDB lock to include all DDB accesses, because
DDB updates now occur from multiple threads. This fixes the SMP timeout
problems that we were occasionally seeing with a x260, because the
controller got confused when the DDBs got corrupted.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Ed Chim of Adaptec informs us that the DDB registers need to be zeroed at
initialization time and that some SCB initializations need to happen even if
we don't use the SCB.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The vmalloc() blob holding the sequencer firmware wasn't being released at
module unload time, which resulted in a memory leak.
Signed-off-by: Alexis Bruemmer <alexisb@us.ibm.com>
Acked-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Now that task aborts and device port resets are done by the EH, we can
remove all the code that set up workqueues and such and simply call
sas_task_abort and let libsas figure things out.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
sas_task_abort() should simply abort the upper-level SCSI command and wait
until the error handler to send the actual ABORT TASK command. By
deferring things to the EH we simplify the concurrency coordination and
eliminate some race conditions. Note that sas_task_abort has a few hooks
to handle libsas internal commands properly too.
Also rename do_sas_task_abort to __sas_task_abort just in case we really
want to abort the task *right now* and we don't have a scsi_cmnd attached
to the command. This is a hook for libata internal commands to abort.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
When a SAS LLDD needs to request a device port reset, it needs to have all
commands aborted before it can reset the port. Since commands are put on
the EH's list in the order that they were queued, the LLDD can set a "need
reset" flag in the last task to be aborted so that the EH can reset the
port after all commands are aborted.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This flag is no longer necessary because we push tasks to be aborted into
the EH as soon as we possibly can, and let the SCSI EH code take care of
the coordination for which this flag was used.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
In this driver, TMF_QUERY_TASK translates to QUERY_SSP_TASK. The
sequencer, it seems, is perfectly happy sending us a SSP response, which
this function promptly "converts" into TMF_RESP_FUNC_FAILED. This leads to
the SAS EH making bad decisions based on bad data, so we should not perform
the conversion in this case.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The aic94xx module has a parameter that looks like it should set
lldd_max_execute_num in the sas_ha, but it never sets this value. Either
we should set it or remove the parameter. This allows us to enable task
collector mode for this driver, though it is still off by default.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
If we use task collector mode, we can end up destroying the task collector
thread before we release the ports, which is bad if a port release causes
a disk I/O (such as cache flushing).
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Every so often, a scsi_cmnd will time out, and the libsas timeout handler
will discover that the scsi_cmnd does not have a sas_task attached to it.
This can happen in two cases: (1) the scsi_cmnd actually made it through
libsas to the HBA and is now going through scsi_done, or (2) the
scsi_cmnd has been held up (host lock, slab alloc, etc) and libsas has
not yet attached a sas_task. In both cases, it is safe to ask SCSI for
more time to process the command via EH_RESET_TIMER; we cannot blindly
return EH_HANDLED because if (2) happens, we could end up calling
scsi_done while another CPU is heading towards sas_queuecommand, which
causes slab corruption when sas_task_done updates the freed scsi_cmnd.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch lets a user arbitrarily enable or disable a phy via sysfs.
Potential applications include shutting down a phy to replace one
lane of wide port, and (more importantly) providing a method for the
libata SATL to control the phy.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
On a system with many SAS targets, it appears possible that a scsi_cmnd
can time out without ever making it to the SAS LLDD or at the same time
that a completion is occurring. In both of these cases, telling the
LLDD to abort the sas_task makes no sense because the LLDD won't know
about the sas_task; what we really want to do is to increase the timer.
Note that this involves creating another sas_task bit to indicate
whether or not the task has been sent to the LLDD; I could have
implemented this by slightly redefining SAS_TASK_STATE_PENDING, but
this way seems cleaner.
This second version amends the aic94xx portion to set the
TASK_AT_INITIATOR flag for all sas_tasks that were passed to
lldd_execute_task.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
sas_get_port_device assigns a rphy to a domain device in anticipation
of finding a disk. When a discovery error occurs in
sas_discover_{sata,sas,expander}*, however, we need to clean up that
rphy and the port device list so that we don't GPF. In addition, we
need to check the result of the second sas_notify_lldd_dev_found.
This patch seems ok on a x260, x366 and x206m.
This patch fixes up sas_expander.c separately because jejb has some
cleanup patches of his own that are a prerequisite.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
sas_get_port_device assigns a rphy to a domain device in anticipation
of finding a disk. When a discovery error occurs in
sas_discover_{sata,sas,expander}*, however, we need to clean up that
rphy and the port device list so that we don't GPF. In addition, we
need to check the result of the second sas_notify_lldd_dev_found.
This patch seems ok on a x260, x366 and x206m.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Removed spin_unlock_irq()/spin_lock_irq() pairs surrounding
starget_for_each_device() calls.
As Matthew W. pointed out, starget_for_each_device() can be called under
a spinlock being held.
The change has been tested and verified on qla2xxx.ko module.
Thanks Matthew W. and Hisashi H. for help.
Signed-off-by: Andrew Vasquez <Andrew.vasquez@qlogic.com>
Signed-off-by: Seokmann Ju <Seokmann.ju@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Hi,
Minor typo ...
In my first iteration of patches (that got merged), the
BLIST_ATTACH_PQ3 actually had the value 0x800000, but that
got changed later to avoid conflicts. This piece must have
been overlooked.
You could obviously do something like %x and then add the
bitflags, but that looks overkill for something that does
not tend to change.
Please merge.
(Patch applied against latest 2.6.20rc version that I tested.)
From: Kurt Garloff <kurt@garloff.de>
Subject: [SCSI SCAN] Fix logging message for PQ3 devices
The blacklist flags BLIST_ATTACH_PQ3 has value 0x1000000,
not 0x800000.
Signed-off-by: Kurt Garloff <garloff@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
More megaraid kernel-doc fixes.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Sumant Patro <sumantp@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
kernel-doc modifications:
- change "@param var" notation to @var;
- change function/description separator from ':' to '-';
- change var/description separator from '-' to ':';
- fix a few doc. typos;
- don't use kernel-doc /** lead-in when the doc. block is not kernel-doc;
- use Linux common */ ending comment format instead of **/;
- use correct function parameter names;
- place function parameters immediately after the function short description;
- place kernel-doc immediately before its function or macro;
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Sumant Patro <sumantp@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
1. Changes in Initialization to fix kdump failure.
Send SYNC command on loading.
This command clears the pending commands in the adapter
and re-initialize its internal RAID structure.
Without this change, megaraid driver either panics or fails to
initialize the adapter during kdump's second kernel boot
if there are pending commands or interrupts from other devices
sharing the same IRQ.
2. Authors email-id domain name changed from lsil.com to lsi.com.
Also modified the MODULE_AUTHOR to megaraidlinux@lsi.com
Signed-off-by: Sumant Patro <sumant.patro@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
After discussions in the thread titled:
[PATCH] scsi_debug: illegal blocking memory allocation
here is a patch containing the discussed fix and some other
fixes and additions. The patch is against lk 2.6.20-rc3 .
The version is bumped to 1.81 .
ChangeLog:
- Change several GFP_KERNEL allocations to GFP_ATOMIC
as they can be called from queuecommand() context
- check above allocation returns and if out of memory
report DID_REQUEUE in two cases, DID_NO_CONNECT in
another, and fail slave configure() in another
- add support for WRITE BUFFER command
- add aborted_command error injection support
(opts mask 0x10), similar mechanism to
recovered_error injection.
Signed-off-by: Douglas Gilbert <dougg@torque.net>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
scsi_retry_command only has a single caller, so there is no point
in having this function. Additionally the memset of the sense
buffer it does is entirely superflous as scsi_request_fn already
calls scsi_init_cmd_errh to perform this memset before the command
is reissued.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The D700 needs the burst length setting to the previous 53c700 default
of 8 otherwise it will be effectively disabled.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This is a patch, which allows not only disabling bursting but to specify
different burst lenghts. This feature is needed to get the 53c700 driver
working for the onboard SCSI controller of SNI RM machines, which only
work reliably with a 4 word burst length.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Setting .ConfigBase and .Present is now done at the pcmcia core.
The driver cleanup missed a few places where the driver did set .Present
to PRESENT_OPTION and later to the values from the CIS. Setting to
PRESENT_OPTION now overrides the values from the CIS. So just remove
those lines.
Signed-off-by: Daniel Ritz <daniel.ritz@gmx.ch>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jeremy caught a bug in the qla1280 driver where it didn't set the
residual value correctly.
Signed-off-by: Jeremy Higdon <jeremy@sgi.com>
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Update domain name change from lsil.com to lsi.com.
Change module author to megaraidlinux@lsi.com
Signed-off-by: Sumant Patro <sumant.patro@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The attached patch updates the 3ware 8000 driver:
- Free irq handler in __tw_shutdown().
- Turn on RCD bit for caching mode page.
- Serialize reset code.
Signed-off-by: Adam Radford <linuxraid@amcc.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
sr_block_ioctl() should proceed to SCSI ioctls if cdrom_ioctl()
returns -ENOSYS. However it tested for ENOSYS instead of -ENOSYS
rendering all SCSI ioctls other than GET_IDLUN and GET_BUS_NUMBER
inaccessible. Fix it.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add kmalloc failure check and fix the loop on error path. Without the
patch pool element at index [0] will not be freed.
Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Acked-by: James Smart <James.Smart@Emulex.Com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Update drivers/scsi/aacraid/linit.c and Documentation/scsi/aacraid.txt
file with the current list of
adapters supported by the aacraid driver. Deprecated a few adapters that
never shipped, corrected a
few and added new adapters that matched the family code support. No
functional changes to the driver.
No side effects.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Yanling Qi, noted that when the sense data length of
a check-condition is greater than 0x7f (127), senselen = (data[0] << 8)
| data[1] will become negative. It causes different kinds of panics from
GPF, spin_lock deadlock to spin_lock recursion.
We were also swapping this value on big endien machines.
This patch fixes both issues by using be16_to_cpu().
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch cures two run together printk messages in iSCSI
driver.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The return value of crypto_alloc_hash() should be checked by
IS_ERR().
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The transition from crypto_digest_*() to the crypto_hash_*() family
introduced a bug into the data digest calculation: crypto_hash_update() is
called with the number of S/G elements instead of the S/G lists data size.
Signed-off-by: Arne Redlich <arne.redlich@xiranet.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Apparently no ATAPI CD/DVD actually supports REPORT LUNS (in spite of
claiming scsi-3 compliance, where it's mandatory) and worse, some
crash or flake out on being sent the command. This may actually be
due to a conflict between SPC and MMC with MMC not listing REPORT LUNS
as mandatory. The same standards conflict exists for RBC as well.
Fix all of this by reversing the blacklists for CDROM and RBC devices
(i.e. now they have to have the BLIST_REPORTLUNS2 flag set even if the
inquiry data returns scsi-3 compliance).
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Rather than a direct call, as was done in the case of a
RISC-paused state within the ISP24xx interrupt handler.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Original code would incorrectly use non-24xx code-paths.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Disable subsequent GPSC queries if Fabric Management services do
not support the operation.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>