Two misc fixes:
- Fix deadlock caused by return with host_lock held in lpfc_findnode_did
- Initialize all fields of the allocated mail box structure to zero.
Was causing some sysfs mailbox commands to fail immediately after load.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Introduce lpfc_reset_barrier() function for resets on dual channel adapters
Workaround for a hardware errata on dual channel asics. There is a
potential for the chip to lock up on a reset if a shared dma engine is in
use. The (ugly) work around requires a reset process which uses a mailbox
command to synchronize the independent channels prior to the reset to
avoid the issue. Unfortunately, the timing windows required to ensure this
workaround succeeds are very specific, meaning we can't release the cpu
during the barrier.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fixed a timer panic due to timer firing after freeing ndlp
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fixed RSCN handling when a PLOGI is in retry.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix Discovery processing for NPorts that change their NPortId on the fly
due to a cable swap.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix polling mode panic
Cause: Race between interrupt driven and polling path in harvesting iocbs
from
the response ring.
Signed-off-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Protect NPL lists with host lock
Symptoms: lpfc_findnode_rpi and lpfc_findnode_did can be called
outside of the discovery thread context. We have to iterate
through the NPL lists under the host lock and all add/del
operations on those lists have to be done under host lock.
Signed-off-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix deadlock in lpfc_fdmi_tmo_handler
lpfc_fdmi_tmo_handler was calling lpfc_fdmi_cmd with the host_lock
held. lpfc_fdmi_cmd assumes the host_lock is released as it calls functions
that acquire the host_lock. lpfc_fdmi_tmo_handler acquired the host_lock to
protect access to work_hba_events. This was already checked in the worker
thread so we can remove that code completely and remove access to the
host_lock.
Signed-off-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix performance when using multiple SLI rings
Currently the driver allocates all of its SLI command and response ring
entries to one primary ring. Other rings get little, or no, resources.
Allow more resources to be given to ring 1
Signed-off-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
PCI hrd_type should be obtained with pci_read_config_byte() macro
Driver keys off of this field to report the proper adapter type.
The pci subsystem explicitly clears the multiport bit in the copy of
the field given the driver. Thus, to properly name the card, obtain it
from config space.
Signed-off-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Derive supported speeds from LMT field in the READ_CONFIG
Driver was keying off internal cores. Use what the firmware reports instead.
Signed-off-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Modify RSCN handling to unregister rpis on lost FCP_TARGETs immediately
Signed-off-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix panic caused by HBA resets and target side cable pulls
Signed-off-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Make lpfc_els_rsp_rps_acc and lpfc_els_rsp_rpl_acc static
Signed-off-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Correct use of the hostdata field in scsi_host
Signed-off-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Misc FC Discovery changes :
- Added FC_BYPASSED_MODE statistic
- Corrected some log message data
- Fix up Discovery infrastructure to support FAN:
Allow Fabric entities to flow thru DSM
Fix up linkup/linkdown unregister login processing for Fabric entities
Clean up Discovery code
Utilize nodev_tmo for Fabric entities
- Use of 3 * ratov for CT handling timeouts
- Fix up DSM to make more appropriate decisions and clean up code.
Signed-off-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add module parameter to limit number of outstanding commands per lpfc HBA
Signed-off-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fixed a double insertion of mail box object to the SLI mailbox list.
Signed-off-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fixed system panic in lpfc_sli_brdreset during dynamic add of LP11K
Signed-off-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Explicitly initialize the skip_post argument to lpfc_sli_send_reset
on a ERATT interrupt.
Signed-off-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fixed a race condition in the PLOGI retry logic.
Signed-off-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Handling of ELS commands RRQ, RPS, RPL and LIRR correctly
Signed-off-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: James Smart <James.Smart@Emulex.Com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- Add functionality to run in polled mode only. Includes run time
attribute to enable mode.
- Enable runtime writable hba settings for coallescing and delay parameters
Customers have requested a mode in the driver to run strictly polled.
This is generally to support an environment where the server is extremely
loaded and is looking to reclaim some cpu cycles from adapter interrupt
handling.
This patch adds a new "poll" attribute, and the following behavior:
if value is 0 (default):
The driver uses the normal method for i/o completion. It uses the
firmware feature of interrupt coalesing. The firmware allows a
minimum number of i/o completions before an interrupt, or a maximum
time delay between interrupts. By default, the driver sets these
to no delay (disabled) or 1 i/o - meaning coalescing is disabled.
Attributes were provided to change the coalescing values, but it was
a module-load time only and global across all adapters.
This patch allows them to be writable on a per-adapter basis.
if value is 1 :
Interrupts are left enabled, expecting that the user has tuned the
interrupt coalescing values. When this setting is enabled, the driver
will attempt to service completed i/o whenever new i/o is submitted
to the adapter. If the coalescing values are large, and the i/o
generation rate steady, an interrupt will be avoided by servicing
completed i/o prior to the coalescing thresholds kicking in. However,
if the i/o completion load is high enough or i/o generation slow, the
coalescion values will ensure that completed i/o is serviced in a timely
fashion.
if value is 3 :
Turns off FCP i/o interrupts altogether. The coalescing values now have
no effect. A new attribute "poll_tmo" (default 10ms) exists to set
the polling interval for i/o completion. When this setting is enabled,
the driver will attempt to service completed i/o and restart the
interval timer whenever new i/o is submitted. This behavior allows for
servicing of completed i/o sooner than the interval timer, but ensures
that if no i/o is being issued, then the interval timer will kick in
to service the outstanding i/o.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- Release task management command before counting outstanding commands.
TMF was being erroneously counted as an active outstanding command.
- Serialize EH calls and block requests when EH function is running.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Remove locking wrappers around error handlers. Wrappers were added in
early 2.6.13 api change
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- Remove unnecessary scsi_block_requests calls on rport deletes.
This was deadlocking the sdev removals as they wanted to flush commands.
- No longer block requests when adding the remote port (to block
discovery). Instead, register, then change port role. Maps to Qlogic
behavior, and closer to the register-node-upon-first-ELS behavior.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Cause: Link bounces were causing discovery ELS's to be killed.
Driver was not properly flushing ELS commands upon the subsequent
link bounces. Thus, processing of ELS post link bounce erroneously
assumed discovery failure and device loss.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Miscellaneous Cleanups:
- Remove ProgType READ_REV mailbox command value check in lpfc_config_port_prep.
- Convert simple printk to an lpfc_printf_log in queuecommand.
- Modify lpfc_abort_handler message 0749 to display more accurate text and data.
- Minor style cleanup: fix 3 long lines in lpfc_hw.h
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Current upstream 'allmodconfig' build is broken. This is the obvious
patch...
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is the drivers/scsi/ part of the big kfree cleanup patch.
Remove pointless checks for NULL prior to calling kfree() in drivers/scsi/.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Acked-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use schedule_timeout_uninterruptible() instead of
set_current_state()/schedule_timeout() to reduce kernel size.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Return FAILED from eh_ routines if command(s) is(are) not completed
There were scenarios where we may have returned from the error
handlers prior to all affected commands being flushed to the midlayer.
Add changes to ensure this doesn't happen.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Adjust lpfc_scsi_buf allocation to account for lun_queue_depth and
error handling
Under high load and high duress, the error handler could steal some
command resources from the normal i/o path. Rework to allocate
additional resources to avoid this scneario.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Replace lpfc_sli_issue_iocb_wait_high_priority with lpfc_sli_issue_iocb_wait.
Simplify code paths, as there really wasn't a "priority"
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
From: James Smart <James.Smart@emulex.com>
There were scenarios where the error handlers could reuse an iotag
value of an active io. Remove all possibility of this by
pre-assigning iotag resources to command resources.
Signed-off-by: James Smart <James.Smart@emulex.com>
Rejections fixed up and
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Table was not providing a lot of value and injected a couple of
errors. Removed it and made functionality inline.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix for "Unknown IOCB command Data: x0 x3 x0 x0" messages and
inability to see devices
On some platforms, the host-memory based ring mgmt area was not
zero. Also, driver wasn't manipulating the entire 32bits of the ring
pointers.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Reuse macros defined for sysfs store callbacks in the initialization
code in order to enforce the same range checking.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Update adapter names to match Emulex naming conventions.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Cleanup white spaces in argument calls & initializations, prune if
statements, remove casting and remove redundant if checks.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We recently went back to implement a board reset. When we perform the
reset, we wanted to tear down the internal data structures and rebuild
them. Unfortunately, when it came to the rport structure, things were
odd. If we deleted them, the scsi targets and sdevs would be
torn down. Not a good thing for a temporary reset. We could block the
rports, but we either maintain the internal structures to keep the
rport reference (perhaps even replicating what's in the transport),
or we have to fatten the fc transport with new search routines to find
the rport (and deal with a case of a dangling rport that the driver
forgets).
It dawned on me that we had actually reached this state incorrectly.
When the fc transport first started, we did the block/unblock first, then
added the rport interface. The purpose of block/unblock is to hide the
temporary disappearance of the rport (e.g. being deleted, then readded).
Why are we making the driver do the block/unblock ? We should be making
the transport have only an rport add/delete, and the let the transport
handle the block/unblock.
So... This patch removes the existing fc_remote_port_block/unblock
functions. It moves the block/unblock functionality into the
fc_remote_port_add/delete functions. Updates for the lpfc driver are
included. Qlogic driver updates are also enclosed, thanks to the
contributions of Andrew Vasquez. [Note: the qla2xxx changes are
relative to the scsi-misc-2.6 tree as of this morning - which does
not include the recent patches sent by Andrew]. The zfcp driver does
not use the block/unblock functions.
One last comment: The resulting behavior feels very clean. The LLDD is
concerned only with add/delete, which corresponds to the physical
disappearance. However, the fact that the scsi target and sdevs are
not immediately torn down after the LLDD calls delete causes an
interesting scenario... the midlayer can call the xxx_slave_alloc and
xxx_queuecommand functions with a sdev that is at the location the
rport used to be. The driver must validate the device exists when it
first enters these functions. In thinking about it, this has always
been the case for the LLDD and these routines. The existing drivers
already check for existence. However, this highlights that simple
validation via data structure dereferencing needs to be watched.
To deal with this, a new transport function, fc_remote_port_chkready()
was created that LLDDs should call when they first enter these two
routines. It validates the rport state, and returns a scsi result
which could be returned. In addition to solving the above, it also
creates consistent behavior from the LLDD's when the block and deletes
are occuring.
Rejections fixed up and
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Ok, here's a patch to add such a common API for fc transport users.
Relevant LLD changes (lpfc and qla2xxx) also present.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Acked-by: Smart, James <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Replace use of lpfc_put_lun with midlayer's int_to_scsilun
Remove driver's local definition of lpfc_put_lun (which converts an
int back to a 64-bit LUN) and replace it's use with the recently added
int_to_scsilun function provided by the midlayer.
Note: Embedding midlayer structure in our structure caused
need for more files to include midlayer headers.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix handling of the dev_loss and nodev timeouts.
Symptoms: when remote port disappears for a period of time longer then
either nodev_tmo or dev_loss_tmo, the lpfc driver worker thread will
stall removing that remote port.
Cause: removing remote port involves un-blocking and sync-ing
corresponding block device queue. But corresponding node in the lpfc
driver is still in the NPR(?node port recovery?) state and mid-layer
gets SCSI_MLQUEUE_HOST_BUSY as a return value when it is trying to call
queuecommand() with command for that node (AKA remote port)
Fix: Instead of returning SCSI_MLQUEUE_HOST_BUS from queuecommand() for
nodes in NPR states complete it with retry-able error code DID_BUS_BUSY
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix panic in lpfc_get_stats()
Symptoms: Panic on sysfs stats access
Cause: In lpfc_get_stats() we are writing to memory that we do not
own.
Fix: Fix our stats structure allocation. Embed phba->link_stats in
struct lpfc_hba and stop treating it like rogue structure.
Note: Embedding midlayer/transport structure in our structure caused
need for more files to include midlayer/transport headers.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Clear task management bits when preparing SCSI commands
In lpfc_scsi_prep_cmnd, clear the task management bits (fcpCntl2 member
in the fcp_cmd structure) when preparing regular SCSI commands.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix panic on lip and cable pull
Symptoms: Panic on lip or cable pull
Cause: Use after free of nlp in lpfc_nlp_remove()
Fix: Do not make FC transport calls after a node is removed. Transport
calls are disabled by ignoring the initial delete transition.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
IOCB BDE not getting fully initialized during reuse
Symptoms: Driver gets Status 3 and Reason 0x13 on IOCB completions.
Cause: The IOCB bpl.bdeSize and bdeFlags are not getting initialized on reuse.
Fix: Reinitialize these fields in prep_dma each time an IOCB is used.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
turn many #if $undefined_string into #ifdef $undefined_string to fix some
warnings after -Wno-def was added to global CFLAGS
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Update copyright notice text and include year 2005.
Add Copyright notice for Christoph Hellwig to several files: lpfc.h
lpfc_attr.c lpfc_els.c lpfc_hbadisc.c lpfc_init.c lpfc_mbox.c
lpfc_mem.c lpfc_nportdisc.c lpfc_scsi.c lpfc_sli.c
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Symptom - An unmapped node (initiator) that goes away in a situation
such as cable pull, comes back as a mapped node. Fix - On ADISC
completion, put a list on the mapped list only if it is a FCP_TARGET.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add completion handler to the abort iocbs to close a hole where we
could reuse an iotag.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
lpfc_els_unsol_event() checks rjt_err to determine is LS_RJT should be
sent. However, rjt_err was set to LSEXP_NOTHING_ELSE (which is 0) in
cases where an LS_RJT should be sent, so rjt_err was never true.
Change lpfc_els_unsol_event() to set rjt_err to 1 when LS_RJT should
be sent.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix driver not seeing LP6000. Fix: add PCI id to the pci_device_id
table and a short description for the HBA in get_hba_model_desc().
Also add a default clause to the switch statement that parses the
various PCI ID's.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add max_sectors to the driver host template and initialize it with
0xFFFF since the driver has no limitations on the size a transfer
contained by a scsi command and that fits within the sg_tablesize
provisioned by the driver. This fixes a performance issue seen in
some configurations.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Bug reported via SourceForge - lpfc does not load on sparc. The lpfc
driver must byteswap all FCP IOCBs to recover the data into cpu native
format.
Also correct issue of "iotag not found" messages
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Analysis:
Timeout of READ_SPARM64 causes call to lpfc_mbox_timeout_handler which
reads psli->mbox_active to determine the timeout mbox. Timeout
handler then NULL's psli->mbox_active and calls
lpfc_mbx_cmpl_read_sparam(), which on timeout condition, calls
link_down(). link_down() now calls disc_done() which calls
mbox_timeout_hander() again since WORKER_MBOX_TMO is still set, which
goes back to read psli->mbox_active which is already NULL'ed.
Remove redundant if statement in lpfc_mbox_timeout_handler. pmbox is
assigned psli->mbox_active so there is no need to check if it actually
equals psli->mbox_active.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix issue where all hosts connected to SAN get spammed with nodev
message when other initiators go away. Display nodev message only
when FC targets go away. However this behavior will be overridden if
LOG_DISCOVERY is set.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
From: Christoph Hellwig <hch@lst.de>:
- rename PGP/HPH to lpfc_pgp/lpfc_hgp
- use __le32 types for the members to start fixing sparse -Wbitwise
issues
- remove lpfc_sli.MBhostaddr, we can always use the pointer from
SLI2_DESC directly
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>