If connection creation fails we end up calling list_del
on a invalid struct. This then causes an oops. We are not
acutally using the lists (old MCS code we thought might
be useful elsewhere) so this patch just removes that
code.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The transport class recv mempools are causing slab corruption.
We could hack around netlink's lack of mempool support like dm,
but it is just too ulgy (dm's hack is ugly enough :) when you need
to support broadcast.
This patch removes the recv pools. We have not used them even when
we were allocting 20 MB per session and the system only had 64 MBs.
And we have no pools on the send side and have been ok there. When
Peter's work gets merged we can use that since the network guys
are in favor of that approach and are not going to add mempools
everywhere.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Since it often takes around 20-30 seconds to scan a scsi bus, it's
highly advantageous to do this in parallel with other things. The bulk
of this patch is ensuring that devices don't change numbering, and that
all devices are discovered prior to trying to start init. For those
who build SCSI as modules, there's a new scsi_wait_scan module that will
ensure all bus scans are finished.
This patch only handles drivers which call scsi_scan_host. Fibre Channel,
SAS, SATA, USB and Firewire all need additional work.
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This was necessitated by the need for a function to get back
to a scsi_cmnd, when an hba the posts its (corresponding) completion
interrupt with a block layer tag as its reference.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: David Somayajulu <david.somayajulu@qlogic.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
device_reprobe() should return an error code. When it does so,
scsi_device_reprobe() should propagate it back.
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make it possible to disable the block layer. Not all embedded devices require
it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require
the block layer to be present.
This patch does the following:
(*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev
support.
(*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls
an item that uses the block layer. This includes:
(*) Block I/O tracing.
(*) Disk partition code.
(*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS.
(*) The SCSI layer. As far as I can tell, even SCSI chardevs use the
block layer to do scheduling. Some drivers that use SCSI facilities -
such as USB storage - end up disabled indirectly from this.
(*) Various block-based device drivers, such as IDE and the old CDROM
drivers.
(*) MTD blockdev handling and FTL.
(*) JFFS - which uses set_bdev_super(), something it could avoid doing by
taking a leaf out of JFFS2's book.
(*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and
linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is,
however, still used in places, and so is still available.
(*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and
parts of linux/fs.h.
(*) Makes a number of files in fs/ contingent on CONFIG_BLOCK.
(*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK.
(*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK
is not enabled.
(*) fs/no-block.c is created to hold out-of-line stubs and things that are
required when CONFIG_BLOCK is not set:
(*) Default blockdev file operations (to give error ENODEV on opening).
(*) Makes some /proc changes:
(*) /proc/devices does not list any blockdevs.
(*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK.
(*) Makes some compat ioctl handling contingent on CONFIG_BLOCK.
(*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if
given command other than Q_SYNC or if a special device is specified.
(*) In init/do_mounts.c, no reference is made to the blockdev routines if
CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2.
(*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return
error ENOSYS by way of cond_syscall if so).
(*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if
CONFIG_BLOCK is not set, since they can't then happen.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Right now ->flags is a bit of a mess: some are request types, and
others are just modifiers. Clean this up by splitting it into
->cmd_type and ->cmd_flags. This allows introduction of generic
Linux block message types, useful for sending generic Linux commands
to block devices.
Signed-off-by: Jens Axboe <axboe@suse.de>
device_reprobe() should return an error code. When it does so,
scsi_device_reprobe() should propagate it back.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This sg driver patch addresses the problem with larger
page sizes reported by Brian King in this post:
http://marc.theaimsgroup.com/?l=linux-scsi&m=115867718623631&w=2
Some other related matters are also addressed. Some of these
prevent oopses when the SG_SCATTER_SZ or scatter_elem_sz are
set to inappropriate values.
The scatter_elem_sz has been tested up to 4 MB which should
make the largest data transfer with one SCSI command, 32 MB
less one block, achievable with a relatively small number
of elements in the scatter gather list.
ChangeLog:
- add scatter_elem_sz boot time parameter and sysfs module
parameter that is initialized to SG_SCATTER_SZ
- the driver will then adjust scatter_elem_sz to be the
max(given(scatter_elem_sz), PAGE_SIZE)
It will also round it up, if necessary, to be a power
of two
- clean up sg.h header, correct bad urls and some statements
that are no longer valid
- make the def_reserved_size sysfs module attribute writable
Signed-off-by: Douglas Gilbert <dougg@torque.net>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Key more of the domain validation settings off the inquiry data from
the disk (in particular, don't try IU or DT unless the disk claims to
support them.
Also add a new dv_in_progress flag to prevent recursive DV.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch implements the ability to set the minimum and maximum
linkrates for both libsas (for expanders) and aic94xx (for the host
phys). It also tidies up the setting of the hardware min and max to
make sure they're updated when the expander emits a change broadcast.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
According to SPEC, the minimum_linkrate and maximum_linkrate should be
settable by the user. This patch introduces a callback that allows the
sas class to pass these settings on to the driver.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
At the moment we have two separate linkspeed enumerations covering
roughly the same values. This patch consolidates on a single one enum
sas_linkspeed in scsi_transport_sas.h and uses it everywhere in the
aic94xx driver. Eventually I'll get around to removing the duplicated
fields in asd_sas_phy and sas_phy ...
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch adds the following functionality to the FC transport:
- dev_loss_tmo LLDD callback :
Called to essentially confirm the deletion of an rport. Thus, it is
called whenever the dev_loss_tmo fires, or when the rport is deleted
due to other circumstances (module unload, etc). It is expected that
the callback will initiate the termination of any outstanding i/o on
the rport.
- fast_io_fail_tmo and LLD callback:
There are some cases where it may take a long while to truly determine
device loss, but the system is in a multipathing configuration that if
the i/o was failed quickly (faster than dev_loss_tmo), it could be
redirected to a different path and completed sooner.
Many thanks to Mike Reed who cleaned up the initial RFC in support
of this post.
The original RFC is at:
http://marc.theaimsgroup.com/?l=linux-scsi&m=115505981027246&w=2
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
During discussions with Mike Christie, I became convinced that we needed
a larger vendor id. This patch extends the id from 32 to 64 bits.
This applies on top of the prior patches that add SCSI transport events
via netlink.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch formally adds support for the posting of FC events via netlink.
It is a followup to the original RFC at:
http://marc.theaimsgroup.com/?l=linux-scsi&m=114530667923464&w=2
and the initial posting at:
http://marc.theaimsgroup.com/?l=linux-scsi&m=115507374832500&w=2
The patch has been updated to optimize the send path, per the discussions
in the initial posting.
Per discussions at the Storage Summit and at OLS, we are to use netlink for
async events from transports. Also per discussions, to avoid a netlink
protocol per transport, I've create a single NETLINK_SCSITRANSPORT protocol,
which can then be used by all transports.
This patch:
- Creates new files scsi_netlink.c and scsi_netlink.h, which contains the
single and shared definitions for the SCSI Transport. It is tied into the
base SCSI subsystem intialization.
Contains a single interface routine, scsi_send_transport_event(), for a
transport to send an event (via multicast to a protocol specific group).
- Creates a new scsi_netlink_fc.h file, which contains the FC netlink event
messages
- Adds 3 new routines to the fc transport:
fc_get_event_number() - to get a FC event #
fc_host_post_event() - to send a simple FC event (32 bits of data)
fc_host_post_vendor_event() - to send a Vendor unique event, with
arbitrary amounts of data.
Note: the separation of event number allows for a LLD to send a standard
event, followed by vendor-specific data for the event.
Note: This patch assumes 2 prior fc transport patches have been installed:
http://marc.theaimsgroup.com/?l=linux-scsi&m=115555807316329&w=2http://marc.theaimsgroup.com/?l=linux-scsi&m=115581614930261&w=2
Sorry - next time I'll do something like making these individual
patches of the same posting when I know they'll be posted closely
together.
Signed-off-by: James Smart <James.Smart@emulex.com>
Tidy up configuration not to make SCSI always select NET
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
It is possible that a ctask could be completing and getting
cleaned up at the same time, we are finishing up the last
data transfer. This could then result in the data transfer
code using stale or invalid values. This patch adds a refcount
to the ctask. When the count goes to zero then we know the
transmit thread and recv thread or softirq are not touching
it and we can safely release it.
The eh should not need to grab a reference because it only cleans
up a task if it has both the xmit mutex and recv lock (or recv
side suspended).
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
iSCSI RFC states that the first burst length must be smaller than the
max burst length. We currently assume targets will be good, but that may
not be the case, so this patch adds a check.
This patch also moves the unsol data out offset to the lib so the LLDs
do not have to track it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch adds support for sharing tag maps at the host level
(i.e. either every queue [LUN] has its own tag map or there's a single
one for the entire host). This formulation is primarily intended to
help single issue queue hardware, like the aic7xxx
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This is the end point of the separate aic94xx driver based on the
original driver and transport class from Luben Tuikov
<ltuikov@yahoo.com>
The log of the separate development is:
Alexis Bruemmer:
o aic94xx: fix hotplug/unplug for expanderless systems
o aic94xx: disable split completion timer/setting by default
o aic94xx: wide port off expander support
o aic94xx: remove various inline functions
o aic94xx: use bitops
o aic94xx: remove queue comment
o aic94xx: remove sas_common.c
o aic94xx: sas remove depot's
o aic94xx: use available list_for_each_entry_safe_reverse()
o aic94xx: sas header file merge
James Bottomley:
o aic94xx: fix TF_TMF_NO_CTX processing
o aic94xx: convert to request_firmware interface
o aic94xx: fix hotplug/unplug
o aic94xx: add link error counts to the expander phys
o aic94xx: add transport class phy reset capability
o aic94xx: remove local_attached flag
o Remove README
o Fixup Makefile variable for libsas rename
o Rename sas->libsas
o aic94xx: correct return code for sas_discover_event
o aic94xx: use parent backlink port
o aic94xx: remove channel abstraction
o aic94xx: fix routing algorithms
o aic94xx: add backlink port
o aic94xx: fix cascaded expander properties
o aic94xx: fix sleep under lock
o aic94xx: fix panic on module removal in complex topology
o aic94xx: make use of the new sas_port
o rename sas_port to asd_sas_port
o Fix for eh_strategy_handler move
o aic94xx: move entirely over to correct transport class formulation
o remove last vestages of sas_rphy_alloc()
o update for eh_timed_out move
o Preliminary expander support for aic94xx
o sas: remove event thread
o minor warning cleanups
o remove last vestiges of id mapping arrays
o Further updates
o Convert aic94xx over entirely to the transport class end device and
o update aic94xx/sas to use the new sas transport class end device
o [PATCH] aic94xx: attaching to the sas transport class
o Add missing completion removal from prior patch
o [PATCH] aic94xx: attaching to the sas transport class
o Build fixes from akpm
Jeff Garzik:
o [scsi aic94xx] Remove ->owner from PCI info table
Luben Tuikov:
o initial aic94xx driver
Mike Anderson:
o aic94xx: fix panic on module insertion
o aic94xx: stub out SATA_DEV case
o aic94xx: compile warning cleanups
o aic94xx: sas_alloc_task
o aic94xx: ref count update
o aic94xx nexus loss time value
o [PATCH] aic94xx: driver assertion in non-x86 BIOS env
Randy Dunlap:
o libsas: externs not needed
Robert Tarte:
o aic94xx: sequence patch - fixes SATA support
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This flag denotes local attachment of the phy. There are two problems
with it:
1) It's actually redundant ... you can get the same information simply
by seeing whether a host is the phys parent
2) we condition a lot of phy parameters on it on the false assumption
that we can only control local phys. I'm wiring up phy resets in the
aic94xx now, and it will be able to reset non-local phys as well.
I fixed 2) by moving the local check into the reset and stats function
of the mptsas, since that seems to be the only HBA that can't
(currently) control non-local phys.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch updates the fc transport for the following:
- Addition of a new attribute "system_hostname" which can be
used to set the fully qualified hostname that the fc_host
is attached to. The fc_host can then register this string
as the FDMI-based host name attribute.
Note: for NPIV, a fc_host could be associated with a system which
is not the local system.
- Add the inline function u64_to_wwn(), which is the inverse of the
existing wwn_to_u64() function.
- Slight reorg, just to keep dynamic attributes with each other, etc
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- Replace scsi_device_types array API with scsi_device_type function API.
Gets rid of a lot of common code, as well as being easier to use.
- Add the new device types in SPC4 r05a, and rename some of the older ones.
- Reformat the printing of inquiry data; now fits on one line and
includes PQ.
I think I've addressed all the feedback from the previous versions. My
current test box prints:
scsi 2:0:1:0: Direct access HP 18.2G ATLAS10K3_18_SCA HP05 PQ: 0 ANSI: 2
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
These aren't used anymore since the field in scsi_cmnd where it was
stored has been removed.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We currently try to allocate a max_recv_data_segment_length
which can be very large (default is 64K), and common uses
are up to 1MB. It is very very difficult to allocte this
much contiguous memory and it turns out we never even use it.
We really only need a couple of pages, so this patch has us
allocates just what we know what we need today.
Later if vendors start adding vendor specific data and
we need to handle large buffers we can do this, but for
the last 4 years we have not seen anyone do this or request
it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
When we enter recovery and flush the running commands
we cannot freee the connection before flushing the commands.
Some commands may have a reference to the connection
that needs to be released before. iscsi_stop was forcing
the term and suspend too early and was causing a oops
in iser, so this patch removes those callbacks all together
and allows the LLD to handle that detail.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Abort handler fixes.
If a connection is dropped and reconnected while an abort is
running then we should assume the recovery code will clean up
the abort. Not doing so causes a oops.
And if a command completes then we get the status for the abort, we do not
need to call into the LLD to cleanup the resources. Doing this causes
and oops in iser because it ends up freeing some resources twice.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The iscsi tcp code can pluck multiple rt2s from the tasks's r2tqueue
in the xmit code. This can result in the task being queued on the xmit queue
but gettting completed at the same time.
This patch fixes the above bug by making the fifo a list so
we always remove the entry on the list del.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch adds the ability to add a backlink to a particular port. The
idea is to represent properly ports on expanders that are used
specifically for linking to the parent device in the topology.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Currently struct scsi_cmnd has various fields that are used to backup
original data after the corresponding fields have been overridden for
EH commands. This means drivers can easily get at it and misuse it.
Due to the old_ naming this doesn't happen for most of them, but two
that have different names have been used wrong a lot (see previous
patch). Another downside is that they unessecarily bloat the scsi_cmnd
size.
This patch moves them onstack in scsi_send_eh_cmnd to fix those two
issues aswell as allowing future EH fixes like moving the EH command
submissions to use SG lists like everything else.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Some SAS HBAs don't want to go to the trouble of tracking port numbers,
so they'd simply like to say "add this port and give it a number".
This is especially beneficial from the hotplug point of view, since
tracking ports and the available number space can be a real pain.
The current implementation uses an incrementing number per expander to
add the port on. However, since there can never be more ports than
there are phys, a later implementation will try to be more intelligent
about this.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch adds or modifies the transport class functions
used to notify userspace of session state events.
We modify the session addition up event and add a destruction event
to notify userspace of session creation, relogin and destruction.
And we modify the conn error event to be sent by broadcast
since multiple listeners may want to listen for it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
So the drivers do not use the channel numbers, but some do
use the target numbers. We were just adding some goofy
variable that just increases for the target nr. This is useless
for software iscsi because it is always zero. And for qla4xxx
the target nr is actually the index of the target/session
in its FW or FLASH tables. We needed to expose this to userspace
so apps could access those numbers so this patch just adds the
target nr to the iscsi session creation functions. This way
when qla4xxx's Hw thinks a session is at target nr 4
in its hw, it is exposed as that number in sysfs.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
qla4xxx is initialized in two steps like other HW drivers.
It allocates the host, sets up the HW, then adds the host.
For iscsi part of HW setup is setting up persistent iscsi
sessions. At that time, the interupts are off and the driver
is not completely set up so we just want to allocate them.
We do not want to add them to sysfs and expose them to userspace
because userspace could try to do lots of fun things with them
like scanning and at that time the driver is not ready.
So this patch breakes up the session creation like other
functions that use the driver model in two the alloc
and add parts. When the driver is ready, it can then add
the sessions and userspace can begin using them.
This also fixes a bug in the addition error patch where
we forgot to do a get on the session.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
I do not remember what I was thinking when we added the channel
as a argument to the session create function. It was probably
due to too much cut and paste work from the FC transport class.
The channel is meaningless for iscsi drivers so this patch drops
its usage everywhere in the iscsi related code.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Reduce duplication in the software iscsi_transport modules by
adding a libiscsi function to handle the common grunt work.
This also has the drivers return specifc -EXXX values for different
errors so userspace can finally handle them in a sane way.
Also just pass the sysfs buffers to the drivers so HW iscsi can
get/set its string values, like targetname, and initiatorname.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Patch from david.somayajulu@qlogic.com:
Add target discovery event. We may have a setup where the iscsi traffic
is on a different netowrk than the other network traffic. In this case
we will want to do discovery though the iscsi card. This patch adds
a event to the transport class that can be used by hw iscsi cards that
support this.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
this patch introduces a port object, separates out ports and phys,
with ports becoming the primary objects of the tree.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The scsi midlayer portion of the patch
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch simplifies "good_bytes" computation in sd_rw_intr().
sd: "good_bytes" computation is always done in terms of the resolution
of the device's medium, since after that it is the number of good bytes
we pass around and other layers/contexts (as opposed ot sd) can translate
that to their own resolution (block layer:512). It also makes
scsi_io_completion() processing more straightforward, eliminating the
3rd argument to the function.
It also fixes a couple of bugs like not checking return value,
using "break" instead of "return;", etc.
I've been running with this patch for some time now on a
test (do-it-all) system.
Signed-off-by: Luben Tuikov <ltuikov@yahoo.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
* git://git.infradead.org/hdrcleanup-2.6: (63 commits)
[S390] __FD_foo definitions.
Switch to __s32 types in joystick.h instead of C99 types for consistency.
Add <sys/types.h> to headers included for userspace in <linux/input.h>
Move inclusion of <linux/compat.h> out of user scope in asm-x86_64/mtrr.h
Remove struct fddi_statistics from user view in <linux/if_fddi.h>
Move user-visible parts of drivers/s390/crypto/z90crypt.h to include/asm-s390
Revert include/media changes: Mauro says those ioctls are only used in-kernel(!)
Include <linux/types.h> and use __uXX types in <linux/cramfs_fs.h>
Use __uXX types in <linux/i2o_dev.h>, include <linux/ioctl.h> too
Remove private struct dx_hash_info from public view in <linux/ext3_fs.h>
Include <linux/types.h> and use __uXX types in <linux/affs_hardblocks.h>
Use __uXX types in <linux/divert.h> for struct divert_blk et al.
Use __u32 for elf_addr_t in <asm-powerpc/elf.h>, not u32. It's user-visible.
Remove PPP_FCS from user view in <linux/ppp_defs.h>, remove __P mess entirely
Use __uXX types in user-visible structures in <linux/nbd.h>
Don't use 'u32' in user-visible struct ip_conntrack_old_tuple.
Use __uXX types for S390 DASD volume label definitions which are user-visible
S390 BIODASDREADCMB ioctl should use __u64 not u64 type.
Remove unneeded inclusion of <linux/time.h> from <linux/ufs_fs.h>
Fix private integer types used in V4L2 ioctls.
...
Manually resolve conflict in include/linux/mtd/physmap.h
This adds the Kbuild files listing the files which are to be installed by
the 'headers_install' make target, in generic directories.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Add enum values for I/O Class values from rev. 10 and rev. 16a SRP
drafts. The values are used to detect targets that implement obsolete
revisions of SRP, so that the initiator can use the old format for
port identifier when connecting to them.
Signed-off-by: Ramachandra K <rkuchimanchi@silverstorm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
With Achim patch the last user (gdth) is switched away from scsi_request
so we an kill it now. Also disables some code in i2o_scsi that was
broken since the sg driver stopped using scsi_requests.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We can race and misset the suspend bit if iscsi_write_space is
called then iscsi_send returns with a failure indicating
there is no space.
To handle this this patch returns a error upwards allowing xmitworker
to decide if we need to try and transmit again. For the no
write space case xmitworker will not retry, and instead
let iscsi_write_space queue it back up if needed (this relies
on the work queue code to properly requeue us if needed).
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
If recovery failed or we are in recovery only overwrite the state
if we are going to terminate the session or if we logged back in.
STOP_CONN_SUSPEND and conn_cnt are not used. We only support
a single connection session ATM, so cleanup that code while
we are working around it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Do not flush queues then block session. This will cause commands
to needlessly swing around on us and remove goofy
recovery_failed field and replace with state value.
And do not start recovery from within the host reset function.
This causeis too many problems becuase open-iscsi was desinged to
call out to userspace then have userpscae decide if we should
go into recovery or kill the session.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
libata implemented a feature to schedule EH without an associated EH
by manipulating shost->host_eh_scheduled in ata_scsi_schedule_eh()
directly. Move this function to scsi_error.c and rename it to
scsi_schedule_eh(). It is now an exported API for SCSI transports and
exported via new header file drivers/scsi/scsi_transport_api.h
This patch also de-export scsi_eh_wakeup() which was exported
specifically for ata_scsi_schedule_eh().
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
libata needs to invoke EH without scmd. This patch adds
shost->host_eh_scheduled to implement such behavior.
Currently the only user of this feature is libata and no general
interface is defined. This patch simply adds handling for
host_eh_scheduled where needed and exports scsi_eh_wakeup() to
modules. The rest is upto libata. This is the result of the
following discussion.
http://thread.gmane.org/gmane.linux.scsi/23853/focus=9760
In short, SCSI host is not supposed to know about exceptions unrelated
to specific device or command. Such exceptions should be handled by
transport layer proper. However, the distinction is not essential to
ATA and libata is planning to depart from SCSI, so, for the time
being, libata will be using SCSI EH to handle such exceptions.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Introduce scsi_req_abort_cmd(struct scsi_cmnd *).
This function requests that SCSI Core start recovery for the
command by deleting the timer and adding the command to the eh
queue. It can be called by either LLDDs or SCSI Core. LLDDs who
implement their own error recovery MAY ignore the timeout event if
they generated scsi_req_abort_cmd.
First post:
http://marc.theaimsgroup.com/?l=linux-scsi&m=113833937421677&w=2
Signed-off-by: Luben Tuikov <ltuikov@yahoo.com>
Signed-off-by: Tejun Heo <htejun@gmail.com>
debugged by Ming and Rohan:
The problem Ming and Rohan debugged was that during a normal session
login, open-iscsi is not incrementing the exp_statsn counter. It was
stuck at zero. From the RFC, it looks like if the login response PDU has
a successful status then we should be incrementing that value. Also from
the RFC, it looks like if when we drop a connection then reconnect, we
should be using the exp_statsn from the old connection in the next
relogin attempt.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
align printk output
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
add transport end point callbacks so iscsi drivers that cannot connect
from userspace, like iscsi tcp, using sockets do not have to
implement their own socket infrastructure.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Several structs in <scsi/srp.h> get padded to a multiple of 8 bytes on
64-bit architectures and end up with a size that does not match the
definition in the SRP spec:
SRP spec 64-bit
sizeof (struct indirect_buf) 20 24
sizeof (struct srp_login_rsp) 52 56
sizeof (struct srp_rsp) 36 40
Fix this by adding __attribute__((packed)) to the offending structs.
Problem pointed out by Arne Redlich <arne.redlich@xiranet.com>.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The current dc395x driver uses PIO to transfer up to 4 bytes which do not
get transferred by DMA (under unclear circumstances). For this the driver
uses page_address() which is broken on highmem. Apart from this the
actual calculation of the virtual address is wrong (even without highmem).
So, e.g., for reading it reads bytes from the driver to a wrong address
and returns wrong data, I guess, for writing it would just output random
data to the device.
The proper fix, as suggested by many, is to dynamically map data using
kmap_atomic(page, KM_BIO_SRC_IRQ) / kunmap_atomic(virt). The reason why it
has not been done until now, although I've done some preliminary patches
more than a year ago was that nobody interested in fixing this problem was
able to reliably reproduce it. Now it changed - with the help from
Sebastian Frei (CC'ed) I was able to trigger the PIO path. Thus, I was
also able to test and debug it.
There are 4 cases when PIO is used in dc395x - data-in / -out with and
without scatter-gather. I was able to reproduce and test only data-in with
and without SG. So, the data-out path is still untested, but it is also
somewhat simpler than the data-in. Fredrik Roubert (also CC'ed) also had
PIO triggering on his system, and in his case it was data-out without SG.
It would be great if he could test the attached patch on his system, but
even if he cannot, I would still request to apply the patch and just wait
if anybody cries...
Implementation: I put 2 new functions in scsi_lib.c and their declarations
in scsi_cmnd.h. I exported them without _GPL, although, I don't feel
strongly about that - not many drivers are likely to use them. But there
is at least one more - I want to use them in tmscsim.c. Whether these are
the right files for the functions and their declarations - not sure
either. Actually, they are not scsi-specific, so, might go somewhere
around other scattergather magic? They are not platform specific either,
and most SG functions are defined under arch/*/... As these issues were
discussed previously there were some more routines suggested to manipulate
scattergather buffers, I think, some of them were needed around
crypto code... So, might be a common place reasonable, like
lib/scattergather.c? I am open here.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Conflicts:
include/scsi/scsi_devinfo.h
Same number for two BLIST flags: BLIST_MAX_512 and BLIST_ATTACH_PQ3
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
There is a lot of code duplcited between iscsi_tcp
and the upcoming iscsi_iser driver. This patch puts
the duplicated code in a lib. There is more code
to move around but this takes care of the
basics. For iscsi_offload if they use the lib we will
probably move some things around. For example in the
queuecommand we will not assume that the LLD wants
to do queue_work, but it is better to handle that
later when we know for sure what iscsi_offload looks
like (we could probably do this for iscsi_iser though to).
Ideally I would like to get the iscsi_transports modules
to a place where all they really have to do is put data
on the wire, but how to do that will hopefully be more clear
when we see other modules like iscsi_offload. Or maybe
iscsi_offload will not use the lib and it will just be
iscsi_iser and iscsi_tcp and maybe the iscsi_tcp_tgt if that
is allowed in mainline.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The current iscsi_tcp eh is not nicely setup for dm-multipath
and performs some extra task management functions when they
are not needed.
The attached patch:
- Fixes the TMF issues. If a session is rebuilt
then we do not send aborts.
- Fixes the problem where if the host reset fired, we would
return SUCCESS even though we had not really done anything
yet. This ends up causing problem with scsi_error.c's TUR.
- If someone has turned on the userspace nop daemon code to try
and detect network problems before the scsi command timeout
we can now drop and clean up the session before the scsi command
timesout and fires the eh speeding up the time it takes for a
command to go from one patch to another. For network problems
we fail the command with DID_BUS_BUSY so if failfast is set
scsi_decide_disposition fails the command up to dm for it to
try on another path.
- And we had to add some basic iscsi session block code. Previously
if we were trying to repair a session we would retrun a MLQUEUE code
in the queuecommand. This worked but it was not the most efficient
or pretty thing to do since it would take a while to relogin
to the target. For iscsi_tcp/open-iscsi a lot of the iscsi error handler
is in userspace the block code is pretty bare. We will be
adding to that for qla4xxx.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
For iscsi boot when going from initramfs to the real root we
need to stop the userpsace iscsi daemon. To later restart it
iscsid needs to be able to rebuild itself and part of that
process is matching a session running the kernel with the
iscsid representation. To do this the attached patch
adds several required iscsi values. If the LLD does not provide
them becuase, login is done in userspace, then the transport
class and userspace set ths up for the LLD.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
from hare@suse.de and michaelc@cs.wisc.edu
hw iscsi like qla4xxx does not allocate a host per session and
for userspace it is difficult to restart iscsid using the
"iscsi handles" for the session and connection, so this
patch just has the class or userspace allocate the id for
the session and connection.
Note: this breaks userspace and requires users to upgrade to the newest
open-iscsi tools. Sorry about his but open-iscsi is still too new to
say we have a stable user-kernel api and we were not good nough
designers to know that other hw iscsi drivers and iscsid itself would
need such changes. Actually we sorta did but at the time we did not
have the HW available to us so we could only guess.
Luckily, the only tools hooking into the class are the open-iscsi ones
or other tools like iscsitart hook into the open-iscsi engine from
userspace or prgroams like anaconda call our tools so they are not affected.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Some devices report a peripheral qualifier of 3 for LUN 0; with the original
code, we would still try a REPORT_LUNS scan (if SCSI level is >= 3 or if we
have the BLIST_REPORTLUNS2 passed in), but NOT any sequential scan.
Also, the device at LUN 0 (which is not connected according to the PQ) is not
registered with the OS.
Unfortunately, SANs exist that are SCSI-2 and do NOT support REPORT_LUNS, but
report a unknown device with PQ 3 on LUN 0. We still need to scan them, and
most probably we even need BLIST_SPARSELUN (and BLIST_LARGELUN). See the bug
reference for an infamous example.
This is patch 3/3:
3. Implement the blacklist flag BLIST_ATTACH_PQ3 that makes the scsi
scanning code register PQ3 devices and continues scanning; only sg
will attach thanks to scsi_bus_match().
Signed-off-by: Kurt Garloff <garloff@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
As previously reported via Michael Reed, the FC transport took a hit
in 2.6.15 (perhaps a little earlier) when we solved a recursion error.
There are 2 deadlocks occurring:
- With scan and the delete items sharing the same workq, flushing the
workq for the delete code was getting it stalled behind a very long
running scan code path.
- There's a deadlock where scsi_remove_target() has to sit behind
scsi_scan_target() due to contention over the scan_lock().
This patch resolves the 1st deadlock and significantly reduces the
odds of the second. So far, we have only replicated the 2nd deadlock
on a highly-parallel SMP system. More on the 2nd deadlock in a following
email.
This patch reworks the transport to:
- Only use the scsi host workq for scanning
- Use 2 other workq's internally. One for deletions, the other for
scheduled deletions. Originally, we tried this with a single workq,
but the occassional flushes of the scheduled queues was hitting the
second deadlock with a slightly higher frequency. In the future, we'll
look at the LLDD's and the transport to see if we can get rid of this
extra overhead.
- When moving to the other workq's we tightened up some object states
and some lock handling.
- Properly syncs adds/deletes
- minor code cleanups
- directly reference fc_host_attrs, rather than through attribute
macros
- flush the right workq on delayed work cancel failures.
Large kudos to Michael Reed who has been working this issue for the last
month.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Original From: Ingo Flaschberger <if@xip.at>
To support the RA4100 array from Compaq.
This patch now correctly handles SCSI_UNKNOWN types with regard to
BLIST_REPORTLUNS2 (allow it) and cdb[1] LUN inclusion (don't).
It also allows a BLIST_MAX_512 flag to restrict the maximum transfer
length to 512 blocks (apparently this is an RA4100 problem).
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We currently have two implementations of this obsolete ioctl, one in
the block layer and one in the scsi code. Both of them have drawbacks.
This patch kills the scsi layer version after updating the block version
with the missing bits:
- argument checking
- use scatterlist I/O
- set number of retries based on the submitted command
This is the last user of non-S/G I/O except for the gdth driver, so
getting this in ASAP and through the scsi tree would be nie to kill
the non-S/G I/O path. Jens, what do you think about adding a check
for non-S/G I/O in the midlayer?
Thanks to Or Gerlitz for testing this patch.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Overriding the whole EH code is a per-transport, not per-host thing.
Move ->eh_strategy_handler to the transport class, same as
->eh_timed_out.
Downside is that scsi_host_alloc can't check for the total lack of EH
anymore, but the transition period from old EH where we needed it is
long gone already.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
At the moment libata doesn't pass pm_message_t down ata_device_suspend.
This causes drives to be powered down when we just want a freeze,
causing unnecessary wear and tear. This patch gets pm_message_t passed
down so that it can be used to determine whether to power down the
drive.
Signed-off-by: Nigel Cunningham <nigel@suspend2.net>
drivers/scsi/libata-core.c | 5 +++--
drivers/scsi/libata-scsi.c | 4 ++--
drivers/scsi/scsi_sysfs.c | 2 +-
include/linux/libata.h | 4 ++--
include/scsi/scsi_host.h | 2 +-
5 files changed, 9 insertions(+), 8 deletions(-)
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This allows the removal of the contained flag and also does a bit of
class renaming (sas_rphy->sas_device).
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Original from Christoph Hellwig and Eric Moore. This version exports
the scsi_reprobe_device() function as an inline.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch makes expanders appear as labelled objects with properties in
the SAS tree.
I've also modified the phy code to make expander phys appear labelled by
host number, expander number and phy index.
So, for my current config, you see something like this in sysfs:
/sys/class/scsi_host/host1/device/phy-1:4/expander-1:0/phy-1-0:12/rphy-1:0-12/target1:0:1
And the expander properties are:
jejb@sparkweed> cd /sys/class/sas_expander/expander-1\:0/
jejb@sparkweed> for f in *; do echo -n $f ": "; cat $f; done
component_id : 29024
component_revision_id : 4
component_vendor_id : VITESSE
device : cat: device: Is a directory
level : 0
product_id : VSC7160 Eval Brd
product_rev : 4
uevent : cat: uevent: Permission denied
vendor_id : VITESSE
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This moves the eh_timed_out functionality from the scsi_host_template
to the transport_template. Given that this is now a transport function,
the EH_RESET_TIMER case no longer caps the timer reschedulings. The
transport guarantees that this is not an infinite condition.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Begin introducing the concept of sas remote devices that have an rphy
embedded. The first one (this) is a simple end device. All that an
end device really does is have port mode page parameters contained.
The next and more complex piece will be expander remote devices.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
I don't think these exist in silicon yet, but the aic94xx driver has a
register setting for them.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
In order to use the new execute_in_process_context() API, you have to
provide it with the work storage, which I do in SCSI in scsi_device and
scsi_target, but which also means that we can no longer queue up the
target reaps, so instead I moved the target to a state model which
allows target_alloc to detect if we've received a dying target and wait
for it to be gone. Hopefully, this should also solve the target
namespace race.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Some non-standard SCSI targets or protocols, such as USB UFI, report "no
LUN present" by setting the Peripheral Device Type to 0x1f and the
Peripheral Qualifier to 0 (not 3 as the standard requires) in the INQUIRY
response. This patch (as650b) adds a new target flag and code to
accomodate such targets.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Introduce new helpers:
- spi_populate_width_msg()
- spi_populate_sync_msg()
- spi_populate_ppr_msg()
and use them in drivers which already enable the SPI transport.
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
As devfs has been disabled from the kernel tree for a number of months
now (5 to be exact), here's a patch against 2.6.16-rc1-git1 that removes
support for it from the SCSI subsystem.
The patch also removes the scsi_disk devfs_name field as it's no longer
needed.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We have several points in the SCSI stack (primarily for our device
functions) where we need to guarantee process context, but (given the
place where the last reference was released) we cannot guarantee this.
This API gets around the issue by executing the function directly if
the caller has process context, but scheduling a workqueue to execute
in process context if the caller doesn't have it. Unfortunately, it
requires memory allocation in interrupt context, but it's better than
what we have previously. The true solution will require a bit of
re-engineering, so isn't appropriate for 2.6.16.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
From:
michaelc@cs.wisc.edufujita.tomonori@lab.ntt.co.jpda-x@monatomic.org
and err path fixup from:
ogerlitz@voltaire.com
This patch cleans up that interface by having the lld and class
pass a iscsi_cls_session or iscsi_cls_conn between each other when
the function is used by HW and SW iscsi llds. This way the lld
does not have to remember if it has to send a handle or pointer
and a handle or pointer to connection, session or host.
This also has the class verify the session handle that gets passed from
userspace instead of using the pointer passed into the kernel directly.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Alex Aizman <itn780@yahoo.com>
Signed-off-by: Dmitry Yusupov <dmitry_yus@yahoo.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>