Commit graph

6994 commits

Author SHA1 Message Date
FUJITA Tomonori
45e79a3acd bsg: add a request_queue argument to scsi_cmd_ioctl()
bsg uses scsi_cmd_ioctl() for some SCSI/sg ioctl
commands. scsi_cmd_ioctl() gets a request queue from a gendisk
arguement. This prevents bsg being bound to SCSI devices that don't
have a gendisk (like OSD). This adds a request_queue argument to
scsi_cmd_ioctl(). The SCSI/sg ioctl commands doesn't use a gendisk so
it's safe for any SCSI devices to use scsi_cmd_ioctl().

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-16 08:52:45 +02:00
FUJITA Tomonori
3862153b67 Replace s32, u32 and u64 with __s32, __u32 and __u64 in bsg.h for userspace
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-16 08:52:45 +02:00
Jens Axboe
1594a3f0eb bsg: use u32 etc instead of uint32_t
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-16 08:52:44 +02:00
FUJITA Tomonori
45977d0e87 bsg: add sg_io_v4 structure
This patch adds sg_io_v4 structure that Doug proposed last month.

There's one major change from the RFC. I dropped iovec, which needs
compat stuff. The bsg code simply calls blk_rq_map_user against
dout_xferp/din_xferp. So if possible, the page frames are directly
mapped. If not possible, the block layer allocates new page frames and
does memory copies.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-16 08:52:44 +02:00
FUJITA Tomonori
337ad41dea block: export blk_verify_command for SG v4
blk_fill_sghdr_rq doesn't work for SG v4 so verify_command needed to
be exported.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-16 08:52:44 +02:00
Jens Axboe
3d6392cfbd bsg: support for full generic block layer SG v3
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-16 08:52:44 +02:00
Linus Torvalds
d3502d7f25 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (53 commits)
  [TCP]: Verify the presence of RETRANS bit when leaving FRTO
  [IPV6]: Call inet6addr_chain notifiers on link down
  [NET_SCHED]: Kill CONFIG_NET_CLS_POLICE
  [NET_SCHED]: act_api: qdisc internal reclassify support
  [NET_SCHED]: sch_dsmark: act_api support
  [NET_SCHED]: sch_atm: act_api support
  [NET_SCHED]: sch_atm: Lindent
  [IPV6]: MSG_ERRQUEUE messages do not pass to connected raw sockets
  [IPV4]: Cleanup call to __neigh_lookup()
  [NET_SCHED]: Revert "avoid transmit softirq on watchdog wakeup" optimization
  [NETFILTER]: nf_conntrack: UDPLITE support
  [NETFILTER]: nf_conntrack: mark protocols __read_mostly
  [NETFILTER]: x_tables: add connlimit match
  [NETFILTER]: Lower *tables printk severity
  [NETFILTER]: nf_conntrack: Don't track locally generated special ICMP error
  [NETFILTER]: nf_conntrack: Introduces nf_ct_get_tuplepr and uses it
  [NETFILTER]: nf_conntrack: make l3proto->prepare() generic and renames it
  [NETFILTER]: nf_conntrack: Increment error count on parsing IPv4 header
  [NET]: Add ethtool support for NETIF_F_IPV6_CSUM devices.
  [AF_IUCV]: Add lock when updating accept_q
  ...
2007-07-15 16:50:46 -07:00
Al Viro
4381ca3c23 fix return type of skb_checksum_complete()
It returns __sum16, not unsigned int

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-15 16:40:51 -07:00
David S. Miller
d09f51b699 Merge master.kernel.org:/pub/scm/linux/kernel/git/herbert/crypto-2.6
Conflicts:

	crypto/Kconfig
2007-07-14 23:47:04 -07:00
Jan Engelhardt
370786f9cf [NETFILTER]: x_tables: add connlimit match
ipt_connlimit has been sitting in POM-NG for a long time.
Here is a new shiny xt_connlimit with:

 * xtables'ified
 * will request the layer3 module
   (previously it hotdropped every packet when it was not loaded)
 * fixed: there was a deadlock in case of an OOM condition
 * support for any layer4 protocol (e.g. UDP/SCTP)
 * using jhash, as suggested by Eric Dumazet
 * ipv6 support

Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 20:47:26 -07:00
Michael Chan
6460d948f3 [NET]: Add ethtool support for NETIF_F_IPV6_CSUM devices.
Add ethtool utility function to set or clear IPV6_CSUM feature flag.
Modify tg3.c and bnx2.c to use this function when doing ethtool -K
to change tx checksum.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 19:07:52 -07:00
David S. Miller
cf3842ec50 Merge branch 'upstream-davem' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2007-07-14 18:58:49 -07:00
Patrick McHardy
b863ceb7dd [NET]: Add macvlan driver
Add macvlan driver, which allows to create virtual ethernet devices
based on MAC address.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:55:06 -07:00
Patrick McHardy
56addd6eee [VLAN]: Use multicast list synchronization helpers
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:53:28 -07:00
Patrick McHardy
6c78dcbd47 [VLAN]: Fix promiscous/allmulti synchronization races
The set_multicast_list function may be called without holding the rtnl
mutex, resulting in races when changing the underlying device's promiscous
and allmulti state. Use the change_rx_mode hook, which is always invoked
under the rtnl.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:52:56 -07:00
Patrick McHardy
a0a400d79e [NET]: dev_mcast: add multicast list synchronization helpers
The method drivers currently use to synchronize multicast lists is not
very pretty:

- walk the multicast list
- search each entry on a copy of the previous list
- if new add to lower device
- walk the copy of the previous list
- search each entry on the current list
- if removed delete from lower device
- copy entire list

This patch adds a new field to struct dev_addr_list to store the
synchronization state and adds two helper functions for synchronization
and cleanup.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:52:02 -07:00
Patrick McHardy
24023451c8 [NET]: Add net_device change_rx_mode callback
Currently the set_multicast_list (and set_rx_mode) callbacks are
responsible for configuring the device according to the IFF_PROMISC,
IFF_MULTICAST and IFF_ALLMULTI flags and the mc_list (and uc_list in
case of set_rx_mode).

These callbacks can be invoked from BH context without the rtnl_mutex
by dev_mc_add/dev_mc_delete, which makes reading the device flags and
promiscous/allmulti count racy. For real hardware drivers that just
commit all changes to the hardware this is not a real problem since
the stack guarantees to call them for every change, so at least the
final call will not race and commit the correct configuration to the
hardware.

For software devices that want to synchronize promiscous and multicast
state to an underlying device however this can cause corruption of the
underlying device's flags or promisc/allmulti counts.

When the software device is concurrently put in promiscous or allmulti
mode while set_multicast_list is invoked from bottem half context, the
device might synchronize the change to the underlying device without
holding the rtnl_mutex, which races with concurrent changes to the
underlying device.

Add a dev->change_rx_flags hook that is invoked when any of the flags
that affect rx filtering change (under the rtnl_mutex), which allows
drivers to perform synchronization immediately and only synchronize
the address lists in set_multicast_list/set_rx_mode.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:51:31 -07:00
Linus Torvalds
16cefa8c38 Merge git://git.linux-nfs.org/pub/linux/nfs-2.6
* git://git.linux-nfs.org/pub/linux/nfs-2.6: (122 commits)
  sunrpc: drop BKL around wrap and unwrap
  NFSv4: Make sure unlock is really an unlock when cancelling a lock
  NLM: fix source address of callback to client
  SUNRPC client: add interface for binding to a local address
  SUNRPC server: record the destination address of a request
  SUNRPC: cleanup transport creation argument passing
  NFSv4: Make the NFS state model work with the nosharedcache mount option
  NFS: Error when mounting the same filesystem with different options
  NFS: Add the mount option "nosharecache"
  NFS: Add support for mounting NFSv4 file systems with string options
  NFS: Add final pieces to support in-kernel mount option parsing
  NFS: Introduce generic mount client API
  NFS: Add enums and match tables for mount option parsing
  NFS: Improve debugging output in NFS in-kernel mount client
  NFS: Clean up in-kernel NFS mount
  NFS: Remake nfsroot_mount as a permanent part of NFS client
  SUNRPC: Add a convenient default for the hostname when calling rpc_create()
  SUNRPC: Rename rpcb_getport to be consistent with new rpcb_getport_sync name
  SUNRPC: Rename rpcb_getport_external routine
  SUNRPC: Allow rpcbind requests to be interrupted by a signal.
  ...
2007-07-13 16:46:18 -07:00
Linus Torvalds
e030dbf91a Merge branch 'ioat-md-accel-for-linus' of git://lost.foo-projects.org/~dwillia2/git/iop
* 'ioat-md-accel-for-linus' of git://lost.foo-projects.org/~dwillia2/git/iop: (28 commits)
  ioatdma: add the unisys "i/oat" pci vendor/device id
  ARM: Add drivers/dma to arch/arm/Kconfig
  iop3xx: surface the iop3xx DMA and AAU units to the iop-adma driver
  iop13xx: surface the iop13xx adma units to the iop-adma driver
  dmaengine: driver for the iop32x, iop33x, and iop13xx raid engines
  md: remove raid5 compute_block and compute_parity5
  md: handle_stripe5 - request io processing in raid5_run_ops
  md: handle_stripe5 - add request/completion logic for async expand ops
  md: handle_stripe5 - add request/completion logic for async read ops
  md: handle_stripe5 - add request/completion logic for async check ops
  md: handle_stripe5 - add request/completion logic for async compute ops
  md: handle_stripe5 - add request/completion logic for async write ops
  md: common infrastructure for running operations with raid5_run_ops
  md: raid5_run_ops - run stripe operations outside sh->lock
  raid5: replace custom debug PRINTKs with standard pr_debug
  raid5: refactor handle_stripe5 and handle_stripe6 (v3)
  async_tx: add the async_tx api
  xor: make 'xor_blocks' a library routine for use with async_tx
  dmaengine: make clients responsible for managing channels
  dmaengine: refactor dmaengine around dma_async_tx_descriptor
  ...
2007-07-13 10:52:27 -07:00
Linus Torvalds
aba2da66cf Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
  [PATCH] sched: small topology.h cleanup
  [PATCH] sched: fix show_task()/show_tasks() output
  [PATCH] sched: remove stale version info from kernel/sched_debug.c
  [PATCH] sched: allow larger granularity
  [PATCH] sched: fix prio_to_wmult[] for nice 1

[ I re-did the commits to get rid of some bogus merge commit that
  Ingo had. - Linus ]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-13 10:13:37 -07:00
Ingo Molnar
f787a50306 [PATCH] sched: small topology.h cleanup
trivial cleanup: LOCAL_DISTANCE and REMOTE_DISTANCE are only used in
topology.h and inside an #ifndef section - limit their existence to
that #ifndef.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-13 10:11:52 -07:00
Dan Williams
3039f0735a ioatdma: add the unisys "i/oat" pci vendor/device id
Cc: John Magolan <john.magolan@unisys.com>
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2007-07-13 08:06:19 -07:00
Dan Williams
b5e98d65d3 md: handle_stripe5 - add request/completion logic for async read ops
When a read bio is attached to the stripe and the corresponding block is
marked R5_UPTODATE, then a read (biofill) operation is scheduled to copy
the data from the stripe cache to the bio buffer.  handle_stripe flags the
blocks to be operated on with the R5_Wantfill flag.  If new read requests
arrive while raid5_run_ops is running they will not be handled until
handle_stripe is scheduled to run again.

Changelog:
* cleanup to_read and to_fill accounting
* do not fail reads that have reached the cache

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-By: NeilBrown <neilb@suse.de>
2007-07-13 08:06:17 -07:00
Dan Williams
f38e12199a md: handle_stripe5 - add request/completion logic for async compute ops
handle_stripe will compute a block when a backing disk has failed, or when
it determines it can save a disk read by computing the block from all the
other up-to-date blocks.

Previously a block would be computed under the lock and subsequent logic in
handle_stripe could use the newly up-to-date block.  With the raid5_run_ops
implementation the compute operation is carried out a later time outside
the lock.  To preserve the old functionality we take advantage of the
dependency chain feature of async_tx to flag the block as R5_Wantcompute
and then let other parts of handle_stripe operate on the block as if it
were up-to-date.  raid5_run_ops guarantees that the block will be ready
before it is used in another operation.

However, this only works in cases where the compute and the dependent
operation are scheduled at the same time.  If a previous call to
handle_stripe sets the R5_Wantcompute flag there is no facility to pass the
async_tx dependency chain across successive calls to raid5_run_ops.  The
req_compute variable protects against this case.

Changelog:
* remove the req_compute BUG_ON

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-By: NeilBrown <neilb@suse.de>
2007-07-13 08:06:17 -07:00
Dan Williams
91c0092484 md: raid5_run_ops - run stripe operations outside sh->lock
When the raid acceleration work was proposed, Neil laid out the following
attack plan:

1/ move the xor and copy operations outside spin_lock(&sh->lock)
2/ find/implement an asynchronous offload api

The raid5_run_ops routine uses the asynchronous offload api (async_tx) and
the stripe_operations member of a stripe_head to carry out xor+copy
operations asynchronously, outside the lock.

To perform operations outside the lock a new set of state flags is needed
to track new requests, in-flight requests, and completed requests.  In this
new model handle_stripe is tasked with scanning the stripe_head for work,
updating the stripe_operations structure, and finally dropping the lock and
calling raid5_run_ops for processing.  The following flags outline the
requests that handle_stripe can make of raid5_run_ops:

STRIPE_OP_BIOFILL
 - copy data into request buffers to satisfy a read request
STRIPE_OP_COMPUTE_BLK
 - generate a missing block in the cache from the other blocks
STRIPE_OP_PREXOR
 - subtract existing data as part of the read-modify-write process
STRIPE_OP_BIODRAIN
 - copy data out of request buffers to satisfy a write request
STRIPE_OP_POSTXOR
 - recalculate parity for new data that has entered the cache
STRIPE_OP_CHECK
 - verify that the parity is correct
STRIPE_OP_IO
 - submit i/o to the member disks (note this was already performed outside
   the stripe lock, but it made sense to add it as an operation type

The flow is:
1/ handle_stripe sets STRIPE_OP_* in sh->ops.pending
2/ raid5_run_ops reads sh->ops.pending, sets sh->ops.ack, and submits the
   operation to the async_tx api
3/ async_tx triggers the completion callback routine to set
   sh->ops.complete and release the stripe
4/ handle_stripe runs again to finish the operation and optionally submit
   new operations that were previously blocked

Note this patch just defines raid5_run_ops, subsequent commits (one per
major operation type) modify handle_stripe to take advantage of this
routine.

Changelog:
* removed ops_complete_biodrain in favor of ops_complete_postxor and
  ops_complete_write.
* removed the raid5_run_ops workqueue
* call bi_end_io for reads in ops_complete_biofill, saves a call to
  handle_stripe
* explicitly handle the 2-disk raid5 case (xor becomes memcpy), Neil Brown
* fix race between async engines and bi_end_io call for reads, Neil Brown
* remove unnecessary spin_lock from ops_complete_biofill
* remove test_and_set/test_and_clear BUG_ONs, Neil Brown
* remove explicit interrupt handling for channel switching, this feature
  was absorbed (i.e. it is now implicit) by the async_tx api
* use return_io in ops_complete_biofill

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-By: NeilBrown <neilb@suse.de>
2007-07-13 08:06:15 -07:00
Dan Williams
a445685647 raid5: refactor handle_stripe5 and handle_stripe6 (v3)
handle_stripe5 and handle_stripe6 have very deep logic paths handling the
various states of a stripe_head.  By introducing the 'stripe_head_state'
and 'r6_state' objects, large portions of the logic can be moved to
sub-routines.

'struct stripe_head_state' consumes all of the automatic variables that previously
stood alone in handle_stripe5,6.  'struct r6_state' contains the handle_stripe6
specific variables like p_failed and q_failed.

One of the nice side effects of the 'stripe_head_state' change is that it
allows for further reductions in code duplication between raid5 and raid6.
The following new routines are shared between raid5 and raid6:

	handle_completed_write_requests
	handle_requests_to_failed_array
	handle_stripe_expansion

Changes:
* v2: fixed 'conf->raid_disk-1' for the raid6 'handle_stripe_expansion' path
* v3: removed the unused 'dirty' field from struct stripe_head_state
* v3: coalesced open coded bi_end_io routines into return_io()

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-By: NeilBrown <neilb@suse.de>
2007-07-13 08:06:15 -07:00
Dan Williams
9bc89cd82d async_tx: add the async_tx api
The async_tx api provides methods for describing a chain of asynchronous
bulk memory transfers/transforms with support for inter-transactional
dependencies.  It is implemented as a dmaengine client that smooths over
the details of different hardware offload engine implementations.  Code
that is written to the api can optimize for asynchronous operation and the
api will fit the chain of operations to the available offload resources. 
 
	I imagine that any piece of ADMA hardware would register with the
	'async_*' subsystem, and a call to async_X would be routed as
	appropriate, or be run in-line. - Neil Brown

async_tx exploits the capabilities of struct dma_async_tx_descriptor to
provide an api of the following general format:

struct dma_async_tx_descriptor *
async_<operation>(..., struct dma_async_tx_descriptor *depend_tx,
			dma_async_tx_callback cb_fn, void *cb_param)
{
	struct dma_chan *chan = async_tx_find_channel(depend_tx, <operation>);
	struct dma_device *device = chan ? chan->device : NULL;
	int int_en = cb_fn ? 1 : 0;
	struct dma_async_tx_descriptor *tx = device ?
		device->device_prep_dma_<operation>(chan, len, int_en) : NULL;

	if (tx) { /* run <operation> asynchronously */
		...
		tx->tx_set_dest(addr, tx, index);
		...
		tx->tx_set_src(addr, tx, index);
		...
		async_tx_submit(chan, tx, flags, depend_tx, cb_fn, cb_param);
	} else { /* run <operation> synchronously */
		...
		<operation>
		...
		async_tx_sync_epilog(flags, depend_tx, cb_fn, cb_param);
	}

	return tx;
}

async_tx_find_channel() returns a capable channel from its pool.  The
channel pool is organized as a per-cpu array of channel pointers.  The
async_tx_rebalance() routine is tasked with managing these arrays.  In the
uniprocessor case async_tx_rebalance() tries to spread responsibility
evenly over channels of similar capabilities.  For example if there are two
copy+xor channels, one will handle copy operations and the other will
handle xor.  In the SMP case async_tx_rebalance() attempts to spread the
operations evenly over the cpus, e.g. cpu0 gets copy channel0 and xor
channel0 while cpu1 gets copy channel 1 and xor channel 1.  When a
dependency is specified async_tx_find_channel defaults to keeping the
operation on the same channel.  A xor->copy->xor chain will stay on one
channel if it supports both operation types, otherwise the transaction will
transition between a copy and a xor resource.

Currently the raid5 implementation in the MD raid456 driver has been
converted to the async_tx api.  A driver for the offload engines on the
Intel Xscale series of I/O processors, iop-adma, is provided in a later
commit.  With the iop-adma driver and async_tx, raid456 is able to offload
copy, xor, and xor-zero-sum operations to hardware engines.
 
On iop342 tiobench showed higher throughput for sequential writes (20 - 30%
improvement) and sequential reads to a degraded array (40 - 55%
improvement).  For the other cases performance was roughly equal, +/- a few
percentage points.  On a x86-smp platform the performance of the async_tx
implementation (in synchronous mode) was also +/- a few percentage points
of the original implementation.  According to 'top' on iop342 CPU
utilization drops from ~50% to ~15% during a 'resync' while the speed
according to /proc/mdstat doubles from ~25 MB/s to ~50 MB/s.
 
The tiobench command line used for testing was: tiobench --size 2048
--block 4096 --block 131072 --dir /mnt/raid --numruns 5
* iop342 had 1GB of memory available

Details:
* if CONFIG_DMA_ENGINE=n the asynchronous path is compiled away by making
  async_tx_find_channel a static inline routine that always returns NULL
* when a callback is specified for a given transaction an interrupt will
  fire at operation completion time and the callback will occur in a
  tasklet.  if the the channel does not support interrupts then a live
  polling wait will be performed
* the api is written as a dmaengine client that requests all available
  channels
* In support of dependencies the api implicitly schedules channel-switch
  interrupts.  The interrupt triggers the cleanup tasklet which causes
  pending operations to be scheduled on the next channel
* Xor engines treat an xor destination address differently than a software
  xor routine.  To the software routine the destination address is an implied
  source, whereas engines treat it as a write-only destination.  This patch
  modifies the xor_blocks routine to take a an explicit destination address
  to mirror the hardware.

Changelog:
* fixed a leftover debug print
* don't allow callbacks in async_interrupt_cond
* fixed xor_block changes
* fixed usage of ASYNC_TX_XOR_DROP_DEST
* drop dma mapping methods, suggested by Chris Leech
* printk warning fixups from Andrew Morton
* don't use inline in C files, Adrian Bunk
* select the API when MD is enabled
* BUG_ON xor source counts <= 1
* implicitly handle hardware concerns like channel switching and
  interrupts, Neil Brown
* remove the per operation type list, and distribute operation capabilities
  evenly amongst the available channels
* simplify async_tx_find_channel to optimize the fast path
* introduce the channel_table_initialized flag to prevent early calls to
  the api
* reorganize the code to mimic crypto
* include mm.h as not all archs include it in dma-mapping.h
* make the Kconfig options non-user visible, Adrian Bunk
* move async_tx under crypto since it is meant as 'core' functionality, and
  the two may share algorithms in the future
* move large inline functions into c files
* checkpatch.pl fixes
* gpl v2 only correction

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-By: NeilBrown <neilb@suse.de>
2007-07-13 08:06:14 -07:00
Dan Williams
685784aaf3 xor: make 'xor_blocks' a library routine for use with async_tx
The async_tx api tries to use a dma engine for an operation, but will fall
back to an optimized software routine otherwise.  Xor support is
implemented using the raid5 xor routines.  For organizational purposes this
routine is moved to a common area.

The following fixes are also made:
* rename xor_block => xor_blocks, suggested by Adrian Bunk
* ensure that xor.o initializes before md.o in the built-in case
* checkpatch.pl fixes
* mark calibrate_xor_blocks __init, Adrian Bunk

Cc: Adrian Bunk <bunk@stusta.de>
Cc: NeilBrown <neilb@suse.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2007-07-13 08:06:14 -07:00
Dan Williams
d379b01e90 dmaengine: make clients responsible for managing channels
The current implementation assumes that a channel will only be used by one
client at a time.  In order to enable channel sharing the dmaengine core is
changed to a model where clients subscribe to channel-available-events.
Instead of tracking how many channels a client wants and how many it has
received the core just broadcasts the available channels and lets the
clients optionally take a reference.  The core learns about the clients'
needs at dma_event_callback time.

In support of multiple operation types, clients can specify a capability
mask to only be notified of channels that satisfy a certain set of
capabilities.

Changelog:
* removed DMA_TX_ARRAY_INIT, no longer needed
* dma_client_chan_free -> dma_chan_release: switch to global reference
  counting only at device unregistration time, before it was also happening
  at client unregistration time
* clients now return dma_state_client to dmaengine (ack, dup, nak)
* checkpatch.pl fixes
* fixup merge with git-ioat

Cc: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
2007-07-13 08:06:13 -07:00
Dan Williams
7405f74bad dmaengine: refactor dmaengine around dma_async_tx_descriptor
The current dmaengine interface defines mutliple routines per operation,
i.e. dma_async_memcpy_buf_to_buf, dma_async_memcpy_buf_to_page etc.  Adding
more operation types (xor, crc, etc) to this model would result in an
unmanageable number of method permutations.

	Are we really going to add a set of hooks for each DMA engine
	whizbang feature?
		- Jeff Garzik

The descriptor creation process is refactored using the new common
dma_async_tx_descriptor structure.  Instead of per driver
do_<operation>_<dest>_to_<src> methods, drivers integrate
dma_async_tx_descriptor into their private software descriptor and then
define a 'prep' routine per operation.  The prep routine allocates a
descriptor and ensures that the tx_set_src, tx_set_dest, tx_submit routines
are valid.  Descriptor creation and submission becomes:

struct dma_device *dev;
struct dma_chan *chan;
struct dma_async_tx_descriptor *tx;

tx = dev->device_prep_dma_<operation>(chan, len, int_flag)
tx->tx_set_src(dma_addr_t, tx, index /* for multi-source ops */)
tx->tx_set_dest(dma_addr_t, tx, index)
tx->tx_submit(tx)

In addition to the refactoring, dma_async_tx_descriptor also lays the
groundwork for definining cross-channel-operation dependencies, and a
callback facility for asynchronous notification of operation completion.

Changelog:
* drop dma mapping methods, suggested by Chris Leech
* fix ioat_dma_dependency_added, also caught by Andrew Morton
* fix dma_sync_wait, change from Andrew Morton
* uninline large functions, change from Andrew Morton
* add tx->callback = NULL to dmaengine calls to interoperate with async_tx
  calls
* hookup ioat_tx_submit
* convert channel capabilities to a 'cpumask_t like' bitmap
* removed DMA_TX_ARRAY_INIT, no longer needed
* checkpatch.pl fixes
* make set_src, set_dest, and tx_submit descriptor specific methods
* fixup git-ioat merge
* move group_list and phys to dma_async_tx_descriptor

Cc: Jeff Garzik <jeff@garzik.org>
Cc: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
2007-07-13 08:06:11 -07:00
Linus Torvalds
9374430a52 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6: (149 commits)
  USB: ohci-pnx4008: Remove unnecessary cast of return value of kzalloc
  USB: additions to the quirk list
  usb-storage: implement autosuspend
  USB: cdc-acm: add new device id to option driver
  USB: goku_udc trivial cleanups
  USB: usb gadget stack can now -DDEBUG with Kconfig
  usb gadget stack: remove usb_ep_*_buffer(), part 2
  usb gadget stack: remove usb_ep_*_buffer(), part 1
  USB: pxa2xx_udc -- cleanups, mostly removing dma hooks
  USB: pxa2xx_udc: use generic gpio layer
  USB: quirk for samsung printer
  USB: usb/dma doc updates
  USB: drivers/usb/storage/unusual_devs.h whitespace cleanup
  USB: remove Makefile reference to obsolete OHCI_AT91
  USB: io_*: remove bogus termios no change checks
  USB: mos7720: remove bogus no termios change check
  USB: visor and whiteheat: remove bogus termios change checks
  USB: pl2303: remove bogus checks and fix speed support to use tty_get_baud_rate()
  USB: mos7840.c: turn this into a serial driver
  USB: make the usb_device numa_node get assigned from controller
  ...
2007-07-12 16:46:58 -07:00
Linus Torvalds
0cdf6990e9 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (76 commits)
  IB: Update MAINTAINERS with Hal's new email address
  IB/mlx4: Implement query SRQ
  IB/mlx4: Implement query QP
  IB/cm: Send no match if a SIDR REQ does not match a listen
  IB/cm: Fix handling of duplicate SIDR REQs
  IB/cm: cm_msgs.h should include ib_cm.h
  IB/cm: Include HCA ACK delay in local ACK timeout
  IB/cm: Use spin_lock_irq() instead of spin_lock_irqsave() when possible
  IB/sa: Make sure SA queries use default P_Key
  IPoIB: Recycle loopback skbs instead of freeing and reallocating
  IB/mthca: Replace memset(<addr>, 0, PAGE_SIZE) with clear_page(<addr>)
  IPoIB/cm: Fix warning if IPV6 is not enabled
  IB/core: Take sizeof the correct pointer when calling kmalloc()
  IB/ehca: Improve latency by unlocking after triggering the hardware
  IB/ehca: Notify consumers of LID/PKEY/SM changes after nondisruptive events
  IB/ehca: Return QP pointer in poll_cq()
  IB/ehca: Change idr spinlocks into rwlocks
  IB/ehca: Refactor sync between completions and destroy_cq using atomic_t
  IB/ehca: Lock renaming, static initializers
  IB/ehca: Report RDMA atomic attributes in query_qp()
  ...
2007-07-12 16:45:40 -07:00
David Brownell
c67ab134ba usb gadget stack: remove usb_ep_*_buffer(), part 2
This patch removes controller driver infrastructure which supported
the now-removed usb_ep_{alloc,free}_buffer() calls.

As can be seen, many of the implementations of this were broken to
various degrees.  Many didn't properly return dma-coherent mappings;
those which did so were necessarily ugly because of bogosity in the
underlying dma_free_coherent() calls ... which on many platforms
can't be called from the same contexts (notably in_irq) from which
their dma_alloc_coherent() sibling can be called.

The main potential downside of removing this is that gadget drivers
wouldn't have specific knowledge that the controller drivers have:
endpoints that aren't dma-capable don't need any dma mappings at all.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-12 16:34:42 -07:00
David Brownell
9d8bab58b7 usb gadget stack: remove usb_ep_*_buffer(), part 1
Remove usb_ep_{alloc,free}_buffer() calls, for small dma-coherent buffers.
This patch just removes the interface and its users; later patches will
remove controller driver support.

  - This interface is invariably not implemented correctly in the
    controller drivers (e.g. using dma pools, a mechanism which
    post-dates the interface by several years).

  - At this point no gadget driver really *needs* to use it.  In
    current kernels, any driver that needs such a mechanism could
    allocate a dma pool themselves.

Removing this interface is thus a simplification and improvement.

Note that the gmidi.c driver had a bug in this area; fixed.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-12 16:34:42 -07:00
Craig W. Nadler
165fe97ed6 USB: add IAD support to usbfs and sysfs
USB_IAD: Adds support for USB Interface Association Descriptors.

This patch adds support to the USB host stack for parsing, storing, and
displaying Interface Association Descriptors. In /proc/bus/usb/devices
lines starting with A: show the fields in an IAD. In sysfs if an
interface on a USB device is referenced by an IAD the following files
will be added to the sysfs directory for that interface:
iad_bFirstInterface, iad_bInterfaceCount, iad_bFunctionClass, and
iad_bFunctionSubClass, iad_bFunctionProtocol

Signed-off-by: Craig W. Nadler <craig@nadler.us>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-12 16:34:40 -07:00
Marcel Holtmann
8b3b01c898 USB: Add URB_FREE_BUFFER flag and the logic behind it
USB: Add URB_FREE_BUFFER flag for freeing the transfer buffer

In some cases it is not needed that the driver keeps track of the
transfer buffer of an URB. It can be simply freed along with the
URB itself when the reference count goes down to zero. The new
flag URB_FREE_BUFFER enables this behavior.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-12 16:34:38 -07:00
Alan Stern
b41a60eca8 USB: add power/persist device attribute
This patch (as920) adds an extra level of protection to the
USB-Persist facility.  Now it will apply by default only to hubs; for
all other devices the user must enable it explicitly by setting the
power/persist device attribute.

The disconnect_all_children() routine in hub.c has been removed and
its code placed inline.  This is the way it was originally as part of
hub_pre_reset(); the revised usage in hub_reset_resume() is
sufficiently different that the code can no longer be shared.
Likewise, mark_children_for_reset() is now inline as part of
hub_reset_resume().  The end result looks much cleaner than before.

The sysfs interface is updated to add the new attribute file, and
there are corresponding documentation updates.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-12 16:34:30 -07:00
Alan Stern
f07600cf9e USB: add reset_resume method
This patch (as918) introduces a new USB driver method: reset_resume.
It is called when a device needs to be reset as part of a resume
procedure (whether because of a device quirk or because of the
USB-Persist facility), thereby taking over a role formerly assigned to
the post_reset method.  As a consequence, post_reset no longer needs
an argument indicating whether it is being called as part of a
reset-resume.  This separation of functions makes the code clearer.

In addition, the pre_reset and post_reset method return types are
changed; they now must return an error code.  The return value is
unused at present, but at some later time we may unbind drivers and
re-probe if they encounter an error during reset handling.

The existing pre_reset and post_reset methods in the usbhid,
usb-storage, and hub drivers are updated to match the new
requirements.  For usbhid the post_reset routine is also used for
reset_resume (duplicate method pointers); for the other drivers a new
reset_resume routine is added.  The change to hub.c looks bigger than
it really is, because mark_children_for_reset_resume() gets moved down
next to the new hub_reset_resume() routine.

A minor change to usb-storage makes the usb_stor_report_bus_reset()
routine acquire the host lock instead of requiring the caller to hold
it already.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
CC: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-12 16:34:30 -07:00
Oliver Neukum
51a2f077c4 USB: introduce usb_anchor
- introduction of usb_anchor and its methods

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-12 16:29:51 -07:00
David Brownell
a5262dcfda USB: export <linux/usb_gadgetfs> as <linux/usb/gadgetfs.h>
Make sure gadgetfs userspace interface is properly exported:

 - Move <linux/usb_gadgetfs.h> to <linux/usb/gadgetfs.h>;
 - Export it using Kbuild;
 - Add an #include guard;
 - Correct some internal documentation;
 - Update struct layout so it's the same on 32/64 bit kernels.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-12 16:29:50 -07:00
Daniel Drake
8538f96ae5 USB: add USB_DEVICE_AND_INTERFACE_INFO for device matching
Recently, the USB device matching code stopped matching generic interface
matches against devices with vendor-specific device class values.

Some drivers now need to explicitly match USB device ID's (in addition to
generic interface info) to retain the same behaviour as before. This new macro,
suggested by Alan Stern, makes the explicit device/interface matching a little
simpler for those users.

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-12 16:29:48 -07:00
Alan Stern
6bc6cff52e USB: add RESET_RESUME device quirk
This patch (as888) adds a new USB device quirk for devices which are
unable to resume correctly.  By using the new code added for the
USB-persist facility, it is a simple matter to reset these devices
instead of resuming them.  To get things kicked off, a quirk entry is
added for the Philips PSC805.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-12 16:29:47 -07:00
Alan Stern
0458d5b4c9 USB: add USB-Persist facility
This patch (as886) adds the controversial USB-persist facility,
allowing USB devices to persist across a power loss during system
suspend.

The facility is controlled by a new Kconfig option (with appropriate
warnings about the potential dangers); when the option is off the
behavior will remain the same as it is now.  But when the option is
on, people will be able to use suspend-to-disk and keep their USB
filesystems intact -- something particularly valuable for small
machines where the root filesystem is on a USB device!

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-12 16:29:47 -07:00
Oliver Neukum
ec22559e0b USB: suspend support for usb serial
this implements generic support for suspend/resume for usb serial.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-12 16:29:44 -07:00
Jack Morgenstein
65541cb7cf IB/mlx4: Implement query SRQ
Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-12 15:41:24 -07:00
Jack Morgenstein
6a775e2ba4 IB/mlx4: Implement query QP
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-12 15:41:00 -07:00
Linus Torvalds
bb50cbbd4b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6:
  security: unexport mmap_min_addr
  SELinux: use SECINITSID_NETMSG instead of SECINITSID_UNLABELED for NetLabel
  security: Protection for exploiting null dereference using mmap
  SELinux: Use %lu for inode->i_no when printing avc
  SELinux: allow preemption between transition permission checks
  selinux: introduce schedule points in policydb_destroy()
  selinux: add selinuxfs structure for object class discovery
  selinux: change sel_make_dir() to specify inode counter.
  selinux: rename sel_remove_bools() for more general usage.
  selinux: add support for querying object classes and permissions from the running policy
2007-07-12 13:46:48 -07:00
Linus Torvalds
0806ca2ab3 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] Support multiple CPUs going through OS_MCA
  [IA64] silence GCC ia64 unused variable warnings
  [IA64] prevent MCA when performing MMIO mmap to PCI config space
  [IA64] add sn_register_pmi_handler oemcall
  [IA64] Stop bit for brl instruction
  [IA64] SN: Correct ROM resource length for BIOS copy
  [IA64] Don't set psr.ic and psr.i simultaneously
2007-07-12 13:41:29 -07:00
Linus Torvalds
21ba0f88ae Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (34 commits)
  PCI: Only build PCI syscalls on architectures that want them
  PCI: limit pci_get_bus_and_slot to domain 0
  PCI: hotplug: acpiphp: avoid acpiphp "cannot get bridge info" PCI hotplug failure
  PCI: hotplug: acpiphp: remove hot plug parameter write to PCI host bridge
  PCI: hotplug: acpiphp: fix slot poweroff problem on systems without _PS3
  PCI: hotplug: pciehp: wait for 1 second after power off slot
  PCI: pci_set_power_state(): check for PM capabilities earlier
  PCI: cpci_hotplug: Convert to use the kthread API
  PCI: add pci_try_set_mwi
  PCI: pcie: remove SPIN_LOCK_UNLOCKED
  PCI: ROUND_UP macro cleanup in drivers/pci
  PCI: remove pci_dac_dma_... APIs
  PCI: pci-x-pci-express-read-control-interfaces cleanups
  PCI: Fix typo in include/linux/pci.h
  PCI: pci_ids, remove double or more empty lines
  PCI: pci_ids, add atheros and 3com_2 vendors
  PCI: pci_ids, reorder some entries
  PCI: i386: traps, change VENDOR to DEVICE
  PCI: ATM: lanai, change VENDOR to DEVICE
  PCI: Change all drivers to use pci_device->revision
  ...
2007-07-12 13:40:57 -07:00
Linus Torvalds
dc690d8ef8 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (61 commits)
  sysfs: add parameter "struct bin_attribute *" in .read/.write methods for sysfs binary attributes
  sysfs: make directory dentries and inodes reclaimable
  sysfs: implement sysfs_get_dentry()
  sysfs: move sysfs_drop_dentry() to dir.c and make it static
  sysfs: restructure add/remove paths and fix inode update
  sysfs: use sysfs_mutex to protect the sysfs_dirent tree
  sysfs: consolidate sysfs spinlocks
  sysfs: make kobj point to sysfs_dirent instead of dentry
  sysfs: implement sysfs_find_dirent() and sysfs_get_dirent()
  sysfs: implement SYSFS_FLAG_REMOVED flag
  sysfs: rename sysfs_dirent->s_type to s_flags and make room for flags
  sysfs: make sysfs_drop_dentry() access inodes using ilookup()
  sysfs: Fix oops in sysfs_drop_dentry on x86_64
  sysfs: use singly-linked list for sysfs_dirent tree
  sysfs: slim down sysfs_dirent->s_active
  sysfs: move s_active functions to fs/sysfs/dir.c
  sysfs: fix root sysfs_dirent -> root dentry association
  sysfs: use iget_locked() instead of new_inode()
  sysfs: reorganize sysfs_new_indoe() and sysfs_create()
  sysfs: fix parent refcounting during rename and move
  ...
2007-07-12 13:40:20 -07:00
Linus Torvalds
57399ec907 Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: (21 commits)
  libata: remove irq_on from ata_bus_reset() and ata_std_postreset()
  ata_piix: kill incorrect invalid map value warning
  libata: add another Maxtor drive with broken NCQ to the list
  [libata] sata_mv: Fix and clean up per-chip-generation tests
  [libata] sata_mv: Convert to new exception handling (EH) infrastructure
  [libata] sata_mv: minor bug fixes, enhancements, and cleanups (prep for new EH)
  [libata] sata_mv: Minor cleanups and renaming, preparing for new EH & NCQ
  libata-link: add PMP related ATA constants
  libata-link: separate out ata_eh_handle_dev_fail()
  pata_hpt3x3: fix DMA Kconfig option to actually have a hope of working
  Add Hitachi HDS7250SASUN500G 0621KTAWSD to NCQ blacklist
  pata_scc.c: Workaround for errata A308
  libata: add FUJITSU MHV2080BH to NCQ blacklist
  pata_hpt3x3: major reworking and testing
  libata: clean up horkage handling
  libata: quirk IOMEGA ZIP 250 ATAPI FLOPPY
  libata: simplify PCI legacy SFF host handling
  pata_mpc52xx: suspend/resume support
  sata_promise: SATA hotplug support, take 2
  pata_sis: FIFO whack
  ...
2007-07-12 13:38:50 -07:00
Linus Torvalds
e1bd2ac5a6 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (183 commits)
  [TG3]: Update version to 3.78.
  [TG3]: Add missing NVRAM strapping.
  [TG3]: Enable auto MDI.
  [TG3]: Fix the polarity bit.
  [TG3]: Fix irq_sync race condition.
  [NET_SCHED]: ematch: module autoloading
  [TCP]: tcp probe wraparound handling and other changes
  [RTNETLINK]: rtnl_link: allow specifying initial device address
  [RTNETLINK]: rtnl_link API simplification
  [VLAN]: Fix MAC address handling
  [ETH]: Validate address in eth_mac_addr
  [NET]: Fix races in net_rx_action vs netpoll.
  [AF_UNIX]: Rewrite garbage collector, fixes race.
  [NETFILTER]: {ip, nf}_conntrack_sctp: fix remotely triggerable NULL ptr dereference (CVE-2007-2876)
  [NET]: Make all initialized struct seq_operations const.
  [UDP]: Fix length check.
  [IPV6]: Remove unneeded pointer idev from addrconf_cleanup().
  [DECNET]: Another unnecessary net/tcp.h inclusion in net/dn.h
  [IPV6]: Make IPV6_{RECV,2292}RTHDR boolean options.
  [IPV6]: Do not send RH0 anymore.
  ...

Fixed up trivial conflict in Documentation/feature-removal-schedule.txt
manually.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-12 13:31:22 -07:00
Linus Torvalds
068345f4a8 Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6: (26 commits)
  i2c-rpx: Remove
  i2c-mpc: work around missing-9th-clock-pulse bug
  i2c: New PMC MSP71xx TWI bus driver
  i2c-savage4: Delete many unused defines
  i2c/tsl2550: Speed up initialization
  i2c: New bus driver for the TAOS evaluation modules
  i2c-i801: Use the internal 32-byte buffer on ICH4+
  i2c-i801: Various cleanups
  i2c: Add support for the TSL2550
  i2c-pxa: Support new-style I2C drivers
  i2c-gpio: Make some internal functions static
  i2c-gpio: Add support for new-style clients
  i2c-iop3xx: Switch to static adapter numbering
  i2c-sis5595: Resolve resource conflict with sis5595
  matroxfb: Clean-up i2c header inclusions
  i2c-nforce2: Add support for SMBus block transactions
  i2c-mpc: Use i2c_add_numbered_adapter
  i2c-mv64xxx: Use i2c_add_numbered_adapter
  i2c-piix4: Add support for the ATI SB700
  i2c: New DS1682 chip driver
  ...
2007-07-12 13:25:00 -07:00
Daniel Drake
5628221caf [PATCH] mac80211: ERP IE handling improvements
The "protection needed" flag is currently parsed out of the ERP IE in
beacons. This patch allows the ERP IE to be available at assocation time
and causes the appropriate actions to be performed earlier.

It is slightly complicated by the fact that most APs don't include the
ERP IE in association responses. To work around this, we store ERP
values in the ieee80211_sta_bss structure.

Also added some WLAN_ERP defines for use by upcoming patches.

Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-07-12 16:07:26 -04:00
H. Peter Anvin
c397368232 Remove old i386 setup code
This removes the old i386 setup code.  This is done as a separate patch
to avoid breaking git bisect as some of the i386 code was also used by
the old x86-64 code.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-12 10:55:56 -07:00
H. Peter Anvin
e087db510c Clean up struct screen_info (<linux/screen_info.h>)
struct screen_info has unaligned members, it needs to be packed.
In the process, fix the naming of some of the members, which don't
belong in this structure but are part of it anyway.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-12 10:55:54 -07:00
Jean Delvare
b9cdad7488 i2c: New bus driver for the TAOS evaluation modules
This is a new I2C bus driver for the TAOS evaluation modules. Developped
and tested on the TAOS TSL2550 EVM.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
2007-07-12 14:12:31 +02:00
Henry Su
c29c22218b i2c-piix4: Add support for the ATI SB700
Add the SMBus device ID for ATI SB700.

Signed-off-by: Henry Su <Henry.su@amd.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2007-07-12 14:12:29 +02:00
Jean Delvare
4b2643d7d9 i2c: Fix the i2c_smbus_read_i2c_block_data() prototype
Let the drivers specify how many bytes they want to read with
i2c_smbus_read_i2c_block_data(). So far, the block count was
hard-coded to I2C_SMBUS_BLOCK_MAX (32), which did not make much sense.
Many driver authors complained about this before, and I believe it's
about time to fix it. Right now, authors have to do technically stupid
things, such as individual byte reads or full-fledged I2C messaging,
to work around the problem. We do not want to encourage that.

I even found that some bus drivers (e.g. i2c-amd8111) already
implemented I2C block read the "right" way, that is, they didn't
follow the old, broken standard. The fact that it was never noticed
before just shows how little i2c_smbus_read_i2c_block_data() was used,
which isn't that surprising given how broken its prototype was so far.

There are some obvious compatiblity considerations:
* This changes the i2c_smbus_read_i2c_block_data() prototype. Users
  outside the kernel tree will notice at compilation time, and will
  have to update their code.
* User-space has access to i2c_smbus_xfer() directly using i2c-dev, so
  the changed expectations would affect tools such as i2cdump. In order
  to preserve binary compatibility, we give I2C_SMBUS_I2C_BLOCK_DATA
  a new numeric value, and define I2C_SMBUS_I2C_BLOCK_BROKEN with the
  old numeric value. When i2c-dev receives a transaction with the
  old value, it can convert it to the new format on the fly.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
2007-07-12 14:12:29 +02:00
Mark M. Hoffman
d75d53cd57 i2c: Fix sparse warning in i2c.h
Kill a sparse warning by un-nesting two container_of() calls.
    
Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2007-07-12 14:12:28 +02:00
David Brownell
d64f73be1b i2c: Add kernel documentation
Generate I2C kerneldoc; fix various glitches and add "context" sections to
that documentation.  Most I2C and SMBus functions still have no kerneldoc.

Let me suggest providing kerneldoc for all the i2c_smbus_*() functions as
a small and mostly self-contained project for anyone so inclined.  :)

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2007-07-12 14:12:28 +02:00
Eric Paris
ed03218951 security: Protection for exploiting null dereference using mmap
Add a new security check on mmap operations to see if the user is attempting
to mmap to low area of the address space.  The amount of space protected is
indicated by the new proc tunable /proc/sys/vm/mmap_min_addr and defaults to
0, preserving existing behavior.

This patch uses a new SELinux security class "memprotect."  Policy already
contains a number of allow rules like a_t self:process * (unconfined_t being
one of them) which mean that putting this check in the process class (its
best current fit) would make it useless as all user processes, which we also
want to protect against, would be allowed. By taking the memprotect name of
the new class it will also make it possible for us to move some of the other
memory protect permissions out of 'process' and into the new class next time
we bump the policy version number (which I also think is a good future idea)

Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2007-07-11 22:52:29 -04:00
Patrick McHardy
db3d99c090 [NET_SCHED]: ematch: module autoloading
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-11 19:46:26 -07:00
Patrick McHardy
8c979c26a0 [VLAN]: Fix MAC address handling
The VLAN MAC address handling is broken in multiple ways. When the address
differs when setting it, the real device is put in promiscous mode twice,
but never taken out again. Additionally it doesn't resync when the real
device's address is changed and needlessly puts it in promiscous mode when
the vlan device is still down.

Fix by moving address handling to vlan_dev_open/vlan_dev_stop and properly
deal with address changes in the device notifier. Also switch to
dev_unicast_add (which needs the exact same handling).

Since the set_mac_address handler is identical to the generic ethernet one
with these changes, kill it and use ether_setup().

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-11 19:45:24 -07:00
Olaf Kirch
29578624e3 [NET]: Fix races in net_rx_action vs netpoll.
Keep netpoll/poll_napi from messing with the poll_list.
Only net_rx_action is allowed to manipulate the list.

Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-11 19:32:02 -07:00
Zhang Rui
91a6902958 sysfs: add parameter "struct bin_attribute *" in .read/.write methods for sysfs binary attributes
Well, first of all, I don't want to change so many files either.

What I do:
Adding a new parameter "struct bin_attribute *" in the
.read/.write methods for the sysfs binary attributes.

In fact, only the four lines change in fs/sysfs/bin.c and
include/linux/sysfs.h do the real work.
But I have to update all the files that use binary attributes
to make them compatible with the new .read and .write methods.
I'm not sure if I missed any. :(

Why I do this:
For a sysfs attribute, we can get a pointer pointing to the
struct attribute in the .show/.store method,
while we can't do this for the binary attributes.
I don't know why this is different, but this does make it not
so handy to use the binary attributes as the regular ones.
So I think this patch is reasonable. :)

Who benefits from it:
The patch that exposes ACPI tables in sysfs
requires such an improvement.
All the table binary attributes share the same .read method.
Parameter "struct bin_attribute *" is used to get
the table signature and instance number which are used to
distinguish different ACPI table binary attributes.

Without this parameter, we need to offer different .read methods
for different ACPI table binary attributes.
This is impossible as there are various ACPI tables on different
platforms, and we don't know what they are until they are loaded.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:09 -07:00
Tejun Heo
51225039f3 sysfs: make directory dentries and inodes reclaimable
This patch makes dentries and inodes for sysfs directories
reclaimable.

* sysfs_notify() is modified to walk sysfs_dirent tree instead of
  dentry tree.

* sysfs_update_file() and sysfs_chmod_file() use sysfs_get_dentry() to
  grab the victim dentry.

* sysfs_rename_dir() and sysfs_move_dir() grab all dentries using
  sysfs_get_dentry() on startup.

* Dentries for all shadowed directories are pinned in memory to serve
  as lookup start point.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:09 -07:00
Tejun Heo
608e266a2d sysfs: make kobj point to sysfs_dirent instead of dentry
As kobj sysfs dentries and inodes are gonna be made reclaimable,
dentry can't be used as naming token for sysfs file/directory, replace
kobj->dentry with kobj->sd.  The only external interface change is
shadow directory handling.  All other changes are contained in kobj
and sysfs.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:08 -07:00
Tejun Heo
380e6fbb72 sysfs: implement SYSFS_FLAG_REMOVED flag
Implement SYSFS_FLAG_REMOVED flag which currently is used only to
improve sanity check in sysfs_deactivate().  The flag will be used to
make directory entries reclamiable.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:08 -07:00
Tejun Heo
b402d72cf7 sysfs: rename sysfs_dirent->s_type to s_flags and make room for flags
Rename sysfs_dirent->s_type to s_flags, pack type into lower eight
bits and reserve the rest for flags.  sysfs_type() can used to access
the type.  All existing sd->s_type accesses are converted to use
sysfs_type().  While at it, type test is changed to equality test
instead of bit-and test where appropriate.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:08 -07:00
Tejun Heo
ad6a1e1c66 driver-core: make devt_attr and uevent_attr static
devt_attr and uevent_attr are either allocated dynamically with or
embedded in device and class_device as they needed their owner field
set to the module implementing the driver.  Now that sysfs implements
immediate disconnect and owner field removed from struct attribute,
there is no reason to do this.  Remove these attributes from
[class_]device and use static attribute structures instead.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:06 -07:00
Tejun Heo
7b595756ec sysfs: kill unnecessary attribute->owner
sysfs is now completely out of driver/module lifetime game.  After
deletion, a sysfs node doesn't access anything outside sysfs proper,
so there's no reason to hold onto the attribute owners.  Note that
often the wrong modules were accounted for as owners leading to
accessing removed modules.

This patch kills now unnecessary attribute->owner.  Note that with
this change, userland holding a sysfs node does not prevent the
backing module from being unloaded.

For more info regarding lifetime rule cleanup, please read the
following message.

  http://article.gmane.org/gmane.linux.kernel/510293

(tweaked by Greg to not delete the field just yet, to make it easier to
merge things properly.)

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:06 -07:00
Tejun Heo
0c096b507f sysfs: add sysfs_dirent->s_name
Add s_name to sysfs_dirent.  This is to further reduce dependency to
the associated dentry.  Name is copied for directories and symlinks
but not for attributes.

Where possible, name dereferences are converted to use sd->s_name.
sysfs_symlink->link_name and sysfs_get_name() are unused now and
removed.

This change allows symlink to be implemented using sysfs_dirent tree
proper, which is the last remaining dentry-dependent sysfs walk.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:04 -07:00
Tejun Heo
72dba584b6 ida: implement idr based id allocator
Implement idr based id allocator.  ida is used the same way idr is
used but lacks id -> ptr translation and thus consumes much less
memory.  struct ida_bitmap is attached as leaf nodes to idr tree which
is managed by the idr code.  Each ida_bitmap is 128bytes long and
contains slightly less than a thousand slots.

ida is more aggressive with releasing extra resources acquired using
ida_pre_get().  After every successful id allocation, ida frees one
reserved idr_layer if possible.  Reserved ida_bitmap is not freed
automatically but only one ida_bitmap is reserved and it's almost
always used right away.  Under most circumstances, ida won't hold on
to memory for too long which isn't actively used.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:03 -07:00
Rafael J. Wysocki
515c535762 PM: Remove prev_state from struct dev_pm_info
The prev_state member of struct dev_pm_info (defined in include/linux/pm.h) is
only used during a resume to check if the device's state before the suspend was
'off', in which case the device is not resumed.  However, in such cases the
decision whether or not to resume the device should be made on the driver level
and the resume callbacks from the device's bus and class should be executed
anyway (the may be needed for some things other than just powering on the
device).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:02 -07:00
Rafael J. Wysocki
cc4900690b PM: Remove saved_state from struct dev_pm_info
The saved_state member of struct dev_pm_info, defined in include/linux/pm.h, is
not used anywhere, so it can be removed.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:01 -07:00
Rafael J. Wysocki
9cddad7757 PM: Remove pm_parent from struct dev_pm_info
The pm_parent member of struct dev_pm_info (defined in include/linux/pm.h) is
only used to check if the device's parent is in the right state while the
device is being suspended or resumed.  However, this can be done just as well
with the help of the parent pointer in struct device, so pm_parent can be
removed along with some code that handles it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:01 -07:00
Lennart Poettering
4f5c791a85 DMI-based module autoloading
The patch below adds DMI/SMBIOS based module autoloading to the Linux
kernel. The idea is to load laptop drivers automatically (and other
drivers which cannot be autoloaded otherwise), based on the DMI system
identification information of the BIOS.

Right now most distros manually try to load all available laptop
drivers on bootup in the hope that at least one of them loads
successfully. This patch does away with all that, and uses udev to
automatically load matching drivers on the right machines.

Basically the patch just exports the DMI information that has been
parsed by the kernel anyway to userspace via a sysfs device
/sys/class/dmi/id and makes sure that proper modalias attributes are
available. Besides adding the "modalias" attribute it also adds
attributes for a few other DMI fields which might be useful for
writing udev rules.

This patch is not an attempt to export the entire DMI/SMBIOS data to
userspace. We already have "dmidecode" which parses the complete DMI
info from userspace. The purpose of this patch is machine model
identification and good udev integration.

To take advantage of DMI based module autoloading, a driver should
export one or more MODULE_ALIAS fields similar to these:

MODULE_ALIAS("dmi:*:svnMICRO-STARINT'LCO.,LTD:pnMS-1013:pvr0131*:cvnMICRO-STARINT'LCO.,LTD:ct10:*");
MODULE_ALIAS("dmi:*:svnMicro-StarInternational:pnMS-1058:pvr0581:rvnMSI:rnMS-1058:*:ct10:*");
MODULE_ALIAS("dmi:*:svnMicro-StarInternational:pnMS-1412:*:rvnMSI:rnMS-1412:*:cvnMICRO-STARINT'LCO.,LTD:ct10:*");
MODULE_ALIAS("dmi:*:svnNOTEBOOK:pnSAM2000:pvr0131*:cvnMICRO-STARINT'LCO.,LTD:ct10:*");

These lines are specific to my msi-laptop.c driver. They are basically
just a concatenation of a few carefully selected DMI fields with all
potentially bad characters stripped.

Besides laptop drivers, modules like "hdaps", the i2c modules
and the hwmon modules are good candidates for "dmi:" MODULE_ALIAS
lines.

Besides merely exporting the DMI data via sysfs the patch adds
support for a few more DMI fields. Especially the CHASSIS fields are
very useful to identify different laptop modules. The patch also adds
working MODULE_ALIAS lines to my msi-laptop.c driver.

I'd like to thank Kay Sievers for helping me to clean up this patch
for posting it on lkml.

Patch is against Linus' current GIT HEAD. Should probably apply to
older kernels as well without modification.


Signed-off-by: Lennart Poettering <mzxreary@0pointer.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:00 -07:00
Jan Kara
cfc94cdf8e debugfs: add rename for debugfs files
Implement debugfs_rename() to allow renaming files/directories in debugfs.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:00 -07:00
Randy Dunlap
694625c0b3 PCI: add pci_try_set_mwi
As suggested by Andrew, add pci_try_set_mwi(), which does not require
return-value checking.

- add pci_try_set_mwi() without __must_check
- make it return 0 on success, errno if the "try" failed or error
- review callers

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:11 -07:00
Rolf Eike Beer
579082df38 PCI: Fix typo in include/linux/pci.h
Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:10 -07:00
Jiri Slaby
12bedda9f4 PCI: pci_ids, remove double or more empty lines
pci_ids, remove two or more empty lines

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:10 -07:00
Jiri Slaby
f732ee0b71 PCI: pci_ids, add atheros and 3com_2 vendors
pci_ids, add atheros and 3com_2 vendors

Atheros is wifi vendor. 3com_2 (0xa727) is an vendor id for one card with
ath chip.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:10 -07:00
Jiri Slaby
03966e097d PCI: pci_ids, reorder some entries
pci_ids, reorder some entries

Some lines are not vendor sorted, reorder it to comply with the rest of
document.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:10 -07:00
Jiri Slaby
c43eaa02ab PCI: i386: traps, change VENDOR to DEVICE
traps, change VENDOR to DEVICE

Change macro for SGI lithium (arch/i386/mach-visws/traps.c) device from
VENDOR to DEVICE, because it's a device id.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:10 -07:00
Jiri Slaby
1d0ed384c1 PCI: ATM: lanai, change VENDOR to DEVICE
lanai, change VENDOR to DEVICE

There were 2 bad named macros in pci_ids (LANAI 2 and IHB). Rename it to
DEVICE, because it's device id. Also make some cleanpu in pci_device_id
table (use PCI_VDEVICE).

Cc: Mitchell Blank Jr <mitch@sfgoth.com>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:10 -07:00
Auke Kok
b8a3a5214d PCI: read revision ID by default
Currently there are 97 occurrences where drivers need the pci
revision ID. We can do this once for all devices. Even the pci
subsystem needs the revision several times for quirks. The extra
u8 member pads out nicely in the pci_dev struct.

Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:09 -07:00
David Brownell
56906c612e PCI: remove useless pci driver method
Remove pointless and never-called enable_wake() hook from pci_driver and
from documentation.  Evidently this was introduced in the 2.4.6 kernel,
but there's no evidence it was ever called; and it was rarely implemented.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:08 -07:00
Stephen Hemminger
f0dce41193 PCI aer: add pci_cleanup_aer_correct_aer_status
Function to clear bogus correctable errors. Analog to pci_aer_uncorrect_are_status.
The Marvell chips seem to start out with a bogus value that needs to be
cleared.

Yanmin ported it to 2.6.22-rc4 by fixing a fuzz patch applying info.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Acked-by: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:08 -07:00
Stephen Hemminger
65b3bc358a PCI aer: fix stub return values
The stubs used when advanced error reporting is not enabled
must have same return type as real functions.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Acked-by: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:08 -07:00
Alan Cox
adf809d010 + pci_find_slot-mark-deprecated.patch added to -mm tree
We've now fixed up most users of pci_find_slot, and the remainder are either
hard and need someone with the hardware and info to work on it, or patches
exist but are not yet merged.

Time therefore for some gentle encouragement

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:08 -07:00
Michael Ellerman
a2cd52ca90 PCI: Make pcibios_add_platform_entries() return errors
Currently pcibios_add_platform_entries() returns void, but could fail,
so instead have it return an int and propagate errors up to
pci_create_sysfs_dev_files().

Fixes:
arch/powerpc/kernel/pci_64.c: In function 'pcibios_add_platform_entries':
arch/powerpc/kernel/pci_64.c:878: warning: ignoring return value of
	'device_create_file', declared with attribute warn_unused_result
arch/powerpc/kernel/pci_32.c: In function 'pcibios_add_platform_entries':
  arch/powerpc/kernel/pci_32.c:1043: warning: ignoring return value of
	'device_create_file', declared with attribute warn_unused_result

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:07 -07:00
Michael Ellerman
575e3348cb PCI: Use a weak symbol for the empty version of pcibios_add_platform_entries()
I'm not sure if this is going to fly, weak symbols work on the compilers I'm
using, but whether they work for all of the affected architectures I can't say.
I've cc'ed as many arch maintainers/lists as I could find.

But assuming they do, we can use a weak empty definition of
pcibios_add_platform_entries() to avoid having an empty definition on every
arch.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:07 -07:00
Peter Oruba
d556ad4bbe PCI: add PCI-X/PCI-Express read control interfaces
This patch introduces an interface to read and write PCI-X / PCI-Express 
maximum read byte count values from PCI config space. There is a second 
function that returns the maximum _designed_ read byte count, which marks the 
maximum value for a device, since some drivers try to set MMRBC to the 
highest allowed value and rely on such a function.

Based on patch set by Stephen Hemminger <shemminger@linux-foundation.org>

Cc: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Peter Oruba <peter.oruba@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:07 -07:00
Herbert Xu
e69ff734e1 [CRYPTO] cipher: Remove obsolete fields from cipher_tfm
This removes all the unused block cipher fields from cipher_tfm.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2007-07-11 20:58:54 +08:00
YOSHIFUJI Hideaki
4c752098f5 [IPV6]: Make IPV6_{RECV,2292}RTHDR boolean options.
Because reversing RH0 is no longer supported by deprecation
of RH0, let's make IPV6_{RECV,2292}RTHDR boolean options.
Boolean are more appropriate from standard POV.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:56:31 -07:00
YOSHIFUJI Hideaki
bb4dbf9e61 [IPV6]: Do not send RH0 anymore.
Based on <draft-ietf-ipv6-deprecate-rh0-00.txt>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:55:49 -07:00
Herbert Xu
c6c6e3e05c [NET]: Update comments for skb checksums
Rusty (whose comments we should all study and emulate :) pointed
out that our comments for skb checksums are no longer up-to-date.
So here is a patch to

1) add the case of partial checksums on input;
2) update partial checksum case to mention csum_start/csum_offset;
3) mention the new IPv6 feature bit.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:41:55 -07:00
Yasuyuki Kozakai
ce7663d84a [NETFILTER]: nfnetlink_queue: don't unregister handler of other subsystem
The queue handlers registered by ip[6]_queue.ko at initialization should
not be unregistered according to requests from userland program
using nfnetlink_queue. If we allow that, there is no way to register
the handlers of built-in ip[6]_queue again.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:21 -07:00
Patrick McHardy
0d53778e81 [NETFILTER]: Convert DEBUGP to pr_debug
Convert DEBUGP to pr_debug and fix lots of non-compiling debug statements.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:20 -07:00
Patrick McHardy
d3c3f4243e [NETFILTER]: ipt_CLUSTERIP: add compat code
Adjust structure size and don't expect pointers passed in from
userspace to be valid. Also replace an enum in an ABI structure
by a fixed size type.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:17 -07:00
Jozsef Kadlecsik
ba9dda3ab5 [NETFILTER]: x_tables: add TRACE target
The TRACE target can be used to follow IP and IPv6 packets through
the ruleset.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick NcHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:17:14 -07:00
Jan Engelhardt
1b50b8a371 [NETFILTER]: Add u32 match
Along comes... xt_u32, a revamped ipt_u32 from POM-NG,
Plus:

    *	2007-06-02: added ipv6 support

    *	2007-06-05: uses kmalloc for the big buffer

    *   2007-06-05: added inversion

    *   2007-06-20: use skb_copy_bits() and get rid of the big buffer
        and lock (suggested by Pablo Neira Ayuso)

Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:17:13 -07:00
Jan Engelhardt
e1931b784a [NETFILTER]: x_tables: switch xt_target->checkentry to bool
Switch the return type of target checkentry functions to boolean.

Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:16:59 -07:00
Jan Engelhardt
ccb79bdce7 [NETFILTER]: x_tables: switch xt_match->checkentry to bool
Switch the return type of match functions to boolean

Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:16:58 -07:00
Jan Engelhardt
1d93a9cbad [NETFILTER]: x_tables: switch xt_match->match to bool
Switch the return type of match functions to boolean

Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:16:57 -07:00
Jan Engelhardt
cff533ac12 [NETFILTER]: x_tables: switch hotdrop to bool
Switch the "hotdrop" variables to boolean

Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:16:56 -07:00
Yasuyuki Kozakai
7bfe246116 [NETFILTER]: ip6_tables: fix explanation of valid upper protocol number
This explains the allowed upper protocol numbers. IP6T_F_NOPROTO was
introduced to use 0 as Hop-by-Hop option header, not wildcard. But that
seemed to be forgotten. 0 has been used as wildcard since 2002-08-23.

Signed-off-by: Yasuyuki Kozakai <yasuyuki@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:16:55 -07:00
Samuel Ortiz
411725280b [IrDA]: Monitor mode.
Through the IrDA netlink set mode command, we switch to IrDA monitor
mode, where one IrLAP instance receives all the packets on the media,
without ever responding to them.

Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:16:44 -07:00
Samuel Ortiz
89da1ecf54 [IrDA]: Netlink layer.
First IrDA configuration netlink layer implementation.
Currently, we only support the set/get mode commands.

Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:16:43 -07:00
Guido Guenther
8c644623fe [NET]: Allow group ownership of TUN/TAP devices.
Introduce a new syscall TUNSETGROUP for group ownership setting of tap
devices. The user now is allowed to send packages if either his euid or
his egid matches the one specified via tunctl (via -u or -g
respecitvely). If both, gid and uid, are set via tunctl, both have to
match.

Signed-off-by: Guido Guenther <agx@sigxcpu.org>
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:16:42 -07:00
Patrick McHardy
61cbc2fca6 [NET]: Fix secondary unicast/multicast address count maintenance
When a reference to an existing address is increased or decreased without
hitting zero, the address count is incorrectly adjusted.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:16:23 -07:00
Peter P Waskiewicz Jr
d62733c8e4 [SCHED]: Qdisc changes and sch_rr added for multiqueue
Add the new sch_rr qdisc for multiqueue network device support.  Allow
sch_prio and sch_rr to be compiled with or without multiqueue hardware
support.

sch_rr is part of sch_prio, and is referenced from MODULE_ALIAS.  This
was done since sch_prio and sch_rr only differ in their dequeue
routine.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:16:22 -07:00
Peter P Waskiewicz Jr
f25f4e4480 [CORE] Stack changes to add multiqueue hardware support API
Add the multiqueue hardware device support API to the core network
stack.  Allow drivers to allocate multiple queues and manage them at
the netdev level if they choose to do so.

Added a new field to sk_buff, namely queue_mapping, for drivers to
know which tx_ring to select based on OS classification of the flow.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:16:21 -07:00
James Chapman
cf14a4d067 [L2TP]: Changes to existing ppp and socket kernel headers for L2TP
Add struct sockaddr_pppol2tp to carry L2TP-specific address
information for the PPPoX (PPPoL2TP) socket. Unfortunately we can't
use the union inside struct sockaddr_pppox because the L2TP-specific
data is larger than the current size of the union and we must preserve
the size of struct sockaddr_pppox for binary compatibility.

Also add a PPPIOCGL2TPSTATS ioctl to allow userspace to obtain
L2TP counters and state from the kernel.

Add new if_pppol2tp.h header.

[ Modified to use aligned_u64 in statistics structure -DaveM ]

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:15:58 -07:00
James Chapman
342f0234c7 [UDP]: Introduce UDP encapsulation type for L2TP
This patch adds a new UDP_ENCAP_L2TPINUDP encapsulation type for UDP
sockets. When a UDP socket's encap_type is UDP_ENCAP_L2TPINUDP, the
skb is delivered to a function pointed to by the udp_sock's
encap_rcv funcptr. If the skb isn't wanted by L2TP, it returns >0, which
causes it to be passed through to UDP.

Include padding to put the new encap_rcv field on a 4-byte boundary.

Previously, the only user of UDP encap sockets was ESP, so when
CONFIG_XFRM was not defined, some of the encap code was compiled
out. This patch changes that. As a result, udp_encap_rcv() will
now do a little more work when CONFIG_XFRM is not defined.

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:15:57 -07:00
Patrick McHardy
4417da668c [NET]: dev: secondary unicast address support
Add support for configuring secondary unicast addresses on network
devices. To support this devices capable of filtering multiple
unicast addresses need to change their set_multicast_list function
to configure unicast filters as well and assign it to dev->set_rx_mode
instead of dev->set_multicast_list. Other devices are put into promiscous
mode when secondary unicast addresses are present.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:15:56 -07:00
Patrick McHardy
3fba5a8b1e [NET]: dev_mcast: switch to generic net_device address lists
Use generic net_device address lists for multicast list handling.
Some defines are used to keep drivers working.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:15:55 -07:00
Patrick McHardy
bf742482d7 [NET]: dev: introduce generic net_device address lists
Introduce struct dev_addr_list and list maintenance functions
based on dev_mc_list and the related functions. This will be
used by follow-up patches for both multicast and secondary
unicast addresses.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:15:54 -07:00
Stephen Hemminger
d212f87b06 [NET]: IPV6 checksum offloading in network devices
The existing model for checksum offload does not correctly handle
devices that can offload IPV4 and IPV6 only. The NETIF_F_HW_CSUM flag
implies device can do any arbitrary protocol.

This patch:
 * adds NETIF_F_IPV6_CSUM for those devices
 * fixes bnx2 and tg3 devices that need it
 * add NETIF_F_IPV6_CSUM to ipv6 output (incl GSO)
 * fixes assumptions about NETIF_F_ALL_CSUM in nat
 * adjusts bridge union of checksumming computation

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:15:52 -07:00
Masahide NAKAMURA
59fbb3a61e [IPV6] MIP6: Loadable module support for MIPv6.
This patch makes MIPv6 loadable module named "mip6".

Here is a modprobe.conf(5) example to load it automatically
when user application uses XFRM state for MIPv6:

alias xfrm-type-10-43 mip6
alias xfrm-type-10-60 mip6

Some MIPv6 feature is not included by this modular, however,
it should not be affected to other features like either IPsec
or IPv6 with and without the patch.
We may discuss XFRM, MH (RAW socket) and ancillary data/sockopt
separately for future work.

Loadable features:
* MH receiving check (to send ICMP error back)
* RO header parsing and building (i.e. RH2 and HAO in DSTOPTS)
* XFRM policy/state database handling for RO

These are NOT covered as loadable:
* Home Address flags and its rule on source address selection
* XFRM sub policy (depends on its own kernel option)
* XFRM functions to receive RO as IPv6 extension header
* MH sending/receiving through raw socket if user application
  opens it (since raw socket allows to do so)
* RH2 sending as ancillary data
* RH2 operation with setsockopt(2)

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:15:42 -07:00
Patrick McHardy
2371baa4bd [RTNETLINK]: Fix rtnetlink compat attribute patch
Sent the wrong patch previously.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:15:40 -07:00
Patrick McHardy
afdc3238ec [RTNETLINK]: Add nested compat attribute
Add a nested compat attribute type that can be used to convert
attributes that contain a structure to nested attributes in a
backwards compatible way.

The attribute looks like this:

struct {
        [ compat contents ]
        struct rtattr {
                .rta_len        = total size,
                .rta_type       = type,
        } rta;
        struct old_structure struct;

        [ nested top-level attribute ]
        struct rtattr {
                .rta_len        = nest size,
                .rta_type       = type,
        } nest_attr;

        [ optional 0 .. n nested attributes ]
        struct rtattr {
                .rta_len        = private attribute len,
                .rta_type       = private attribute typ,
        } nested_attr;
        struct nested_data data;
};

Since both userspace and kernel deal correctly with attributes that are
larger than expected old versions will just parse the compat part and
ignore the rest.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:15:39 -07:00
Patrick McHardy
334a8132d9 [SKBUFF]: Keep track of writable header len of headerless clones
Currently NAT (and others) that want to modify cloned skbs copy them,
even if in the vast majority of cases its not necessary because the
skb is a clone made by TCP and the portion NAT wants to modify is
actually writable because TCP release the header reference before
cloning.

The problem is that there is no clean way for NAT to find out how
long the writable header area is, so this patch introduces skb->hdr_len
to hold this length. When a headerless skb is cloned skb->hdr_len
is set to the current headroom, for regular clones it is copied from
the original. A new function skb_clone_writable(skb, len) returns
whether the skb is writable up to len bytes from skb->data. To avoid
enlarging the skb the mac_len field is reduced to 16 bit and the
new hdr_len field is put in the remaining 16 bit.

I've done a few rough benchmarks of NAT (not with this exact patch,
but a very similar one). As expected it saves huge amounts of system
time in case of sendfile, bringing it down to basically the same
amount as without NAT, with sendmsg it only helps on loopback,
probably because of the large MTU.

Transmit a 1GB file using sendfile/sendmsg over eth0/lo with and
without NAT:

- sendfile eth0, no NAT:	sys     0m0.388s
- sendfile eth0, NAT:		sys     0m1.835s
- sendfile eth0: NAT + path:	sys     0m0.370s	(~ -80%)

- sendfile lo, no NAT:		sys     0m0.258s
- sendfile lo, NAT:		sys     0m2.609s
- sendfile lo, NAT + patch:	sys     0m0.260s	(~ -90%)

- sendmsg eth0, no NAT:		sys     0m2.508s
- sendmsg eth0, NAT:		sys     0m2.539s
- sendmsg eth0, NAT + patch:	sys     0m2.445s	(no change)

- sendmsg lo, no NAT:		sys	0m2.151s
- sendmsg lo, NAT:		sys     0m3.557s
- sendmsg lo, NAT + patch:	sys     0m2.159s	(~ -40%)

I expect other users can see a similar performance improvement,
packet mangling iptables targets, ipip and ip_gre come to mind ..

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:15:37 -07:00
Arnaldo Carvalho de Melo
1e180f726a [KTIME]: Introduce ktime_add_us
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
2007-07-10 22:15:26 -07:00
Gerrit Renker
f1c91da447 [KTIME]: Introduce ktime_us_delta
This provides a reusable time difference function which returns the difference in
microseconds, as often used in the DCCP code.

Commiter note: renamed ktime_delta to ktime_us_delta and put it in ktime.h.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
2007-07-10 22:15:25 -07:00
Patrick McHardy
07b5b17e15 [VLAN]: Use rtnl_link API
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:15:03 -07:00
Patrick McHardy
a4bf3af4ac [VLAN]: Introduce symbolic constants for flag values
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:15:02 -07:00
Patrick McHardy
b020cb4885 [VLAN]: Keep track of number of QoS mappings
Keep track of the number of configured ingress/egress QoS mappings to
avoid iteration while calculating the netlink attribute size.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:15:01 -07:00
Patrick McHardy
734423cf38 [VLAN]: Use 32 bit value for skb->priority mapping
skb->priority has only 32 bits and even VLAN uses 32 bit values in its API.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:15:00 -07:00
Patrick McHardy
38f7b870d4 [RTNETLINK]: Link creation API
Add rtnetlink API for creating, changing and deleting software devices.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:14:20 -07:00
Patrick McHardy
6472ce6096 [NET]: Mark struct net_device * argument to netdev_priv const
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:14:13 -07:00
David S. Miller
8c7b7faaa6 [NET]: Kill eth_copy_and_sum().
It hasn't "summed" anything in over 7 years, and it's
just a straight mempcy ala skb_copy_to_linear_data()
so just get rid of it.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:08:12 -07:00
David S. Miller
e06e7c6158 [IPV4]: The scheduled removal of multipath cached routing support.
With help from Chris Wedgwood.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:05:57 -07:00
Frank van Maarseveen
c98451bdb2 NLM: fix source address of callback to client
Use the destination address of the original NLM request as the
source address in callbacks to the client.

Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:49 -04:00
Frank van Maarseveen
d3bc9a1deb SUNRPC client: add interface for binding to a local address
In addition to binding to a local privileged port the NFS client should
allow binding to a specific local address. This is used by the server
for callbacks. The patch adds the necessary interface.

Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:49 -04:00
Frank van Maarseveen
a97476926e SUNRPC server: record the destination address of a request
Save the destination address of an incoming request over TCP like is
done already for UDP. It is necessary later for callbacks by the server.

Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:49 -04:00
Frank van Maarseveen
96802a0951 SUNRPC: cleanup transport creation argument passing
Cleanup argument passing to functions for creating an RPC transport.

Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:49 -04:00
Trond Myklebust
75180df2ed NFS: Add the mount option "nosharecache"
Prior to David Howell's mount changes in 2.6.18, users who mounted
different directories which happened to be from the same filesystem on the
server would get different super blocks, and hence could choose different
mount options. As long as there were no hard linked files that crossed from
one subtree to another, this was quite safe.
Post the changes, if the two directories are on the same filesystem (have
the same 'fsid'), they will share the same super block, and hence the same
mount options.

Add a flag to allow users to elect not to share the NFS super block with
another mount point, even if the fsids are the same. This will allow
users to set different mount options for the two different super blocks, as
was previously possible. It is still up to the user to ensure that there
are no cache coherency issues when doing this, however the default
behaviour will be to share super blocks whenever two paths result in
the same fsid.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:48 -04:00
Chuck Lever
8007122520 NFS: Add support for mounting NFSv4 file systems with string options
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:48 -04:00
Chuck Lever
3ea97309e6 NFS: Remake nfsroot_mount as a permanent part of NFS client
In preparation for supporting NFSv2 and NFSv3 mount option handling in the
kernel NFS client, convert mount_clnt.c to be a permanent part of the NFS
client, instead of built only when CONFIG_ROOT_NFS is enabled.

In addition, we also replace the "struct sockaddr_in *" argument with
something more generic, to help support IPv6 at some later point.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:46 -04:00
Chuck Lever
45160d6275 SUNRPC: Rename rpcb_getport to be consistent with new rpcb_getport_sync name
Clean up, for consistency.  Rename rpcb_getport as rpcb_getport_async, to
match the naming scheme of rpcb_getport_sync.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:46 -04:00
Chuck Lever
cce63cd637 SUNRPC: Rename rpcb_getport_external routine
In preparation for handling NFS mount option parsing in the kernel,
rename rpcb_getport_external as rpcb_get_port_sync, and make it available
always (instead of only when CONFIG_ROOT_NFS is enabled).

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:46 -04:00
Chuck Lever
f18289931d NFS: Add a new NFS debugging flag just for mount processing
Note to self: fix up /usr/sbin/rpcdebug too

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:45 -04:00
Chuck Lever
5680d48be8 NFS: Clean-up: Define macros for maximum host and export path name lengths
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:44 -04:00
Chuck Lever
433c92379d NFS: Clean up nfs_size_to_loff_t()
Use the same file size limit that lockd uses.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:43 -04:00
Trond Myklebust
412c77cee6 NFSv4: Defer inode revalidation when setting up a delegation
Currently we force a synchronous call to __nfs_revalidate_inode() in
nfs_inode_set_delegation(). This not only ensures that we cannot call
nfs_inode_set_delegation from an asynchronous context, but it also slows
down any call to open().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:41 -04:00
Trond Myklebust
9f958ab885 NFSv4: Reduce the chances of an open_owner identifier collision
Currently we just use a 32-bit counter.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:39 -04:00
Trond Myklebust
7af654f8d1 NFSv4: Don't reuse expired nfs4_state_owner structs
That just confuses certain NFSv4 servers.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:38 -04:00
Trond Myklebust
587142f85f NFS: Replace NFS_I(inode)->req_lock with inode->i_lock
There is no justification for keeping a special spinlock for the exclusive
use of the NFS writeback code.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:38 -04:00