Commit graph

25728 commits

Author SHA1 Message Date
Gerrit Renker
c2f42077bd dccp ccid-2: Schedule Sync as out-of-band mechanism
The problem with Ack Vectors is that 

  i) their length is variable and can in principle grow quite large,
 ii) it is hard to predict exactly how large they will be.

Due to the second point it seems not a good idea to reduce the MPS; in
particular when on average there is enough room for the Ack Vector and an
increase in length is momentarily due to some burst loss, after which the
Ack Vector returns to its normal/average length.

The solution taken by this patch is to subtract a minimum-expected Ack Vector
length from the MPS (previous patch), and to defer any larger Ack Vectors onto
a separate Sync - but only if indeed there is no space left on the skb.

This patch provides the infrastructure to schedule Sync-packets for transporting
(urgent) out-of-band data. Its signalling is quicker than scheduling an Ack, since
it does not need to wait for new application data.

It can thus serve other parts of the DCCP code as well.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04 07:45:37 +02:00
Gerrit Renker
f10ecaee6d dccp: Replace magic CCID-specific numbers by symbolic constants
The constants DCCPO_{MIN,MAX}_CCID_SPECIFIC are nowhere used in the code, but
instead for the CCID-specific options numbers are used.

This patch unifies the use of CCID-specific option numbers, by adding symbolic
names reflecting the definitions in RFC 4340, 10.3.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04 07:45:34 +02:00
Gerrit Renker
0a4822679d dccp: Initialisation and type-checking of feature sysctls
This patch takes care of initialising and type-checking sysctls related to
feature negotiation. Type checking is important since some of the sysctls
now directly act on the feature-negotiation process.

The sysctls are initialised with the known default values for each feature.
For the type-checking the value constraints from RFC 4340 are used:

 * Sequence Window uses the specified Wmin=32, the maximum is ulong (4 bytes),
   tested and confirmed that it works up to 4294967295 - for Gbps speed;
 * Ack Ratio is between 0 .. 0xffff (2-byte unsigned integer);
 * CCIDs are between 0 .. 255;
 * request_retries, retries1, retries2 also between 0..255 for good measure;
 * tx_qlen is checked to be non-negative;
 * sync_ratelimit remains as before.

Further changes:
----------------
Performed s@sysctl_dccp_feat@sysctl_dccp@g since the sysctls are now in feat.c.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04 07:45:32 +02:00
Gerrit Renker
51c7d4fa26 dccp: Implement both feature-local and feature-remote Sequence Window feature
This adds full support for local/remote Sequence Window feature, from which the 
  * sequence-number-validity (W) and 
  * acknowledgment-number-validity (W') windows 
derive as specified in RFC 4340, 7.5.3. 

Specifically, the following changes are introduced:
  * integrated new socket fields into dccp_sk;
  * updated the update_gsr/gss routines with regard to these fields;
  * updated handler code: the Sequence Window feature is located at the TX side,
    so the local feature is meant if the handler-rx flag is false;
  * the initialisation of `rcv_wnd' in reqsk is removed, since
    - rcv_wnd is not used by the code anywhere;
    - sequence number checks are not done in the LISTEN state (cf. 7.5.3);
    - dccp_check_req checks the Ack number validity more rigorously;
  * the `struct dccp_minisock' became empty and is now removed.

Until the handshake completes with activating negotiated values, the local/remote
Sequence-Window values are undefined and thus can not reliably be estimated.
This issue is addressed in a separate patch.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04 07:45:32 +02:00
Gerrit Renker
5d3dac267a dccp: Initialisation framework for feature negotiation
This initialises feature negotiation from two tables, which are initialised
from sysctls. 

As a novel feature, specifics of the implementation (e.g. currently short
seqnos and ECN are not supported) are advertised for robustness.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04 07:45:31 +02:00
Gerrit Renker
b235dc4abb dccp ccid-2: Phase out the use of boolean Ack Vector sysctl
This removes the use of the sysctl and the minisock variable for the Send Ack
Vector feature, which is now handled fully dynamically via feature negotiation;
i.e. when CCID2 is enabled, Ack Vectors are automatically enabled (as per
RFC 4341, 4.).

Using a sysctl in parallel to this implementation would open the door to
crashes, since much of the code relies on tests of the boolean minisock /
sysctl variable. Thus, this patch replaces all tests of type

	if (dccp_msk(sk)->dccpms_send_ack_vector)
		/* ... */
with
	if (dp->dccps_hc_rx_ackvec != NULL)
		/* ... */

The dccps_hc_rx_ackvec is allocated by the dccp_hdlr_ackvec() when feature
negotiation concluded that Ack Vectors are to be used on the half-connection.
Otherwise, it is NULL (due to dccp_init_sock/dccp_create_openreq_child),
so that the test is a valid one.

The activation handler for Ack Vectors is called as soon as the feature
negotiation has concluded at the
 * server when the Ack marking the transition RESPOND => OPEN arrives;
 * client after it has sent its ACK, marking the transition REQUEST => PARTOPEN.

Adding the sequence number of the Response packet to the Ack Vector has been 
removed, since
 (a) connection establishment implies that the Response has been received;
 (b) the CCIDs only look at packets received in the (PART)OPEN state, i.e.
     this entry will always be ignored;
 (c) it can not be used for anything useful - to detect loss for instance, only
     packets received after the loss can serve as pseudo-dupacks.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04 07:45:31 +02:00
Gerrit Renker
68e074bfce dccp: Remove manual influence on NDP Count feature
Updating the NDP count feature is handled automatically now:
 * for CCID-2 it is disabled, since the code does not use NDP counts;
 * for CCID-3 it is enabled, as NDP counts are used to determine loss lengths.

Allowing the user to change NDP values leads to unpredictable and failing
behaviour, since it is then possible to disable NDP counts even when they
are needed (e.g. in CCID-3).

This means that only those user settings are sensible that agree with the
values for Send NDP Count implied by the choice of CCID. But those settings
are already activated by the feature negotiation (CCID dependency tracking),
hence this form of support is redundant.

At startup the initialisation of the NDP count feature is with the default
value of 0, which is done implicitly by the zeroing-out of the socket when
it is allocated. If the choice of CCID or feature negotiation enables NDP
count, this will then be updated via the NDP activation handler.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04 07:45:31 +02:00
Gerrit Renker
78673e24df dccp: Remove obsolete parts of the old CCID interface
The TX/RX CCIDs of the minisock are now redundant: similar to the Ack Vector
case, their value equals initially that of the sysctl, but at the end of
feature negotiation may be something different.

The old interface removed by this patch thus has been replaced by the newer
interface to dynamically query the currently loaded CCIDs earlier in this
patch set.

Also removed the constructors for the TX CCID and the RX CCID, since the
switch rx/non-rx is done by the handler in minisocks.c (and the handler is
the only place in the code where CCIDs are loaded).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04 07:45:31 +02:00
Gerrit Renker
fade756f18 dccp: Set per-connection CCIDs via socket options
With this patch, TX/RX CCIDs can now be changed on a per-connection basis, which
overrides the defaults set by the global sysctl variables for TX/RX CCIDs.

To make full use of this facility, the remaining patches of this patch set are
needed, which track dependencies and activate negotiated feature values.

Note on the maximum number of CCIDs that can be registered:
-----------------------------------------------------------
The maximum number of CCIDs that can be registered on the socket is constrained
by the space in a Confirm/Change feature negotiation option. 

The space in these in turn depends on the size of header options as defined
in RFC 4340, 5.8. Since this is a recurring constant, it has been moved from
ackvec.h into linux/dccp.h, clarifying its purpose.

Relative to this size, the maximum number of CCID identifiers that can be 
present in a Confirm option (which always consumes 1 byte more than a Change
option, cf. 6.1) is 2 bytes less than the maximum TLV size: one for the
CCID-feature-type and one for the selected value.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04 07:45:28 +02:00
Gerrit Renker
17c30b40ed dccp: Deprecate Ack Ratio sysctl
This patch deprecates the Ack Ratio sysctl, since
 * Ack Ratio is entirely ignored by CCID-3 and CCID-4,
 * Ack Ratio currently doesn't work in CCID-2 (i.e. is always set to 1);
 * even if it would work in CCID-2, there is no point for a user to change it:
   - Ack Ratio is constrained by cwnd (RFC 4341, 6.1.2),
   - if Ack Ratio > cwnd, the system resorts to spurious RTO timeouts 
     (since waiting for Acks which will never arrive in this window),
   - cwnd is not a user-configurable value.	

The only reasonable place for Ack Ratio is to print it for debugging. It is
planned to do this later on, as part of e.g. dccp_probe.

With this patch Ack Ratio is now under full control of feature negotiation:
 * Ack Ratio is resolved as a dependency of the selected CCID;
 * if the chosen CCID supports it (i.e. CCID == CCID-2), Ack Ratio is set to
   the default of 2, following RFC 4340, 11.3 - "New connections start with Ack
   Ratio 2 for both endpoints";
 * what happens then is part of another patch set, since it concerns the 
   dynamic update of Ack Ratio while the connection is in full flight.

Thanks to Tomasz Grobelny for discussion leading up to this patch.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2008-09-04 07:45:28 +02:00
Gerrit Renker
20f41eee82 dccp: Feature negotiation for minimum-checksum-coverage
This provides feature negotiation for server minimum checksum coverage
which so far has been missing.

Since sender/receiver coverage values range only from 0...15, their
type has also been reduced in size from u16 to u4.

Feature-negotiation options are now generated for both sender and receiver
coverage, i.e. when the peer has `forgotten' to enable partial coverage
then feature negotiation will automatically enable (negotiate) the partial
coverage value for this connection.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04 07:45:27 +02:00
Gerrit Renker
668144f7b4 dccp: Deprecate old setsockopt framework
The previous setsockopt interface, which passed socket options via struct 
dccp_so_feat, is complicated/difficult to use. Continuing to support it leads to
ugly code since the old approach did not distinguish between NN and SP values.

This patch removes the old setsockopt interface and replaces it with two new
functions to register NN/SP values for feature negotiation. These are 
essentially wrappers around the internal __feat_register functions, with 
checking added to avoid
 * wrong usage (type);
 * changing values while the connection is in progress.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04 07:45:27 +02:00
Gerrit Renker
71bb49596b dccp: Query supported CCIDs
This provides a data structure to record which CCIDs are locally supported
and three accessor functions:
 - a test function for internal use which is used to validate CCID requests
   made by the user;
 - a copy function so that the list can be used for feature-negotiation;   
 - documented getsockopt() support so that the user can query capabilities.

The data structure is a table which is filled in at compile-time with the
list of available CCIDs (which in turn depends on the Kconfig choices).

Using the copy function for cloning the list of supported CCIDs is useful for
feature negotiation, since the negotiation is now with the full list of available
CCIDs (e.g. {2, 3}) instead of the default value {2}. This means negotiation 
will not fail if the peer requests to use CCID3 instead of CCID2. 

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04 07:45:27 +02:00
Gerrit Renker
828755cee0 dccp: Per-socket initialisation of feature negotiation
This provides feature-negotiation initialisation for both DCCP sockets and
DCCP request_sockets, to support feature negotiation during connection setup.

It also resolves a FIXME regarding the congestion control initialisation.

Thanks to Wei Yongjun for help with the IPv6 side of this patch.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04 07:45:26 +02:00
Gerrit Renker
b4eec20637 dccp: Implement lookup table for feature-negotiation information
A lookup table for feature-negotiation information, extracted from RFC 4340/42,
is provided by this patch. All currently known features can be found in this 
table, along with their feature location, their default value, and type.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04 07:45:26 +02:00
David S. Miller
7c19a3d280 net: Unbreak userspace usage of linux/mroute.h
Nothing in linux/pim.h should be exported to userspace.

This should fix the XORP build failure reported by
Jose Calhariz, the debain package maintainer.

Nothing originally in linux/mroute.h was exported to userspace
ever, but some of this stuff started to be when it was moved into
this new linux/pim.h, and that was wrong.  If we didn't provide these
definitions for 10 years we can reasonably expect that applications
defined this stuff locally or used GLIBC headers providing the
protocol definitions.  And as such the only result of this can
be conflict and userland build breakage.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-29 14:37:23 -07:00
Jarek Poplawski
fe439dd09d pkt_sched: Fix sch_tree_lock()
Use new qdisc_root_sleeping_lock() instead of qdisc_root_lock() as
sch_tree_lock() because this lock could be used while dev is
deactivated, but we never need to use this with noop_qdisc as a root.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-27 02:27:10 -07:00
Jarek Poplawski
f6f9b93f16 pkt_sched: Fix gen_estimator locks
While passing a qdisc root lock to gen_new_estimator() and
gen_replace_estimator() dev could be deactivated or even before
grafting proper root qdisc as qdisc_sleeping (e.g. qdisc_create), so
using qdisc_root_lock() is not enough. This patch adds
qdisc_root_sleeping_lock() for this, plus additional checks, where
necessary.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-27 02:25:17 -07:00
Jarek Poplawski
f6e0b239a2 pkt_sched: Fix qdisc list locking
Since some qdiscs call qdisc_tree_decrease_qlen() (so qdisc_lookup())
without rtnl_lock(), adding and deleting from a qdisc list needs
additional locking. This patch adds global spinlock qdisc_list_lock
and wrapper functions for modifying the list. It is considered as a
temporary solution until hfsc_dequeue(), netem_dequeue() and
tbf_dequeue() (or qdisc_tree_decrease_qlen()) are redone.

With feedback from Herbert Xu and David S. Miller.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-22 03:31:39 -07:00
Jarek Poplawski
2540e0511e pkt_sched: Fix qdisc_watchdog() vs. dev_deactivate() race
dev_deactivate() can skip rescheduling of a qdisc by qdisc_watchdog()
or other timer calling netif_schedule() after dev_queue_deactivate().
We prevent this checking aliveness before scheduling the timer. Since
during deactivation the root qdisc is available only as qdisc_sleeping
additional accessor qdisc_root_sleeping() is created.

With feedback from Herbert Xu <herbert@gondor.apana.org.au>

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-21 05:11:14 -07:00
Ian Campbell
d847471d06 fbdefio: add set_page_dirty handler to deferred IO FB
Fixes kernel BUG at lib/radix-tree.c:473.

Previously the handler was incidentally provided by tmpfs but this was
removed with:

  commit 14fcc23fdc
  Author: Hugh Dickins <hugh@veritas.com>
  Date:   Mon Jul 28 15:46:19 2008 -0700

    tmpfs: fix kernel BUG in shmem_delete_inode

relying on this behaviour was incorrect in any case and the BUG also
appeared when the device node was on an ext3 filesystem.

v2: override a_ops at open() time rather than mmap() time to minimise
races per AKPM's concerns.

Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
Cc: Jaya Kumar <jayakumar.lkml@gmail.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Johannes Weiner <hannes@saeurebad.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Kel Modderman <kel@otaku42.de>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: <stable@kernel.org> [14fcc23fd is in 2.6.25.14 and 2.6.26.1]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-20 15:40:32 -07:00
Nick Piggin
479db0bf40 mm: dirty page tracking race fix
There is a race with dirty page accounting where a page may not properly
be accounted for.

clear_page_dirty_for_io() calls page_mkclean; then TestClearPageDirty.

page_mkclean walks the rmaps for that page, and for each one it cleans and
write protects the pte if it was dirty.  It uses page_check_address to
find the pte.  That function has a shortcut to avoid the ptl if the pte is
not present.  Unfortunately, the pte can be switched to not-present then
back to present by other code while holding the page table lock -- this
should not be a signal for page_mkclean to ignore that pte, because it may
be dirty.

For example, powerpc64's set_pte_at will clear a previously present pte
before setting it to the desired value.  There may also be other code in
core mm or in arch which do similar things.

The consequence of the bug is loss of data integrity due to msync, and
loss of dirty page accounting accuracy.  XIP's __xip_unmap could easily
also be unreliable (depending on the exact XIP locking scheme), which can
lead to data corruption.

Fix this by having an option to always take ptl to check the pte in
page_check_address.

It's possible to retain this optimization for page_referenced and
try_to_unmap.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Jared Hulbert <jaredeh@gmail.com>
Cc: Carsten Otte <cotte@freenet.de>
Cc: Hugh Dickins <hugh@veritas.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-20 15:40:32 -07:00
Ken Chen
2d70b68d42 fix setpriority(PRIO_PGRP) thread iterator breakage
When user calls sys_setpriority(PRIO_PGRP ...) on a NPTL style multi-LWP
process, only the task leader of the process is affected, all other
sibling LWP threads didn't receive the setting.  The problem was that the
iterator used in sys_setpriority() only iteartes over one task for each
process, ignoring all other sibling thread.

Introduce a new macro do_each_pid_thread / while_each_pid_thread to walk
each thread of a process.  Convert 4 call sites in {set/get}priority and
ioprio_{set/get}.

Signed-off-by: Ken Chen <kenchen@google.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-20 15:40:32 -07:00
David Howells
0c7281c0fa FRV: Provide ioremap_wc() for FRV
Provide ioremap_wc() for FRV.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-20 13:19:52 -07:00
David Howells
449f76506f MN10300: Supply ioremap_wc() for MN10300
Supply ioremap_wc() for MN10300.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-20 13:19:51 -07:00
David Woodhouse
e4464facd6 Reserve NFS fileid values for btrfs
Purely cosmetic for now, but we might as well get it merged ASAP.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-20 13:19:51 -07:00
David Woodhouse
1c8e40e406 Reduce brokenness of CRIS headers_install
I won't say 'fix', because they still look broken, although this will at
least allow 'make ARCH=CRIS headers_install' to _complete_.

For headers which are exported, we should probably choose between
asm/arch-v10 and asm/arch-v32 by something that GCC defines -- we can't
rely on a generated symlink. And we certainly can't export an arch/
directory which doesn't even exist.

And the only thing that we seem to include from the arch/ directory is
<asm/arch/ptrace.h> from <asm/ptrace.h> ... and that isn't exported in
either arch-v10 or arch-v32 _anyway_.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-20 13:19:51 -07:00
Linus Torvalds
c5b0079c0a Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (22 commits)
  [SCSI] ibmvfc: Driver version 1.0.2
  [SCSI] ibmvfc: Add details to async event log
  [SCSI] ibmvfc: Sanitize response lengths
  [SCSI] ibmvfc: Fix for lost async events
  [SCSI] ibmvfc: Fixup host state during reinit
  [SCSI] ibmvfc: Fix another hang on module removal
  [SCSI] ibmvscsi: Fixup desired DMA value for shared memory partitions
  [SCSI] megaraid_sas: remove sysfs dbg_lvl world writeable permissions
  [SCSI] qla2xxx: Update version number to 8.02.01-k7.
  [SCSI] qla2xxx: Explicitly tear-down vports during PCI remove_one().
  [SCSI] qla2xxx: Reference proper ha during SBR handling.
  [SCSI] qla2xxx: Set npiv_supported flag for FCoE HBAs.
  [SCSI] qla2xxx: Don't leak SG-DMA mappings while aborting commands.
  [SCSI] qla2xxx: Correct vport-state management issues during ISP-ABORT.
  [SCSI] qla2xxx: Correct synchronization of software/firmware fcport states.
  [SCSI] scsi_dh: Initialize lun_state in check_ownership()
  [SCSI] scsi_dh: Do not use scsilun in rdac hardware handler
  [SCSI] megaraid_sas: version and Documentation Update
  [SCSI] megaraid_sas: add new controllers (0x78 0x79)
  [SCSI] megaraid_sas: add the shutdown DCMD cmd to driver shutdown routine
  ...
2008-08-20 08:42:53 -07:00
Linus Torvalds
ddd13dc606 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI: add acpi_find_root_bridge_handle
  PCI: acpi_pcihp: run _OSC on a root bridge
  x86/PCI: irq and pci_ids patch for Intel Ibex Peak PCHs
  x86/PCI: allow scanning of 255 PCI busses
  x86, pci: detect end_bus_number according to acpi/e820 reserved, v2
  pci: debug extra pci bus resources
  pci: debug extra pci resources range
2008-08-19 13:55:47 -07:00
Linus Torvalds
4309e09242 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (94 commits)
  pkt_sched: Prevent livelock in TX queue running.
  Revert "pkt_sched: Add BH protection for qdisc_stab_lock."
  Revert "pkt_sched: Protect gen estimators under est_lock."
  pkt_sched: remove bogus block (cleanup)
  nf_nat: use secure_ipv4_port_ephemeral() for NAT port randomization
  netfilter: ctnetlink: sleepable allocation with spin lock bh
  netfilter: ctnetlink: fix sleep in read-side lock section
  netfilter: ctnetlink: fix double helper assignation for NAT'ed conntracks
  netfilter: ipt_addrtype: Fix matching of inverted destination address type
  dccp: Fix panic caused by too early termination of retransmission mechanism
  pkt_sched: Don't hold qdisc lock over qdisc_destroy().
  pkt_sched: Add lockdep annotation for qdisc locks
  pkt_sched: Never schedule non-root qdiscs.
  removed unused #include <version.h>
  rt2x00: Fix txdone_entry_desc_flags
  b43: Fix for another Bluetooth Coexistence SPROM Programming error for BCM4306
  mac80211: remove kdoc references to IEEE80211_HW_HOST_GEN_BEACON_TEMPLATE
  p54u: reset skb's data/tail pointer on requeue
  p54: move p54_vdcf_init to the right place.
  iwlwifi: fix printk newlines
  ...
2008-08-19 09:59:02 -07:00
David S. Miller
8e0f36ec37 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-08-18 21:15:44 -07:00
Linus Torvalds
b689e83961 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
  ata: add missing ATA_* defines
  ata: add missing ATA_CMD_* defines
  ata: add missing ATA_ID_* defines (take 2)
  sgiioc4: fixup message on resource allocation failure
  ide-cd: use bcd2bin/bin2bcd
  cdrom: handle TOC
  gdrom: add dummy audio_ioctl handler
  viocd: add dummy audio ioctl handler
  cleanup powerpc/include/asm/ide.h
  drivers/ide/pci/: use __devexit_p()
2008-08-18 17:40:13 -07:00
Jiri Slaby
056c58e8eb PCI: add acpi_find_root_bridge_handle
Consolidate finding of a root bridge and getting its handle to the one
inline function. It's cut & pasted on multiple places. Use this new
inline in those.

Cc: kristen.c.accardi@intel.com
Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-08-18 13:48:04 -07:00
Bartlomiej Zolnierkiewicz
b59116205c ata: add missing ATA_* defines
Add missing ATA_* defines to <linux/ata.h>.  Also add
ATAPI_{LFS,EOM,ILI,IO,CODE} defines while at it.

Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-08-18 21:40:05 +02:00
Bartlomiej Zolnierkiewicz
476d9894dd ata: add missing ATA_CMD_* defines
Add missing ATA_CMD_* defines to <linux/ata.h>.  Also add
ATA_EXABYTE_ENABLE_NEST, SETFEATURES_AAM_* and ATA_SMART_*
defines while at it.

Partially based on earlier work by Chris Wedgwood.

Acked-by: Chris Wedgwood <cw@f00f.org>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-08-18 21:40:05 +02:00
Bartlomiej Zolnierkiewicz
37014c6407 ata: add missing ATA_ID_* defines (take 2)
Add missing ATA_ID_* defines and update {ata,atapi}_*()
inlines accordingly.  The currently unused defines are
needed for the forthcoming drivers/ide/ changes.

v2:
Add ATA_ID_SPG.

Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-08-18 21:40:05 +02:00
Linus Torvalds
a7f5aaf36d Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix build warnings in real mode code
  x86, calgary: fix section mismatch warning - get_tce_space_from_tar
  x86: silence section mismatch warning - get_local_pda
  x86, percpu: silence section mismatch warnings related to EARLY_PER_CPU variables
  x86: fix i486 suspend to disk CR4 oops
  x86: mpparse.c: fix section mismatch warning
  x86: mmconf: fix section mismatch warning
  x86: fix MP_processor_info section mismatch warning
  x86, tsc: fix section mismatch warning
  x86: correct register constraints for 64-bit atomic operations
2008-08-18 12:10:14 -07:00
Luis R. Rodriguez
546c80c91f mac80211: remove kdoc references to IEEE80211_HW_HOST_GEN_BEACON_TEMPLATE
IEEE80211_HW_HOST_GEN_BEACON_TEMPLATE was made unnecessary in
the recent revamp on beacon configuration.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-18 11:05:14 -04:00
Marcin Slusarz
c6a92a2501 x86, percpu: silence section mismatch warnings related to EARLY_PER_CPU variables
Quoting Mike Travis in "x86: cleanup early per cpu variables/accesses v4"
(23ca4bba3e):

    The DEFINE macro defines the per_cpu variable as well as the early
    map and pointer.  It also initializes the per_cpu variable and map
    elements to "_initvalue".  The early_* macros provide access to
    the initial map (usually setup during system init) and the early
    pointer.  This pointer is initialized to point to the early map
    but is then NULL'ed when the actual per_cpu areas are setup.  After
    that the per_cpu variable is the correct access to the variable.

As these variables are NULL'ed before __init sections are dropped
(in setup_per_cpu_maps), they can be safely annotated as __ref.

This change silences following section mismatch warnings:

WARNING: vmlinux.o(.data+0x46c0): Section mismatch in reference from the variable x86_cpu_to_apicid_early_ptr to the variable .init.data:x86_cpu_to_apicid_early_map
The variable x86_cpu_to_apicid_early_ptr references
the variable __initdata x86_cpu_to_apicid_early_map
If the reference is valid then annotate the
variable with __init* (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

WARNING: vmlinux.o(.data+0x46c8): Section mismatch in reference from the variable x86_bios_cpu_apicid_early_ptr to the variable .init.data:x86_bios_cpu_apicid_early_map
The variable x86_bios_cpu_apicid_early_ptr references
the variable __initdata x86_bios_cpu_apicid_early_map
If the reference is valid then annotate the
variable with __init* (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

WARNING: vmlinux.o(.data+0x46d0): Section mismatch in reference from the variable x86_cpu_to_node_map_early_ptr to the variable .init.data:x86_cpu_to_node_map_early_map
The variable x86_cpu_to_node_map_early_ptr references
the variable __initdata x86_cpu_to_node_map_early_map
If the reference is valid then annotate the
variable with __init* (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 09:10:55 +02:00
Marcin Slusarz
c72a5efec1 x86: mmconf: fix section mismatch warning
WARNING: arch/x86/kernel/built-in.o(.cpuinit.text+0x1591): Section mismatch in reference from the function init_amd() to the function .init.text:check_enable_amd_mmconf_dmi()
The function __cpuinit init_amd() references
a function __init check_enable_amd_mmconf_dmi().
If check_enable_amd_mmconf_dmi is only used by init_amd then
annotate check_enable_amd_mmconf_dmi with a matching annotation.

check_enable_amd_mmconf_dmi is only called from init_amd which is __cpuinit

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 07:49:06 +02:00
Mathieu Desnoyers
3c3b5c3b0b x86: correct register constraints for 64-bit atomic operations
x86_64 add/sub atomic ops does not seems to accept integer values bigger
than 32 bits as immediates. Intel's add/sub documentation specifies they
have to be passed as registers.

The only operations in the x86-64 architecture which accept arbitrary
64-bit immediates is "movq" to any register; similarly, the only
operation which accept arbitrary 64-bit displacement is "movabs" to or
from al/ax/eax/rax.

http://gcc.gnu.org/onlinedocs/gcc-4.3.0/gcc/Machine-Constraints.html

states :

e
    32-bit signed integer constant, or a symbolic reference known to fit
    that range (for immediate operands in sign-extending x86-64
    instructions).
Z
    32-bit unsigned integer constant, or a symbolic reference known to
    fit that range (for immediate operands in zero-extending x86-64
    instructions).

Since add/sub does sign extension, using the "e" constraint seems appropriate.

It applies to 2.6.27-rc, 2.6.26, 2.6.25...

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 07:47:30 +02:00
David S. Miller
1e0d5a5747 pkt_sched: No longer destroy qdiscs from RCU.
We can now kill them synchronously with all of the
previous dev_deactivate() cures.

This makes netdev destruction and shutdown saner as
the qdiscs hold references to the device.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-17 22:31:26 -07:00
David S. Miller
a9312ae893 pkt_sched: Add 'deactivated' state.
This new state lets dev_deactivate() mark a qdisc as having been
deactivated.

dev_queue_xmit() and ing_filter() check for this bit and do not
try to process the qdisc if the bit is set.

dev_deactivate() polls the qdisc after setting the bit, waiting
for both __QDISC_STATE_RUNNING and __QDISC_STATE_SCHED to clear.

This isn't perfect yet, but subsequent changesets will make it so.
This part is just one piece of the puzzle.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-17 21:51:03 -07:00
Alexander Beregalov
5e186b57e7 security.h: fix build failure
security.h: fix build failure

include/linux/security.h: In function 'security_ptrace_traceme':
include/linux/security.h:1760: error: 'parent' undeclared (first use in this function)

Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-17 22:47:30 +10:00
Linus Torvalds
0473b79929 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (32 commits)
  x86: add MAP_STACK mmap flag
  x86: fix section mismatch warning - spp_getpage()
  x86: change init_gdt to update the gdt via write_gdt, rather than a direct write.
  x86-64: fix overlap of modules and fixmap areas
  x86, geode-mfgpt: check IRQ before using MFGPT as clocksource
  x86, acpi: cleanup, temp_stack is used only when CONFIG_SMP is set
  x86: fix spin_is_contended()
  x86, nmi: clean UP NMI watchdog failure message
  x86, NMI: fix watchdog failure message
  x86: fix /proc/meminfo DirectMap
  x86: fix readb() et al compile error with gcc-3.2.3
  arch/x86/Kconfig: clean up, experimental adjustement
  x86: invalidate caches before going into suspend
  x86, perfctr: don't use CCCR_OVF_PMI1 on Pentium 4Ds
  x86, AMD IOMMU: initialize dma_ops after sysfs registration
  x86m AMD IOMMU: cleanup: replace LOW_U32 macro with generic lower_32_bits
  x86, AMD IOMMU: initialize device table properly
  x86, AMD IOMMU: use status bit instead of memory write-back for completion wait
  x86: silence mmconfig printk
  x86, msr: fix NULL pointer deref due to msr_open on nonexistent CPUs
  ...
2008-08-16 17:14:07 -07:00
Linus Torvalds
9c0d2a20fe Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm: (38 commits)
  [ARM] 5191/1: ARM: remove CVS keywords
  [ARM] pxafb: fix the warning of incorrect lccr when lcd_conn is specified
  [ARM] pxafb: add flag to specify output format on LDD pins when base is RGBT16
  [ARM] pxafb: fix the incorrect configuration of GPIO77 as ACBIAS for TFT LCD
  [ARM] 5198/1: PalmTX: PCMCIA fixes
  [ARM] Fix a pile of broken watchdog drivers
  [ARM] update mach-types
  [ARM] 5196/1: fix inline asm constraints for preload
  [ARM] 5194/1: update .gitignore
  [ARM] add proc-macros.S include to proc-arm940 and proc-arm946
  [ARM] 5192/1: ARM TLB: add v7wbi_{possible,always}_flags to {possible,always}_tlb_flags
  [ARM] 5193/1: Wire up missing syscalls
  [ARM] traps: don't call undef hook functions with spinlock held
  [ARM] 5183/2: Provide Poodle LoCoMo GPIO names
  [ARM] dma-mapping: provide sync_range APIs
  [ARM] dma-mapping: improve type-safeness of DMA translations
  [ARM] Kirkwood: instantiate the orion_spi driver in the platform code
  [ARM] prevent crashing when too much RAM installed
  [ARM] Kirkwood: Instantiate mv_xor driver
  [ARM] Orion: Instantiate mv_xor driver for 5182
  ...
2008-08-16 16:48:45 -07:00
David Woodhouse
5e6b83ed8c Fix header export of videodev2.h, ivtv.h, ivtvfb.h
The exported copy of videodev2.h contains this line:

	#define #include <sys/time.h>

This is because for some reason it defines __user for itself -- despite
the fact that we remove all instances of __user when exporting headers.
_All_ pointers in userspace are user pointers. Fix it by removing the
unnecessary '#define __user' from the file.

The new headers ivtv.h and ivtvfb.h would have the same problem... if
whoever put them there had actually remembered to add them to the Kbuild
file while he was at it. Fix those too, and export them as was
presumably intended.

Note that includes of <linux/compiler.h> are also stripped by the header
export process, so those don't need to be conditional.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Acked-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-16 16:46:57 -07:00
Hugh Dickins
605d9288b3 mm: VM_flags comment fixes
Try to comment away a little of the confusion between mm's vm_area_struct
vm_flags and vmalloc's vm_struct flags: based on an idea by Ulrich Drepper.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-16 16:45:56 -07:00
Adrian Bunk
66bfa2f031 [ARM] 5191/1: ARM: remove CVS keywords
This patch removes CVS keywords that weren't updated for a long time.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-08-16 20:01:18 +01:00
Rusty Russell
db543c1f97 net: skb_copy_datagram_from_iovec()
There's an skb_copy_datagram_iovec() to copy out of a paged skb, but
nothing the other way around (because we don't do that).

We want to allocate big skbs in tun.c, so let's add the function.
It's a carbon copy of skb_copy_datagram_iovec() with enough changes to
be annoying.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-15 19:52:30 -07:00