The nfs server rdma transport was mapping rdma read target pages for
TO_DEVICE instead of FROM_DEVICE. This causes data corruption on non
cache-coherent systems if frmrs are used.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Related-to: commit 325fb5b4d2
The compat path suffers from a similar problem. It only uses a __be32
when all of the recent code uses, and expects, an nf_inet_addr
everywhere. As a result, addresses stored by xt_recents were
filled with whatever other stuff was on the stack following the be32.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
With a minor compile fix from Roman.
Reported-and-tested-by: Roman Hoog Antink <rha@open.ch>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch adds missing role attribute to the DCCP type, otherwise
the creation of entries is not of any use.
The attribute added is CTA_PROTOINFO_DCCP_ROLE which contains the
role of the conntrack original tuple.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Commit d0dba725 (netfilter: ctnetlink: add callbacks to the per-proto
nlattrs) changed the protocol registration function to abort if the
to-be registered protocol doesn't provide a new callback function.
The DCCP and UDP-Lite IPv6 protocols were missed in this conversion,
add the required callback pointer.
Reported-and-tested-by: Steven Jan Springl <steven@springl.ukfsn.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch fixes a (bogus?) gcc warning during compilation:
net/netfilter/nf_conntrack_netlink.c🔢 warning: 'helpname' may be used uninitialized in this function
net/netfilter/nf_conntrack_netlink.c:991: warning: 'helpname' may be used uninitialized in this function
In fact, helpname is initialized by ctnetlink_parse_help() so
I cannot see a way to use it without being initialized.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch "af_rose/x25: Sanity check the maximum user frame size"
(commit 83e0bbcbe2) from Alan Cox got
locking wrong. If we bail out due to user frame size being too large,
we must unlock the socket beforehand.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The NetLabel address selector mechanism has a problem where it can get
mistakenly remove the wrong selector when similar addresses are used. The
problem is caused when multiple addresses are configured that have different
netmasks but the same address, e.g. 127.0.0.0/8 and 127.0.0.0/24. This patch
fixes the problem.
Reported-by: Etienne Basset <etienne.basset@numericable.fr>
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
Tested-by: Etienne Basset <etienne.basset@numericable.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
AF_IUCV runs into a race when queuing incoming iucv messages
and receiving the resulting backlog.
If the Linux system is under pressure (high load or steal time),
the message queue grows up, but messages are not received and queued
onto the backlog queue. In that case, applications do not
receive any data with recvmsg() even if AF_IUCV puts incoming
messages onto the message queue.
The race can be avoided if the message queue spinlock in the
message_pending callback is spreaded across the entire callback
function.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add few more sk states in iucv_sock_shutdown().
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reject incoming iucv messages if the receive direction has been shut down.
It avoids that the queue of outstanding messages increases and exceeds the
message limit of the iucv communication path.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If iucv_sock_recvmsg() is called with MSG_PEEK flag set, the skb is enqueued
twice. If the socket is then closed, the pointer to the skb is freed twice.
Remove the skb_queue_head() call for MSG_PEEK, because the skb_recv_datagram()
function already handles MSG_PEEK (does not dequeue the skb).
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure a second invocation of iucv_sock_close() guarantees proper
freeing of an iucv path.
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In non-SMP mode, the variable section attribute specified by DECLARE_PER_CPU()
does not agree with that specified by DEFINE_PER_CPU(). This means that
architectures that have a small data section references relative to a base
register may throw up linkage errors due to too great a displacement between
where the base register points and the per-CPU variable.
On FRV, the .h declaration says that the variable is in the .sdata section, but
the .c definition says it's actually in the .data section. The linker throws
up the following errors:
kernel/built-in.o: In function `release_task':
kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o
kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o
To fix this, DECLARE_PER_CPU() should simply apply the same section attribute
as does DEFINE_PER_CPU(). However, this is made slightly more complex by
virtue of the fact that there are several variants on DEFINE, so these need to
be matched by variants on DECLARE.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When checking whether or not a given frame needs to be
moved to be properly aligned to a 4-byte boundary, we
use & 4 which wasn't intended, this code should check
the lowest two bits.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
It is expected that config interface will always succeed as mac80211
will only request what driver supports. The exception here is when a
device has rfkill enabled. At this time the rfkill state is unknown to
mac80211 and config interface can fail. When this happens we deal with
this error instead of printing a WARN.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
"mac80211: fix basic rates setting from association response"
introduced a copy/paste error.
Unfortunately, this not just leads to wrong data being passed
to the driver but is remotely exploitable for some hardware or
driver combinations.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: stable@kernel.org [2.6.29]
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Currently beacon loss detection triggers after a scan. A probe request
is sent and a message like this is printed to the log:
wlan0: beacon loss from AP 00:12:17:e7:98:de - sending probe request
But in fact there is no beacon loss, the beacons are just not received
because of the ongoing scan. Fix it by updating last_beacon after
the scan has finished.
Reported-by: Jaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: Kalle Valo <kalle.valo@iki.fi>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
One of the code paths sending deauth/disassoc events ends up calling
this function with rcu_read_lock held, so we must use GFP_ATOMIC in
allocation routines.
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Remove this unused Kconfig variable, which Intel apparently once
promised to make use of but never did.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
br_nf_dev_queue_xmit only checks for ETH_P_IP packets for fragmenting but not
VLAN packets. This results in dropping of large VLAN packets. This can be
observed when connection tracking is enabled. Connection tracking re-assembles
fragmented packets, and these have to re-fragmented when transmitting out. Also,
make sure only refragmented packets are defragmented as per suggestion from
Patrick McHardy.
Signed-off-by: Saikiran Madugula <hummerbliss@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This loop over fragments in napi_fraginfo_skb() was "interesting".
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Just noticed while doing some new work that the recent
mid-wq adjustment logic will misbehave when FACK is not
in use (happens either due sysctl'ed off or auto-detected
reordering) because I forgot the relevant TCPCB tagbit.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Sidorenko reported:
"while experimenting with 'netem' we have found some strange behaviour. It
seemed that ingress delay as measured by 'ping' command shows up on some
hosts but not on others.
After some investigation I have found that the problem is that skbuff->tstamp
field value depends on whether there are any packet sniffers enabled. That
is:
- if any ptype_all handler is registered, the tstamp field is as expected
- if there are no ptype_all handlers, the tstamp field does not show the delay"
This patch prevents unnecessary update of tstamp in dev_queue_xmit_nit()
on ingress path (with act_mirred) adding a check, so minimal overhead on
the fast path, but only when sniffers etc. are active.
Since netem at ingress seems to logically emulate a network before a host,
tstamp is zeroed to trigger the update and pretend delays are from the
outside.
Reported-by: Alex Sidorenko <alexandre.sidorenko@hp.com>
Tested-by: Alex Sidorenko <alexandre.sidorenko@hp.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This has been broken for a while. I happened to catch it testing because one
app "knew" that the top line of the calls data was the policy line and got
confused.
Put the header back.
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Broadcom chips with 2.1 firmware handle the fallback case to a SCO
link wrongly when setting up eSCO connections.
< HCI Command: Setup Synchronous Connection (0x01|0x0028) plen 17
handle 11 voice setting 0x0060
> HCI Event: Command Status (0x0f) plen 4
Setup Synchronous Connection (0x01|0x0028) status 0x00 ncmd 1
> HCI Event: Connect Complete (0x03) plen 11
status 0x00 handle 1 bdaddr 00:1E:3A:xx:xx:xx type SCO encrypt 0x01
The Link Manager negotiates the fallback to SCO, but then sends out
a Connect Complete event. This is wrong and the Link Manager should
actually send a Synchronous Connection Complete event if the Setup
Synchronous Connection has been used. Only the remote side is allowed
to use Connect Complete to indicate the missing support for eSCO in
the host stack.
This patch adds a workaround for this which clearly should not be
needed, but reality is that broken Broadcom devices are deployed.
Based on a report by Ville Tervo <ville.tervo@nokia.com>
Signed-off-by: Marcel Holtman <marcel@holtmann.org>
Some Bluetooth chips (like the ones from Texas Instruments) don't do
proper eSCO negotiations inside the Link Manager. They just return an
error code and in case of the Kyocera ED-8800 headset it is just a
random error.
< HCI Command: Setup Synchronous Connection 0x01|0x0028) plen 17
handle 1 voice setting 0x0060
> HCI Event: Command Status (0x0f) plen 4
Setup Synchronous Connection (0x01|0x0028) status 0x00 ncmd 1
> HCI Event: Synchronous Connect Complete (0x2c) plen 17
status 0x1f handle 257 bdaddr 00:14:0A:xx:xx:xx type eSCO
Error: Unspecified Error
In these cases it is up to the host stack to fallback to a SCO setup
and so retry with SCO parameters.
Based on a report by Nick Pelly <npelly@google.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
There is a missing call to rfcomm_dlc_clear_timer in the case that
DEFER_SETUP is used and so the connection gets disconnected after the
timeout even if it was successfully accepted previously.
This patch adds a call to rfcomm_dlc_clear_timer to rfcomm_dlc_accept
which will get called when the user accepts the connection by calling
read() on the socket.
Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Check whether the underlying device provides a set of ethtool ops before
checking for individual handlers to avoid NULL pointer dereferences.
Reported-by: Art van Breemen <ard@telegraafnet.nl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TIM IE must not be shorter than 4 bytes, so verify that
when parsing it.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Instead, allocate extra IE memory if necessary. Normally,
this isn't even necessary since there's enough space.
This is a better way of correcting the "held BSS can
disappear" issue, but also a lot more code. It is also
necessary for proper auth/assoc BSS handling in the
future.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When we receive a probe response frame we can replace the
BSS struct in our list -- but if that struct is held then
we need to hold the new one as well.
We really should fix this completely and not replace the
struct, but this is a bandaid for now.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Using the scan_sdata variable here is terribly wrong,
if there has never been a scan then we fail. However,
we need a bandaid...
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: stable@kernel.org [2.6.29]
Signed-off-by: John W. Linville <linville@tuxdriver.com>
With this patch, nfnetlink returns -ENOMEM instead of -EPERM if we
fail to create the nfnetlink netlink socket during the module
loading. This is exactly what rtnetlink does in this case.
Ideally, it would be better if we propagate the error that has
happened in netlink_kernel_create(), however, this function still
does not implement this yet.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch fixes an inconsistency that results in no error reports
to user-space listeners if we fail to allocate the event message.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
After calling skb_gro_receive skb->len can no longer be relied
on since if the skb was merged using frags, then its pages will
have been removed and the length reduced.
This caused tcp_gro_receive to prematurely end merging which
resulted in suboptimal performance with ixgbe.
The fix is to store skb->len on the stack.
Reported-by: Mark Wagner <mwagner@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit ead2ceb0ec ("Network Drop
Monitor: Adding kfree_skb_clean for non-drops and modifying
end-of-line points for skbs") so called end-of-line points for skb's
should use consume_skb() to free the socket buffer.
In opposite to consume_skb() the function kfree_skb() is intended to
be used for unexpected skb drops e.g. in error conditions that now can
trigger the network drop monitor if enabled.
This patch moves the skb end-of-line point in af_can.c to use
consume_skb().
Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The removal of the SAME target accidentally removed one feature that is
not available from the normal NAT targets so far, having multi-range
mappings that use the same mapping for each connection from a single
client. The current behaviour is to choose the address from the range
based on source and destination IP, which breaks when communicating
with sites having multiple addresses that require all connections to
originate from the same IP address.
Introduce a IP_NAT_RANGE_PERSISTENT option that controls whether the
destination address is taken into account for selecting addresses.
http://bugzilla.kernel.org/show_bug.cgi?id=12954
Signed-off-by: Patrick McHardy <kaber@trash.net>
mac80211: Fragmentation threshold (typo)
ieee80211_ioctl_siwfrag() sets the fragmentation_threshold to 2352
when frame fragmentation is to be disabled, yet the corresponding
'get' function tests for 2353 bytes instead.
This causes user-space tools to display a fragmentation threshold
of 2352 bytes even if fragmentation has been disabled.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
On Sunday 05 April 2009 11:29:38 Michael Buesch wrote:
> On Sunday 05 April 2009 11:23:59 Jaswinder Singh Rajput wrote:
> > With latest linus tree I am getting, .config file attached:
> >
> > [ 22.895051] r8169: eth0: link down
> > [ 22.897564] ADDRCONF(NETDEV_UP): eth0: link is not ready
> > [ 22.928047] ADDRCONF(NETDEV_UP): wlan0: link is not ready
> > [ 22.982292] libvirtd used greatest stack depth: 4200 bytes left
> > [ 63.709879] wlan0: authenticate with AP 00:11:95:9e:df:f6
> > [ 63.712096] wlan0: authenticated
> > [ 63.712127] wlan0: associate with AP 00:11:95:9e:df:f6
> > [ 63.726831] wlan0: RX AssocResp from 00:11:95:9e:df:f6 (capab=0x471 status=0 aid=1)
> > [ 63.726855] wlan0: associated
> > [ 63.730093] ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
> > [ 74.296087] wlan0: no IPv6 routers present
> > [ 79.349044] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 119.358200] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 179.354292] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 259.366044] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 359.348292] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 361.953459] packagekitd used greatest stack depth: 4160 bytes left
> > [ 478.824258] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 598.813343] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 718.817292] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 838.824567] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 958.815402] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1078.848434] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1198.822913] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1318.824931] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1438.814157] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1558.827336] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1678.823011] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1798.830589] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1918.828044] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 2038.827224] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 2116.517152] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 2158.840243] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 2278.827427] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
>
>
> I think this message should only show if CONFIG_MAC80211_VERBOSE_DEBUG is set.
> It's kind of expected that we lose a beacon once in a while, so we shouldn't print
> verbose messages to the kernel log (even if they are KERN_DEBUG).
>
> And besides that, I think one can easily remotely trigger this message and flood the logs.
> So it should probably _also_ be ratelimited.
Something like this:
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Wext makes no assumptions about the contents of
data->txpower.fixed and data->txpower.value when
data->txpower.disabled is set, so do not update
the user-requested power level while disabling.
Also, when wext configures a really _fixed_ power
output [1], we should reject it instead of limiting it
to the regulatory constraint. If the user wants to set
a _limit_ [2] then we should honour that.
[1] iwconfig wlan0 txpower 20dBm fixed
[2] iwconfig wlan0 txpower 10dBm
This fixes
http://www.intellinuxwireless.org/bugzilla/show_bug.cgi?id=1942
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Currently rx status for frames which are completed from reorder buffer
is taken from it's cb area which is not always right, cb is not holding
the rx status when driver uses mac80211's non-irq rx handler to pass it's
received frames. This results in dropping almost all frames from reorder
buffer when security is enabled by doing double decryption (first in hw,
second in sw because of wrong rx status). This patch copies rx status into
cb area before the frame is put into reorder buffer. After this patch,
there is a significant improvement in throughput with ath9k + WPA2(AES).
Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We won't ever get here as regulatory_hint_core() can only fail
on -ENOMEM and in that case we don't initialize cfg80211 but this is
technically correct code.
This is actually good for stable, where we don't check for -ENOMEM
failure on __regulatory_hint()'s failure.
Cc: stable@kernel.org
Reported-by: Quentin Armitage <Quentin@armitage.org.uk>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Commit ea781f197d (netfilter: nf_conntrack: use SLAB_DESTROY_BY_RCU and)
get rid of call_rcu() was missing one conversion to the hlist_nulls
functions, causing a crash when unloading conntrack helper modules.
Reported-and-tested-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
commit ca735b3aaa
'netfilter: use a linked list of loggers'
introduced an array of list_head in "struct nf_logger", but
forgot to initialize it in nf_log_register(). This resulted
in oops when calling nf_log_unregister() at module unload time.
Reported-and-tested-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This reverts commit 244f46ae6e.
Alan Cox did the research, and just like the other radio protocols
zero-length frames have meaning because at the top level ROSE is
X.25 PLP.
So this zero-length filtering is invalid.
Signed-off-by: David S. Miller <davem@davemloft.net>
Since everybody has been focusing on baremetal GRO performance
no one noticed when I added a bug that zapped gso_size for all
GRO packets. This only gets picked up when you forward the skb
out of an interface.
Thanks to Mark Wagner for noticing this bug when testing kvm.
Reported-by: Mark Wagner <mwagner@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
After switch (rthdr->type) {...},the check below is completely useless.Because:
if the type is 2,then hdrlen must be 2 and segments_left must be 1,clearly the
check is redundant;if the type is not 2,then goto sticky_done,the check is useless
too.
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Reviewed-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A long-standing feature in tcp_init_metrics() is such that
any of its goto reset prevents call to tcp_init_cwnd().
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
When vlan acceleration is used on receive, the vlan tag is maintained
outside of the skb data. The existing vlan tag match only works on TX
path because it uses vlan_get_tag which tests for VLAN_HW_TX_ACCEL.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hi:
gro: Normalise skb before bypassing GRO on netpoll VLAN path
When we detect netpoll RX on the GRO VLAN path we bail out and
call the normal VLAN receive handler. However, the packet needs
to be normalised by calling eth_type_trans since that's what the
normal path expects (normally the GRO path does the fixup).
This patch adds the necessary call to vlan_gro_frags.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Thanks,
Signed-off-by: David S. Miller <davem@davemloft.net>
Add dev_put() after dev_get_by_index() to avoid leakage
of device.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently netif_device_attach/detach are only stopping one queue. They
should be starting and stopping all the queues on a given device.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
b44: Use kernel DMA addresses for the kernel DMA API
forcedeth: Fix resume from hibernation regression.
xfrm: fix fragmentation on inter family tunnels
ibm_newemac: Fix dangerous struct assumption
gigaset: documentation update
gigaset: in file ops, check for device disconnect before anything else
bas_gigaset: use tasklet_hi_schedule for timing critical tasklets
net/802/fddi.c: add MODULE_LICENSE
smsc911x: remove unused #include <linux/version.h>
axnet_cs: fix phy_id detection for bogus Asix chip.
bnx2: Use request_firmware()
b44: Fix sizes passed to b44_sync_dma_desc_for_{device,cpu}()
socket: use percpu_add() while updating sockets_in_use
virtio_net: Set the mac config only when VIRITO_NET_F_MAC
myri_sbus: use request_firmware
e1000: fix loss of multicast packets
vxge: should include tcp.h
Conflict in firmware/WHENCE (SCSI vs net firmware)
If an ipv4 packet (not locally generated with IP_DF flag not set) bigger
than mtu size is supposed to go via a xfrm ipv6 tunnel, the packetsize
check in xfrm4_tunnel_check_size() is omited and ipv6 drops the packet
without sending a notice to the original sender of the ipv4 packet.
Another issue is that ipv4 connection tracking does reassembling of
incomming fragmented packets. If such a reassembled packet is supposed to
go via a xfrm ipv6 tunnel it will be droped, even if the original sender
did proper fragmentation.
According to RFC 2473 (section 7) tunnel ipv6 packets resulting from the
encapsulation of an original packet are considered as locally generated
packets. If such a packet passed the checks in xfrm{4,6}_tunnel_check_size()
fragmentation is allowed according to RFC 2473 (section 7.1/7.2).
This patch sets skb->local_df in xfrm6_prepare_output() to achieve
fragmentation in this case.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-2.6.30' of git://linux-nfs.org/~bfields/linux: (81 commits)
nfsd41: define nfsd4_set_statp as noop for !CONFIG_NFSD_V4
nfsd41: define NFSD_DRC_SIZE_SHIFT in set_max_drc
nfsd41: Documentation/filesystems/nfs41-server.txt
nfsd41: CREATE_EXCLUSIVE4_1
nfsd41: SUPPATTR_EXCLCREAT attribute
nfsd41: support for 3-word long attribute bitmask
nfsd: dynamically skip encoded fattr bitmap in _nfsd4_verify
nfsd41: pass writable attrs mask to nfsd4_decode_fattr
nfsd41: provide support for minor version 1 at rpc level
nfsd41: control nfsv4.1 svc via /proc/fs/nfsd/versions
nfsd41: add OPEN4_SHARE_ACCESS_WANT nfs4_stateid bmap
nfsd41: access_valid
nfsd41: clientid handling
nfsd41: check encode size for sessions maxresponse cached
nfsd41: stateid handling
nfsd: pass nfsd4_compound_state* to nfs4_preprocess_{state,seq}id_op
nfsd41: destroy_session operation
nfsd41: non-page DRC for solo sequence responses
nfsd41: Add a create session replay cache
nfsd41: create_session operation
...
This patch fixes a regression (introduced by myself in commit 19abb7b:
netfilter: ctnetlink: deliver events for conntracks changed from
userspace) that results in an expectation re-insertion since
__nf_ct_expect_check() may return 0 for expectation timer refreshing.
This patch also removes a unnecessary refcount bump that
pretended to avoid a possible race condition with event delivery
and expectation timers (as said, not needed since we hold a
reference to the object since until we finish the expectation
setup). This also merges nf_ct_expect_related_report() and
nf_ct_expect_related() which look basically the same.
Reported-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Currently the 9p code crashes when a operation is interrupted, i.e. for
example when the user presses ^C while reading from a file.
This patch fixes the code that is responsible for interruption and flushing
of 9P operations.
Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
p9_client_stat function doesn't return correct value if it fails.
p9_client_stat should return ERR_PTR of the error value when it fails.
Instead, it always returns a value to the allocated p9_wstat struct even
when it is not populated correctly.
This patch makes p9_client_stat to handle failure correctly.
Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Reviewed-by: Eric Van Hensbergen <ericvh@gmail.com>
The 9P2000 Twstat message requires the size of the stat structure to be
specified. Currently the 9p code writes zero instead of the actual size.
This behavior confuses some of the file servers that check if the size is
correct.
This patch adds a new function that calculcates the stat size and puts the
value in the appropriate place in the 9P message.
Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Reviewed-by: Eric Van Hensbergen <ericvh@gmail.com>
See http://bugzilla.kernel.org/show_bug.cgi?id=13034
If the port gets into a TIME_WAIT state, then we cannot reconnect without
binding to a new port.
Tested-by: Petr Vandrovec <petr@vandrovec.name>
Tested-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (24 commits)
e100: do not go D3 in shutdown unless system is powering off
netfilter: revised locking for x_tables
Bluetooth: Fix connection establishment with low security requirement
Bluetooth: Add different pairing timeout for Legacy Pairing
Bluetooth: Ensure that HCI sysfs add/del is preempt safe
net: Avoid extra wakeups of threads blocked in wait_for_packet()
net: Fix typo in net_device_ops description.
ipv4: Limit size of route cache hash table
Add reference to CAPI 2.0 standard
Documentation/isdn/INTERFACE.CAPI
update Documentation/isdn/00-INDEX
ixgbe: Fix WoL functionality for 82599 KX4 devices
veth: prevent oops caused by netdev destructor
xfrm: wrong hash value for temporary SA
forcedeth: tx timeout fix
net: Fix LL_MAX_HEADER for CONFIG_TR_MODULE
mlx4_en: Handle page allocation failure during receive
mlx4_en: Fix cleanup flow on cq activation
vlan: update vlan carrier state for admin up/down
netfilter: xt_recent: fix stack overread in compat code
...
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (94 commits)
netfilter: ctnetlink: fix gcc warning during compilation
net/netrom: Fix socket locking
netlabel: Always remove the correct address selector
ucc_geth.c: Fix upsmr setting in RMII mode
8139too: fix HW initial flow
af_iucv: Fix race when queuing incoming iucv messages
af_iucv: Test additional sk states in iucv_sock_shutdown
af_iucv: Reject incoming msgs if RECV_SHUTDOWN is set
af_iucv: fix oops in iucv_sock_recvmsg() for MSG_PEEK flag
af_iucv: consider state IUCV_CLOSING when closing a socket
iwlwifi: DMA fixes
iwlwifi: add debugging for TX path
mwl8: fix build warning.
mac80211: fix alignment calculation bug
mac80211: do not print WARN if config interface
iwl3945: use cancel_delayed_work_sync to cancel rfkill_poll
iwlwifi: fix EEPROM validation mask to include OTP only devices
atmel: fix netdev ops conversion
pcnet_cs: add cis(firmware) of the Allied Telesis LA-PCM
mlx4_en: Fix cleanup if workqueue create in mlx4_en_add() fails
...
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask: (36 commits)
cpumask: remove cpumask allocation from idle_balance, fix
numa, cpumask: move numa_node_id default implementation to topology.h, fix
cpumask: remove cpumask allocation from idle_balance
x86: cpumask: x86 mmio-mod.c use cpumask_var_t for downed_cpus
x86: cpumask: update 32-bit APM not to mug current->cpus_allowed
x86: microcode: cleanup
x86: cpumask: use work_on_cpu in arch/x86/kernel/microcode_core.c
cpumask: fix CONFIG_CPUMASK_OFFSTACK=y cpu hotunplug crash
numa, cpumask: move numa_node_id default implementation to topology.h
cpumask: convert node_to_cpumask_map[] to cpumask_var_t
cpumask: remove x86 cpumask_t uses.
cpumask: use cpumask_var_t in uv_flush_tlb_others.
cpumask: remove cpumask_t assignment from vector_allocation_domain()
cpumask: make Xen use the new operators.
cpumask: clean up summit's send_IPI functions
cpumask: use new cpumask functions throughout x86
x86: unify cpu_callin_mask/cpu_callout_mask/cpu_initialized_mask/cpu_sibling_setup_mask
cpumask: convert struct cpuinfo_x86's llc_shared_map to cpumask_var_t
cpumask: convert node_to_cpumask_map[] to cpumask_var_t
x86: unify 32 and 64-bit node_to_cpumask_map
...
On an NFSv4.1 server cache miss that causes an upcall, NFS4ERR_DELAY will be
returned. It is up to the NFSv4.1 client to resend only the operations that
have not been processed.
Initialize rq_usedeferral to 1 in svc_process(). It sill be turned off in
nfsd4_proc_compound() only when NFSv4.1 Sessions are used.
Note: this isn't an adequate solution on its own. It's acceptable as a way
to get some minimal 4.1 up and working, but we're going to have to find a
way to avoid returning DELAY in all common cases before 4.1 can really be
considered ready.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: reverse rq_nodeferral negative logic]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[sunrpc: initialize rq_usedeferral]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (28 commits)
trivial: Update my email address
trivial: NULL noise: drivers/mtd/tests/mtd_*test.c
trivial: NULL noise: drivers/media/dvb/frontends/drx397xD_fw.h
trivial: Fix misspelling of "Celsius".
trivial: remove unused variable 'path' in alloc_file()
trivial: fix a pdlfush -> pdflush typo in comment
trivial: jbd header comment typo fix for JBD_PARANOID_IOFAIL
trivial: wusb: Storage class should be before const qualifier
trivial: drivers/char/bsr.c: Storage class should be before const qualifier
trivial: h8300: Storage class should be before const qualifier
trivial: fix where cgroup documentation is not correctly referred to
trivial: Give the right path in Documentation example
trivial: MTD: remove EOL from MODULE_DESCRIPTION
trivial: Fix typo in bio_split()'s documentation
trivial: PWM: fix of #endif comment
trivial: fix typos/grammar errors in Kconfig texts
trivial: Fix misspelling of firmware
trivial: cgroups: documentation typo and spelling corrections
trivial: Update contact info for Jochen Hein
trivial: fix typo "resgister" -> "register"
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
Remove two unneeded exports and make two symbols static in fs/mpage.c
Cleanup after commit 585d3bc06f
Trim includes of fdtable.h
Don't crap into descriptor table in binfmt_som
Trim includes in binfmt_elf
Don't mess with descriptor table in load_elf_binary()
Get rid of indirect include of fs_struct.h
New helper - current_umask()
check_unsafe_exec() doesn't care about signal handlers sharing
New locking/refcounting for fs_struct
Take fs_struct handling to new file (fs/fs_struct.c)
Get rid of bumping fs_struct refcount in pivot_root(2)
Kill unsharing fs_struct in __set_personality()
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (54 commits)
glge: remove unused #include <version.h>
dnet: remove unused #include <version.h>
tcp: miscounts due to tcp_fragment pcount reset
tcp: add helper for counter tweaking due mid-wq change
hso: fix for the 'invalid frame length' messages
hso: fix for crash when unplugging the device
fsl_pq_mdio: Fix compile failure
fsl_pq_mdio: Revive UCC MDIO support
ucc_geth: Pass proper device to DMA routines, otherwise oops happens
i.MX31: Fixing cs89x0 network building to i.MX31ADS
tc35815: Fix build error if NAPI enabled
hso: add Vendor/Product ID's for new devices
ucc_geth: Remove unused header
gianfar: Remove unused header
kaweth: Fix locking to be SMP-safe
net: allow multiple dev per napi with GRO
r8169: reset IntrStatus after chip reset
ixgbe: Fix potential memory leak/driver panic issue while setting up Tx & Rx ring parameters
ixgbe: fix ethtool -A|a behavior
ixgbe: Patch to fix driver panic while freeing up tx & rx resources
...
It seems that trivial reset of pcount to one was not sufficient
in tcp_retransmit_skb. Multiple counters experience a positive
miscount when skb's pcount gets lowered without the necessary
adjustments (depending on skb's sacked bits which exactly), at
worst a packets_out miscount can crash at RTO if the write queue
is empty!
Triggering this requires mss change, so bidir tcp or mtu probe or
like.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Tested-by: Uwe Bugla <uwe.bugla@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need full-scale adjustment to fix a TCP miscount in the next
patch, so just move it into a helper and call for that from the
other places.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
GRO assumes that there is a one-to-one relationship between NAPI
structure and network device. Some devices like sky2 share multiple
devices on a single interrupt so only have one NAPI handler. Rather than
split GRO from NAPI, just have GRO assume if device changes that
it is a different flow.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 784544739a
(netfilter: iptables: lock free counters) forgot to disable BH
in arpt_do_table(), ipt_do_table() and ip6t_do_table()
Use rcu_read_lock_bh() instead of rcu_read_lock() cures the problem.
Reported-and-bisected-by: Roman Mindalev <r000n@r000n.net>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We have a 64bit value that needs to be set atomically.
This is easy and quick on all 64bit archs, and can also be done
on x86/32 with set_64bit() (uses cmpxchg8b). However other
32b archs don't have this.
I actually changed this to the current state in preparation for
mainline because the old way (using a spinlock on 32b) resulted in
unsightly #ifdefs in the code. But obviously, being correct takes
precedence.
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a bug where a connection was unexpectedly
not on *any* list while being destroyed. It also
cleans up some code duplication and regularizes some
function names.
* Grab appropriate lock in conn_free() and explain in comment
* Ensure via locking that a conn is never not on either
a dev's list or the nodev list
* Add rds_xx_remove_conn() to match rds_xx_add_conn()
* Make rds_xx_add_conn() return void
* Rename remove_{,nodev_}conns() to
destroy_{,nodev_}conns() and unify their implementation
in a helper function
* Document lock ordering as nodev conn_lock before
dev_conn_lock
Reported-by: Yosef Etigin <yosefe@voltaire.com>
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rs_send_drop_to() is called during socket close. If it takes
m_rs_lock without disabling interrupts, then
rds_send_remove_from_sock() can run from the rx completion
handler and thus deadlock.
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Also ensure that we use the protocol family instead of the address
family when calling sock_create_kern().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Add support for event-aware wakeups to the sockets code. Events are
delivered to the wakeup target, so that epoll can avoid spurious wakeups
for non-interesting events.
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: William Lee Irwin III <wli@movementarian.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
struct tty_operations::proc_fops took it's place and there is one less
create_proc_read_entry() user now!
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
wireless: remove duplicated .ndo_set_mac_address
netfilter: xtables: fix IPv6 dependency in the cluster match
tg3: Add GRO support.
niu: Add GRO support.
ucc_geth: Fix use-after-of_node_put() in ucc_geth_probe().
gianfar: Fix use-after-of_node_put() in gfar_of_init().
kernel: remove HIPQUAD()
netpoll: store local and remote ip in net-endian
netfilter: fix endian bug in conntrack printks
dmascc: fix incomplete conversion to network_device_ops
gso: Fix support for linear packets
skbuff.h: fix missing kernel-doc
ni5010: convert to net_device_ops