Make the socket transport kick the event queue to start socket connects
immediately. This should improve responsiveness of applications that are
sensitive to slow mount operations (like automounters).
We are now also careful to cancel the connect worker before destroying
the xprt. This eliminates a race where xprt_destroy can finish before
the connect worker is even allowed to run.
Test-plan:
Destructive testing (unplugging the network temporarily). Connectathon
with UDP and TCP. Hard-code impossibly small connect timeout.
Version: Fri, 29 Apr 2005 15:32:01 -0400
Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
When the network layer reports a connection close, the RPC task
waiting to reconnect should be notified so it can retry immediately
instead of waiting for the normal connection establishment timeout.
This reverts a change made in 2.6.6 as part of adding client support
for RPC over TCP socket idle timeouts.
Test-plan:
Destructive testing with NFS over TCP mounts.
Version: Fri, 29 Apr 2005 15:31:46 -0400
Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cancel autodisconnect requests inside xprt_transmit() in order to avoid
races.
Use more efficient del_singleshot_timer_sync()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
For internal purposes, the rpc_clnt_sigmask() call is replaced by
a call to rpc_task_sigmask(), which ensures that the current task
sigmask respects both the client cl_intr flag and the per-task NOINTR flag.
Problem noted by Jiaying Zhang.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The NFS and NFSACL programs run on the same RPC transport. This patch adds
support for this by converting svc_program into a chained list of programs
(server-side).
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Currently we return -ENOMEM for every single failure to create a new auth.
This is actually accurate for auth_null and auth_unix, but for auth_gss it's a
bit confusing.
Allow rpcauth_create (and the ->create methods) to return errors. With this
patch, the user may sometimes see an EINVAL instead. Whee.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We shouldn't be silently falling back from krb5p to krb5i.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Ensure that we don't create an RPC client without checking that the server
does indeed support the RPC program + version that we are trying to set up.
This enables us to immediately return an error to "mount" if it turns out
that the server is only supporting NFSv2, when we requested NFSv3 or NFSv4.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Fix up call_header() so that it calls xdr_adjust_iovec().
Fix calculation of the scratch buffer length in xdr_init_encode().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If the task->tk_exit() wants to restart the RPC call after delaying
then the current RPC code will clobber the timer by calling
rpc_delete_timer() immediately after re-entering the loop in
__rpc_execute().
Problem noticed by Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Kconfig option had an extra double quote at the end of the line
which was causing in warning when building.
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Sit tunnel logging is currently broken:
MAC=01:23:45:67:89:ab->01:23:45:47:89:ac TUNNEL=123.123. 0.123-> 12.123. 6.123
Apart from the broken IP address, MAC addresses are printed differently
for sit tunnels than for everything else.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drop reference before handing the packets to raw_rcv()
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Keir Fraser <Keir.Fraser@xl.cam.ac.uk>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since expectation timeouts were made compulsory [1], there is no need to
check for them in ip_conntrack_expect_insert.
[1] https://lists.netfilter.org/pipermail/netfilter-devel/2005-January/018143.html
Signed-off-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netfilter assumes that skb->data == skb->nh.ipv6h
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here is a simplified version of the patch to fix a bug in IPv6
multicasting. It:
1) adds existence check & EADDRINUSE error for regular joins
2) adds an exception for EADDRINUSE in the source-specific multicast
join (where a prior join is ok)
3) adds a missing/needed read_lock on sock_mc_list; would've raced
with destroying the socket on interface down without
4) adds a "leave group" in the (INCLUDE, empty) source filter case.
This frees unneeded socket buffer memory, but also prevents
an inappropriate interaction among the 8 socket options that
mess with this. Some would fail as if in the group when you
aren't really.
Item #4 had a locking bug in the last version of this patch; rather than
removing the idev->lock read lock only, I've simplified it to remove
all lock state in the path and treat it as a direct "leave group" call for
the (INCLUDE,empty) case it covers. Tested on an MP machine. :-)
Much thanks to HoerdtMickael <hoerdt@clarinet.u-strasbg.fr> who
reported the original bug.
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Essentially netlink at the moment always reports a pid and sequence of 0
always for v6 route activities.
To understand the repurcassions of this look at:
http://lists.quagga.net/pipermail/quagga-dev/2005-June/003507.html
While fixing this, i took the liberty to resolve the outstanding issue
of IPV6 routes inserted via ioctls to have the correct pids as well.
This patch tries to behave as close as possible to the v4 routes i.e
maintains whatever PID the socket issuing the command owns as opposed to
the process. That made the patch a little bulky.
I have tested against both netlink derived utility to add/del routes as
well as ioctl derived one. The Quagga folks have tested against quagga.
This fixes the problem and so far hasnt been detected to introduce any
new issues.
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Below is a more generic patch to do fib_lookup via netlink. For others
we should say that we discussed this as a way to verify route selection.
It's also possible there are others uses for this.
In short the fist half of struct fib_result_nl is filled in by caller
and netlink call fills in the other half and returns it.
In case anyone is interested there is a corresponding user app to compare
the full routing table this was used to test implementation of the LC-trie.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the flag XFRM_STATE_NOPMTUDISC for xfrm states. It is
similar to the nopmtudisc on IPIP/GRE tunnels. It only has an effect
on IPv4 tunnel mode states. For these states, it will ensure that the
DF flag is always cleared.
This is primarily useful to work around ICMP blackholes.
In future this flag could also allow a larger MTU to be set within the
tunnel just like IPIP/GRE tunnels. This could be useful for short haul
tunnels where temporary fragmentation outside the tunnel is desired over
smaller fragments inside the tunnel.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: James Morris <jmorris@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the xfrm_state_afinfo->init_flags hook which allows
each address family to perform any common initialisation that does
not require a corresponding destructor call.
It will be used subsequently to set the XFRM_STATE_NOPMTUDISC flag
in IPv4.
It also fixes up the error codes returned by xfrm_init_state.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: James Morris <jmorris@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds xfrm_init_state which is simply a wrapper that calls
xfrm_get_type and subsequently x->type->init_state. It also gets rid
of the unused args argument.
Abstracting it out allows us to add common initialisation code, e.g.,
to set family-specific flags.
The add_time setting in xfrm_user.c was deleted because it's already
set by xfrm_state_alloc.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: James Morris <jmorris@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implements sctp_connectx() as defined in the SCTP sockets API draft by
tunneling the request through a setsockopt().
Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When enabled, this should disable UCOPY prequeue'ing altogether,
but it does not due to a missing test.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch changes the type of the third parameter 'length' of the
raw_send_hdrinc() function from 'int' to 'size_t'.
This makes sense since this function is only ever called from one
location, and the value passed as the third parameter in that location is
itself of type size_t, so this makes the recieving functions parameter
type match. Also, inside raw_send_hdrinc() the 'length' variable is
used in comparisons with unsigned values and passed as parameter to
functions expecting unsigned values (it's used in a single comparison with
a signed value, but that one can never actually be negative so the patch
also casts that one to size_t to stop gcc worrying, and it is passed in a
single instance to memcpy_fromiovecend() which expects a signed int, but
as far as I can see that's not a problem since the value of 'length'
shouldn't ever exceed the value of a signed int).
Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch changes the type of the local variable 'i' in
raw_probe_proto_opt() from 'int' to 'unsigned int'. The only use of 'i' in
this function is as a counter in a for() loop and subsequent index into
the msg->msg_iov[] array.
Since 'i' is compared in a loop to the unsigned variable msg->msg_iovlen
gcc -W generates this warning :
net/ipv4/raw.c:340: warning: comparison between signed and unsigned
Changing 'i' to unsigned silences this warning and is safe since the array
index can never be negative anyway, so unsigned int is the logical type to
use for 'i' and also enables a larger msg_iov[] array (but I don't know if
that will ever matter).
Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch gets rid of the following gcc -W warning in net/ipv4/raw.c :
net/ipv4/raw.c:387: warning: comparison of unsigned expression < 0 is always false
Since 'len' is of type size_t it is unsigned and can thus never be <0, and
since this is obvious from the function declaration just a few lines above
I think it's ok to remove the pointless check for len<0.
Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch silences these two gcc -W warnings in net/ipv4/raw.c :
net/ipv4/raw.c:517: warning: signed and unsigned type in conditional expression
net/ipv4/raw.c:613: warning: signed and unsigned type in conditional expression
It doesn't change the behaviour of the code, simply writes the conditional
expression with plain 'if()' syntax instead of '? :' , but since this
breaks it into sepperate statements gcc no longer complains about having
both a signed and unsigned value in the same conditional expression.
Signed-off-by: David S. Miller <davem@davemloft.net>
Removes the skb trimming code which is not needed since we never
touch the skb upon failure. Removes unnecessary initializers,
and simplifies the code a bit.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
prio2list() returns the relevant sk_buff_head for the
band specified by the priority for a given skb.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removes the skb trimming code which is not needed since we never
touch the skb upon failure. Removes unnecessary includes,
initializers, and simplifies the code a bit.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The simplicity of the fifo qdisc allows several qdisc operations to be
redirected to the relevant queue management function directly. Saves
a lot of code lines and gives the pfifo a byte based backlog.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch replaces the spin_lock_irqsave call on the receive queue
lock in SCTP with spin_lock_bh. Despite the proliferation of
spin_lock_irqsave calls in this stack, it is only entered from the
IPv4/IPv6 stack and user space. That is, it is never entered from
hardirq context.
The call in question is only called from recvmsg which means that
IRQs aren't disabled. Therefore it is safe to replace it with
spin_lock_bh.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
In light of my recent patch to net/ipv4/udp.c that replaced the
spin_lock_irq calls on the receive queue lock with spin_lock_bh,
here is a similar patch for all other occurences of spin_lock_irq
on receive/error queue locks in IPv4 and IPv6.
In these stacks, we know that they can only be entered from user
or softirq context. Therefore it's safe to disable BH only.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>