Commit graph

361466 commits

Author SHA1 Message Date
J. Bruce Fields
dd30333cf5 nfsd4: better error return to indicate SSV non-support
As 4.1 becomes less experimental and SSV still isn't implemented, we
have to admit it's not going to be, and return some sensible error
rather than just saying "our server's broken".  Discussion in the ietf
group hasn't turned up any objections to using NFS4ERR_ENC_ALG_UNSUPP
for that purpose.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-26 16:18:15 -04:00
J. Bruce Fields
aa387d6ce1 nfsd: fix EXDEV checking in rename
We again check for the EXDEV a little later on, so the first check is
redundant.  This check is also slightly racier, since a badly timed
eviction from the export cache could leave us with the two fh_export
pointers pointing to two different cache entries which each refer to the
same underlying export.

It's better to compare vfsmounts as the later check does, but that
leaves a minor security hole in the case where the two exports refer to
two different directories especially if (for example) they have
different root-squashing options.

So, compare ex_path.dentry too.

Reported-by: Joe Habermann <joe.habermann@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-26 16:18:15 -04:00
Simo Sorce
030d794bf4 SUNRPC: Use gssproxy upcall for server RPCGSS authentication.
The main advantge of this new upcall mechanism is that it can handle
big tickets as seen in Kerberos implementations where tickets carry
authorization data like the MS-PAC buffer with AD or the Posix Authorization
Data being discussed in IETF on the krbwg working group.

The Gssproxy program is used to perform the accept_sec_context call on the
kernel's behalf. The code is changed to also pass the input buffer straight
to upcall mechanism to avoid allocating and copying many pages as tokens can
be as big (potentially more in future) as 64KiB.

Signed-off-by: Simo Sorce <simo@redhat.com>
[bfields: containerization, negotiation api]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-26 11:41:28 -04:00
Simo Sorce
1d658336b0 SUNRPC: Add RPC based upcall mechanism for RPCGSS auth
This patch implements a sunrpc client to use the services of the gssproxy
userspace daemon.

In particular it allows to perform calls in user space using an RPC
call instead of custom hand-coded upcall/downcall messages.

Currently only accept_sec_context is implemented as that is all is needed for
the server case.

File server modules like NFS and CIFS can use full gssapi services this way,
once init_sec_context is also implemented.

For the NFS server case this code allow to lift the limit of max 2k krb5
tickets. This limit is prevents legitimate kerberos deployments from using krb5
authentication with the Linux NFS server as they have normally ticket that are
many kilobytes large.

It will also allow to lift the limitation on the size of the credential set
(uid,gid,gids) passed down from user space for users that have very many groups
associated. Currently the downcall mechanism used by rpc.svcgssd is limited
to around 2k secondary groups of the 65k allowed by kernel structures.

Signed-off-by: Simo Sorce <simo@redhat.com>
[bfields: containerization, concurrent upcalls, misc. fixes and cleanup]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-26 11:41:27 -04:00
Simo Sorce
400f26b542 SUNRPC: conditionally return endtime from import_sec_context
We expose this parameter for a future caller.
It will be used to extract the endtime from the gss-proxy upcall mechanism,
in order to set the rsc cache expiration time.

Signed-off-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-26 11:41:27 -04:00
J. Bruce Fields
33d90ac058 SUNRPC: allow disabling idle timeout
In the gss-proxy case we don't want to have to reconnect at random--we
want to connect only on gss-proxy startup when we can steal gss-proxy's
context to do the connect in the right namespace.

So, provide a flag that allows the rpc_create caller to turn off the
idle timeout.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-26 11:41:26 -04:00
J. Bruce Fields
7073ea8799 SUNRPC: attempt AF_LOCAL connect on setup
In the gss-proxy case, setup time is when I know I'll have the right
namespace for the connect.

In other cases, it might be useful to get any connection errors
earlier--though actually in practice it doesn't make any difference for
rpcbind.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-26 11:41:25 -04:00
J. Bruce Fields
c85b03ab20 Merge Trond's nfs-for-next
Merging Trond's nfs-for-next branch, mainly to get
b7993cebb8 "SUNRPC: Allow rpc_create() to
request that TCP slots be unlimited", which a small piece of the
gss-proxy work depends on.
2013-04-26 11:37:43 -04:00
Bryan Schumaker
bf8d909705 nfsd: Decode and send 64bit time values
The seconds field of an nfstime4 structure is 64bit, but we are assuming
that the first 32bits are zero-filled.  So if the client tries to set
atime to a value before the epoch (touch -t 196001010101), then the
server will save the wrong value on disk.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-23 14:49:37 -04:00
Trond Myklebust
fd068b200f NFSv4: Ensure that we clear the NFS_OPEN_STATE flag when appropriate
We should always clear it before initiating file recovery.
Also ensure that we clear it after a CLOSE and/or after TEST_STATEID fails.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-22 11:29:51 -04:00
Trond Myklebust
1dfd89af86 LOCKD: Ensure that nlmclnt_block resets block->b_status after a server reboot
After a server reboot, the reclaimer thread will recover all the existing
locks. For locks that are blocked, however, it will change the value
of block->b_status to nlm_lck_denied_grace_period in order to signal that
they need to wake up and resend the original blocking lock request.

Due to a bug, however, the block->b_status never gets reset after the
blocked locks have been woken up, and so the process goes into an
infinite loop of resends until the blocked lock is satisfied.

Reported-by: Marc Eshel <eshel@us.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-04-21 18:08:42 -04:00
Trond Myklebust
8e472f33b5 NFSv4: Ensure the LOCK call cannot use the delegation stateid
Defensive patch to ensure that we copy the state->open_stateid, which
can never be set to the delegation stateid.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-20 01:39:54 -04:00
Trond Myklebust
92b40e9384 NFSv4: Use the open stateid if the delegation has the wrong mode
Fix nfs4_select_rw_stateid() so that it chooses the open stateid
(or an all-zero stateid) if the delegation does not match the selected
read/write mode.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-20 01:39:42 -04:00
Bryan Schumaker
042ad0b398 nfs: Send atime and mtime as a 64bit value
RFC 3530 says that the seconds value of a nfstime4 structure is a 64bit
value, but we are instead sending a 32-bit 0 and then a 32bit conversion
of the 64bit Linux value.  This means that if we try to set atime to a
value before the epoch (touch -t 196001010101) the client will only send
part of the new value due to lost precision.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-19 17:21:07 -04:00
Fengguang Wu
ba138435d1 nfsd4: put_client_renew_locked can be static
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-16 22:15:00 -04:00
J. Bruce Fields
9aeb5aeeb0 nfsd4: remove unused macro
Cleanup a piece I forgot to remove in
9411b1d4c7 "nfsd4: cleanup handling of
nfsv4.0 closed stateid's".

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-16 21:51:55 -04:00
Trond Myklebust
549b19cc9f NFSv4: Record the OPEN create mode used in the nfs4_opendata structure
If we're doing NFSv4.1 against a server that has persistent sessions,
then we should not need to call SETATTR in order to reset the file
attributes immediately after doing an exclusive create.

Note that since the create mode depends on the type of session that
has been negotiated with the server, we should not choose the
mode until after we've got a session slot.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-16 18:58:26 -04:00
fanchaoting
53584f6652 nfsd4: remove some useless code
The "list_empty(&oo->oo_owner.so_stateids)" is aways true, so remove it.

Signed-off-by: fanchaoting <fanchaoting@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-16 10:59:31 -04:00
J. Bruce Fields
3bd64a5ba1 nfsd4: implement SEQ4_STATUS_RECALLABLE_STATE_REVOKED
A 4.1 server must notify a client that has had any state revoked using
the SEQ4_STATUS_RECALLABLE_STATE_REVOKED flag.  The client can figure
out exactly which state is the problem using CHECK_STATEID and then free
it using FREE_STATEID.  The status flag will be unset once all such
revoked stateids are freed.

Our server's only recallable state is delegations.  So we keep with each
4.1 client a list of delegations that have timed out and been recalled,
but haven't yet been freed by FREE_STATEID.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-16 10:59:30 -04:00
Trond Myklebust
98f98cf571 NFSv4.1: Set the RPC_CLNT_CREATE_INFINITE_SLOTS flag for NFSv4.1 transports
This ensures that the RPC layer doesn't override the NFS session
negotiation.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-14 12:59:28 -04:00
Trond Myklebust
b7993cebb8 SUNRPC: Allow rpc_create() to request that TCP slots be unlimited
This is mainly for use by NFSv4.1, where the session negotiation
ultimately wants to decide how many RPC slots we can fill.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-14 12:26:03 -04:00
Trond Myklebust
ba60eb25ff SUNRPC: Fix a livelock problem in the xprt->backlog queue
This patch ensures that we throttle new RPC requests if there are
requests already waiting in the xprt->backlog queue. The reason for
doing this is to fix livelock issues that can occur when an existing
(high priority) task is waiting in the backlog queue, gets woken up
by xprt_free_slot(), but a new task then steals the slot.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-14 12:26:02 -04:00
Trond Myklebust
b570a975ed NFSv4: Fix handling of revoked delegations by setattr
Currently, _nfs4_do_setattr() will use the delegation stateid if no
writeable open file stateid is available.
If the server revokes that delegation stateid, then the call to
nfs4_handle_exception() will fail to handle the error due to the
lack of a struct nfs4_state, and will just convert the error into
an EIO.

This patch just removes the requirement that we must have a
struct nfs4_state in order to invalidate the delegation and
retry.

Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-12 15:21:15 -04:00
Andy Adamson
b9536ad521 NFSv4 release the sequence id in the return on close case
Otherwise we deadlock if state recovery is initiated while we
sleep.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-11 09:39:53 -04:00
Jeff Layton
314d7cc05d nfs: remove unnecessary check for NULL inode->i_flock from nfs_delegation_claim_locks
The second check was added in commit 65b62a29 but it will never be true.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-10 15:40:31 -04:00
J. Bruce Fields
23340032e6 nfsd4: clean up validate_stateid
The logic here is better expressed with a switch statement.

While we're here, CLOSED stateids (or stateids of an unkown type--which
would indicate a server bug) should probably return nfserr_bad_stateid,
though this behavior shouldn't affect any non-buggy client.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 17:42:28 -04:00
J. Bruce Fields
06b332a522 nfsd4: check backchannel attributes on create_session
Make sure the client gives us an adequate backchannel.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 16:53:56 -04:00
J. Bruce Fields
55c760cfc4 nfsd4: fix forechannel attribute negotiation
Negotiation of the 4.1 session forechannel attributes is a mess.  Fix:

	- Move it all into check_forechannel_attrs instead of spreading
	  it between that, alloc_session, and init_forechannel_attrs.
	- set a minimum "slotsize" so that our drc memory limits apply
	  even for small maxresponsesize_cached.  This also fixes some
	  bugs when slotsize becomes <= 0.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 16:43:44 -04:00
J. Bruce Fields
373cd4098a nfsd4: cleanup check_forechannel_attrs
Pass this struct by reference, not by value, and return an error instead
of a boolean to allow for future additions.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 15:49:50 -04:00
J. Bruce Fields
0c7c3e67ab nfsd4: don't close read-write opens too soon
Don't actually close any opens until we don't need them at all.

This means being left with write access when it's not really necessary,
but that's better than putting a file that might still have posix locks
held on it, as we have been.

Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 09:08:57 -04:00
J. Bruce Fields
eb2099f31b nfsd4: release lockowners on last unlock in 4.1 case
In the 4.1 case we're supposed to release lockowners as soon as they're
no longer used.

It would probably be more efficient to reference count them, but that's
slightly fiddly due to the need to have callbacks from locks.c to take
into account lock merging and splitting.

For most cases just scanning the inode's lock list on unlock for
matching locks will be sufficient.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 09:08:56 -04:00
J. Bruce Fields
bbc9c36c31 nfsd4: more sessions/open-owner-replay cleanup
More logic that's unnecessary in the 4.1 case.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 09:08:56 -04:00
J. Bruce Fields
3d74e6a5b6 nfsd4: no need for replay_owner in sessions case
The replay_owner will never be used in the sessions case.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 09:08:55 -04:00
J. Bruce Fields
c383747ef6 nfsd4: remove some redundant comments
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 09:08:54 -04:00
Wei Yongjun
2c44a23471 nfsd: use kmem_cache_free() instead of kfree()
memory allocated by kmem_cache_alloc() should be freed using
kmem_cache_free(), not kfree().

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 09:08:47 -04:00
Trond Myklebust
7a8203d8cb NFS: Ensure that NFS file unlock waits for readahead to complete
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-08 22:12:42 -04:00
Trond Myklebust
577b42327d NFS: Add functionality to allow waiting on all outstanding reads to complete
This will later allow NFS locking code to wait for readahead to complete
before releasing byte range locks.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-08 22:12:33 -04:00
Trond Myklebust
bc7a05ca51 NFSv4: Handle timeouts correctly when probing for lease validity
When we send a RENEW or SEQUENCE operation in order to probe if the
lease is still valid, we want it to be able to time out since the
lease we are probing is likely to time out too. Currently, because
we use soft mount semantics for these RPC calls, the return value
is EIO, which causes the state manager to exit with an "unhandled
error" message.
This patch changes the call semantics, so that the RPC layer returns
ETIMEDOUT instead of EIO. We then have the state manager default to
a simple retry instead of exiting.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-08 18:01:59 -04:00
J. Bruce Fields
9411b1d4c7 nfsd4: cleanup handling of nfsv4.0 closed stateid's
Closed stateid's are kept around a little while to handle close replays
in the 4.0 case.  So we stash them in the last-used stateid in the
oo_last_closed_stateid field of the open owner.  We can free that in
encode_seqid_op_tail once the seqid on the open owner is next
incremented.  But we don't want to do that on the close itself; so we
set NFS4_OO_PURGE_CLOSE flag set on the open owner, skip freeing it the
first time through encode_seqid_op_tail, then when we see that flag set
next time we free it.

This is unnecessarily baroque.

Instead, just move the logic that increments the seqid out of the xdr
code and into the operation code itself.

The justification given for the current placement is that we need to
wait till the last minute to be sure we know whether the status is a
sequence-id-mutating error or not, but examination of the code shows
that can't actually happen.

Reported-by: Yanchuan Nian <ycnian@gmail.com>
Tested-by: Yanchuan Nian <ycnian@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-08 09:55:32 -04:00
Trond Myklebust
826e001308 NFSv4: Fix CB_RECALL_ANY to only return delegations that are not in use
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-05 17:03:57 -04:00
Trond Myklebust
b02ba0b660 NFSv4: Clean up nfs_expire_all_delegations
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-05 17:03:56 -04:00
Trond Myklebust
5c31e2368f NFSv4: Fix nfs_server_return_all_delegations
If the state manager thread is already running, we may end up
racing with it in nfs_client_return_marked_delegations. Better to
just allow the state manager thread to do the job.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-05 17:03:56 -04:00
Trond Myklebust
b757144fd7 NFSv4: Be less aggressive about returning delegations for open files
Currently, if the application that holds the file open isn't doing
I/O, we may end up returning the delegation. This means that we can
no longer cache the file as aggressively, and often also that we
multiply the state that both the server and the client needs to track.

This patch adds a check for open files to the routine that scans
for delegations that are unreferenced.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-05 17:03:55 -04:00
Trond Myklebust
db4f2e637f NFSv4: Clean up delegation recall error handling
Unify the error handling in nfs4_open_delegation_recall and
nfs4_lock_delegation_recall.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-05 17:03:55 -04:00
Trond Myklebust
be76b5b68d NFSv4: Clean up nfs4_open_delegation_recall
Make it symmetric with nfs4_lock_delegation_recall

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-05 17:03:54 -04:00
Trond Myklebust
4a706fa09f NFSv4: Clean up nfs4_lock_delegation_recall
All error cases are handled by the switch() statement, meaning that the
call to nfs4_handle_exception() is unreachable.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-05 17:03:54 -04:00
Trond Myklebust
8b6cc4d6f8 NFSv4: Handle NFS4ERR_DELAY and NFS4ERR_GRACE in nfs4_open_delegation_recall
A server shouldn't normally return NFS4ERR_GRACE if the client holds a
delegation, since no conflicting lock reclaims can be granted, however
the spec does not require the server to grant the open in this
instance

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-04-05 17:03:53 -04:00
Trond Myklebust
dbb21c25a3 NFSv4: Handle NFS4ERR_DELAY and NFS4ERR_GRACE in nfs4_lock_delegation_recall
A server shouldn't normally return NFS4ERR_GRACE if the client holds a
delegation, since no conflicting lock reclaims can be granted, however
the spec does not require the server to grant the lock in this
instance.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-04-05 17:03:53 -04:00
Paul Bolle
cb60718747 sunrpc: drop "select NETVM"
The Kconfig entry for SUNRPC_SWAP selects NETVM. That select statement
was added in commit a564b8f039 ("nfs:
enable swap on NFS"). But there's no Kconfig symbol NETVM. It apparently
was only in used in development versions of the swap over nfs
functionality but never entered mainline. Anyhow, it is a nop and can
safely be dropped.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-05 17:03:52 -04:00
Jeff Layton
25d280aad8 nfs: allow the v4.1 callback thread to freeze
The v4.1 callback thread has set_freezable() at the top, but it doesn't
ever try to freeze within the loop. Have it call try_to_freeze() at the
top of the loop. If a freeze event occurs, recheck kthread_should_stop()
after thawing.

Reported-by: Yanchuan Nian <ycnian@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-05 17:03:52 -04:00