Commit graph

79700 commits

Author SHA1 Message Date
Glauber de Oliveira Costa
4665ac8e28 lguest: makes special fields be per-vcpu
lguest struct have room for some fields, namely, cr2, ts, esp1
and ss1, that are not really guest-wide, but rather, vcpu-wide.

This patch puts it in the vcpu struct

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30 22:50:13 +11:00
Glauber de Oliveira Costa
66686c2ab0 lguest: per-vcpu lguest task management
lguest uses tasks to control its running behaviour (like sending
breaks, controlling halted state, etc). In a per-vcpu environment,
each vcpu will have its own underlying task. So this patch
makes the infrastructure for that possible

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30 22:50:12 +11:00
Glauber de Oliveira Costa
fc708b3e40 lguest: replace lguest_arch with lg_cpu_arch.
The fields found in lguest_arch are not really per-guest,
but per-cpu (gdt, idt, etc). So this patch turns lguest_arch
into lg_cpu_arch.

It makes sense to have a per-guest per-arch struct, but this
can be addressed later, when the need arrives.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30 22:50:11 +11:00
Glauber de Oliveira Costa
a53a35a8b4 lguest: make registers per-vcpu
This is the most obvious per-vcpu field: registers.

So this patch moves it from struct lguest to struct vcpu,
and patch the places in which they are used, accordingly

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30 22:50:11 +11:00
Glauber de Oliveira Costa
a3863f68b0 lguest: make emulate_insn receive a vcpu struct.
emulate_insn() needs to know about current eip, which will be,
in the future, a per-vcpu thing. So in this patch, the function
prototype is modified to receive a vcpu struct

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30 22:50:10 +11:00
Glauber de Oliveira Costa
0c78441cf4 lguest: map_switcher_in_guest() per-vcpu
The switcher needs to be mapped per-vcpu, because different vcpus
will potentially have different page tables (they don't have to,
because threads will share the same).

So our first step is the make the function receive a vcpu struct

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30 22:50:09 +11:00
Glauber de Oliveira Costa
177e449dc5 lguest: per-vcpu interrupt processing.
This patch adapts interrupt processing for using the vcpu struct.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30 22:50:09 +11:00
Glauber de Oliveira Costa
ad8d8f3bc6 lguest: per-vcpu lguest timers
Here, I introduce per-vcpu timers. With this, we can have
local expiries, needed for accounting time in smp guests

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30 22:50:08 +11:00
Glauber de Oliveira Costa
73044f05a4 lguest: make hypercalls use the vcpu struct
this patch changes do_hcall() and do_async_hcall() interfaces (and obviously their
callers) to get a vcpu struct. Again, a vcpu services the hypercall, not the whole
guest

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30 22:50:08 +11:00
Glauber de Oliveira Costa
7ea07a1500 lguest: make write() operation smp aware
This patch makes the write() file operation smp aware. Which means, receiving
the vcpu_id value through the offset parameter, and being well aware to which
vcpu we're talking to.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30 22:50:07 +11:00
Glauber de Oliveira Costa
d0953d42c3 lguest: per-cpu run guest
This patch makes the run_guest() routine use the lg_cpu struct.
This is required since in a smp guest environment, there's no
more the notion of "running the guest", but rather, it is "running the vcpu"

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30 22:50:06 +11:00
Glauber de Oliveira Costa
4dcc53da49 lguest: initialize vcpu
this patch initializes the first vcpu in the initialize() routing,
which is responsible for starting the process of putting the guest up.
right now, as much of the fields are still not per-vcpu, it does not
do much.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30 22:50:06 +11:00
Glauber de Oliveira Costa
e3283fa0cc lguest: adapt launcher to per-cpuness
This patch makes uses of pread() and pwrite() in lguest launcher
to communicate the vcpu id to the lguest driver. The id is kept in
a thread variable, which means we'll span in the future, vcpus as
threads. But right now, only the infrastructure is out there.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30 22:50:05 +11:00
Glauber de Oliveira Costa
badb1e0402 lguest: introduce vcpu struct
this patch introduces a vcpu struct for lguest. In upcoming patches,
more and more fields will be moved from the lguest struct to the vcpu

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30 22:50:04 +11:00
Balaji Rao
ec04b13f67 lguest: Reboot support
Reboot Implemented

(Prevent fd leak, fix style and fix documentation --RR)

Signed-off-by: Balaji Rao <balajirrao@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30 22:50:04 +11:00
Glauber de Oliveira Costa
5c55841d16 lguest: remove pv_info dependency
Currently, lguest module can't be compiled without the PARAVIRT flag being
on. This is a fake dependency, since the module itself shouldn't need any
paravirt override. Reason for that is the reference to pv_info structure
in initial loading tests.

This patch removes it in favour of a more generic error message.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30 22:50:03 +11:00
Glauber de Oliveira Costa
7ea08093e0 lguest: fix drivers/lguest Makefile entry
Parts depend on CONFIG_LGUEST, not just CONFIG_LGUEST_GUEST

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30 22:50:02 +11:00
Linus Torvalds
85004cc367 Merge git://git.linux-nfs.org/pub/linux/nfs-2.6
* git://git.linux-nfs.org/pub/linux/nfs-2.6: (118 commits)
  NFSv4: Iterate through all nfs_clients when the server recalls a delegation
  NFSv4: Deal more correctly with duplicate delegations
  NFS: Fix a potential race between umount and nfs_access_cache_shrinker()
  NFS: Add an asynchronous delegreturn operation for use in nfs_clear_inode
  nfs: convert NFS_*(inode) helpers to static inline
  nfs: obliterate NFS_FLAGS macro
  NFS: Address memory leaks in the NFS client mount option parser
  nfs4: allow nfsv4 acls on non-regular-files
  NFS: Optimise away the sigmask code in aio/dio reads and writes
  SUNRPC: Don't bother changing the sigmask for asynchronous RPC calls
  SUNRPC: rpcb_getport_sync() passes incorrect address size to rpc_create()
  SUNRPC: Clean up block comment preceding rpcb_getport_sync()
  SUNRPC: Use appropriate argument types in rpcb client
  SUNRPC: rpcb_getport_sync() should use built-in hostname generator
  SUNRPC: Clean up functions that free address_strings array
  NFS: NFS version number is unsigned
  NLM: Fix a bogus 'return' in nlmclnt_rpc_release
  NLM: Introduce an arguments structure for nlmclnt_init()
  NLM/NFS: Use cached nlm_host when calling nlmclnt_proc()
  NFS: Invoke nlmclnt_init during NFS mount processing
  ...
2008-01-30 19:54:24 +11:00
Jens Axboe
149a051f82 as-iosched: fix double locking bug in as_merged_requests()
If the two requests belong to the same io context, we will attempt
to lock the same lock twice. But swapping contexts is pointless in
that case, so just check for rioc == nioc before doing the double
lock and copy.

Tested-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-30 09:11:10 +01:00
Trond Myklebust
3fbd67ad61 NFSv4: Iterate through all nfs_clients when the server recalls a delegation
The same delegation may have been handed out to more than one nfs_client.
Ensure that if a recall occurs, we return all instances.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:12 -05:00
Trond Myklebust
57bfa89171 NFSv4: Deal more correctly with duplicate delegations
If a (broken?) server hands out two different delegations for the same
file, then we should return one of them.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:12 -05:00
Trond Myklebust
6f23e3872c NFS: Fix a potential race between umount and nfs_access_cache_shrinker()
Thanks to Yawei Niu for spotting the race.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:12 -05:00
Trond Myklebust
e6f8107595 NFS: Add an asynchronous delegreturn operation for use in nfs_clear_inode
Otherwise, there is a potential deadlock if the last dput() from an NFSv4
close() or other asynchronous operation leads to nfs_clear_inode calling
the synchronous delegreturn.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:12 -05:00
Benny Halevy
99fadcd764 nfs: convert NFS_*(inode) helpers to static inline
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:11 -05:00
Benny Halevy
3a10c30acc nfs: obliterate NFS_FLAGS macro
use NFS_I(inode)->flags instead

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:11 -05:00
Chuck Lever
fc6014771b NFS: Address memory leaks in the NFS client mount option parser
David Howells noticed that repeating the same mount option twice during an
NFS mount request can result in orphaned memory in certain cases.

Only the client_address and mount_server.hostname strings are initialized
in the mount parsing loop, so those appear to be the only two pointers that
might be written over by repeating a mount option.  The strings in the
nfs_server section of the nfs_parsed_mount_data structure are set only once
after the options are parsed, thus these are not susceptible to being
overwritten.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:11 -05:00
J. Bruce Fields
3d1c550874 nfs4: allow nfsv4 acls on non-regular-files
The rfc doesn't give any reason it shouldn't be possible to set an
attribute on a non-regular file.  And if the server supports it, then it
shouldn't be up to us to prevent it.

Thanks to Erez for the report and Trond for further analysis.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Tested-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:10 -05:00
Trond Myklebust
f3c391e89c NFS: Optimise away the sigmask code in aio/dio reads and writes
There are no interruptible waits for asynchronous RPC tasks, so we don't
need to wrap calls to rpc_run_task() with an
rpc_clnt_sigmask/rpc_clnt_unsigmask pair.

Instead we can wrap the wait_for_completion_interruptible() in
nfs_direct_wait(). This means that we completely optimise away sigmask
setting for the case of non-blocking aio/dio.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:10 -05:00
Trond Myklebust
34f5b4662b SUNRPC: Don't bother changing the sigmask for asynchronous RPC calls
The caller will never sleep in rpc_execute, so don't bother setting the
sigmask.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:10 -05:00
Chuck Lever
afc881124b SUNRPC: rpcb_getport_sync() passes incorrect address size to rpc_create()
The variable "sin" is a pointer, so sizeof(sin) is the size of a pointer,
not the size of thing that sin points to.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:09 -05:00
Chuck Lever
67d6021362 SUNRPC: Clean up block comment preceding rpcb_getport_sync()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:09 -05:00
Chuck Lever
f1ec08cb94 SUNRPC: Use appropriate argument types in rpcb client
Clean up: Follow recommendations of Chapter 5 of Documentation/CodingStyle
and use "u32" instead of "__u32" for types in definitions that are not
shared with user space.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:09 -05:00
Chuck Lever
b91e101fca SUNRPC: rpcb_getport_sync() should use built-in hostname generator
rpc_create() can already fill in the hostname with a string representation
of the server's IP address, so remove redundant logic in in
rpcb_getport_sync() that does that.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:09 -05:00
Chuck Lever
33e01dc7f5 SUNRPC: Clean up functions that free address_strings array
Clean up: document the rule (kfree) and the exceptions
(RPC_DISPLAY_PROTO and RPC_DISPLAY_NETID) when freeing the objects in
a transport's address_strings array.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:08 -05:00
Chuck Lever
c0e07cb68d NFS: NFS version number is unsigned
RPC protocol version numbers are unsigned.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:08 -05:00
Trond Myklebust
65fdf7d264 NLM: Fix a bogus 'return' in nlmclnt_rpc_release
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:08 -05:00
Chuck Lever
883bb163f8 NLM: Introduce an arguments structure for nlmclnt_init()
Clean up: pass 5 arguments to nlmclnt_init() in a structure similar to the
new nfs_client_initdata structure.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2008-01-30 02:06:07 -05:00
Chuck Lever
1093a60ef3 NLM/NFS: Use cached nlm_host when calling nlmclnt_proc()
Now that each NFS mount point caches its own nlm_host structure, it can be
passed to nlmclnt_proc() for each lock request.  By pinning an nlm_host for
each mount point, we trade the overhead of looking up or creating a fresh
nlm_host struct during every NLM procedure call for a little extra memory.

We also restrict the nlmclnt_proc symbol to limit the use of this call to
in-tree modules.

Note that nlm_lookup_host() (just removed from the client's per-request
NLM processing) could also trigger an nlm_host garbage collection.  Now
client-side nlm_host garbage collection occurs only during NFS mount
processing.  Since the NFS client now holds a reference on these nlm_host
structures, they wouldn't have been affected by garbage collection
anyway.

Given that nlm_lookup_host() reorders the global nlm_host chain after
every successful lookup, and that a garbage collection could be triggered
during the call, we've removed a significant amount of per-NLM-request
CPU processing overhead.

Sidebar: there are only a few remaining references to the internals of
NFS inodes in the client-side NLM code.  The only references I found are
related to extracting or comparing the inode's file handle via NFS_FH().
One is in nlmclnt_grant(); the other is in nlmclnt_setlockargs().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:07 -05:00
Chuck Lever
9289e7f91a NFS: Invoke nlmclnt_init during NFS mount processing
Cache an appropriate nlm_host structure in the NFS client's mount point
metadata for later use.

Note that there is no need to set NFS_MOUNT_NONLM in the error case -- if
nfs_start_lockd() returns a non-zero value, its callers ensure that the
mount request fails outright.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:07 -05:00
Chuck Lever
52c4044d00 NLM: Introduce external nlm_host set-up and tear-down functions
We would like to remove the per-lock-operation nlm_lookup_host() call from
nlmclnt_proc().

The new architecture pins an nlm_host structure to each NFS client
superblock that has the "lock" mount option set.  The NFS client passes
in the pinned nlm_host structure during each call to nlmclnt_proc().  NFS
client unmount processing "puts" the nlm_host so it can be garbage-
collected later.

This patch introduces externally callable NLM functions that handle
mount-time nlm_host set up and tear-down.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:06 -05:00
Trond Myklebust
86d61d8638 SUNRPC: Fix up constant string declarations in struct rpcbind_args
...and eliminate an unnecessary cast.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:06 -05:00
Chuck Lever
b454ae9060 SUNRPC: fewer conditionals in the format_ip_address routines
Clean up: have the set up routines explicitly pass the strings to be used
for the transport name and NETID.  This removes a number of conditionals
and dependencies on rpc_xprt.prot, which is overloaded.

Tighten up type checking on the address_strings array while we're at it.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:04 -05:00
Chuck Lever
cab6fc1b77 lockd: Eliminate harmless mixed sign comparison in nlmdbg_cookie2a()
The cookie->len field is unsigned, so the loop index variable in
nlmdbg_cookie2a() should also be unsigned.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:02 -05:00
Chuck Lever
3d509e5454 NFS: nfs_write_end clean up
Clean up: commit 4899f9c8 added nfs_write_end(), which introduces a
conditional expression that returns an unsigned integer in one arm and
a signed integer in the other.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:02 -05:00
Chuck Lever
bf4285e75c NFS: Fix minor mixed sign comparison in NFS client's write logic
Clean up: PAGE_CACHE_SIZE is unsigned, and nfs_pageio_init() takes a size_t.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:01 -05:00
Chuck Lever
d24aae41b4 NFS: Use size_t for storing name lengths
Clean up: always use the same type when handling buffer lengths.  As a
bonus, this prevents a mixed sign comparison in idmap_lookup_name.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:01 -05:00
Chuck Lever
a661b77fc1 NFS: Fix use of copy_to_user() in idmap_pipe_upcall
The idmap_pipe_upcall() function expects the copy_to_user() function to
return a negative error value if the call fails, but copy_to_user()
returns an unsigned long number of bytes that couldn't be copied.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:01 -05:00
Chuck Lever
369af0f116 NFS: Clean up fs/nfs/idmap.c
Clean up white space damage and use standard kernel coding conventions for
return statements.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:00 -05:00
Chuck Lever
7df089952f SUNRPC: Fix use of copy_to_user() in gss_pipe_upcall()
The gss_pipe_upcall() function expects the copy_to_user() function to
return a negative error value if the call fails, but copy_to_user()
returns an unsigned long number of bytes that couldn't be copied.

Can rpc_pipefs actually retry a partially completed upcall read?  If
not, then gss_pipe_upcall() should punt any partial read, just like the
upcall logic in net/sunrpc/cache.c.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:00 -05:00
Trond Myklebust
59dca3b28c NFS: Fix the 'proto=' mount option
Currently, if you have a server mounted using networking protocol, you
cannot specify a different value using the 'proto=' option on another
mountpoint.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:00 -05:00