Commit graph

19415 commits

Author SHA1 Message Date
Eric Sesterhenn
c5e3d98c56 [PATCH] alpha show_interrups() trashes argument
This is a bug found by cpminer.  The show_interrupts function reuses i as a
for loop counter, and therefore trashes its contents, which are needed
later.

(akpm: rename local `i' to `irq', use for_each_inline_cpu())

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:21 -08:00
Eric W. Biederman
9a5e733990 [PATCH] alpha: Fix getxpid on alpha so it works for threads
While looking in the code I discovered that alpha has fallen behind because
it doesn't use sys_getppid.  The problem is that it doesn't follow the task
struct to the task_group_leader.

Acked-by: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:21 -08:00
Mark Lord
975b3d3d5b [PATCH] VMSPLIT config options
Enable selection of different user/kernel VM splits for i386, including an
optimized mode for 1GB physical RAM, which gives the kernel a direct (non
HIGHMEM) mapping to the entire 1GB rather than just the first 896MB.

There is a similarly a similarly optimized mode for machines with exactly 2GB
of physical RAM.

This can speed up the kernel by avoiding having to create/destroy temporary
HIGHMEM mappings, and by not having to include HIGHMEM support at all on such
machines.  The flip side is that there's less virtual addressing left for
userspace in these alternatives, and some binary-only kernel modules may
misbehave unless rebuilt with the same VMSPLIT option as the main kernel
image.

Original idea/patch from Jens Axboe, modified based on suggestions from Linus
et al.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:21 -08:00
Alexey Dobriyan
4940fb4412 [PATCH] arch/sh64/kernel/time.c: add module.h
It uses EXPORT_SYMBOL.

arch/sh64/kernel/time.c:254: warning: type defaults to `int' in declaration of `EXPORT_SYMBOL'
arch/sh64/kernel/time.c:254: warning: parameter names (without types) in function declaration
arch/sh64/kernel/time.c:254: warning: data definition has no type or storage class

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:20 -08:00
Paul Mundt
87f55e67dc [PATCH] sh/sh64: Fix bogus TIOCGICOUNT definitions
As reported by Russell King, sh and sh64 currently have bogus definitions for
TIOCGICOUNT, particularly referencing a kernel only structure.  Switch to
using a sensible ioctl value.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:20 -08:00
Paul Mundt
a3310bbd3a [PATCH] sh: machine_halt()/machine_power_off() cleanups
machine_halt() managed to trigger the soft lockup detection due to not
disabling interrupts before going to sleep, so correct that.

machine_power_off() should be using pm_power_off, which lets us drop the
board-specific hacks from here.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:20 -08:00
Paul Mundt
6c80a1f888 [PATCH] sh: Add missing timers directory rule to build
This should have been part of the timer framework support that was merged
earlier, but looks to have been accidentally omitted.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:20 -08:00
Paul Mundt
b7a76e4b4e [PATCH] sh: sh-sci clock framework updates
A couple of updates for the sh-sci serial driver:

	- Update for clock framework on sh.
	- Fix a compile error introduced by some h8300 changes.
	- Add SH7770/SH7780 subtype support.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:20 -08:00
Paul Mundt
37cc794378 [PATCH] sh: convert voyagergx to platform device, drop sh-bus
Trivial patch updating the voyagergx cchip code to reference a platform device
instead, now that the dma mask is taken care of.  Given this, there's no
longer any reason to drag around the SH-bus code, so kill that off entirely.

Signed-off-by: Manuel Lauss <mano@roarinelk.homelinux.net>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:20 -08:00
Paul Mundt
8d27e08191 [PATCH] sh: drop maskpos from make_ipr_irq(), remove duplicate irq definitions
Clean up some of the subtype IRQ definitions for IPR IRQ, and consolidate the
make_ipr_irq() definitions by dropping maskpos.  SH-4A was the only thing
interested in the maskpos, and this should be handled through INTC2 rather
than IPR.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:20 -08:00
Paul Mundt
50373c1b7e [PATCH] sh: unknown mach-type updates
Trivial cleanup of the unknown machine type for some of the recent machvec
changes.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:19 -08:00
Paul Mundt
de02797aa7 [PATCH] sh: Cleanup struct sh_cpuinfo for clock framework changes
Now that the clock framework changes have been integrated, the manual clock
accounting that was done in sh_cpuinfo can be dropped.

Also correct a bug with running past the end of the CPU flags when there's a
mismatch between the added flags and printed ones.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:19 -08:00
Paul Mundt
091904ae5f [PATCH] sh: Move TRA/EXPEVT/INTEVT definitions for reuse
Currently entry.S is home to these definitions, so we move them somewhere more
sensible.  IPR IRQ handling depends on being to read from INTEVT.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:19 -08:00
Paul Mundt
134ed1420e [PATCH] sh: Make peripheral clock frequency setting mandatory
Pretty much every subtype does this now anyways, and as we depend on it in a
few places being set to something sensible quite early on, it's better for a
new subtype to simply set a sensible default.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:19 -08:00
Paul Mundt
740172947b [PATCH] sh: SH4-202 microdev updates
A few trivial updates for the microdev board support code:

	- Update for __IO_PREFIX changes.
	- Consolidate headers into a single microdev.h.
	- Update the microdev_defconfig.
	- Add init values for the S1D13806 used by s1d13xxxfb.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:19 -08:00
Heiko Carstens
4a41cdf978 [PATCH] powerpc: Fix sigmask handling in sys_sigsuspend.
Better save the sigmask instead of throwing it away so it can be restored.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:19 -08:00
Olaf Hering
e61997881e [PATCH] MODALIAS= for macio
Prodive a MODALIAS= enviroment variable for devices on the mac-io bus.
Change the buffer length counter to not waste memory by advancing the
pointer for the next string too far.  Tested on an ibook1 with modular
pmac_zilog.

Signed-off-by: Olaf Hering <olh@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:19 -08:00
Albert Herranz
39931e41be [PATCH] powerpc: fix for kexec ppc32
- kexec.h is included from assembly code, thus C code must be properly
  protected.

- (embedded) ppc32 systems use machine_kexec_simple whose declaration
  vanished during a recent powerpc merge change.

Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
Cc: <fastboot@osdl.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:19 -08:00
Stephen Smalley
9ac49d2213 [PATCH] selinux: remove security struct magic number fields and tests
Remove the SELinux security structure magic number fields and tests, along
with some unnecessary tests for NULL security pointers.  These fields and
tests are leftovers from the early attempts to support SELinux as a
loadable module during LSM development.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:19 -08:00
Stephen Smalley
26d2a4be6a [PATCH] selinux: change file_alloc_security to use GFP_KERNEL
This patch changes the SELinux file_alloc_security function to use
GFP_KERNEL rather than GFP_ATOMIC; the use of GFP_ATOMIC appears to be a
remnant of when this function was being called with the files_lock spinlock
held, and is no longer necessary.  Please apply.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:18 -08:00
Stephen Smalley
db4c9641de [PATCH] selinux: fix and cleanup mprotect checks
Fix the SELinux mprotect checks on executable mappings so that they are not
re-applied when the mapping is already executable as well as cleaning up
the code.  This avoids a situation where e.g.  an application is prevented
from removing PROT_WRITE on an already executable mapping previously
authorized via execmem permission due to an execmod denial.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:18 -08:00
Randy Dunlap
ee13d785ea [PATCH] slab: fix sparse warning
mm/slab.c:1522:13: error: incompatible types for operation (&)

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:18 -08:00
Randy.Dunlap
a70773ddb9 [PATCH] mm/slab: add kernel-doc for one function
Fix kernel-doc for calculate_slab_order().

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:18 -08:00
Pekka Enberg
7fd6b14130 [PATCH] slab: fix kzalloc and kstrdup caller report for CONFIG_DEBUG_SLAB
Fix kzalloc() and kstrdup() caller report for CONFIG_DEBUG_SLAB.  We must
pass the caller to __cache_alloc() instead of directly doing
__builtin_return_address(0) there; otherwise kzalloc() and kstrdup() are
reported as the allocation site instead of the real one.

Thanks to Valdis Kletnieks for reporting the problem and Steven Rostedt for
the original idea.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:18 -08:00
Andrew Morton
b958f7d9f3 [PATCH] dump_stack() in oom handler
Sometimes it's nice to know who's calling.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:18 -08:00
Pekka Enberg
343e0d7a93 [PATCH] slab: replace kmem_cache_t with struct kmem_cache
Replace uses of kmem_cache_t with proper struct kmem_cache in mm/slab.c.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:18 -08:00
Pekka Enberg
9a2dba4b49 [PATCH] slab: rename ac_data to cpu_cache_get
Rename the ac_data() function to more descriptive cpu_cache_get().

Acked-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:18 -08:00
Pekka Enberg
6ed5eb2211 [PATCH] slab: extract virt_to_{cache|slab}
Introduce virt_to_cache() and virt_to_slab() functions to reduce duplicate
code and introduce a proper abstraction should we want to support other kind
of mapping for address to slab and cache (eg.  for vmalloc() or I/O memory).

Acked-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:18 -08:00
Pekka Enberg
5295a74cc0 [PATCH] slab: reduce inlining
From: Manfred Spraul <manfred@colorfullife.com>

Reduce the amount of inline functions in slab to the functions that
are used in the hot path:

  - no inline for debug functions
  - no __always_inline, inline is already __always_inline
  - remove inline from a few numa support functions.

Before:

   text    data     bss     dec     hex filename
  13588     752      48   14388    3834 mm/slab.o (defconfig)
  16671    2492      48   19211    4b0b mm/slab.o (numa)

After:

   text    data     bss     dec     hex filename
  13366     752      48   14166    3756 mm/slab.o (defconfig)
  16230    2492      48   18770    4952 mm/slab.o (numa)

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:17 -08:00
Matthew Dobson
78d382d77c [PATCH] slab: extract slab_{put|get}_obj
Create two helper functions slab_get_obj() and slab_put_obj() to replace
duplicated code in mm/slab.c

Signed-off-by: Matthew Dobson <colpatch@us.ibm.com>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:17 -08:00
Matthew Dobson
12dd36faec [PATCH] slab: extract slab_destroy_objs()
Create a helper function, slab_destroy_objs() which called from
slab_destroy().  This makes slab_destroy() smaller and more readable, and
moves ifdefs outside the function body.

Signed-off-by: Matthew Dobson <colpatch@us.ibm.com>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:17 -08:00
Steven Rostedt
fbaccacff1 [PATCH] slab: cache_estimate cleanup
Clean up cache_estimate() in mm/slab.c and improves the algorithm from O(n) to
O(1).  We first calculate the maximum number of objects a slab can hold after
struct slab and kmem_bufctl_t for each object has been given enough space.
After that, to respect alignment rules, we decrease the number of objects if
necessary.  As required padding is at most align-1 and memory of obj_size is
at least align, it is always enough to decrease number of objects by one.

The optimization was originally made by Balbir Singh with more improvements
from Steven Rostedt.  Manfred Spraul provider further modifications: no loop
at all for the off-slab case and added comments to explain the background.

Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:17 -08:00
Steven Rostedt
5ec8a847bb [PATCH] slab: have index_of bug at compile time
I noticed the code for index_of is a creative way of finding the cache
index using the compiler to optimize to a single hard coded number.  But
I couldn't help noticing that it uses two methods to let you know that
someone used it wrong.  One is at compile time (the correct way), and
the other is at run time (not good).

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:17 -08:00
Christoph Lameter
18f820f655 [PATCH] slab: minor cleanup to kmem_cache_alloc_node
Clean up kmem_cache_alloc_node a bit.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:17 -08:00
Manfred Spraul
3dafccf227 [PATCH] slab: distinguish between object and buffer size
An object cache has two different object lengths:

  - the amount of memory available for the user (object size)
  - the amount of memory allocated internally (buffer size)

This patch does some renames to make the code reflect that better.

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:17 -08:00
Christoph Lameter
e965f9630c [PATCH] Direct Migration V9: Avoid writeback / page_migrate() method
Migrate a page with buffers without requiring writeback

This introduces a new address space operation migratepage() that may be used
by a filesystem to implement its own version of page migration.

A version is provided that migrates buffers attached to pages.  Some
filesystems (ext2, ext3, xfs) are modified to utilize this feature.

The swapper address space operation are modified so that a regular
migrate_page() will occur for anonymous pages without writeback (migrate_pages
forces every anonymous page to have a swap entry).

Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:17 -08:00
Christoph Lameter
7e2ab150d1 [PATCH] Direct Migration V9: upgrade MPOL_MF_MOVE and sys_migrate_pages()
Modify policy layer to support direct page migration

- Add migrate_pages_to() allowing the migration of a list of pages to a a
  specified node or to vma with a specific allocation policy in sets of
  MIGRATE_CHUNK_SIZE pages

- Modify do_migrate_pages() to do a staged move of pages from the source
  nodes to the target nodes.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:16 -08:00
Christoph Lameter
a3351e525e [PATCH] Direct Migration V9: remove_from_swap() to remove swap ptes
Add remove_from_swap

remove_from_swap() allows the restoration of the pte entries that existed
before page migration occurred for anonymous pages by walking the reverse
maps.  This reduces swap use and establishes regular pte's without the need
for page faults.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:16 -08:00
Christoph Lameter
a48d07afdf [PATCH] Direct Migration V9: migrate_pages() extension
Add direct migration support with fall back to swap.

Direct migration support on top of the swap based page migration facility.

This allows the direct migration of anonymous pages and the migration of file
backed pages by dropping the associated buffers (requires writeout).

Fall back to swap out if necessary.

The patch is based on lots of patches from the hotplug project but the code
was restructured, documented and simplified as much as possible.

Note that an additional patch that defines the migrate_page() method for
filesystems is necessary in order to avoid writeback for anonymous and file
backed pages.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:16 -08:00
Christoph Lameter
b16664e44c [PATCH] Direct Migration V9: PageSwapCache checks
Check for PageSwapCache after looking up and locking a swap page.

The page migration code may change a swap pte to point to a different page
under lock_page().

If that happens then the vm must retry the lookup operation in the swap space
to find the correct page number.  There are a couple of locations in the VM
where a lock_page() is done on a swap page.  In these locations we need to
check afterwards if the page was migrated.  If the page was migrated then the
old page that was looked up before was freed and no longer has the
PageSwapCache bit set.

Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Christoph Lameter <clameter@@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:16 -08:00
Christoph Lameter
2a16e3f4b0 [PATCH] Reclaim slab during zone reclaim
If large amounts of zone memory are used by empty slabs then zone_reclaim
becomes uneffective.  This patch shakes the slab a bit.

The problem with this patch is that the slab reclaim is not containable to a
zone.  Thus slab reclaim may affect the whole system and be extremely slow.
This also means that we cannot determine how many pages were freed in this
zone.  Thus we need to go off node for at least one allocation.

The functionality is disabled by default.

We could modify the shrinkers to take a zone parameter but that would be quite
invasive.  Better ideas are welcome.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:16 -08:00
Christoph Lameter
1b2ffb7896 [PATCH] Zone reclaim: Allow modification of zone reclaim behavior
In some situations one may want zone_reclaim to behave differently.  For
example a process writing large amounts of memory will spew unto other nodes
to cache the writes if many pages in a zone become dirty.  This may impact the
performance of processes running on other nodes.

Allowing writes during reclaim puts a stop to that behavior and throttles the
process by restricting the pages to the local zone.

Similarly one may want to contain processes to local memory by enabling
regular swap behavior during zone_reclaim.  Off node memory allocation can
then be controlled through memory policies and cpusets.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:16 -08:00
Christoph Lameter
2a11ff06d7 [PATCH] zone_reclaim: configurable off node allocation period.
Currently the zone_reclaim code has a fixed window of 30 seconds of off node
allocations should a local zone have no unused pagecache pages left.  Reclaim
will be attempted again after this timeout period to avoid repeated useless
scans for memory.  This is also useful to established sufficiently large off
node allocation chunks to relieve the local node.

It may be beneficial to adjust that time period for some special situations.
For example if memory use was exceeding node capacity one may want to give up
for longer periods of time.  If memory spikes intermittendly then one may want
to shorten the time period to reduce the number of off node allocations.

This patch allows just that....

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:16 -08:00
Christoph Lameter
a92f71263a [PATCH] zone_reclaim: partial scans instead of full scan
Instead of scanning all the pages in a zone, imitate real swap and scan
only a portion of the pages and gradually scan more if we do not free up
enough pages.  This avoids a zone suddenly loosing all unused pagecache
pages (we may after all access some of these again so they deserve another
chance) but it still frees up large chunks of memory if a zone only
contains unused pagecache pages.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:16 -08:00
Christoph Lameter
aa3f18b339 [PATCH] zone_reclaim: do not unmap file backed pages
zone_reclaim should leave that to the real swapper.  We are only interested
in evicting unmapped pages.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:15 -08:00
Hugh Dickins
4e6a510a74 [PATCH] mm: hugepage accounting fix
2.6.15's hugepage faulting introduced huge_pages_needed accounting into
hugetlbfs: to count how many pages are already in cache, for spot check on
how far a new mapping may be allowed to extend the file.  But it's muddled:
each hugepage found covers HPAGE_SIZE, not PAGE_SIZE.  Once pages were
already in cache, it would overshoot, wrap its hugepages count backwards,
and so fail a harmless repeat mapping with -ENOMEM.  Fixes the problem
found by Don Dupuis.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-By: Adam Litke <agl@us.ibm.com>
Acked-by: William Irwin <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:15 -08:00
Benjamin LaHaise
9884fd8df1 [PATCH] Use 32 bit division in slab_put_obj()
Improve the performance of slab_put_obj().  Without the cast, gcc considers
ptrdiff_t a 64 bit signed integer and ends up emitting code to use a full
signed 128 bit divide on EM64T, which is substantially slower than a 32 bit
unsigned divide.

I noticed this when looking at the profile of a case where the slab balance
is just on edge and thrashes back and forth freeing a block.

Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:15 -08:00
Christoph Lameter
c84db23c6e [PATCH] zone_reclaim: minor fixes
- If we only reclaim nr_pages then its okay to stay on node.
  Switch from > to >= for the comparison.

- vm_table[] entry for zone_reclaim_mode is a bit screwed up.

- Add empty lines around shrink_zone to show that this is the
  central function to be called.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:15 -08:00
Christoph Lameter
52a8363eae [PATCH] mm: improve function of sc->may_writepage
Make sc->may_writepage control the writeout behavior of shrink_list.

Remove the laptop_mode trick from shrink_list and instead set may_writepage
in try_to_free_pages properly.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:15 -08:00
Andy Whitcroft
ce2ea89ba1 [PATCH] GFP_ZONETYPES: calculate from GFP_ZONEMASK
GFP_ZONETYPES calculate from GFP_ZONEMASK

GFP_ZONETYPES's value is directly related to the value of GFP_ZONEMASK.  It
takes one of two forms depending whether the top bit of GFP_ZONEMASK is a
'loner'.  Supply both forms, enabling the loner.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:15 -08:00