This is a bug found by cpminer. The show_interrupts function reuses i as a
for loop counter, and therefore trashes its contents, which are needed
later.
(akpm: rename local `i' to `irq', use for_each_inline_cpu())
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
While looking in the code I discovered that alpha has fallen behind because
it doesn't use sys_getppid. The problem is that it doesn't follow the task
struct to the task_group_leader.
Acked-by: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Enable selection of different user/kernel VM splits for i386, including an
optimized mode for 1GB physical RAM, which gives the kernel a direct (non
HIGHMEM) mapping to the entire 1GB rather than just the first 896MB.
There is a similarly a similarly optimized mode for machines with exactly 2GB
of physical RAM.
This can speed up the kernel by avoiding having to create/destroy temporary
HIGHMEM mappings, and by not having to include HIGHMEM support at all on such
machines. The flip side is that there's less virtual addressing left for
userspace in these alternatives, and some binary-only kernel modules may
misbehave unless rebuilt with the same VMSPLIT option as the main kernel
image.
Original idea/patch from Jens Axboe, modified based on suggestions from Linus
et al.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It uses EXPORT_SYMBOL.
arch/sh64/kernel/time.c:254: warning: type defaults to `int' in declaration of `EXPORT_SYMBOL'
arch/sh64/kernel/time.c:254: warning: parameter names (without types) in function declaration
arch/sh64/kernel/time.c:254: warning: data definition has no type or storage class
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As reported by Russell King, sh and sh64 currently have bogus definitions for
TIOCGICOUNT, particularly referencing a kernel only structure. Switch to
using a sensible ioctl value.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
machine_halt() managed to trigger the soft lockup detection due to not
disabling interrupts before going to sleep, so correct that.
machine_power_off() should be using pm_power_off, which lets us drop the
board-specific hacks from here.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This should have been part of the timer framework support that was merged
earlier, but looks to have been accidentally omitted.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A couple of updates for the sh-sci serial driver:
- Update for clock framework on sh.
- Fix a compile error introduced by some h8300 changes.
- Add SH7770/SH7780 subtype support.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Trivial patch updating the voyagergx cchip code to reference a platform device
instead, now that the dma mask is taken care of. Given this, there's no
longer any reason to drag around the SH-bus code, so kill that off entirely.
Signed-off-by: Manuel Lauss <mano@roarinelk.homelinux.net>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Clean up some of the subtype IRQ definitions for IPR IRQ, and consolidate the
make_ipr_irq() definitions by dropping maskpos. SH-4A was the only thing
interested in the maskpos, and this should be handled through INTC2 rather
than IPR.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Trivial cleanup of the unknown machine type for some of the recent machvec
changes.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Now that the clock framework changes have been integrated, the manual clock
accounting that was done in sh_cpuinfo can be dropped.
Also correct a bug with running past the end of the CPU flags when there's a
mismatch between the added flags and printed ones.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Currently entry.S is home to these definitions, so we move them somewhere more
sensible. IPR IRQ handling depends on being to read from INTEVT.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Pretty much every subtype does this now anyways, and as we depend on it in a
few places being set to something sensible quite early on, it's better for a
new subtype to simply set a sensible default.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A few trivial updates for the microdev board support code:
- Update for __IO_PREFIX changes.
- Consolidate headers into a single microdev.h.
- Update the microdev_defconfig.
- Add init values for the S1D13806 used by s1d13xxxfb.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Better save the sigmask instead of throwing it away so it can be restored.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Prodive a MODALIAS= enviroment variable for devices on the mac-io bus.
Change the buffer length counter to not waste memory by advancing the
pointer for the next string too far. Tested on an ibook1 with modular
pmac_zilog.
Signed-off-by: Olaf Hering <olh@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- kexec.h is included from assembly code, thus C code must be properly
protected.
- (embedded) ppc32 systems use machine_kexec_simple whose declaration
vanished during a recent powerpc merge change.
Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
Cc: <fastboot@osdl.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove the SELinux security structure magic number fields and tests, along
with some unnecessary tests for NULL security pointers. These fields and
tests are leftovers from the early attempts to support SELinux as a
loadable module during LSM development.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch changes the SELinux file_alloc_security function to use
GFP_KERNEL rather than GFP_ATOMIC; the use of GFP_ATOMIC appears to be a
remnant of when this function was being called with the files_lock spinlock
held, and is no longer necessary. Please apply.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix the SELinux mprotect checks on executable mappings so that they are not
re-applied when the mapping is already executable as well as cleaning up
the code. This avoids a situation where e.g. an application is prevented
from removing PROT_WRITE on an already executable mapping previously
authorized via execmem permission due to an execmod denial.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix kzalloc() and kstrdup() caller report for CONFIG_DEBUG_SLAB. We must
pass the caller to __cache_alloc() instead of directly doing
__builtin_return_address(0) there; otherwise kzalloc() and kstrdup() are
reported as the allocation site instead of the real one.
Thanks to Valdis Kletnieks for reporting the problem and Steven Rostedt for
the original idea.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Replace uses of kmem_cache_t with proper struct kmem_cache in mm/slab.c.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Rename the ac_data() function to more descriptive cpu_cache_get().
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Introduce virt_to_cache() and virt_to_slab() functions to reduce duplicate
code and introduce a proper abstraction should we want to support other kind
of mapping for address to slab and cache (eg. for vmalloc() or I/O memory).
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
From: Manfred Spraul <manfred@colorfullife.com>
Reduce the amount of inline functions in slab to the functions that
are used in the hot path:
- no inline for debug functions
- no __always_inline, inline is already __always_inline
- remove inline from a few numa support functions.
Before:
text data bss dec hex filename
13588 752 48 14388 3834 mm/slab.o (defconfig)
16671 2492 48 19211 4b0b mm/slab.o (numa)
After:
text data bss dec hex filename
13366 752 48 14166 3756 mm/slab.o (defconfig)
16230 2492 48 18770 4952 mm/slab.o (numa)
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Create two helper functions slab_get_obj() and slab_put_obj() to replace
duplicated code in mm/slab.c
Signed-off-by: Matthew Dobson <colpatch@us.ibm.com>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Create a helper function, slab_destroy_objs() which called from
slab_destroy(). This makes slab_destroy() smaller and more readable, and
moves ifdefs outside the function body.
Signed-off-by: Matthew Dobson <colpatch@us.ibm.com>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Clean up cache_estimate() in mm/slab.c and improves the algorithm from O(n) to
O(1). We first calculate the maximum number of objects a slab can hold after
struct slab and kmem_bufctl_t for each object has been given enough space.
After that, to respect alignment rules, we decrease the number of objects if
necessary. As required padding is at most align-1 and memory of obj_size is
at least align, it is always enough to decrease number of objects by one.
The optimization was originally made by Balbir Singh with more improvements
from Steven Rostedt. Manfred Spraul provider further modifications: no loop
at all for the off-slab case and added comments to explain the background.
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I noticed the code for index_of is a creative way of finding the cache
index using the compiler to optimize to a single hard coded number. But
I couldn't help noticing that it uses two methods to let you know that
someone used it wrong. One is at compile time (the correct way), and
the other is at run time (not good).
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Clean up kmem_cache_alloc_node a bit.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
An object cache has two different object lengths:
- the amount of memory available for the user (object size)
- the amount of memory allocated internally (buffer size)
This patch does some renames to make the code reflect that better.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Migrate a page with buffers without requiring writeback
This introduces a new address space operation migratepage() that may be used
by a filesystem to implement its own version of page migration.
A version is provided that migrates buffers attached to pages. Some
filesystems (ext2, ext3, xfs) are modified to utilize this feature.
The swapper address space operation are modified so that a regular
migrate_page() will occur for anonymous pages without writeback (migrate_pages
forces every anonymous page to have a swap entry).
Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Modify policy layer to support direct page migration
- Add migrate_pages_to() allowing the migration of a list of pages to a a
specified node or to vma with a specific allocation policy in sets of
MIGRATE_CHUNK_SIZE pages
- Modify do_migrate_pages() to do a staged move of pages from the source
nodes to the target nodes.
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add remove_from_swap
remove_from_swap() allows the restoration of the pte entries that existed
before page migration occurred for anonymous pages by walking the reverse
maps. This reduces swap use and establishes regular pte's without the need
for page faults.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add direct migration support with fall back to swap.
Direct migration support on top of the swap based page migration facility.
This allows the direct migration of anonymous pages and the migration of file
backed pages by dropping the associated buffers (requires writeout).
Fall back to swap out if necessary.
The patch is based on lots of patches from the hotplug project but the code
was restructured, documented and simplified as much as possible.
Note that an additional patch that defines the migrate_page() method for
filesystems is necessary in order to avoid writeback for anonymous and file
backed pages.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Check for PageSwapCache after looking up and locking a swap page.
The page migration code may change a swap pte to point to a different page
under lock_page().
If that happens then the vm must retry the lookup operation in the swap space
to find the correct page number. There are a couple of locations in the VM
where a lock_page() is done on a swap page. In these locations we need to
check afterwards if the page was migrated. If the page was migrated then the
old page that was looked up before was freed and no longer has the
PageSwapCache bit set.
Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Christoph Lameter <clameter@@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If large amounts of zone memory are used by empty slabs then zone_reclaim
becomes uneffective. This patch shakes the slab a bit.
The problem with this patch is that the slab reclaim is not containable to a
zone. Thus slab reclaim may affect the whole system and be extremely slow.
This also means that we cannot determine how many pages were freed in this
zone. Thus we need to go off node for at least one allocation.
The functionality is disabled by default.
We could modify the shrinkers to take a zone parameter but that would be quite
invasive. Better ideas are welcome.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In some situations one may want zone_reclaim to behave differently. For
example a process writing large amounts of memory will spew unto other nodes
to cache the writes if many pages in a zone become dirty. This may impact the
performance of processes running on other nodes.
Allowing writes during reclaim puts a stop to that behavior and throttles the
process by restricting the pages to the local zone.
Similarly one may want to contain processes to local memory by enabling
regular swap behavior during zone_reclaim. Off node memory allocation can
then be controlled through memory policies and cpusets.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Currently the zone_reclaim code has a fixed window of 30 seconds of off node
allocations should a local zone have no unused pagecache pages left. Reclaim
will be attempted again after this timeout period to avoid repeated useless
scans for memory. This is also useful to established sufficiently large off
node allocation chunks to relieve the local node.
It may be beneficial to adjust that time period for some special situations.
For example if memory use was exceeding node capacity one may want to give up
for longer periods of time. If memory spikes intermittendly then one may want
to shorten the time period to reduce the number of off node allocations.
This patch allows just that....
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Instead of scanning all the pages in a zone, imitate real swap and scan
only a portion of the pages and gradually scan more if we do not free up
enough pages. This avoids a zone suddenly loosing all unused pagecache
pages (we may after all access some of these again so they deserve another
chance) but it still frees up large chunks of memory if a zone only
contains unused pagecache pages.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
zone_reclaim should leave that to the real swapper. We are only interested
in evicting unmapped pages.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2.6.15's hugepage faulting introduced huge_pages_needed accounting into
hugetlbfs: to count how many pages are already in cache, for spot check on
how far a new mapping may be allowed to extend the file. But it's muddled:
each hugepage found covers HPAGE_SIZE, not PAGE_SIZE. Once pages were
already in cache, it would overshoot, wrap its hugepages count backwards,
and so fail a harmless repeat mapping with -ENOMEM. Fixes the problem
found by Don Dupuis.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-By: Adam Litke <agl@us.ibm.com>
Acked-by: William Irwin <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Improve the performance of slab_put_obj(). Without the cast, gcc considers
ptrdiff_t a 64 bit signed integer and ends up emitting code to use a full
signed 128 bit divide on EM64T, which is substantially slower than a 32 bit
unsigned divide.
I noticed this when looking at the profile of a case where the slab balance
is just on edge and thrashes back and forth freeing a block.
Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- If we only reclaim nr_pages then its okay to stay on node.
Switch from > to >= for the comparison.
- vm_table[] entry for zone_reclaim_mode is a bit screwed up.
- Add empty lines around shrink_zone to show that this is the
central function to be called.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make sc->may_writepage control the writeout behavior of shrink_list.
Remove the laptop_mode trick from shrink_list and instead set may_writepage
in try_to_free_pages properly.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
GFP_ZONETYPES calculate from GFP_ZONEMASK
GFP_ZONETYPES's value is directly related to the value of GFP_ZONEMASK. It
takes one of two forms depending whether the top bit of GFP_ZONEMASK is a
'loner'. Supply both forms, enabling the loner.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>