Commit graph

2751 commits

Author SHA1 Message Date
Hugh Dickins
c34d1b4d16 [PATCH] mm: kill check_user_page_readable
check_user_page_readable is a problematic variant of follow_page.  It's used
only by oprofile's i386 and arm backtrace code, at interrupt time, to
establish whether a userspace stackframe is currently readable.

This is problematic, because we want to push the page_table_lock down inside
follow_page, and later split it; whereas oprofile is doing a spin_trylock on
it (in the i386 case, forgotten in the arm case), and needs that to pin
perhaps two pages spanned by the stackframe (which might be covered by
different locks when we split).

I think oprofile is going about this in the wrong way: it doesn't need to know
the area is readable (neither i386 nor arm uses read protection of user
pages), it doesn't need to pin the memory, it should simply
__copy_from_user_inatomic, and see if that succeeds or not.  Sorry, but I've
not got around to devising the sparse __user annotations for this.

Then we can eliminate check_user_page_readable, and return to a single
follow_page without the __follow_page variants.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:41 -07:00
Hugh Dickins
663b97f7ef [PATCH] mm: flush_tlb_range outside ptlock
There was one small but very significant change in the previous patch:
mprotect's flush_tlb_range fell outside the page_table_lock: as it is in 2.4,
but that doesn't prove it safe in 2.6.

On some architectures flush_tlb_range comes to the same as flush_tlb_mm, which
has always been called from outside page_table_lock in dup_mmap, and is so
proved safe.  Others required a deeper audit: I could find no reliance on
page_table_lock in any; but in ia64 and parisc found some code which looks a
bit as if it might want preemption disabled.  That won't do any actual harm,
so pending a decision from the maintainers, disable preemption there.

Remove comments on page_table_lock from flush_tlb_mm, flush_tlb_range and
flush_tlb_page entries in cachetlb.txt: they were rather misleading (what
generic code does is different from what usually happens), the rules are now
changing, and it's not yet clear where we'll end up (will the generic
tlb_flush_mmu happen always under lock?  never under lock?  or sometimes under
and sometimes not?).

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:40 -07:00
Hugh Dickins
b462705ac6 [PATCH] mm: arches skip ptlock
Convert those few architectures which are calling pud_alloc, pmd_alloc,
pte_alloc_map on a user mm, not to take the page_table_lock first, nor drop it
after.  Each of these can continue to use pte_alloc_map, no need to change
over to pte_alloc_map_lock, they're neither racy nor swappable.

In the sparc64 io_remap_pfn_range, flush_tlb_range then falls outside of the
page_table_lock: that's okay, on sparc64 it's like flush_tlb_mm, and that has
always been called from outside of page_table_lock in dup_mmap.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:40 -07:00
Hugh Dickins
872fec16d9 [PATCH] mm: init_mm without ptlock
First step in pushing down the page_table_lock.  init_mm.page_table_lock has
been used throughout the architectures (usually for ioremap): not to serialize
kernel address space allocation (that's usually vmlist_lock), but because
pud_alloc,pmd_alloc,pte_alloc_kernel expect caller holds it.

Reverse that: don't lock or unlock init_mm.page_table_lock in any of the
architectures; instead rely on pud_alloc,pmd_alloc,pte_alloc_kernel to take
and drop it when allocating a new one, to check lest a racing task already
did.  Similarly no page_table_lock in vmalloc's map_vm_area.

Some temporary ugliness in __pud_alloc and __pmd_alloc: since they also handle
user mms, which are converted only by a later patch, for now they have to lock
differently according to whether or not it's init_mm.

If sources get muddled, there's a danger that an arch source taking
init_mm.page_table_lock will be mixed with common source also taking it (or
neither take it).  So break the rules and make another change, which should
break the build for such a mismatch: remove the redundant mm arg from
pte_alloc_kernel (ppc64 scrapped its distinct ioremap_mm in 2.6.13).

Exceptions: arm26 used pte_alloc_kernel on user mm, now pte_alloc_map; ia64
used pte_alloc_map on init_mm, now pte_alloc_kernel; parisc had bad args to
pmd_alloc and pte_alloc_kernel in unused USE_HPPA_IOREMAP code; ppc64
map_io_page forgot to unlock on failure; ppc mmu_mapin_ram and ppc64 im_free
took page_table_lock for no good reason.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:40 -07:00
Hugh Dickins
46dea3d092 [PATCH] mm: ia64 use expand_upwards
ia64 has expand_backing_store function for growing its Register Backing Store
vma upwards.  But more complete code for this purpose is found in the
CONFIG_STACK_GROWSUP part of mm/mmap.c.  Uglify its #ifdefs further to provide
expand_upwards for ia64 as well as expand_stack for parisc.

The Register Backing Store vma should be marked VM_ACCOUNT.  Implement the
intention of growing it only a page at a time, instead of passing an address
outside of the vma to handle_mm_fault, with unknown consequences.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:39 -07:00
Nick Piggin
b5810039a5 [PATCH] core remove PageReserved
Remove PageReserved() calls from core code by tightening VM_RESERVED
handling in mm/ to cover PageReserved functionality.

PageReserved special casing is removed from get_page and put_page.

All setting and clearing of PageReserved is retained, and it is now flagged
in the page_alloc checks to help ensure we don't introduce any refcount
based freeing of Reserved pages.

MAP_PRIVATE, PROT_WRITE of VM_RESERVED regions is tentatively being
deprecated.  We never completely handled it correctly anyway, and is be
reintroduced in future if required (Hugh has a proof of concept).

Once PageReserved() calls are removed from kernel/power/swsusp.c, and all
arch/ and driver code, the Set and Clear calls, and the PG_reserved bit can
be trivially removed.

Last real user of PageReserved is swsusp, which uses PageReserved to
determine whether a struct page points to valid memory or not.  This still
needs to be addressed (a generic page_is_ram() should work).

A last caveat: the ZERO_PAGE is now refcounted and managed with rmap (and
thus mapcounted and count towards shared rss).  These writes to the struct
page could cause excessive cacheline bouncing on big systems.  There are a
number of ways this could be addressed if it is an issue.

Signed-off-by: Nick Piggin <npiggin@suse.de>

Refcount bug fix for filemap_xip.c

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:39 -07:00
Hugh Dickins
f9c98d0287 [PATCH] mm: m68k kill stram swap
Please, please now delete the Atari CONFIG_STRAM_SWAP code.  It may be
excellent and ingenious code, but its reference to swap_vfsmnt betrays that it
hasn't been built since 2.5.1 (four years old come December), it's delving
deep into matters which are the preserve of core mm code, its only purpose is
to give the more conscientious mm guys an anxiety attack from time to time;
yet we keep on breaking it more and more.

If you want to use RAM for swap, then if the MTD driver does not already
provide just what you need, I'm sure David could be persuaded to add the
extra.  But you'd also like to be able to allocate extents of that swap for
other use: we can give you a core interface for that if you need.  But unbuilt
for four years suggests to me that there's no need at all.

I cannot swear the patch below won't break your build, but believe so.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:38 -07:00
Hugh Dickins
147efea8eb [PATCH] mm: sh64 hugetlbpage.c
The sh64 hugetlbpage.c seems to be erroneous, left over from a bygone age,
clashing with the common hugetlb.c.  Replace it by a copy of the sh
hugetlbpage.c.  Except, delete that mk_pte_huge macro neither uses.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:38 -07:00
Hugh Dickins
404351e67a [PATCH] mm: mm_init set_mm_counters
How is anon_rss initialized?  In dup_mmap, and by mm_alloc's memset; but
that's not so good if an mm_counter_t is a special type.  And how is rss
initialized?  By set_mm_counter, all over the place.  Come on, we just need to
initialize them both at once by set_mm_counter in mm_init (which follows the
memcpy when forking).

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:38 -07:00
Hugh Dickins
fc2acab31b [PATCH] mm: tlb_finish_mmu forget rss
zap_pte_range has been counting the pages it frees in tlb->freed, then
tlb_finish_mmu has used that to update the mm's rss.  That got stranger when I
added anon_rss, yet updated it by a different route; and stranger when rss and
anon_rss became mm_counters with special access macros.  And it would no
longer be viable if we're relying on page_table_lock to stabilize the
mm_counter, but calling tlb_finish_mmu outside that lock.

Remove the mmu_gather's freed field, let tlb_finish_mmu stick to its own
business, just decrement the rss mm_counter in zap_pte_range (yes, there was
some point to batching the update, and a subsequent patch restores that).  And
forget the anal paranoia of first reading the counter to avoid going negative
- if rss does go negative, just fix that bug.

Remove the mmu_gather's flushes and avoided_flushes from arm and arm26: no use
was being made of them.  But arm26 alone was actually using the freed, in the
way some others use need_flush: give it a need_flush.  arm26 seems to prefer
spaces to tabs here: respect that.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:37 -07:00
Hugh Dickins
4d6ddfa924 [PATCH] mm: tlb_is_full_mm was obscure
tlb_is_full_mm?  What does that mean?  The TLB is full?  No, it means that the
mm's last user has gone and the whole mm is being torn down.  And it's an
inline function because sparc64 uses a different (slightly better)
"tlb_frozen" name for the flag others call "fullmm".

And now the ptep_get_and_clear_full macro used in zap_pte_range refers
directly to tlb->fullmm, which would be wrong for sparc64.  Rather than
correct that, I'd prefer to scrap tlb_is_full_mm altogether, and change
sparc64 to just use the same poor name as everyone else - is that okay?

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:37 -07:00
Hugh Dickins
ab50b8ed81 [PATCH] mm: vm_stat_account unshackled
The original vm_stat_account has fallen into disuse, with only one user, and
only one user of vm_stat_unaccount.  It's easier to keep track if we convert
them all to __vm_stat_account, then free it from its __shackles.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:37 -07:00
Linus Torvalds
be15cd72d2 Merge master.kernel.org:/home/rmk/linux-2.6-arm 2005-10-29 14:02:16 -07:00
Nicolas Pitre
37d07b72ef [ARM] 3061/1: cleanup the XIP link address mess
Patch from Nicolas Pitre

Since vmlinux.lds.S is preprocessed, we can use the defines already
present in asm/memory.h (allowed by patch #3060) for the XIP kernel link
address instead of relying on a duplicated Makefile hardcoded value, and
also get rid of its dependency on awk to handle it at the same time.

While at it let's clean XIP stuff even further and make things clearer
in head.S with a nice code reduction.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-10-29 21:44:56 +01:00
Nicolas Pitre
f09b997999 [ARM] 3060/1: allow constants found in asm/memory.h to be used in asm code
Patch from Nicolas Pitre

This patch allows for assorted type of cleanups by letting assembly code
use the same set of defines for constant values and avoid duplicated
definitions that might not always be in sync, or that might simply be
confusing due to the different names for the same thing.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-10-29 21:44:55 +01:00
Ralf Baechle
09af7b443c Update MIPS defconfig files.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:54 +01:00
Arthur Othieno
5ef66935c1 prom_free_prom_memory() returns unsigned long
Some boards declare prom_free_prom_memory as a void function but the
    caller free_initmem() expects a return value.
    
    Fix those up and return 0 instead, just like everyone else does.
    
    Signed-off-by: Arthur Othieno <a.othieno@bluewin.ch>
    Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:53 +01:00
Ralf Baechle
4b724efdde Get rid of SINGLE_ONLY_FPU. Linux does not support half FPU other than
by emulation of a full FPU.
    
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:52 +01:00
Ralf Baechle
3fccc0150e Fix all the get_user / put_user related sparse warnings.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:52 +01:00
Ralf Baechle
3c5c8f6748 Delete unused ieee754_cname[] and declaration.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:52 +01:00
Ralf Baechle
efec3c4e96 Include for prototypes.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:51 +01:00
Ralf Baechle
a663bf906d Protect against multiple inclusion.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:51 +01:00
Ralf Baechle
030274ae03 Remove useless casts of kmalloc return values.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:50 +01:00
Ralf Baechle
e5adb8770e Hack to resolve longstanding prefetch issue
Prefetching may be fatal on some systems if we're prefetching beyond the
end of memory on some systems.  It's also a seriously bad idea on non
dma-coherent systems.
    
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:50 +01:00
Ralf Baechle
7cf8053b8e More foolproofing of the CPU configuration.
Limit the number of cpu type options in the cpu menu to just those
types that are actually available for the select platform.
    
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:49 +01:00
Andrew Isaacson
cb4262481f pci-expmem-hack
CFE 1.2.5 and earlier fails to turn on the ExpMemEn bit in the
PCIFeatureControl register, which means that DMA does not work
beyond physical address 01_0000_0000, ergo to DRAM beyond 1GB.
    
With ExpMemEn turned on, 01_0000_0000-0f_ffff_ffff is mapped,
so DMA works for up to 61 GB of DRAM.
    
Will be fixed in CFE 1.2.6 (yet to be released).
    
Signed-Off-By: Andy Isaacson <adi@broadcom.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:49 +01:00
Andrew Isaacson
8a1417de9e BCM1480 HT support
PCI support code for PLX 7250 PCI-X tunnel on BCM91480B BigSur board.
    
Signed-Off-By: Andy Isaacson <adi@broadcom.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:49 +01:00
Andrew Isaacson
dc41f94f77 Support for the BCM1480 on-chip PCI-X bridge.
Signed-Off-By: Andy Isaacson <adi@broadcom.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:48 +01:00
Andrew Isaacson
a4b5bd9abc SB1 cache exception handling.
Expand SB1 cache error handling by adding SB1_CEX_ALWAYS_FATAL and
SB1_CEX_STALL, allowing configurable behavior on cache errors.
    
Signed-Off-By: Andy Isaacson <adi@broadcom.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:48 +01:00
Andrew Isaacson
9a6dcea103 Support for BigSur board.
Signed-Off-By: Andy Isaacson <adi@broadcom.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:47 +01:00
Andrew Isaacson
f137e463b5 Add support for BCM1480 family of chips.
- Kconfig and Makefile changes
 - arch/mips/sibyte/bcm1480/
 - changes to sibyte common code to support 1480
    
Signed-Off-By: Andy Isaacson <adi@broadcom.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:47 +01:00
Andrew Isaacson
93ce2f524e Add support for SB1A CPU.
Signed-Off-By: Andy Isaacson <adi@broadcom.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:46 +01:00
Andrew Isaacson
d121ced21d Sibyte fixes
Fix typo in cpu_probe_sibyte.
    
Signed-Off-By: Andy Isaacson <adi@broadcom.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:45 +01:00
Atsushi Nemoto
750ccf687f Fix zero length sys_cacheflush
Cacheflush(0, 0, 0) was crashing the system.  This is because
flush_icache_range(start, end) tries to flushing whole address space
(0 - ~0UL) if both start and end are zero.
    
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:44 +01:00
Ralf Baechle
f4c72cc737 Get 64-bit right in the kgdb stub.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:43 +01:00
Ralf Baechle
0d507d61cd Sys_lookup_dcookie arguments occupy 4 argument slots.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:43 +01:00
Ralf Baechle
12616ed202 FPU emulator garbage collection.
First argument of fpu_emulator_cop1Handler() was unused.
    
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:43 +01:00
Ralf Baechle
178086c86a Don't print file name and line in die and die_if_kernel.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:42 +01:00
Ralf Baechle
6ec25809c1 Rename page argument of flush_cache_page to something more descriptive.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:42 +01:00
Ralf Baechle
5e83d43054 Sliceup Kconfig; it's grown too large.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:41 +01:00
Ralf Baechle
9383292f17 Date: Fri Jan 14 03:03:23 2005 +0000
Locking cleanups.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:39 +01:00
Ralf Baechle
dbc571690e Fix wrong comment.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:38 +01:00
Ralf Baechle
ec917c2c1a Fixup a few lose ends in explicit support for MIPS R1/R2.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:37 +01:00
Ralf Baechle
f92c1759a4 Document the meaning of the CPU_MIPS32, CPU_MIPS64, CPU_MIPSR1 and
CPU_MIPSR2.
    
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:37 +01:00
Ralf Baechle
101b3531a6 Protect manipulation of c0_status against preemption and multithreading.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:36 +01:00
Ralf Baechle
8afcb5d829 Detect 4KSD and treat it like 4KSc.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:36 +01:00
Ralf Baechle
2f69ddccb0 Convert the remaining SPIN_LOCK_UNLOCKED instances to DEFINE_SPINLOCK.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:35 +01:00
Ralf Baechle
57468af326 Define and initialize kdb_lock using DEFINE_SPINLOCK.
Convert kgdb_cpulock into a raw_spinlock_t.
    
SPIN_LOCK_UNLOCKED is deprecated and it's replacement DEFINE_SPINLOCK is
not suitable for arrays of spinlocks.
    
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:35 +01:00
Ralf Baechle
f8bb3af924 Make kgdb_wait static.
Nothing outside gdb-stub.c uses kgdb_wait, so change it's definition to
static.
    
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:34 +01:00
Ralf Baechle
65f1f5a2c3 Don't copy SB1 cache error handler to uncached memory.
This may have made sense on a paranoid day with pass 1 BCM1250 processors
that were throwing cache error exception left and right for no good
reason.  On modern silicion that hardly makes sense and the code had
gotten just an obscurity ...
    
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:32:34 +01:00