- When setting a sighandler using sigaction() call, if the flag
SA_ONSTACK is set and no alternate stack is provided via sigaltstack(),
the kernel still try to install the alternate stack. This behavior is
the opposite of the one which is documented in Single Unix Specifications
V3.
- Also when setting an alternate stack using sigaltstack() with the flag
SS_DISABLE, the kernel try to install the alternate stack on signal
delivery.
These two use cases makes the process crash at signal delivery.
Signed-off-by: Laurent Meyer <meyerlau@fr.ibm.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use the new LED infrastructure to support the 6 LEDs present on the Amstrad
Delta.
[akpm@osdl.org: cleanup]
Signed-off-by: Jonathan McDowell <noodles@earth.li>
Ackde-by: Richard Purdie <rpurdie@rpsys.net>
Cc: Ben Dooks <ben@fluff.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The hardirq_ctx and softirq_ctx variables are written to on init only,
Signed-off-by: Andreas Mohr <andi@lisas.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Default values for boolean and tristate options can only be 'y', 'm' or 'n'.
This patch removes wrong default for SYSCALL_DEBUG.
Signed-off-by: Jean-Luc Leger <jean-luc.leger@dspnet.fr.eu.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Default values for boolean and tristate options can only be 'y', 'm' or 'n'.
This patch removes wrong default for SCHED_SMT.
Signed-off-by: Jean-Luc Leger <jean-luc.leger@dspnet.fr.eu.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Pass the POSIX lock owner ID to the flush operation.
This is useful for filesystems which don't want to store any locking state
in inode->i_flock but want to handle locking/unlocking POSIX locks
internally. FUSE is one such filesystem but I think it possible that some
network filesystems would need this also.
Also add a flag to indicate that a POSIX locking request was generated by
close(), so filesystems using the above feature won't send an extra locking
request in this case.
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
On zSeries machines there exists an interface which allows the operating
system to retrieve LPAR hypervisor accounting data. For example, it is
possible to get usage data for physical and virtual cpus. In order to
provide this information to user space programs, I implemented a new
virtual Linux file system named 's390_hypfs' using the Linux 2.6 libfs
framework. The name 's390_hypfs' stands for 'S390 Hypervisor Filesystem'.
All the accounting information is put into different virtual files which
can be accessed from user space. All data is represented as ASCII strings.
When the file system is mounted the accounting information is retrieved and
a file system tree is created with the attribute files containing the cpu
information. The content of the files remains unchanged until a new update
is made. An update can be triggered from user space through writing
'something' into a special purpose update file.
We create the following directory structure:
<mount-point>/
update
cpus/
<cpu-id>
type
mgmtime
<cpu-id>
...
hyp/
type
systems/
<lpar-name>
cpus/
<cpu-id>
type
mgmtime
cputime
onlinetime
<cpu-id>
...
<lpar-name>
cpus/
...
- update: File to trigger update
- cpus/: Directory for all physical cpus
- cpus/<cpu-id>/: Directory for one physical cpu.
- cpus/<cpu-id>/type: Type name of physical zSeries cpu.
- cpus/<cpu-id>/mgmtime: Physical-LPAR-management time in microseconds.
- hyp/: Directory for hypervisor information
- hyp/type: Typ of hypervisor (currently only 'LPAR Hypervisor')
- systems/: Directory for all LPARs
- systems/<lpar-name>/: Directory for one LPAR.
- systems/<lpar-name>/cpus/<cpu-id>/: Directory for the virtual cpus
- systems/<lpar-name>/cpus/<cpu-id>/type: Typ of cpu.
- systems/<lpar-name>/cpus/<cpu-id>/mgmtime:
Accumulated number of microseconds during which a physical
CPU was assigned to the logical cpu and the cpu time was
consumed by the hypervisor and was not provided to
the LPAR (LPAR overhead).
- systems/<lpar-name>/cpus/<cpu-id>/cputime:
Accumulated number of microseconds during which a physical CPU
was assigned to the logical cpu and the cpu time was consumed
by the LPAR.
- systems/<lpar-name>/cpus/<cpu-id>/onlinetime:
Accumulated number of microseconds during which the logical CPU
has been online.
As mount point for the filesystem /sys/hypervisor/s390 is created.
The update process is triggered when writing 'something' into the
'update' file at the top level hypfs directory. You can do this e.g.
with 'echo 1 > update'. During the update the whole directory structure
is deleted and built up again.
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Ingo Oeser <ioe-lkml@rameria.de>
Cc: Joern Engel <joern@wohnheim.fh-wedel.de>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
verify_area() is still alive on xtensa in 2.6.17-rc3-git13 It would be nice
to finally be rid of that function across the board.
Signed-off-by: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Cast is not an lvalue; =r constraint wants an lvalue and really couldn't
care whether it's void * or other pointer type.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This uninlines a few large functions in uaccess.h and cleans up the rest.
It includes a (hopefully temporary) workaround for the broken typeof of
gcc-4.1.
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some fixes and cleanups from the linux-mac68k repo. Fix mac_esp by clearing
the VIA2 SCSI IRQ flag before the SCSI IRQ handler is invoked. Also fix a
race condition caused by unmasking a nubus slot IRQ then setting the relevant
nubus_active bit.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Adjust entry.S to the changed HARDIRQ_MASK, add a check to prevent it from
silently breaking again.
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
MAX_NR_ZONES changed, so use correct defines now.
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Move do_suspend_lowlevel to correct segment. If it is in the same hugepage
with ro data, mark_rodata_ro will make it unexecutable.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
flush_tlb_all uses on_each_cpu, which will disable/enable interrupt.
In suspend/resume time, this will make interrupt wrongly enabled.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Pages (Reserved/ACPI NVS/ACPI Data) below end_pfn will be saved/restored by S4
currently. We should mark 'Reserved' pages not saveable.
Pages (Reserved/ACPI NVS/ACPI Data) above end_pfn will not be saved/restored
by S4 currently. We should save the 'ACPI NVS/ACPI Data' pages.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Nigel Cunningham <nigel@suspend2.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Pages (Reserved/ACPI NVS/ACPI Data) below max_low_pfn will be saved/restored
by S4 currently. We should mark 'Reserved' pages not saveable.
Pages (Reserved/ACPI NVS/ACPI Data) above max_low_pfn will not be
saved/restored by S4 currently. We should save the 'ACPI NVS/ACPI Data'
pages.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Nigel Cunningham <nigel@suspend2.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
New CPU flags for next generation of crypto engine as found in VIA C7
processors.
Signed-off-by: Michal Ludvig <michal@logix.cz>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Sometimes thread_info and task_struct get out-of-sync with each other.
Printing task.thread_info in show_registers() can help spot this. And when
task_struct is corrupt then task.comm can contain garbage, so only print as
many characters as it can hold.
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Never allow int3 traps from V8086 mode to enter the kprobes handler.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We need to check for vm86 mode first before looking at selector privilege
bits.
Segment limit is always base + 64k and only the low 16 bits of EIP are
significant in vm86 mode.
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Rohit Seth <rohitseth@google.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use proper defines instead of open-coded values.
Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
constify structs and add one __initdata.
Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
PCI code was outside of CONFIG_PCI, add __initdata at cyrix_55x0 (since
accessed within __init function only).
Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The i386 page fault handler does not allow enough slack when checking for
userspace access below the current stack pointer. This prevents use of the
enter instruction by user code. Fix this by allowing enough slack for
"enter $65535,$31" to execute.
Problem reported by Tomasz Malesinski <tmal@mimuw.edu.pl>
Tested using this program, based on the original from Tomasz:
.file "ovflow.S"
.version "01.01"
gcc2_compiled.:
.section .rodata
.LC0:
.string "asdf\n"
.text
.align 4
.globl main
.type main,@function
main:
nest_level=0
.rept 30
enter $0,$nest_level
nest_level=nest_level+1
.endr
enter $65535,$30
enter $65535,$31
addl $-12,%esp
pushl $.LC0
call printf
addl $16,%esp
.L2:
.rept 32
leave
.endr
ret
.Lfe1:
.size main,.Lfe1-main
.ident "GCC: (GNU) 2.95.4 20011002 (Debian prerelease)"
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
On i386, kernel irq balance doesn't work.
1) In function do_irq_balance, after kernel finds the min_loaded cpu but
before calling set_pending_irq to really pin the selected_irq to the
target cpu, kernel does a cpus_and with irq_affinity[selected_irq].
Later on, when the irq is acked, kernel would calls
move_native_irq=>desc->handler->set_affinity to change the irq affinity.
However, every function pointed by
hw_interrupt_type->set_affinity(unsigned int irq, cpumask_t cpumask)
always changes irq_affinity[irq] to cpumask. Next time when recalling
do_irq_balance, it has to do cpu_ands again with
irq_affinity[selected_irq], but irq_affinity[selected_irq] already
becomes one cpu selected by the first irq balance.
2) Function balance_irq in file arch/i386/kernel/io_apic.c has the same
issue.
[akpm@osdl.org: cleanups]
Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If CONFIG_FRAME_POINTERS is enabled, and one does a dump_stack() during
early SMP init, an infinite stackdump and a bootup hang happens:
[<c0104e7f>] show_trace+0xd/0xf
[<c0104e96>] dump_stack+0x15/0x17
[<c01440df>] save_trace+0xc3/0xce
[<c014527d>] mark_lock+0x8c/0x4fe
[<c0145df5>] __lockdep_acquire+0x44e/0xaa5
[<c0146798>] lockdep_acquire+0x68/0x84
[<c1048699>] _spin_lock+0x21/0x2f
[<c010d918>] prepare_set+0xd/0x5d
[<c010daa8>] generic_set_all+0x1d/0x201
[<c010ca9a>] mtrr_ap_init+0x23/0x3b
[<c010ada8>] identify_cpu+0x2a7/0x2af
[<c01192a7>] smp_store_cpu_info+0x2f/0xb4
[<c01197d0>] start_secondary+0xb5/0x3ec
[<c104ec11>] end_of_stack_stop_unwind_function+0x1/0x4
[<c104ec11>] end_of_stack_stop_unwind_function+0x1/0x4
[<c104ec11>] end_of_stack_stop_unwind_function+0x1/0x4
[<c104ec11>] end_of_stack_stop_unwind_function+0x1/0x4
[<c104ec11>] end_of_stack_stop_unwind_function+0x1/0x4
[<c104ec11>] end_of_stack_stop_unwind_function+0x1/0x4
[<c104ec11>] end_of_stack_stop_unwind_function+0x1/0x4
[<c104ec11>] end_of_stack_stop_unwind_function+0x1/0x4
[...]
Due to "end_of_stack_stop_unwind_function" recursing back to itself in the
EBP stackframe-walker. So avoid this type of recursion when walking the
stack .
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When multiple updates matching a given CPU are found in the update file, the
action taken by the microcode update driver was inappropriate:
- when lower revision microcode was found before matching or higher revision
one, the driver would needlessly complain that it would not downgrade the
CPU
- when microcode matching the currently installed revision was found before
newer revision code, no update would actually take place
To change this behavior, the driver now concludes about possibly updates and
issues messages only when the entire input was parsed.
Additionally, this adds back (in different places, and conditionalized upon
a new module option) some messages removed by a previous patch.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Tigran Aivazian <tigran_aivazian@symantec.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Only drm, framebuffer, mtrr parts + misc files here and there.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- avoid expensive modulo (integer division) which happened
since APM_MAX_EVENTS is 20 (non-power-of-2)
- kill compiler warnings by initializing two variables
- add __read_mostly to some important static variables that are read often
(by idle loop etc.)
- constify several structures
Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use the x86 cache-bypassing copy instructions for copy_from_user().
Some performance data are
Total of GLOBAL_POWER_EVENTS (CPU cycle samples)
2.6.12.4.orig 1921587
2.6.12.4.nt 1599424
1599424/1921587=83.23% (16.77% reduction)
BSQ_CACHE_REFERENCE (L3 cache miss)
2.6.12.4.orig 57427
2.6.12.4.nt 20858
20858/57427=36.32% (63.7% reduction)
L3 cache miss reduction of __copy_from_user_ll
samples %
37408 65.1412 vmlinux __copy_from_user_ll
23 0.1103 vmlinux __copy_user_zeroing_intel_nocache
23/37408=0.061% (99.94% reduction)
Top 5 of 2.6.12.4.nt
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000
samples % app name symbol name
128392 8.0274 vmlinux __copy_user_zeroing_intel_nocache
64206 4.0143 vmlinux journal_add_journal_head
59746 3.7355 vmlinux do_get_write_access
47674 2.9807 vmlinux journal_put_journal_head
46021 2.8774 vmlinux journal_dirty_metadata
pattern9-0-cpu4-0-09011728/summary.out
Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) with a unit mask of 0x3f (multiple flags) count 3000
samples % app name symbol name
69755 4.2861 vmlinux __copy_user_zeroing_intel_nocache
55685 3.4215 vmlinux journal_add_journal_head
52371 3.2179 vmlinux __find_get_block
45504 2.7960 vmlinux journal_put_journal_head
36005 2.2123 vmlinux journal_stop
pattern9-0-cpu4-0-09011744/summary.out
Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) with a unit mask of 0x200 (read 3rd level cache miss) count 3000
samples % app name symbol name
1147 5.4994 vmlinux journal_add_journal_head
881 4.2240 vmlinux journal_dirty_data
872 4.1809 vmlinux blk_rq_map_sg
734 3.5192 vmlinux journal_commit_transaction
617 2.9582 vmlinux radix_tree_delete
pattern9-0-cpu4-0-09011731/summary.out
iozone results are
original 2.6.12.4 CPU time = 207.768 sec
cache aware CPU time = 184.783 sec
(three times run)
184.783/207.768=88.94% (11.06% reduction)
original:
pattern9-0-cpu4-0-08191720/iozone.out: CPU Utilization: Wall time 45.997 CPU time 64.527 CPU utilization 140.28 %
pattern9-0-cpu4-0-08191741/iozone.out: CPU Utilization: Wall time 46.878 CPU time 71.933 CPU utilization 153.45 %
pattern9-0-cpu4-0-08191743/iozone.out: CPU Utilization: Wall time 45.152 CPU time 71.308 CPU utilization 157.93 %
cache awre:
pattern9-0-cpu4-0-09011728/iozone.out: CPU Utilization: Wall time 44.842 CPU time 62.465 CPU utilization 139.30 %
pattern9-0-cpu4-0-09011731/iozone.out: CPU Utilization: Wall time 44.718 CPU time 59.273 CPU utilization 132.55 %
pattern9-0-cpu4-0-09011744/iozone.out: CPU Utilization: Wall time 44.367 CPU time 63.045 CPU utilization 142.10 %
Signed-off-by: Hiro Yoshioka <hyoshiok@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove duplicate EXPORT_SYMBOL annotations from the FRV arch.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The FRV arch should use fstatat64 not newfstatat.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add annotations to the FRV signal handling for sparse.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add annotations to the FRV I/O handling functions for sparse.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add general annotations to the FRV arch for sparse.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
sys_move_pages() support for 32bit (i386 plus x86_64 compat layer)
Add support for move_pages() on i386 and also add the compat functions
necessary to run 32 bit binaries on x86_64.
Add compat_sys_move_pages to the x86_64 32bit binary layer. Note that it is
not up to date so I added the missing pieces. Not sure if this is done the
right way.
[akpm@osdl.org: compile fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
move_pages() is used to move individual pages of a process. The function can
be used to determine the location of pages and to move them onto the desired
node. move_pages() returns status information for each page.
long move_pages(pid, number_of_pages_to_move,
addresses_of_pages[],
nodes[] or NULL,
status[],
flags);
The addresses of pages is an array of void * pointing to the
pages to be moved.
The nodes array contains the node numbers that the pages should be moved
to. If a NULL is passed instead of an array then no pages are moved but
the status array is updated. The status request may be used to determine
the page state before issuing another move_pages() to move pages.
The status array will contain the state of all individual page migration
attempts when the function terminates. The status array is only valid if
move_pages() completed successfullly.
Possible page states in status[]:
0..MAX_NUMNODES The page is now on the indicated node.
-ENOENT Page is not present
-EACCES Page is mapped by multiple processes and can only
be moved if MPOL_MF_MOVE_ALL is specified.
-EPERM The page has been mlocked by a process/driver and
cannot be moved.
-EBUSY Page is busy and cannot be moved. Try again later.
-EFAULT Invalid address (no VMA or zero page).
-ENOMEM Unable to allocate memory on target node.
-EIO Unable to write back page. The page must be written
back in order to move it since the page is dirty and the
filesystem does not provide a migration function that
would allow the moving of dirty pages.
-EINVAL A dirty page cannot be moved. The filesystem does not provide
a migration function and has no ability to write back pages.
The flags parameter indicates what types of pages to move:
MPOL_MF_MOVE Move pages that are only mapped by the process.
MPOL_MF_MOVE_ALL Also move pages that are mapped by multiple processes.
Requires sufficient capabilities.
Possible return codes from move_pages()
-ENOENT No pages found that would require moving. All pages
are either already on the target node, not present, had an
invalid address or could not be moved because they were
mapped by multiple processes.
-EINVAL Flags other than MPOL_MF_MOVE(_ALL) specified or an attempt
to migrate pages in a kernel thread.
-EPERM MPOL_MF_MOVE_ALL specified without sufficient priviledges.
or an attempt to move a process belonging to another user.
-EACCES One of the target nodes is not allowed by the current cpuset.
-ENODEV One of the target nodes is not online.
-ESRCH Process does not exist.
-E2BIG Too many pages to move.
-ENOMEM Not enough memory to allocate control array.
-EFAULT Parameters could not be accessed.
A test program for move_pages() may be found with the patches
on ftp.kernel.org:/pub/linux/kernel/people/christoph/pmig/patches-2.6.17-rc4-mm3
From: Christoph Lameter <clameter@sgi.com>
Detailed results for sys_move_pages()
Pass a pointer to an integer to get_new_page() that may be used to
indicate where the completion status of a migration operation should be
placed. This allows sys_move_pags() to report back exactly what happened to
each page.
Wish there would be a better way to do this. Looks a bit hacky.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Jes Sorensen <jes@trained-monkey.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Modify the gen_pool allocator (lib/genalloc.c) to utilize a bitmap scheme
instead of the buddy scheme. The purpose of this change is to eliminate
the touching of the actual memory being allocated.
Since the change modifies the interface, a change to the uncached allocator
(arch/ia64/kernel/uncached.c) is also required.
Both Andrey Volkov and Jes Sorenson have expressed a desire that the
gen_pool allocator not write to the memory being managed. See the
following:
http://marc.theaimsgroup.com/?l=linux-kernel&m=113518602713125&w=2http://marc.theaimsgroup.com/?l=linux-kernel&m=113533568827916&w=2
Signed-off-by: Dean Nelson <dcn@sgi.com>
Cc: Andrey Volkov <avolkov@varma-el.com>
Acked-by: Jes Sorensen <jes@trained-monkey.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Consolidate the various arch-specific implementations of pxm_to_node() and
node_to_pxm() into a single generic version.
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Give the statfs superblock operation a dentry pointer rather than a superblock
pointer.
This complements the get_sb() patch. That reduced the significance of
sb->s_root, allowing NFS to place a fake root there. However, NFS does
require a dentry to use as a target for the statfs operation. This permits
the root in the vfsmount to be used instead.
linux/mount.h has been added where necessary to make allyesconfig build
successfully.
Interest has also been expressed for use with the FUSE and XFS filesystems.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Extend the get_sb() filesystem operation to take an extra argument that
permits the VFS to pass in the target vfsmount that defines the mountpoint.
The filesystem is then required to manually set the superblock and root dentry
pointers. For most filesystems, this should be done with simple_set_mnt()
which will set the superblock pointer and then set the root dentry to the
superblock's s_root (as per the old default behaviour).
The get_sb() op now returns an integer as there's now no need to return the
superblock pointer.
This patch permits a superblock to be implicitly shared amongst several mount
points, such as can be done with NFS to avoid potential inode aliasing. In
such a case, simple_set_mnt() would not be called, and instead the mnt_root
and mnt_sb would be set directly.
The patch also makes the following changes:
(*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
pointer argument and return an integer, so most filesystems have to change
very little.
(*) If one of the convenience function is not used, then get_sb() should
normally call simple_set_mnt() to instantiate the vfsmount. This will
always return 0, and so can be tail-called from get_sb().
(*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
dcache upon superblock destruction rather than shrink_dcache_anon().
This is required because the superblock may now have multiple trees that
aren't actually bound to s_root, but that still need to be cleaned up. The
currently called functions assume that the whole tree is rooted at s_root,
and that anonymous dentries are not the roots of trees which results in
dentries being left unculled.
However, with the way NFS superblock sharing are currently set to be
implemented, these assumptions are violated: the root of the filesystem is
simply a dummy dentry and inode (the real inode for '/' may well be
inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
with child trees.
[*] Anonymous until discovered from another tree.
(*) The documentation has been adjusted, including the additional bit of
changing ext2_* into foo_* in the documentation.
[akpm@osdl.org: convert ipath_fs, do other stuff]
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nathan Scott <nathans@sgi.com>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (21 commits)
[ARM] 3629/1: S3C24XX: fix missing bracket in regs-dsc.h
[ARM] 3537/1: Rework DMA-bounce locking for finer granularity
[ARM] 3601/1: i.MX/MX1 DMA error handling for signaled channels only
[ARM] 3597/1: ixp4xx/nslu2: Board support for new LED subsystem
[ARM] 3595/1: ixp4xx/nas100d: Board support for new LED subsystem
[ARM] 3626/1: ARM EABI: fix syscall restarting
[ARM] 3628/1: S3C24XX: add get_rate call to struct clk
[ARM] 3627/1: S3C24XX: split s3c2410 clocks from core clocks
[ARM] 3613/1: S3C2410: Add sysdev and sysclass
[ARM] 3624/1: Report true modem control line states
[ARM] 3620/2: ixp23xx: add uengine loader support
[ARM] 3618/1: add defconfig for logicpd pxa270 card engine
[ARM] 3617/1: ep93xx: fix slightly incorrect timer tick rate
[ARM] 3616/1: fix timer handler wrap logic for a number of platforms
[ARM] 3615/1: ixp23xx: use platform devices for physmap flash
[ARM] 3614/1: ep93xx: use platform devices for physmap flash
[ARM] 3621/1: fix compilation breakage for pnx4008
[ARM] 3623/1: pnx4008: move GPIO-related defines to gpio.h
[ARM] 3622/1: pnx4008: remove clk_use/clk_unuse
[ARM] Enable VFP to be built when non-VFP capable CPUs are selected
...
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] Fix ondemand vs suspend deadlock
[CPUFREQ] Fix powernow-k8 SMP kernel on UP hardware bug.
[PATCH] redirect speedstep-centrino maintainer mail to cpufreq list
[CPUFREQ] correct powernow-k8 fid/vid masks for extended parts
[CPUFREQ] Clarify powernow-k8 cpu_family statements
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (139 commits)
[POWERPC] re-enable OProfile for iSeries, using timer interrupt
[POWERPC] support ibm,extended-*-frequency properties
[POWERPC] Extra sanity check in EEH code
[POWERPC] Dont look for class-code in pci children
[POWERPC] Fix mdelay badness on shared processor partitions
[POWERPC] disable floating point exceptions for init
[POWERPC] Unify ppc syscall tables
[POWERPC] mpic: add support for serial mode interrupts
[POWERPC] pseries: Print PCI slot location code on failure
[POWERPC] spufs: one more fix for 64k pages
[POWERPC] spufs: fail spu_create with invalid flags
[POWERPC] spufs: clear class2 interrupt status before wakeup
[POWERPC] spufs: fix Makefile for "make clean"
[POWERPC] spufs: remove stop_code from struct spu
[POWERPC] spufs: fix spu irq affinity setting
[POWERPC] spufs: further abstract priv1 register access
[POWERPC] spufs: split the Cell BE support into generic and platform dependant parts
[POWERPC] spufs: dont try to access SPE channel 1 count
[POWERPC] spufs: use kzalloc in create_spu
[POWERPC] spufs: fix initial state of wbox file
...
Manually resolved conflicts in:
drivers/net/phy/Makefile
include/asm-powerpc/spu.h
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (27 commits)
[PATCH] PCI: nVidia quirk to make AER PCI-E extended capability visible
[PATCH] PCI: fix issues with extended conf space when MMCONFIG disabled because of e820
[PATCH] PCI: Bus Parity Status sysfs interface
[PATCH] PCI: fix memory leak in MMCONFIG error path
[PATCH] PCI: fix error with pci_get_device() call in the mpc85xx driver
[PATCH] PCI: MSI-K8T-Neo2-Fir: run only where needed
[PATCH] PCI: fix race with pci_walk_bus and pci_destroy_dev
[PATCH] PCI: clean up pci documentation to be more specific
[PATCH] PCI: remove unneeded msi code
[PATCH] PCI: don't move ioapics below PCI bridge
[PATCH] PCI: cleanup unused variable about msi driver
[PATCH] PCI: disable msi mode in pci_disable_device
[PATCH] PCI: Allow MSI to work on kexec kernel
[PATCH] PCI: AMD 8131 MSI quirk called too late, bus_flags not inherited ?
[PATCH] PCI: Move various PCI IDs to header file
[PATCH] PCI Bus Parity Status-broken hardware attribute, EDAC foundation
[PATCH] PCI: i386/x86_84: disable PCI resource decode on device disable
[PATCH] PCI ACPI: Rename the functions to avoid multiple instances.
[PATCH] PCI: don't enable device if already enabled
[PATCH] PCI: Add a "enable" sysfs attribute to the pci devices to allow userspace (Xorg) to enable devices without doing foul direct access
...
The AGP default doesn't work well with other selects, so use a select for
GART_IOMMU as well. Remove a redundant default for SWIOTLB as well.
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@muc.de>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Dave Airlie <airlied@linux.ie>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Upgrade the zlib_inflate implementation in the kernel from a patched
version 1.1.3/4 to a patched 1.2.3.
The code in the kernel is about seven years old and I noticed that the
external zlib library's inflate performance was significantly faster (~50%)
than the code in the kernel on ARM (and faster again on x86_32).
For comparison the newer deflate code is 20% slower on ARM and 50% slower
on x86_32 but gives an approx 1% compression ratio improvement. I don't
consider this to be an improvement for kernel use so have no plans to
change the zlib_deflate code.
Various changes have been made to the zlib code in the kernel, the most
significant being the extra functions/flush option used by ppp_deflate.
This update reimplements the features PPP needs to ensure it continues to
work.
This code has been tested on ARM under both JFFS2 (with zlib compression
enabled) and ppp_deflate and on x86_32. JFFS2 sees an approx. 10% real
world file read speed improvement.
This patch also removes ZLIB_VERSION as it no longer has a correct value.
We don't need version checks anyway as the kernel's module handling will
take care of that for us. This removal is also more in keeping with the
zlib author's wishes (http://www.zlib.net/zlib_faq.html#faq24) and I've
added something to the zlib.h header to note its a modified version.
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Acked-by: Joern Engel <joern@wh.fh-wedel.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
rd_prompt et.al. depend on CONFIG_BLK_DEV_RAM, not CONFIG_BLK_INITRD; now
that those are independent, setup.c blows with INITRD on and BLK_DEV_RAM
off.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Change a variable from unsigned to signed in order to get sign-extension
when the thing is negated. Without this, uptime is horribly confused.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Kevin Hilman
This time with IRQ versions of locks.
Rework also enables compatability with realtime-preemption patch.
With the current locking via interrupt disabling, under RT,
potentially sleeping functions can be called with interrupts
disabled.
Signed-off-by: Kevin Hilman <khilman@mvista.com>
Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Pavel Pisa
There has been bug, that dma_err_handler() touches even
channels not signaling error condition.
Problem noticed by Andrea Paterniani.
Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Rod Whitby
This patch implements NEW_LEDS support for the Linksys NSLU2. The
NSLU2 has four LED indicators, which are the only form of output for
an unmodified device - there is no keyboard or display on an NSLU2.
For an NSLU2 which has been modified to bring out the serial port
console, it is important to register that device first separately, to
enable debugging of other device support.
Signed-off-by: John Bowler <jbowler@acm.org>
Signed-off-by: Rod Whitby <rod@whitby.id.au>
Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Rod Whitby
This patch implements NEW_LEDS support for the IOMega NAS100d. The
NAS100d has three LED indicators, which are the only form of output
for an unmodified device - there is no keyboard or display on an
NAS100d. For an NAS100d which has been modified to bring out the
serial port console, it is important to register that device first
separately, to enable debugging of other device support.
Signed-off-by: John Bowler <jbowler@acm.org>
Signed-off-by: Rod Whitby <rod@whitby.id.au>
Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
The RESTARTBLOCK case currently store some code on the stack to invoke
sys_restart_syscall. However this is ABI dependent and there is a
mismatch with the way __NR_restart_syscall gets defined when the kernel
is compiled for EABI.
There is also a long standing bug in the thumb case since with OABI the
__NR_restart_syscall value includes __NR_SYSCALL_BASE which should not
be the case for Thumb syscalls.
Credits to Yauheni Kaliuta <yauheni.kaliuta@gmail.com> for finding the
EABI bug.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
Add a get_rate call to allow an given clock
to over-ride the clk_get_rate() call.
This provides support for clocks which rely on
division of their parent to correctly report
their frequency when the parent can also change.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
Split the s3c2410 specific clocks from the core
clock code, as part of the work to support more
of the Samsung line of SoCs.
The patch does not use the sysdev mechanism as
the clocks are needed for the timer init, which
is very early in the kernel init sequence.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
The S3C2440 and S3C2442 both have their own sysdev
and sysclass for differentiating them from the
currently default S3C2410.
Add a sysdev for the S3C2410 as part of the work
to make the code be non-dependant on the S3C2410.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Lennert Buytenhek
This patch allows the ixp2000 uengine loader that is already in the
tree to also be used on the ixp23xx.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Lennert Buytenhek
As it's slightly nontrivial to make it possible to build a single
kernel image for both the mainstone and the logicpd pxa270 card engine,
add a separate defconfig for the logicpd pxa270 card engine for now.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Lennert Buytenhek
The tick rate of timers 1-3 isn't exactly 508 kHz as some parts of the
relevant documentation claim, but more like 508.469 kHz (14.7456 MHz
divided by 29.)
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Lennert Buytenhek
A couple of platforms aren't using the right comparison type in their
timer interrupt handlers (as we're comparing two wrapping timestamps,
we need a bmi/bpl-type comparison, not an unsigned comparison) -- this
patch fixes them up.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Lennert Buytenhek
Now that the physmap platform device rewrite is in, make the ixp23xx
boards use platform devices for physmap flash.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Lennert Buytenhek
Now that the physmap platform device rewrite is in, make the ep93xx
boards use platform devices for physmap flash.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Vitaly Wool
This patch moves GPIO-related defines and static inline funcs from include/asm-arm/arch-pnx4008/pm.h to include/asm-arm/arch-pnx4008/gpio.h.
Also, some more GPIO-related defines are added to include/asm-arm/arch-pnx4008/gpio.h as they are needed for the USB host driver (coming soon...)
Signed-off-by: Vitaly Wool <vwool@ru.mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Vitaly Wool
clk_use/clk_unuse functions are no longer needed, so removing those from arch/arm/mach-pnx4008/clock.c.
Also, the order of functions is rearranged a bit, to avoid forward declarations.
Signed-off-by: Vitaly Wool <vwool@ru.mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Since we pass flags to the compiler to control code generation based
on the least capable selected CPU, if we want to include VFP support,
we must tweak the assembler flags to allow the VFP instructions.
Moreover, we must not use the mrrc/mcrr versions since these will not
be recognised by the assembler.
We do not convert all instructions to the VFP-equivalent (yet) since
binutils appears to barf on "fmrx rn, fpinst" and doesn't provide any
other way (other than using the mrc equivalent) to encode this
instruction - which is rather a problem when you have a VFP
implementation which requires these instructions.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Some machine classes need to allow VFP support to be built into the
kernel, but still allow the kernel to run even though VFP isn't
present. Unfortunately, the kernel hard-codes VFP instructions
into the thread switch, which prevents this being run-time selectable.
Solve this by introducing a notifier which things such as VFP can
hook into to be informed of events which affect the VFP subsystem
(eg, creation and destruction of threads, switches between threads.)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
In some systems we may have both a platform EHCI controller and PCI EHCI
controller. Previously we couldn't build the EHCI support as a module due
to conflicting module_init() calls in the code.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Sorry I didn't notice earlier, but that BUG_ON triggers for me on the
simulator. AFAICS the mask for itv is set in cpu_init(), which comes
after sal_init(). Consequently on the simulator the itv still has its
start value of zero. I've probably missed something, but I wonder why
at this stage of the boot you even need to save and restore the itv?
Signed-Off-By: Ian Wienand <ianw@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The following patch fixes a bug in the SGI Altix tioce_bus_fixup()
code. ce_dre_comp_err_addr needs to be zero'd out not ~0ULL. As
a result completion errors weren't being captured.
Signed-off-by: Mike Habeck <habeck@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
struct ia64_sal_os_state has three semi-independent sections. The code
in mca_asm.S assumes that these three sections are contiguous, which
makes it very awkward to add new data to this structure. Remove the
assumption that the sections are contiguous. Define a macro to shorten
references to offsets in ia64_sal_os_state.
This patch does not change the way that the code behaves. It just
makes it easier to update the code in future and to add fields to
ia64_sal_os_state when debugging the MCA/INIT handlers.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
When I tried to use PCI Express Hotplug driver on my ia64 box, I
noticed that "PCI Express support" is not even selectable on ia64.
This patch makes PCI Express support selectable.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
I haven't really maintained this driver for a while, and I'm not
keeping up with the latest in Intel power management. I get a steady
stream of mail which I don't really do anything useful with; the
cpufreq list seems like a better destination, unless someone wants to
get the mail directly.
Also clean up a couple of ancient comments which don't really apply
anymore (as far as I know, nobody has ever damaged a CPU with this
driver).
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Dave Jones <davej@redhat.com>
There is an SN bug in sn_hwperf.c that affects systems with 1024n or 1024p.
The bug manifests itself 2 ways: IO interrupts are not always
targeted to the nearest node, and 2) the "cat /proc/sgi_sn/sn_topology"
commands fails with "cannot allocate memory".
The code is using the wrong macros for validating node numbers.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
One more trivial, stand-alone patch from the Xen/ia64 review. Sanity
check usage of the reserved region numbers.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This is a trivial stand-alone patch out of the Xen/ia64 patches. Add
a vmlinuz build target to be more compatible with x86-ish targets.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
On 15 Jun 2006 03:45:10 +0200, Andi Kleen wrote:
> Anyways I would say that if the BIOS can't get MCFG right then
> it's likely not been validated on that board and shouldn't be used.
According to Petr Vandrovec:
... "What is important (and checked) is address of MMCONFIG reported by MCFG
table... Unfortunately code does not bother with printing that address :-(
"Another problem is that code has hardcoded that MMCONFIG area is 256MB large.
Unfortunately for the code PCI specification allows any power of two between 2MB
and 256MB if vendor knows that such amount of busses (from 2 to 128) will be
sufficient for system. With notebook it is quite possible that not full 8 bits
are implemented for MMCONFIG bus number."
So here is a patch. Unfortunately my system still fails the test because
it doesn't reserve any part of the MMCONFIG area, but this may fix others.
Booted on x86_64, only compiled on i386. x86_64 still remaps the max area
(256MB) even though only 2MB is checked... but 2.6.16 had no check at all
so it is still better.
PCI: reduce size of x86 MMCONFIG reserved area check
1. Print the address of the MMCONFIG area when the test for that area
being reserved fails.
2. Only check if the first 2MB is reserved, as that is the minimum.
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This a bit late (yours patch was posted about a year ago), but
a co-worker of spotted part of the code that looks like a memory
leak. Looking at the code it seems that pci_mmcfg_config should
be free-ed if MMCONFIG is above 4GB.
From: Konrad Rzeszutek <konradr@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When a PCI device is disabled via pci_disable_device(), it's still
left decoding its BAR resource ranges even though its driver
will have likely released those regions (and may even have
unloaded). pci_enable_device() already explicitly enables
BAR resource decode for the device being enabled. This patch
disables resource decode for the PCI device being disabled,
making it symmetric with the enable call.
I saw this while doing something else, not because of a
problem report. Still, seems to be the correct thing to do.
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
MSI callouts for altix. Involves a fair amount of code reorg in sn irq.c
code as well as adding some extensions to the altix PCI provider abstaction.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Abstract IA64_FIRST_DEVICE_VECTOR/IA64_LAST_DEVICE_VECTOR since SN platforms
use a subset of the IA64 range. Implement this by making the above macros
global variables which the platform can override in it setup code.
Also add a reserve_irq_vector() routine used by SN to mark a vector's as
in-use when that weren't allocated through assign_irq_vector().
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch removes the changes from an earlier patch that disables
oProfile for iSeries within the oProfile KConfig (submitted Feb 23,
2006). Checks within the arch init for iSeries, still allowing profiling
for timer interrupts (using firmware_has_feature).
Signed-off-by: Kelly Daly <kelly@au.ibm.com>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Support the ibm,extended-*-frequency properties found in recent POWER5
firmware:
cpus/PowerPC,POWER5@0/clock-frequency
59aa5880 (1504336000)
cpus/PowerPC,POWER5@0/ibm,extended-clock-frequency
00000000 59aa5880
cpus/PowerPC,POWER5@0/timebase-frequency
0b354b10 (188042000)
cpus/PowerPC,POWER5@0/ibm,extended-timebase-frequency
00000000 0b354b10
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Don't dereference a device node that isn't there. A "shouldn't
happen" case, but someone ran into it with a possibly misconfigured
device tree.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Looking for class-code in PCI children breaks with direct slots. Lets
just count all children.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Avoid duplication of the syscall table for the cell platform. Based on an
idea from David Woodhouse.
Signed-off-by: Andreas Schwab <schwab@suse.de>
Acked-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
On Tue, Jun 20, 2006 at 02:01:26PM +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2006-06-19 at 13:08 -0700, Mark A. Greer wrote:
> > MPC10x-style interrupt controllers have a serial mode that allows
> > several interrupts to be clocked in through one INT signal.
> >
> > This patch adds the software support for that mode.
>
> You hard code the clock ratio... why not add a separate call to be
> called after mpic_init,
> something like mpic_set_serial_int(int mpic, int enable, int
> clock_ratio) ?
How's this?
--
MPC10x-style interrupt controllers have a serial mode that allows
several interrupts to be clocked in through one INT signal.
This patch adds the software support for that mode.
Signed-off-by: Mark A. Greer <mgreer@mvista.com>
--
arch/powerpc/sysdev/mpic.c | 20 ++++++++++++++++++++
include/asm-powerpc/mpic.h | 10 ++++++++++
2 files changed, 30 insertions(+)
--
Signed-off-by: Paul Mackerras <paulus@samba.org>
The PCI error recovery code will printk diagnostic info when
a PCI error event occurs. Change the messages to include the slot
location code, which is how most sysadmins will know the device.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The SPU context save/restore code is currently built
for a 4k page size and we provide a _shipped version
of it since most people don't have the spu toolchain
that is needed to rebuild that code.
This patch hardcodes the data structures to a 64k
page alignment, which also guarantees 4k alignment
but unfortunately wastes 60k of memory per SPU
context that is created in the running system.
We will follow up on this with another patch to
reduce that overhead or maybe redo the context
save/restore logic to do this part entirely different,
but for now it should make experimental systems
work with either page size.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
At this time, all flags are invalid. Since we are
planning to actually add valid flags in the future,
we better check if any were passed by the user.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
SPU interrupt status must be cleared before handle it.
Otherwise, kernel may drop some interrupt packet.
Currently, class2 interrupt treated like:
1) call callback to wake up waiting process
2) mask raised mailbox interrupt
3) clear interrupt status
I changed like:
1) mask raised mailbox interrupt
2) clear interrupt status
3) call callback to wake up waiting process
Clearing status before masking will make spurious interrupt.
Thus, it is necessary to hold by steps I described above, I think.
Signed-off-by: Masato Noguchi <Masato.Noguchi@jp.sony.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch remove 'stop_code' -- discarded member of struct spu.
It is written at initialize and interrupt, but never read
in current implementation.
Signed-off-by: Masato Noguchi <Masato.Noguchi@jp.sony.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This changes the hypervisor abstraction of setting cpu affinity to a
higher level to avoid platform dependent interrupt controller
routines. I replaced spu_priv1_ops:spu_int_route_set() with a
new routine spu_priv1_ops:spu_cpu_affinity_set().
As a by-product, this change eliminated what looked like an
existing bug in the set affinity code where spu_int_route_set()
mistakenly called int_stat_get().
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
To support muti-platform binaries the spu hypervisor accessor
routines must have runtime binding.
I removed the existing statically linked routines in spu.h
and spu_priv1_mmio.c and created new accessor routines in spu_priv1.h
that operate indirectly through an ops struct spu_priv1_ops.
spu_priv1_mmio.c contains the instance of the accessor routines
for running on raw hardware.
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Creates new config variables PPC_CELL_NATIVE and PPC_IBM_CELL_BLADE.
The existing CONFIG_PPC_CELL is now used to denote the generic
Cell processor support.
PPC_CELL = make descends into platforms/cell
PPC_CELL_NATIVE = add bare metal support
PPC_IBM_CELL_BLADE = add blade device drivers, etc.
Also renames spu_priv1.c to spu_priv1_mmio.c.
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The save/restore sequence for SPE contexts currently attempts to save
and restore the channel count for SPE channel 1 (the SPU_WriteEventMask
channel. But the CBE architecture (section 9.11.2) clearly states
that this channel does not have an associated count. Hardware simply
ignores the attempt to write this count, but the simulator generates
a warning message.
WARNING: 279721590: SPE7: Attempt to write channel count for CH 1 with
no associated count is ignored.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Clean up create_spu() a little by using kzalloc instead of kmalloc +
assignments.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The wbox channel count of an spu is now initialized
to four for the saved context. This makes it possible
to write to the mailbox right away without waiting
for the SPE to become scheduled first.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
For performance analysis, it is often interesting to know
which physical SPE a thread is currently running on, and,
more importantly, if it is running at all.
This patch adds a simple attribute to each SPU directory
with that information.
The attribute is read-only and called 'phys-id'. It contains
an ascii string with the number of the physical SPU (e.g.
"0x5"), or alternatively the string "0xffffffff" (32 bit -1)
when it is not running at all at the time that the file
is read.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
spufs currently knows only 4k pages and 16M hugetlb
pages. Make it use the regular methods for deciding on
the SLB bits.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
spufs_rmdir tries to acquire the spufs root
i_mutex, which is already held by spufs_create_thread.
This was tracked as Bug #H9512.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
A recent change to the way that the mfc file gets mapped made it
impossible to map the SPE Multi-Source Synchronization register
into user space, but that may be needed by some applications.
This restores the missing functionality.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The spu_base module is rather deeply intermixed with the
core kernel, so it makes sense to have that built-in.
This will let us extend the base in the future without
having to export more core symbols just for it.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
SPUs are registered as system devices, exposing attributes through
sysfs. Since the sysdev includes a kref, we can remove the one in
struct spu (it isn't used at the moment anyway).
Currently only the interrupt source and numa node attributes are added.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Enable some of the most requested features in defconfig
and refresh with the latest powerpc.git Kconfig files.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Checking the priority field to test for irq validity is
completely bogus and breaks with future external interrupt
controllers.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This is a first version of support for the Cell BE "Reliability,
Availability and Serviceability" features.
It doesn't yet handle some of the RAS interrupts (the ones described in
iic_is/iic_irr), I'm still working on a proper way to expose these. They
are essentially a cascaded controller by themselves (sic !) though I may
just handle them locally to the iic driver. I need also to sync with
David Erb on the way he hooked in the performance monitor interrupt.
So that's all for 2.6.17 and I'll do more work on that with my rework of
the powerpc interrupt layer that I'm hacking on at the moment.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Clear the high BATS during load_up_mmu if FTR_HAS_HIGH_BATS.
Allow just a bit more time for secondary CPUs to phone home.
Signed-off-by: Wei Zhang <Wei.Zhang@freescale.com>
Signed-off-by: Haiying Wang <Haiying.Wang@freescale.com>
Signed-off-by: Jon Loeliger <jdl@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Jon Loeliger <jdl@freescale.com>
Signed-off-by: Haiying Wang <Haiying.Wang@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Export both news RTAS delay functions, and change the scanlog module to
use the new delay functions.
Signed-off-by: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
* git://git.infradead.org/~dwmw2/rbtree-2.6:
[RBTREE] Switch rb_colour() et al to en_US spelling of 'color' for consistency
Update UML kernel/physmem.c to use rb_parent() accessor macro
[RBTREE] Update hrtimers to use rb_parent() accessor macro.
[RBTREE] Add explicit alignment to sizeof(long) for struct rb_node.
[RBTREE] Merge colour and parent fields of struct rb_node.
[RBTREE] Remove dead code in rb_erase()
[RBTREE] Update JFFS2 to use rb_parent() accessor macro.
[RBTREE] Update eventpoll.c to use rb_parent() accessor macro.
[RBTREE] Update key.c to use rb_parent() accessor macro.
[RBTREE] Update ext3 to use rb_parent() accessor macro.
[RBTREE] Change rbtree off-tree marking in I/O schedulers.
[RBTREE] Add accessor macros for colour and parent fields of rb_node
* master.kernel.org:/home/rmk/linux-2.6-arm: (22 commits)
[ARM] 3559/1: S3C2442: core and serial port
[ARM] 3557/1: S3C24XX: centralise and cleanup uart registration
[ARM] 3558/1: SMDK24XX: LED platform devices
[ARM] 3534/1: add spi support to lubbock platform
[ARM] 3554/1: ARM: Fix dyntick locking
[ARM] 3553/1: S3C24XX: earlier print of cpu idcode info
[ARM] 3552/1: S3C24XX: Move VA of GPIO for low-level debug
[ARM] 3551/1: S3C24XX: PM code failes to compile with CONFIG_DCACHE_WRITETHROUGH
[ARM] 3550/1: OSIRIS: fix serial port map for 1:1
[ARM] 3548/1: Fix the ARMv6 CPU id in compressed/head.S
[ARM] 3335/1: Old-abi Thumb sys_syscall broken
[ARM] 3467/1: [3/3] Support for Philips PNX4008 platform: defconfig
[ARM] 3466/1: [2/3] Support for Philips PNX4008 platform: chip support
[ARM] 3465/1: [1/3] Support for Philips PNX4008 platform: headers
[ARM] 3407/1: lpd7x: documetation update
[ARM] 3406/1: lpd7x: compilation fix for smc91x
[ARM] 3405/1: lpd7a40x: CPLD ssp driver
[ARM] 3404/1: lpd7a40x: AMBA CLCD support
[ARM] 3403/1: lpd7a40x: updated default configurations
[ARM] 3402/1: lpd7a40x: serial driver bug fix
...
Patch from Deepak Saxena
This patch makes soft reboot work on the Versatile board. Thanks to
Catalin Marinas @ ARM for pointing out the proper way to do this.
Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Andrew Victor
This patch removes some now unnecessary global variables -
at91_master_clock, at91_serial_map, at91_console_port.
Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
Update s3c2410_defconfig to latest kernel with the
latest patches
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Andrew Victor
This final patch includes some general fixes.
1. Link in pm.o if CONFIG_PM is enabled. [Should have been included in
patch 3605/1].
2. Use __raw_readl()/__raw_writel() when accessing System Peripheral
registers.
3. Removed some unnecessary includes
Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Andrew Victor
This patch adds the core Power Management support for the AT91RM9200
processor. It will support suspend-to-RAM and standby modes.
The suspend-to-RAM functionality is not 100% complete. The code that
needs to be execute from the internal SRAM to restore the system is
outstanding. For now we just fall through to Standby mode.
The AT91-specific at91_suspend_entering_slow_clock() function will
eventually be replaced by clk_must_disable() once that functionality is
added to mainline clock API.
Patch from David Brownell.
Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>