Commit graph

516 commits

Author SHA1 Message Date
Randy Dunlap
ebba5f9fcb [PATCH] consistently use MAX_ERRNO in __syscall_return
Consistently use MAX_ERRNO when checking for errors in __syscall_return().

[ralf@linux-mips.org: build fix]
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:18 -07:00
Linus Torvalds
b278240839 Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (225 commits)
  [PATCH] Don't set calgary iommu as default y
  [PATCH] i386/x86-64: New Intel feature flags
  [PATCH] x86: Add a cumulative thermal throttle event counter.
  [PATCH] i386: Make the jiffies compares use the 64bit safe macros.
  [PATCH] x86: Refactor thermal throttle processing
  [PATCH] Add 64bit jiffies compares (for use with get_jiffies_64)
  [PATCH] Fix unwinder warning in traps.c
  [PATCH] x86: Allow disabling early pci scans with pci=noearly or disallowing conf1
  [PATCH] x86: Move direct PCI scanning functions out of line
  [PATCH] i386/x86-64: Make all early PCI scans dependent on CONFIG_PCI
  [PATCH] Don't leak NT bit into next task
  [PATCH] i386/x86-64: Work around gcc bug with noreturn functions in unwinder
  [PATCH] Fix some broken white space in ia32_signal.c
  [PATCH] Initialize argument registers for 32bit signal handlers.
  [PATCH] Remove all traces of signal number conversion
  [PATCH] Don't synchronize time reading on single core AMD systems
  [PATCH] Remove outdated comment in x86-64 mmconfig code
  [PATCH] Use string instructions for Core2 copy/clear
  [PATCH] x86: - restore i8259A eoi status on resume
  [PATCH] i386: Split multi-line printk in oops output.
  ...
2006-09-26 13:07:55 -07:00
Jeff Dike
70e0eb8ef1 [PATCH] Split i386 and x86_64 ptrace.h
The use of SEGMENT_RPL_MASK in the i386 ptrace.h introduced by
x86-allow-a-kernel-to-not-be-in-ring-0.patch broke the UML build, as UML
includes the underlying architecture's ptrace.h, but has no easy access to the
x86 segment definitions.

Rather than kludging around this, as in the past, this patch splits the
userspace-usable parts, which are the bits that UML needs, of ptrace.h into
ptrace-abi.h, which is included back into ptrace.h.  Thus, there is no net
effect on i386.

As a side-effect, this creates a ptrace header which is close to being usable
in /usr/include.

x86_64 is also treated in this way for consistency.  There was some trailing
whitespace there, which is cleaned up.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:49:10 -07:00
Rusty Russell
2965a0e6da [PATCH] x86: trivial move of ptep_set_access_flags
Move ptep_set_access_flags to be closer to the other ptep accessors, and make
the indentation standard.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:56 -07:00
Rusty Russell
6049742dbc [PATCH] x86: trivial move of __HAVE macros in i386 pagetable headers
Move the __HAVE_ARCH_PTEP defines to accompany the function definitions.
Anything else is just a complete nightmare to track through the 2/3-level
paging code, and this caused duplicate definitions to be needed (pte_same),
which could have easily been taken care of with the asm-generic pgtable
functions.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:56 -07:00
Jeremy Fitzhardinge
052e79941a [PATCH] x86: make __FIXADDR_TOP variable to allow it to make space for a hypervisor
Make __FIXADDR_TOP a variable, so that it can be set to not get in the way of
address space a hypervisor may want to reserve.

Original patch by Gerd Hoffmann <kraxel@suse.de>

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Gerd Hoffmann <kraxel@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:55 -07:00
Rusty Russell
9f093394d7 [PATCH] x86: roll all the cpuid asm into one __cpuid call
It's a little neater, and also means only one place to patch for
paravirtualization.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:55 -07:00
Chris Wright
027a8c7e60 [PATCH] x86: implement always-locked bit ops, for memory shared with an SMP hypervisor
Add "always lock'd" implementations of set_bit, clear_bit and change_bit and
the corresponding test_and_ functions.  Also add "always lock'd"
implementation of cmpxchg.  These give guaranteed strong synchronisation and
are required for non-SMP kernels running on an SMP hypervisor.

Signed-off-by: Ian Pratt <ian.pratt@xensource.com>
Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:55 -07:00
Rolf Eike Beer
3a750363e6 [PATCH] Use BUG_ON(foo) instead of "if (foo) BUG()" in include/asm-i386/dma-mapping.h
Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:55 -07:00
Dave McCracken
46a82b2d55 [PATCH] Standardize pxx_page macros
One of the changes necessary for shared page tables is to standardize the
pxx_page macros.  pte_page and pmd_page have always returned the struct
page associated with their entry, while pte_page_kernel and pmd_page_kernel
have returned the kernel virtual address.  pud_page and pgd_page, on the
other hand, return the kernel virtual address.

Shared page tables needs pud_page and pgd_page to return the actual page
structures.  There are very few actual users of these functions, so it is
simple to standardize their usage.

Since this is basic cleanup, I am submitting these changes as a standalone
patch.  Per Hugh Dickins' comments about it, I am also changing the
pxx_page_kernel macros to pxx_page_vaddr to clarify their meaning.

Signed-off-by: Dave McCracken <dmccr@us.ibm.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:51 -07:00
keith mannthey
9102330005 [PATCH] convert i386 NUMA KVA space to bootmem
Address a long standing issue of booting with an initrd on an i386 numa
system.  Currently (and always) the numa kva area is mapped into low memory
by finding the end of low memory and moving that mark down (thus creating
space for the kva).  The issue with this is that Grub loads initrds into
this similar space so when the kernel check the initrd it finds it outside
max_low_pfn and disables it (it thinks the initrd is not mapped into usable
memory) thus initrd enabled kernels can't boot i386 numa :(

My solution to the problem just converts the numa kva area to use the
bootmem allocator to save it's area (instead of moving the end of low
memory).  Using bootmem allows the kva area to be mapped into more diverse
addresses (not just the end of low memory) and enables the kva area to be
mapped below the initrd if present.

I have tested this patch on numaq(no initrd) and summit(initrd) i386 numa
based systems.

[akpm@osdl.org: cleanups]
Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:45 -07:00
Dmitriy Zavin
3222b36f46 [PATCH] x86: Add a cumulative thermal throttle event counter.
The counter is exported to /sys that keeps track of the
number of thermal events, such that the user knows how bad the
thermal problem might be (since the logging to syslog and mcelog
is rate limited).

AK: Fixed cpu hotplug locking

Signed-off-by: Dmitriy Zavin <dmitriyz@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:42 +02:00
Dmitriy Zavin
15d5f83983 [PATCH] x86: Refactor thermal throttle processing
Refactor the event processing (syslog messaging and rate limiting)
into separate file therm_throt.c. This allows consistent reporting
of CPU thermal throttle events.

After ACK'ing the interrupt, if the event is current, the user
(p4.c/mce_intel.c) calls therm_throt_process to log (and rate limit)
the event. If that function returns 1, the user has the option to log
things further (such as to mce_log in x86_64).

AK: minor cleanup

Signed-off-by: Dmitriy Zavin <dmitriyz@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:42 +02:00
Jan Beulich
adf1423698 [PATCH] i386/x86-64: Work around gcc bug with noreturn functions in unwinder
Current gcc generates calls not jumps to noreturn functions. When that happens the
return address can point to the next function, which confuses the unwinder.

This patch works around it by marking asynchronous exception
frames in contrast normal call frames in the unwind information.  Then teach
the unwinder to decode this.

For normal call frames the unwinder now subtracts one from the address which avoids
this problem.  The standard libgcc unwinder uses the same trick.

It doesn't include adjustment of the printed address (i.e. for the original
example, it'd still be kernel_math_error+0 that gets displayed, but the
unwinder wouldn't get confused anymore.

This only works with binutils 2.6.17+ and some versions of H.J.Lu's 2.6.16
unfortunately because earlier binutils don't support .cfi_signal_frame

[AK: added automatic detection of the new binutils and wrote description]

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:41 +02:00
Jeremy Fitzhardinge
2817716ace [PATCH] i386: Fix pack_descriptor()
Fix pack_descriptor:
 1. flags are bits 20-23 in the high word
 2. limit's 4 msb are bits 16-19 in the high word

These haven't mattered so far, because all users have had small limits
and a flags setting of 0.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>

===================================================================
2006-09-26 10:52:40 +02:00
Rusty Russell
78be3706b2 [PATCH] i386: Allow a kernel not to be in ring 0
We allow for the fact that the guest kernel may not run in ring 0.  This
requires some abstraction in a few places when setting %cs or checking
privilege level (user vs kernel).

This is Chris' [RFC PATCH 15/33] move segment checks to subarch, except rather
than using #define USER_MODE_MASK which depends on a config option, we use
Zach's more flexible approach of assuming ring 3 == userspace.  I also used
"get_kernel_rpl()" over "get_kernel_cs()" because I think it reads better in
the code...

1) Remove the hardcoded 3 and introduce #define SEGMENT_RPL_MASK 3 2) Add a
get_kernel_rpl() macro, and don't assume it's zero.

And:

Clean up of patch for letting kernel run other than ring 0:

a. Add some comments about the SEGMENT_IS_*_CODE() macros.
b. Add a USER_RPL macro.  (Code was comparing a value to a mask
   in some places and to the magic number 3 in other places.)
c. Add macros for table indicator field and use them.
d. Change the entry.S tests for LDT stack segment to use the macros

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:39 +02:00
Rusty Russell
0da5db3133 [PATCH] i386: Abstract sensitive instructions
Abstract sensitive instructions in assembler code, replacing them with macros
(which currently are #defined to the native versions).  We use long names:
assembler is case-insensitive, so if something goes wrong and macros do not
expand, it would assemble anyway.

Resulting object files are exactly the same as before.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:39 +02:00
Magnus Damm
3566561bfa [PATCH] i386: Avoid overwriting the current pgd (V4, i386)
kexec: Avoid overwriting the current pgd (V4, i386)

This patch upgrades the i386-specific kexec code to avoid overwriting the
current pgd. Overwriting the current pgd is bad when CONFIG_CRASH_DUMP is used
to start a secondary kernel that dumps the memory of the previous kernel.

The code introduces a new set of page tables. These tables are used to provide
an executable identity mapping without overwriting the current pgd.

Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:38 +02:00
Andi Kleen
80d2679cbc [PATCH] x86: Remove incorrect comment about ACPI e820 entries
They cannot be actually freed because the FACS table has a
shared-with-the-BIOS lock.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:38 +02:00
Chuck Ebbert
a549b86dd0 [PATCH] i386: annotate FIX_STACK() and the rest of nmi()
In i386's entry.S, FIX_STACK() needs annotation because it
replaces the stack pointer.  And the rest of nmi() needs
annotation in order to compile with these new annotations.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:37 +02:00
Dave Jones
f704cb9350 [PATCH] x86: remove config.h includes from asm-i386 & asm-x86_64
This is now automatically included by kbuild.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:36 +02:00
Ashok Raj
73fea17530 [PATCH] i386: Support physical cpu hotplug for x86_64
This patch enables ACPI based physical CPU hotplug support for x86_64.
Implements acpi_map_lsapic() and acpi_unmap_lsapic() to support physical cpu
hotplug.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@muc.de>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26 10:52:35 +02:00
Rusty Russell
522e93e3fc [PATCH] i386: Descriptor and trap table cleanups.
The implementation comes from Zach's [RFC, PATCH 10/24] i386 Vmi
descriptor changes:

Descriptor and trap table cleanups.  Add cleanly written accessors for
IDT and GDT gates so the subarch may override them.  Note that this
allows the hypervisor to transparently tweak the DPL of the descriptors
as well as the RPL of segments in those descriptors, with no unnecessary
kernel code modification.  It also allows the hypervisor implementation
of the VMI to tweak the gates, allowing for custom exception frames or
extra layers of indirection above the guest fault / IRQ handlers.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:35 +02:00
Adrian Bunk
3d08a256da [PATCH] i386: Make enable_local_apic static
enable_local_apic can now become static.

Cc: len.brown@intel.com

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:35 +02:00
Andi Kleen
a32cf3975b [PATCH] i386: Get ebp from unwinder state when continuing fallback backtrace
Cc: jbeulich@novell.com
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:34 +02:00
Andi Kleen
2b14a78cd0 [PATCH] i386: Do stacktracer conversion too
Following x86-64 patches. Reuses code from them in fact.

Convert the standard backtracer to do all output using
callbacks.   Use the x86-64 stack tracer implementation
that uses these callbacks to implement the stacktrace interface.

This allows to use the new dwarf2 unwinder for stacktrace
and get better backtraces.

Cc: mingo@elte.hu

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:34 +02:00
Rusty Russell
1a3f239ddf [PATCH] i386: Replace i386 open-coded cmdline parsing with
This patch replaces the open-coded early commandline parsing
throughout the i386 boot code with the generic mechanism (already used
by ppc, powerpc, ia64 and s390).  The code was inconsistent with
whether it deletes the option from the cmdline or not, meaning some of
these will get passed through the environment into init.

This transformation is mainly mechanical, but there are some notable
parts:

1) Grammar: s/linux never set's it up/linux never sets it up/

2) Remove hacked-in earlyprintk= option scanning.  When someone
   actually implements CONFIG_EARLY_PRINTK, then they can use
   early_param().
[AK: actually it is implemented, but I'm adding the early_param it in the next
x86-64 patch]

3) Move declaration of generic_apic_probe() from setup.c into asm/apic.h

4) Various parameters now moved into their appropriate files (thanks Andi).

5) All parse functions which examine arg need to check for NULL,
   except one where it has subtle humor value.

AK: readded acpi_sci handling which was completely dropped
AK: moved some more variables into acpi/boot.c

Cc: len.brown@intel.com

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Andi Kleen
fb2e284856 [PATCH] i386: Clean up spin/rwlocks
- Inline spinlock strings into their inline functions
- Convert macros to typesafe inlines
- Replace some leftover __asm__ __volatile__s with asm volatile

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Andi Kleen
7ca2b49b06 [PATCH] i386: Remove lock section support in semaphore.h
Lock sections don't work the new dwarf2 unwinder
This generates slightly smaller code. It adds one more taken
jump to the fast path.

Cc: jbeulich@novell.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Andi Kleen
add659bf8a [PATCH] i386: Remove lock section support in rwsem.h
Lock sections don't work the new dwarf2 unwinder
This generates slightly smaller code. It adds one more taken
jump to the fast path.

Also move the trampolines into semaphore.S and add proper CFI
annotations.

Cc: jbeulich@novell.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Andi Kleen
01215ad8d8 [PATCH] i386: Remove lock section support in mutex.h
Lock sections don't work the new dwarf2 unwinder
This generates slightly smaller code. It adds one more taken
jump to the fast path.

Cc: jbeulich@novell.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Andi Kleen
107878bb14 [PATCH] i386: Minor fixes & cleanup to tlb flush
(based on x86-64 changes)
- Add a proper memory clobber to invlpg
- Remove an unused extern

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen
ecaf45ee5c [PATCH] i386: Redo semaphore and rwlock assembly helpers
- Move them to a pure assembly file. Previously they were in
a C file that only consisted of inline assembly. Doing it in pure
assembler is much nicer.
- Add a frame.i include with FRAME/ENDFRAME macros to easily
add frame pointers to assembly functions
- Add dwarf2 annotation to them so that the new dwarf2 unwinder
doesn't get stuck on them
- Random cleanups

Includes feedback from Jan Beulich and a UML build fix from Andrew
Morton.

Cc: jbeulich@novell.com
Cc: jdike@addtoit.com
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen
07c9819b31 [PATCH] i386: add alternative-asm.h to allow LOCK_PREFIX replacement in .S files
LOCK_PREFIX is replaced by nops on UP systems, so it has to be a special
macro.  Previously this was only possible from C. Allow it for pure
assembly files too. Similar to earlier x86-64 patch.
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen
1a015b5644 [PATCH] i386: Remove const case for rwlocks
rwlocks are now out of line, so it near never triggers.  Also it was
incompatible with the new dwarf2 unwinder because it had unannotiatable
push/pops.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen
0cb91a2293 [PATCH] i386: Account spinlocks to the caller during profiling for !FP kernels
This ports the algorithm from x86-64 (with improvements) to i386.
Previously this only worked for frame pointer enabled kernels.
But spinlocks have a very simple stack frame that can be manually
analyzed. Do this.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Andi Kleen
3cfc348bf9 [PATCH] x86: Add portable getcpu call
For NUMA optimization and some other algorithms it is useful to have a fast
to get the current CPU and node numbers in user space.

x86-64 added a fast way to do this in a vsyscall. This adds a generic
syscall for other architectures to make it a generic portable facility.

I expect some of them will also implement it as a faster vsyscall.

The cache is an optimization for the x86-64 vsyscall optimization. Since
what the syscall returns is an approximation anyways and user space
often wants very fast results it can be cached for some time.  The norma
methods to get this information in user space are relatively slow

The vsyscall is in a better position to manage the cache because it has direct
access to a fast time stamp (jiffies). For the generic syscall optimization
it doesn't help much, but enforce a valid argument to keep programs
portable

I only added an i386 syscall entry for now. Other architectures can follow
as needed.

AK: Also added some cleanups from Andrew Morton

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Venkatesh Pallipadi
248dcb2fff [PATCH] x86: i386/x86-64 Add nmi watchdog support for new Intel CPUs
AK: This redoes the changes I temporarily reverted.

Intel now has support for Architectural Performance Monitoring Counters
( Refer to IA-32 Intel Architecture Software Developer's Manual
http://www.intel.com/design/pentium4/manuals/253669.htm ). This
feature is present starting from Intel Core Duo and Intel Core Solo processors.

What this means is, the performance monitoring counters and some performance
monitoring events are now defined in an architectural way (using cpuid).
And there will be no need to check for family/model etc for these architectural
events.

Below is the patch to use this performance counters in nmi watchdog driver.
Patch handles both i386 and x86-64 kernels.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Shaohua Li
4038f901cf [PATCH] i386/x86-64: Fix NMI watchdog suspend/resume
Making NMI suspend/resume work with SMP. We use CPU hotplug to offline
APs in SMP suspend/resume. Only BSP executes sysdev's .suspend/.resume
method. APs should follow CPU hotplug code path.

And:

+From: Don Zickus <dzickus@redhat.com>

Makes the start/stop paths of nmi watchdog more robust to handle the
suspend/resume cases more gracefully.

AK: I merged the two patches together

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26 10:52:27 +02:00
Don Zickus
407984f1af [PATCH] x86: Add abilty to enable/disable nmi watchdog with sysctl
Adds a new /proc/sys/kernel/nmi call that will enable/disable the nmi
watchdog.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Don Zickus
2fbe7b25c8 [PATCH] i386/x86-64: Remove un/set_nmi_callback and reserve/release_lapic_nmi functions
Removes the un/set_nmi_callback and reserve/release_lapic_nmi functions as
they are no longer needed.  The various subsystems are modified to register
with the die_notifier instead.

Also includes compile fixes by Andrew Morton.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Don Zickus
3adbbcce9a [PATCH] x86: Cleanup NMI interrupt path
This patch cleans up the NMI interrupt path.  Instead of being gated by if
the 'nmi callback' is set, the interrupt handler now calls everyone who is
registered on the die_chain and additionally checks the nmi watchdog,
reseting it if enabled.  This allows more subsystems to hook into the NMI if
they need to (without being block by set_nmi_callback).

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Don Zickus
b7471c6da9 [PATCH] i386: Add SMP support on i386 to reservation framework
This patch includes the changes to make the nmi watchdog on i386 SMP aware.
A bunch of code was moved around to make it simpler to read.  In addition,
it is now possible to determine if a particular NMI was the result of the
watchdog or not.  This feature allows the kernel to filter out unknown NMIs
easier.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Don Zickus
828f0afda1 [PATCH] x86: Add performance counter reservation framework for UP kernels
Adds basic infrastructure to allow subsystems to reserve performance
counters on the x86 chips.  Only UP kernels are supported in this patch to
make reviewing easier.  The SMP portion makes a lot more changes.

Think of this as a locking mechanism where each bit represents a different
counter.  In addition, each subsystem should also reserve an appropriate
event selection register that will correspond to the performance counter it
will be using (this is mainly neccessary for the Pentium 4 chips as they
break the 1:1 relationship to performance counters).

This will help prevent subsystems like oprofile from interfering with the
nmi watchdog.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Andi Kleen
b07f8915cd [PATCH] x86: Temporarily revert parts of the Core 2 nmi nmi watchdog support
This makes merging easier.  They are readded a few patches later.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Andi Kleen
874c4fe389 [PATCH] i386: Allow to use GENERICARCH for UP kernels
There are some machines around (large xSeries or Unisys ES7000) that
need physical IO-APIC destination mode to access all of their IO
devices. This currently doesn't work in UP kernels as used in
distribution installers.

This patch allows to compile even UP kernels as GENERICARCH which
allows to use physical or clustered APIC mode.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Jeff Garzik
a6d967a485 [libata] No need for all those arch libata-portmap.h headers
They all contain the same thing.  Instead, have a single generic one in
include/asm-generic, and permit an arch to override as needed.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-09-25 15:33:09 -04:00
Jeff Garzik
23930fa1ce Merge branch 'master' into upstream 2006-09-24 01:52:47 -04:00
David Woodhouse
fadcfa33b6 [HEADERS] One line per header in Kbuild files to reduce conflicts
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-09-19 12:43:58 +01:00
Jeff Garzik
4a3381feb8 Merge branch 'master' into upstream 2006-09-19 00:42:13 -04:00
Linus Torvalds
47a5c6fa0e x86: save/restore eflags in context switch
(And reset it on new thread creation)

It turns out that eflags is important to save and restore not just
because of iopl, but due to the magic bits like the NT bit, which we
don't want leaking between different threads.

Tested-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-18 16:20:40 -07:00
David Woodhouse
e5fa6d7031 [PATCH] Fix 'make headers_check' on i386
This brings i386 asm/unistd.h into consistency with other architectures by not
exporting functionality which is not necessary.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-16 12:54:32 -07:00
David Woodhouse
f01f0f052d [PATCH] headers_check: don't expose PFN stuff to userspace in <asm-i386/setup.h>
The header file <linux/pfn.h> doesn't exist in userspace and probably
shouldn't -- but it's used unconditionally in <asm-i386/setup.h>.  Protect it
with #ifdef __KERNEL__ and move setup.h from $(header-y) to $(unifdef-y) in
Kbuild accordingly.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-13 07:32:16 -07:00
David Woodhouse
651c923a44 [PATCH] headers_check: move kernel-only #includes within <asm-i386/elf.h>
Some files which don't exist in userspace were being included unconditionally
in asm-i386/elf.h.  Move the offending #includes down a few lines so that
they're protected by #ifdef __KERNEL__

In fact, we probably want to kill off all userspace use of asm/elf.h -- but we
aren't there yet, so we should at least make it possible to include it for
now.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-13 07:32:15 -07:00
David Woodhouse
b40c274a03 [PATCH] headers_check: move inclusion of <linux/linkage.h> in <asm-i386/signal.h>
Because <linux/linkage.h> doesn't exist in userspace, it should be only
included from within #ifdef __KERNEL__.  Move the corresponding #include

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-13 07:32:15 -07:00
Jeff Garzik
97148ba223 Merge branch 'master' into upstream 2006-09-12 12:03:21 -04:00
David Woodhouse
b4a228346c [PATCH] Remove unneeded asm-i386/cpufeature.h from user visibility.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-06 11:00:02 -07:00
Jeff Garzik
f9bcda7760 Merge branch 'master' into upstream 2006-09-04 06:41:37 -04:00
Andrew Morton
a9aa141cfc [PATCH] x86: increase MAX_MP_BUSSES on default arch
Vitezslav Samel <samel@mail.cz> reports that an HP DL380 g4 fails using the
default arch due to the ISA bus having an ID of 32.

It would have worked OK with the generic arch - for some reason the default
arch doesn't support as many busses.

So bump that up to support 256 busses, but leave it at 32 if we're building a
tiny system to save a bit of memory.

Cc: Vitezslav Samel <samel@mail.cz>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-01 11:39:08 -07:00
Chris Wright
22db37ec5f [PATCH] i386: rwlock.h fix smp alternatives fix
Commit 8c74932779 ("i386: Remove
alternative_smp") did not actually compile on x86 with CONFIG_SMP.

This fixes the __build_read/write_lock helpers.  I've boot tested on
SMP.

[ Andi: "Oops, I think that was a quilt unrefreshed patch.  Sorry.  I
  fixed those before testing, but then still send out the old patch." ]

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Gerd Hoffmann <kraxel@suse.de>
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-31 10:46:07 -07:00
Andi Kleen
386dcafaac [PATCH] i386: Remove __KERNEL__ ifdef around _syscall*()
After all their only point is having them in user space.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:16 -07:00
Andi Kleen
8c74932779 [PATCH] i386: Remove alternative_smp
The .fill causes miscompilations with some binutils version.

Instead just patch the lock prefix in the lock constructs. That is the
majority of the cost and should be good enough.

Cc: Gerd Hoffmann <kraxel@suse.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Jan Beulich
ea424055b7 [PATCH] x86: Make backtracer fallback logic more bullet-proof
The unwinder fallback logic still had potential for falling through to
the legacy stack trace code without printing an indication (at once
serving as a separator) of this.

Further, the stack pointer retrieval for the fallback should be as
restrictive as possible (in order to avoid having the legacy stack
tracer try to access invalid memory). The patch tightens that, but
this could certainly be further improved.

Also making the call_trace command line option now conditional upon
CONFIG_STACK_UNWIND (as it's meaningless otherwise).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Jan Beulich
61171b8dbd [PATCH] x86: fix x86 cpuid keys used in alternative_smp()
By hard-coding the cpuid keys for alternative_smp() rather than using
the symbolic constant it turned out that incorrect values were used on
both i386 (0x68 instead of 0x69) and x86-64 (0x66 instead of 0x68).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Jeff Garzik
b01e86fee6 Merge /spare/repo/linux-2.6 into upstream 2006-08-29 17:55:59 -04:00
KAMEZAWA Hiroyuki
4e54bdaa9c [PATCH] CONFIG_ACPI_SRAT NUMA build fix
In file included from include/asm/mmzone.h:18,
                   from include/linux/mmzone.h:439,
  <snip>
  include/asm/srat.h:31:2: error: #error CONFIG_ACPI_SRAT not defined, and srat.h header has been included
  make[1]: *** [arch/i386/kernel/asm-offsets.s] Error 1

This can happen with CONFIG_NUMA && !CONFIG_ACPI && !CONFIG_X86_NUMAQ

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-27 11:01:32 -07:00
Alan Cox
2ec7df0457 [PATCH] libata: rework legacy handling to remove much of the cruft
Kill host_set->next
Fix simplex support
Allow per platform setting of IDE legacy bases

Some of this can be tidied further later on, in particular all the
legacy port gunge belongs as a PCI quirk/PCI header decode to understand
the special legacy IDE rules in the PCI spec.

Longer term Jeff also wants to move the request_irq/free_irq out of core
which will make this even cleaner.

tj: folded in three followup patches - ata_piix-fix, broken-arch-fix
and fix-new-legacy-handling, and separated per-dev xfermask into
separate patch preceding this one.  Folded in fixes are...

* ata_piix-fix: fix build failure due to host_set->next removal
* broken-arch-fix: add missing include/asm-*/libata-portmap.h
* fix-new-legacy-handling:
	* In ata_pci_init_legacy_port(), probe_num was incorrectly
          incremented during initialization of the secondary port and
          probe_ent->n_ports was incorrectly fixed to 1.

	* Both legacy ports ended up having the same hard_port_no.

	* When printing port information, both legacy ports printed
	  the first irq.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-08-10 16:59:10 +09:00
bibo, mao
a9ad965ea9 [PATCH] IA64: kprobe invalidate icache of jump buffer
Kprobe inserts breakpoint instruction in probepoint and then jumps to
instruction slot when breakpoint is hit, the instruction slot icache must
be consistent with dcache.  Here is the patch which invalidates instruction
slot icache area.

Without this patch, in some machines there will be fault when executing
instruction slot where icache content is inconsistent with dcache.

Signed-off-by: bibo,mao <bibo.mao@intel.com>
Acked-by: "Luck, Tony" <tony.luck@intel.com>
Acked-by: Keshavamurthy Anil S <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31 13:28:38 -07:00
Steven Rostedt
52393ccc0a [PATCH] remove set_wmb - arch removal
set_wmb should not be used in the kernel because it just confuses the
code more and has no benefit.  Since it is not currently used in the
kernel this patch removes it so that new code does not include it.

All archs define set_wmb(var, value) to do { var = value; wmb(); }
while(0) except ia64 and sparc which use a mb() instead.  But this is
still moot since it is not used anyway.

Hasn't been tested on any archs but x86 and x86_64 (and only compiled
tested)

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14 21:56:14 -07:00
Chuck Ebbert
b43c7cec6b [PATCH] i386: system.h: remove extra semicolons and fix order
include/asm-i386/system.h has trailing semicolons in some of the
macros that cause legitimate code to fail compilation, so remove
them. Also remove extra blank lines within one group of macros.

And put stts() and clts() back together; they got separated somehow.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-13 07:48:28 -07:00
Stephane Eranian
b3cf257623 [PATCH] i386: use thread_info flags for debug regs and IO bitmaps
Use thread info flags to track use of debug registers and IO bitmaps.

 - add TIF_DEBUG to track when debug registers are active
 - add TIF_IO_BITMAP to track when I/O bitmap is used
 - modify __switch_to() to use the new TIF flags

Performance tested on Pentium II, ten runs of LMbench context switch
benchmark (smaller is better:)

	before	after
avg	3.65	3.39
min	3.55	3.33

Signed-off-by: Stephane Eranian <eranian@hpl.hp.com>
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-09 18:47:12 -07:00
Linus Torvalds
b862f3b099 i386: improve and correct inline asm memory constraints
Use "+m" rather than a combination of "=m" and "m" for improved clarity
and consistency.

This also fixes some inlines that incorrectly didn't tell the compiler
that they read the old value at all, potentially causing the compiler to
generate bogus code.  It appear that all of those potential bugs were
hidden by the use of extra "volatile" specifiers on the data structures
in question, though.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-08 15:24:18 -07:00
Linus Torvalds
6fa0cb1141 Merge git://git.infradead.org/hdrinstall-2.6
* git://git.infradead.org/hdrinstall-2.6:
  Remove export of include/linux/isdn/tpam.h
  Remove <linux/i2c-id.h> and <linux/i2c-algo-ite.h> from userspace export
  Restrict headers exported to userspace for SPARC and SPARC64
  Add empty Kbuild files for 'make headers_install' in remaining arches.
  Add Kbuild file for Alpha 'make headers_install'
  Add Kbuild file for SPARC 'make headers_install'
  Add Kbuild file for IA64 'make headers_install'
  Add Kbuild file for S390 'make headers_install'
  Add Kbuild file for i386 'make headers_install'
  Add Kbuild file for x86_64 'make headers_install'
  Add Kbuild file for PowerPC 'make headers_install'
  Add generic Kbuild files for 'make headers_install'
  Basic implementation of 'make headers_check'
  Basic implementation of 'make headers_install'
2006-07-04 12:55:45 -07:00
Ingo Molnar
8a25d5debf [PATCH] lockdep: prove spinlock rwlock locking correctness
Use the lock validator framework to prove spinlock and rwlock locking
correctness.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:04 -07:00
Ingo Molnar
4ea2176dfa [PATCH] lockdep: prove rwsem locking correctness
Use the lock validator framework to prove rwsem locking correctness.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:04 -07:00
Ingo Molnar
c8558fcdec [PATCH] lockdep: irqtrace cleanup of include/asm-i386/irqflags.h
Clean up the x86 irqflags.h file:

 - macro => inline function transformation
 - simplifications
 - style fixes

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:03 -07:00
Ingo Molnar
55f327fa9e [PATCH] lockdep: irqtrace subsystem, i386 support
Add irqflags-tracing support to i386.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:03 -07:00
Ingo Molnar
c4e05116a2 [PATCH] lockdep: clean up rwsems
Clean up rwsems.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:01 -07:00
Thomas Gleixner
4879d77c4c [PATCH] irq-flags: i386: Use the new IRQF_ constants
Use the new IRQF_ constants and remove the SA_INTERRUPT define

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-02 13:58:47 -07:00
Gerd Hoffmann
8ec4d41f88 [PATCH] SMP alternatives: skip with UP kernels
Hide the magic in alternative.h and provide some dummy inline functions
for the UP case (gcc should manage to optimize away these calls).  No
changes in module.c.

Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-01 09:56:02 -07:00
Catherine Zhang
877ce7c1b3 [AF_UNIX]: Datagram getpeersec
This patch implements an API whereby an application can determine the
label of its peer's Unix datagram sockets via the auxiliary data mechanism of
recvmsg.

Patch purpose:

This patch enables a security-aware application to retrieve the
security context of the peer of a Unix datagram socket.  The application
can then use this security context to determine the security context for
processing on behalf of the peer who sent the packet.

Patch design and implementation:

The design and implementation is very similar to the UDP case for INET
sockets.  Basically we build upon the existing Unix domain socket API for
retrieving user credentials.  Linux offers the API for obtaining user
credentials via ancillary messages (i.e., out of band/control messages
that are bundled together with a normal message).  To retrieve the security
context, the application first indicates to the kernel such desire by
setting the SO_PASSSEC option via getsockopt.  Then the application
retrieves the security context using the auxiliary data mechanism.

An example server application for Unix datagram socket should look like this:

toggle = 1;
toggle_len = sizeof(toggle);

setsockopt(sockfd, SOL_SOCKET, SO_PASSSEC, &toggle, &toggle_len);
recvmsg(sockfd, &msg_hdr, 0);
if (msg_hdr.msg_controllen > sizeof(struct cmsghdr)) {
    cmsg_hdr = CMSG_FIRSTHDR(&msg_hdr);
    if (cmsg_hdr->cmsg_len <= CMSG_LEN(sizeof(scontext)) &&
        cmsg_hdr->cmsg_level == SOL_SOCKET &&
        cmsg_hdr->cmsg_type == SCM_SECURITY) {
        memcpy(&scontext, CMSG_DATA(cmsg_hdr), sizeof(scontext));
    }
}

sock_setsockopt is enhanced with a new socket option SOCK_PASSSEC to allow
a server socket to receive security context of the peer.

Testing:

We have tested the patch by setting up Unix datagram client and server
applications.  We verified that the server can retrieve the security context
using the auxiliary data mechanism of recvmsg.

Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com>
Acked-by: Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29 16:58:06 -07:00
Ingo Molnar
c0ad90a32f [PATCH] genirq: add ->retrigger() irq op to consolidate hw_irq_resend()
Add ->retrigger() irq op to consolidate hw_irq_resend() implementations.
(Most architectures had it defined to NOP anyway.)

NOTE: ia64 needs testing. i386 and x86_64 tested.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29 10:26:23 -07:00
Adrian Bunk
0686cd8fbe [PATCH] fix sgivwfb compile
drivers/built-in.o: In function `sgivwfb_set_par':
sgivwfb.c:(.text+0x88583): undefined reference to `sgivwfb_mem_phys'
sgivwfb.c:(.text+0x88596): undefined reference to `sgivwfb_mem_phys'
sgivwfb.c:(.text+0x885a8): undefined reference to `sgivwfb_mem_phys'
drivers/built-in.o: In function `sgivwfb_check_var':
sgivwfb.c:(.text+0x88ad0): undefined reference to `sgivwfb_mem_size'
drivers/built-in.o: In function `sgivwfb_mmap':
sgivwfb.c:(.text+0x88c75): undefined reference to `sgivwfb_mem_size'
sgivwfb.c:(.text+0x88c7f): undefined reference to `sgivwfb_mem_phys'
drivers/built-in.o: In function `sgivwfb_probe':
sgivwfb.c:(.init.text+0x4060): undefined reference to `sgivwfb_mem_size'
sgivwfb.c:(.init.text+0x4065): undefined reference to `sgivwfb_mem_phys'
sgivwfb.c:(.init.text+0x4076): undefined reference to `sgivwfb_mem_phys'
sgivwfb.c:(.init.text+0x409c): undefined reference to `sgivwfb_mem_size'
sgivwfb.c:(.init.text+0x410e): undefined reference to `sgivwfb_mem_size'
sgivwfb.c:(.init.text+0x4113): undefined reference to `sgivwfb_mem_phys'
sgivwfb.c:(.init.text+0x4162): undefined reference to `sgivwfb_mem_size'
sgivwfb.c:(.init.text+0x4168): undefined reference to `sgivwfb_mem_phys'
make: *** [.tmp_vmlinux1] Error 1

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29 10:26:19 -07:00
Siddha, Suresh B
5c45bf279d [PATCH] sched: mc/smt power savings sched policy
sysfs entries 'sched_mc_power_savings' and 'sched_smt_power_savings' in
/sys/devices/system/cpu/ control the MC/SMT power savings policy for the
scheduler.

Based on the values (1-enable, 0-disable) for these controls, sched groups
cpu power will be determined for different domains.  When power savings
policy is enabled and under light load conditions, scheduler will minimize
the physical packages/cpu cores carrying the load and thus conserving
power(with a perf impact based on the workload characteristics...  see OLS
2005 CMP kernel scheduler paper for more details..)

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Con Kolivas <kernel@kolivas.org>
Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:45 -07:00
Ingo Molnar
e6e5494cb2 [PATCH] vdso: randomize the i386 vDSO by moving it into a vma
Move the i386 VDSO down into a vma and thus randomize it.

Besides the security implications, this feature also helps debuggers, which
can COW a vma-backed VDSO just like a normal DSO and can thus do
single-stepping and other debugging features.

It's good for hypervisors (Xen, VMWare) too, which typically live in the same
high-mapped address space as the VDSO, hence whenever the VDSO is used, they
get lots of guest pagefaults and have to fix such guest accesses up - which
slows things down instead of speeding things up (the primary purpose of the
VDSO).

There's a new CONFIG_COMPAT_VDSO (default=y) option, which provides support
for older glibcs that still rely on a prelinked high-mapped VDSO.  Newer
distributions (using glibc 2.3.3 or later) can turn this option off.  Turning
it off is also recommended for security reasons: attackers cannot use the
predictable high-mapped VDSO page as syscall trampoline anymore.

There is a new vdso=[0|1] boot option as well, and a runtime
/proc/sys/vm/vdso_enabled sysctl switch, that allows the VDSO to be turned
on/off.

(This version of the VDSO-randomization patch also has working ELF
coredumping, the previous patch crashed in the coredumping code.)

This code is a combined work of the exec-shield VDSO randomization
code and Gerd Hoffmann's hypervisor-centric VDSO patch. Rusty Russell
started this patch and i completed it.

[akpm@osdl.org: cleanups]
[akpm@osdl.org: compile fix]
[akpm@osdl.org: compile fix 2]
[akpm@osdl.org: compile fix 3]
[akpm@osdl.org: revernt MAXMEM change]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Cc: Gerd Hoffmann <kraxel@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:38 -07:00
Chuck Ebbert
c723e08460 [PATCH] i386: use C code for current_thread_info()
Using C code for current_thread_info() lets the compiler optimize it.
With gcc 4.0.2, kernel is smaller:

    text           data     bss     dec     hex filename
 3645212         555556  312024 4512792  44dc18 2.6.17-rc6-nb-post/vmlinux
 3647276         555556  312024 4514856  44e428 2.6.17-rc6-nb/vmlinux
 -------
   -2064

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:37 -07:00
Rohit Seth
4b89aff930 [PATCH] i386: move phys_proc_id and cpu_core_id to cpuinfo_x86
Move the phys_core_id and cpu_core_id to cpuinfo_x86 structure.  Similar
patch for x86_64 is already accepted by Andi earlier this week.

[akpm@osdl.org: fix warning]
Signed-off-by: Rohit Seth <rohitseth@google.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:37 -07:00
Yasunori Goto
0fc44159bf [PATCH] Register sysfs file for hotplugged new node
When new node becomes enable by hot-add, new sysfs file must be created for
new node.  So, if new node is enabled by add_memory(), register_one_node() is
called to create it.  In addition, I386's arch_register_node() and a part of
register_nodes() of powerpc are consolidated to register_one_node() as a
generic_code().

This is tested by Tiger4(IPF) with node hot-plug emulation.

Signed-off-by: Keiichiro Tokunaga <tokuanga.keiich@jp.fujitsu.com>
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:36 -07:00
Linus Torvalds
da206c9e68 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial:
  typo fixes
  Clean up 'inline is not at beginning' warnings for usb storage
  Storage class should be first
  i386: Trivial typo fixes
  ixj: make ixj_set_tone_off() static
  spelling fixes
  fix paniced->panicked typos
  Spelling fixes for Documentation/atomic_ops.txt
  move acknowledgment for Mark Adler to CREDITS
  remove the bouncing email address of David Campbell
2006-06-26 13:33:14 -07:00
Linus Torvalds
81a07d7588 Merge branch 'x86-64'
* x86-64: (83 commits)
  [PATCH] x86_64: x86_64 stack usage debugging
  [PATCH] x86_64: (resend) x86_64 stack overflow debugging
  [PATCH] x86_64: msi_apic.c build fix
  [PATCH] x86_64: i386/x86-64 Add nmi watchdog support for new Intel CPUs
  [PATCH] x86_64: Avoid broadcasting NMI IPIs
  [PATCH] x86_64: fix apic error on bootup
  [PATCH] x86_64: enlarge window for stack growth
  [PATCH] x86_64: Minor string functions optimizations
  [PATCH] x86_64: Move export symbols to their C functions
  [PATCH] x86_64: Standardize i386/x86_64 handling of NMI_VECTOR
  [PATCH] x86_64: Fix modular pc speaker
  [PATCH] x86_64: remove sys32_ni_syscall()
  [PATCH] x86_64: Do not use -ffunction-sections for modules
  [PATCH] x86_64: Add cpu_relax to apic_wait_icr_idle
  [PATCH] x86_64: adjust kstack_depth_to_print default
  [PATCH] i386/x86-64: adjust /proc/interrupts column headings
  [PATCH] x86_64: Fix race in cpu_local_* on preemptible kernels
  [PATCH] x86_64: Fix fast check in safe_smp_processor_id
  [PATCH] x86_64: x86_64 setup.c - printing cmp related boottime information
  [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status
  ...

Manual resolve of trivial conflict in arch/i386/kernel/Makefile
2006-06-26 10:51:09 -07:00
Venkatesh Pallipadi
0080e66755 [PATCH] x86_64: i386/x86-64 Add nmi watchdog support for new Intel CPUs
Intel now has support for Architectural Performance Monitoring Counters
( Refer to IA-32 Intel Architecture Software Developer's Manual
http://www.intel.com/design/pentium4/manuals/253669.htm ). This
feature is present starting from Intel Core Duo and Intel Core Solo processors.

What this means is, the performance monitoring counters and some performance
monitoring events are now defined in an architectural way (using cpuid).
And there will be no need to check for family/model etc for these architectural
events.

Below is the patch to use this performance counters in nmi watchdog driver.
Patch handles both i386 and x86-64 kernels.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:22 -07:00
Keith Owens
e77deacb7b [PATCH] x86_64: Avoid broadcasting NMI IPIs
On some i386/x86_64 systems, sending an NMI IPI as a broadcast will
reset the system.  This seems to be a BIOS bug which affects machines
where one or more cpus are not under OS control.  It occurs on HT
systems with a version of the OS that is not compiled without HT
support.  It also occurs when a system is booted with max_cpus=n where
2 <= n < cpus known to the BIOS.  The fix is to always send NMI IPI as
a mask instead of as a broadcast.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:22 -07:00
Keith Owens
45486f81c9 [PATCH] x86_64: Standardize i386/x86_64 handling of NMI_VECTOR
x86_64 and i386 behave inconsistently when sending an IPI on vector 2
(NMI_VECTOR).  Make both behave the same, so IPI 2 is sent as NMI.

The crash code was abusing send_IPI_allbutself() by passing a code
instead of a vector, it only worked because crash knew about the
internal code of send_IPI_allbutself().  Change crash to use NMI_VECTOR
instead, and remove the comment about how crash was abusing the function.

This patch is a pre-requisite for fixing the problem where sending an
IPI as NMI would reboot some Dell Xeon systems.  I cannot fix that
problem while crash continus to abuse send_IPI_allbutself().

It also removes the inconsistency between i386 and x86_64 for
NMI_VECTOR.  That will simplify all the RAS code that needs to bring
all the cpus to a clean stop, even when one or more cpus are spinning
disabled.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:22 -07:00
Andi Kleen
da5311258d [PATCH] x86_64: Fix race in cpu_local_* on preemptible kernels
When a process changes CPUs while doing the non atomic cpu_local_*
operations it might operate on the local_t of a different CPUs.

Fix that by disabling preemption.

Pointed out by Christopher Lameter

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:21 -07:00
Andi Kleen
495ab9c045 [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status
During some profiling I noticed that default_idle causes a lot of
memory traffic. I think that is caused by the atomic operations
to clear/set the polling flag in thread_info. There is actually
no reason to make this atomic - only the idle thread does it
to itself, other CPUs only read it. So I moved it into ti->status.

Converted i386/x86-64/ia64 for now because that was the easiest
way to fix ACPI which also manipulates these flags in its idle
function.

Cc: Nick Piggin <npiggin@novell.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:21 -07:00
Jan Beulich
c33bd9aac0 [PATCH] i386/x86-64: fall back to old-style call trace if no unwinding
If no unwinding is possible at all for a certain exception instance,
fall back to the old style call trace instead of not showing any trace
at all.

Also, allow setting the stack trace mode at the command line.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:18 -07:00
Jan Beulich
fe7cacc1c2 [PATCH] i386: reliable stack trace support i386 entry.S
To increase the usefulness of reliable stack unwinding, this adds CFI
unwind annotations to many low-level i386 routines.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:17 -07:00
Jan Beulich
176a2718f4 [PATCH] i386: reliable stack trace support (i386)
These are the i386-specific pieces to enable reliable stack traces. This is
going to be even more useful once CFI annotations get added to he assembly
code, namely to entry.S.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:17 -07:00
Don Zickus
3e4ff11574 [PATCH] x86_64: nmi watchdog header cleanup
Misc header cleanup for nmi watchdog.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:16 -07:00
Andi Kleen
a32073bffc [PATCH] x86_64: Clean and enhance up K8 northbridge access code
- Factor out the duplicated access/cache code into a single file
   * Shared between i386/x86-64.
 - Share flush code between AGP and IOMMU
   * Fix a bug: AGP didn't wait for end of flush before
 - Drop 8 northbridges limit and allocate dynamically
 - Add lock to serialize AGP and IOMMU GART flushes
 - Add PCI ID for next AMD northbridge
 - Random related cleanups

The old K8 NUMA discovery code is unchanged. New systems
should all use SRAT for this.

Cc: "Navin Boppuri" <navin.boppuri@newisys.com>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:15 -07:00
Gerd Hoffmann
d167a51877 [PATCH] x86_64: x86_64 version of the smp alternative patch.
Changes are largely identical to the i386 version:

 * alternative #define are moved to the new alternative.h file.
 * one new elf section with pointers to the lock prefixes which can be
   nop'ed out for non-smp.
 * two new elf sections simliar to the "classic" alternatives to
   replace SMP code with simpler UP code.
 * fixup headers to use alternative.h instead of defining their own
   LOCK / LOCK_PREFIX macros.

The patch reuses the i386 version of the alternatives code to avoid code
duplication.  The code in alternatives.c was shuffled around a bit to
reduce the number of #ifdefs needed.  It also got some tweaks needed for
x86_64 (vsyscall page handling) and new features (noreplacement option
which was x86_64 only up to now).  Debug printk's are changed from
compile-time to runtime.

Loosely based on a early version from Bastian Blank <waldi@debian.org>

Signed-off-by: Gerd Hoffmann <kraxel@suse.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:14 -07:00
Andi Kleen
240cd6a806 [PATCH] i386/x86-64: Emulate CPUID4 on AMD
Intel systems report the cache level data from CPUID 4 in sysfs.
Add a CPUID 4 emulation for AMD CPUs to report the same
information for them. This allows programs to read this
information in a uniform way.

The AMD way to report this is less flexible so some assumptions
are hardcoded (e.g. no L3)

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:14 -07:00
Anil S Keshavamurthy
e6f47f978b [PATCH] Notify page fault call chain
With this patch Kprobes now registers for page fault notifications only when
their is an active probe registered.  Once all the active probes are
unregistered their is no need to be notified of page faults and kprobes
unregisters itself from the page fault notifications.  Hence we will have ZERO
side effects when no probes are active.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:22 -07:00
Anil S Keshavamurthy
b71b5b6528 [PATCH] Notify page fault call chain for i386
Overloading of page fault notification with the notify_die() has performance
issues(since the only interested components for page fault is kprobes and/or
kdb) and hence this patch introduces the new notifier call chain exclusively
for page fault notifications their by avoiding notifying unnecessary
components in the do_page_fault() code path.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:22 -07:00
john stultz
6f84fa2f3e [PATCH] Time: i386 Conversion - part 3: Enable Generic Timekeeping
This converts the i386 arch to use the generic timeofday subsystem.  It
enabled the GENERIC_TIME option, disables the timer_opts code and other arch
specific timekeeping code and reworks the delay code.

While this patch enables the generic timekeeping, please note that this patch
does not provide any i386 clocksource.  Thus only the jiffies clocksource will
be available.  To get full replacements for the code being disabled here, the
timeofday-clocks-i386 patch will needed.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:21 -07:00
john stultz
539eb11e6e [PATCH] Time: i386 Conversion - part 2: Rework TSC Support
As part of the i386 conversion to the generic timekeeping infrastructure, this
introduces a new tsc.c file.  The code in this file replaces the TSC
initialization, management and access code currently in timer_tsc.c (which
will be removed) that we want to preserve.

The code also introduces the following functionality:

o tsc_khz: like cpu_khz but stores the TSC frequency on systems that do not
  change TSC frequency w/ CPU frequency

o check/mark_tsc_unstable: accessor/modifier flag for TSC timekeeping
  usability

o minor cleanups to calibration math.

This patch also includes a one line __cpuinitdata fix from Zwane Mwaikambo.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:21 -07:00
Andreas Mohr
d6e05edc59 spelling fixes
acquired (aquired)
contiguous (contigious)
successful (succesful, succesfull)
surprise (suprise)
whether (weather)
some other misspellings

Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-26 18:35:02 +02:00
NeilBrown
7c12d81134 [PATCH] Make copy_from_user_inatomic NOT zero the tail on i386
As described in a previous patch and documented in mm/filemap.h,
copy_from_user_inatomic* shouldn't zero out the tail of the buffer after an
incomplete copy.

This patch implements that change for i386.

For the _nocache version, a new __copy_user_intel_nocache is defined similar
to copy_user_zeroio_intel_nocache, and this is ultimately used for the copy.

For the regular version, __copy_from_user_ll_nozero is defined which uses
__copy_user and __copy_user_intel - the later needs casts to reposition the
__user annotations.

If copy_from_user_atomic is given a constant length of 1, 2, or 4, then we do
still zero the destintion on failure.  This didn't seem worth the effort of
fixing as the places where it is used really don't care.

Signed-off-by: Neil Brown <neilb@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:09 -07:00
NeilBrown
01408c4939 [PATCH] Prepare for __copy_from_user_inatomic to not zero missed bytes
The problem is that when we write to a file, the copy from userspace to
pagecache is first done with preemption disabled, so if the source address is
not immediately available the copy fails *and* *zeros* *the* *destination*.

This is a problem because a concurrent read (which admittedly is an odd thing
to do) might see zeros rather that was there before the write, or what was
there after, or some mixture of the two (any of these being a reasonable thing
to see).

If the copy did fail, it will immediately be retried with preemption
re-enabled so any transient problem with accessing the source won't cause an
error.

The first copying does not need to zero any uncopied bytes, and doing so
causes the problem.  It uses copy_from_user_atomic rather than copy_from_user
so the simple expedient is to change copy_from_user_atomic to *not* zero out
bytes on failure.

The first of these two patches prepares for the change by fixing two places
which assume copy_from_user_atomic does zero the tail.  The two usages are
very similar pieces of code which copy from a userspace iovec into one or more
page-cache pages.  These are changed to remove the assumption.

The second patch changes __copy_from_user_inatomic* to not zero the tail.
Once these are accepted, I will look at similar patches of other architectures
where this is important (ppc, mips and sparc being the ones I can find).

This patch:

There is a problem with __copy_from_user_inatomic zeroing the tail of the
buffer in the case of an error.  As it is called in atomic context, the error
may be transient, so it results in zeros being written where maybe they
shouldn't be.

In the usage in filemap, this opens a window for a well timed read to see data
(zeros) which is not consistent with any ordering of reads and writes.

Most cases where __copy_from_user_inatomic is called, a failure results in
__copy_from_user being called immediately.  As long as the latter zeros the
tail, the former doesn't need to.  However in *copy_from_user_iovec
implementations (in both filemap and ntfs/file), it is assumed that
copy_from_user_inatomic will zero the tail.

This patch removes that assumption, so that after this patch it will
be safe for copy_from_user_inatomic to not zero the tail.

This patch also adds some commentary to filemap.h and asm-i386/uaccess.h.

After this patch, all architectures that might disable preempt when
kmap_atomic is called need to have their __copy_from_user_inatomic* "fixed".
This includes
 - powerpc
 - i386
 - mips
 - sparc

Signed-off-by: Neil Brown <neilb@suse.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:09 -07:00
Matt Mackall
afedfd016a [PATCH] random: remove SA_SAMPLE_RANDOM from floppy driver
The floppy driver is already calling add_disk_randomness as it should, so this
was redundant.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:00 -07:00
Jeremy Fitzhardinge
e75eac33b5 [PATCH] Clean up and refactor i386 sub-architecture setup
Clean up and refactor i386 sub-architecture setup.

This change moves all the code from the
asm-i386/mach-*/setup_arch_pre/post.h headers, into
arch/i386/mach-*/setup.c.  mach-*/setup_arch_pre.h is renamed to
setup_arch.h, and contains only things which should be in header files.  It
is purely code-motion; there should be no functional changes at all.

Several functions in arch/i386/kernel/setup.c needed to be made non-static
so that they're visible to the code in mach-*/setup.c.  asm-i386/setup.h is
used to hold the prototypes for these functions.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Cc: Martin Bligh <mbligh@google.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Andrey Panin <pazke@donpac.ru>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:00:55 -07:00
Linus Torvalds
37224470c8 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (65 commits)
  ACPI: suppress power button event on S3 resume
  ACPI: resolve merge conflict between sem2mutex and processor_perflib.c
  ACPI: use for_each_possible_cpu() instead of for_each_cpu()
  ACPI: delete newly added debugging macros in processor_perflib.c
  ACPI: UP build fix for bugzilla-5737
  Enable P-state software coordination via _PDC
  P-state software coordination for speedstep-centrino
  P-state software coordination for acpi-cpufreq
  P-state software coordination for ACPI core
  ACPI: create acpi_thermal_resume()
  ACPI: create acpi_fan_suspend()/acpi_fan_resume()
  ACPI: pass pm_message_t from acpi_device_suspend() to root_suspend()
  ACPI: create acpi_device_suspend()/acpi_device_resume()
  ACPI: replace spin_lock_irq with mutex for ec poll mode
  ACPI: Allow a WAN module enable/disable on a Thinkpad X60.
  sem2mutex: acpi, acpi_link_lock
  ACPI: delete unused acpi_bus_drivers_lock
  sem2mutex: drivers/acpi/processor_perflib.c
  ACPI add ia64 exports to build acpi_memhotplug as a module
  ACPI: asus_acpi_init(): propagate correct return value
  ...

Manual resolve of conflicts in:

	arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c
	arch/i386/kernel/cpu/cpufreq/speedstep-centrino.c
	include/acpi/processor.h
2006-06-23 07:52:36 -07:00
Kirill Smelkov
30343d6c3d [PATCH] x86: compile fix for asm-i386/alternatives.h
compile fix:  <asm-i386/alternative.h>  needs  <asm/types.h> for 'u8' --
just look at struct alt_instr.

My module includes <asm/bitops.h> as the first header, and as of 2.6.17 this
leads to compilation errors.

Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:59 -07:00
Michal Ludvig
224f611c16 [PATCH] x86: VIA C7 CPU flags
New CPU flags for next generation of crypto engine as found in VIA C7
processors.

Signed-off-by: Michal Ludvig <michal@logix.cz>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:59 -07:00
Roman Zippel
722f4f5b26 [PATCH] x86: fix __range_ok constraint
An immediate operand can't be the destination of the cmpl instruction,
so exclude it.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Cc: Mattia Dongili <malattia@linux.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:58 -07:00
Alexey Dobriyan
a03a3e287b [PATCH] Don't trigger full rebuild via CONFIG_X86_MCE
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:56 -07:00
Alexey Dobriyan
27b07da733 [PATCH] Don't trigger full rebuild via CONFIG_MTRR
Only drm, framebuffer, mtrr parts + misc files here and there.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:56 -07:00
Adrian Bunk
a0b4da91f4 [PATCH] arch/i386/kernel/apic.c: make modern_apic() static
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:56 -07:00
Hiro Yoshioka
c22ce143d1 [PATCH] x86: cache pollution aware __copy_from_user_ll()
Use the x86 cache-bypassing copy instructions for copy_from_user().

Some performance data are

Total of GLOBAL_POWER_EVENTS (CPU cycle samples)

2.6.12.4.orig    1921587
2.6.12.4.nt      1599424
1599424/1921587=83.23% (16.77% reduction)

BSQ_CACHE_REFERENCE (L3 cache miss)
2.6.12.4.orig      57427
2.6.12.4.nt        20858
20858/57427=36.32% (63.7% reduction)

L3 cache miss reduction of __copy_from_user_ll
samples  %
37408    65.1412  vmlinux                  __copy_from_user_ll
23        0.1103  vmlinux                  __copy_user_zeroing_intel_nocache
23/37408=0.061% (99.94% reduction)

Top 5 of 2.6.12.4.nt
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000
samples  %        app name                 symbol name
128392    8.0274  vmlinux                  __copy_user_zeroing_intel_nocache
64206     4.0143  vmlinux                  journal_add_journal_head
59746     3.7355  vmlinux                  do_get_write_access
47674     2.9807  vmlinux                  journal_put_journal_head
46021     2.8774  vmlinux                  journal_dirty_metadata
pattern9-0-cpu4-0-09011728/summary.out

Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) with a unit mask of 0x3f (multiple flags) count 3000
samples  %        app name                 symbol name
69755     4.2861  vmlinux                  __copy_user_zeroing_intel_nocache
55685     3.4215  vmlinux                  journal_add_journal_head
52371     3.2179  vmlinux                  __find_get_block
45504     2.7960  vmlinux                  journal_put_journal_head
36005     2.2123  vmlinux                  journal_stop
pattern9-0-cpu4-0-09011744/summary.out

Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) with a unit mask of 0x200 (read 3rd level cache miss) count 3000
samples  %        app name                 symbol name
1147      5.4994  vmlinux                  journal_add_journal_head
881       4.2240  vmlinux                  journal_dirty_data
872       4.1809  vmlinux                  blk_rq_map_sg
734       3.5192  vmlinux                  journal_commit_transaction
617       2.9582  vmlinux                  radix_tree_delete
pattern9-0-cpu4-0-09011731/summary.out

iozone results are

original 2.6.12.4 CPU time = 207.768 sec
cache aware       CPU time = 184.783 sec
(three times run)
184.783/207.768=88.94% (11.06% reduction)

original:
pattern9-0-cpu4-0-08191720/iozone.out:  CPU Utilization: Wall time   45.997    CPU time   64.527    CPU utilization 140.28 %
pattern9-0-cpu4-0-08191741/iozone.out:  CPU Utilization: Wall time   46.878    CPU time   71.933    CPU utilization 153.45 %
pattern9-0-cpu4-0-08191743/iozone.out:  CPU Utilization: Wall time   45.152    CPU time   71.308    CPU utilization 157.93 %

cache awre:
pattern9-0-cpu4-0-09011728/iozone.out:  CPU Utilization: Wall time   44.842    CPU time   62.465    CPU utilization 139.30 %
pattern9-0-cpu4-0-09011731/iozone.out:  CPU Utilization: Wall time   44.718    CPU time   59.273    CPU utilization 132.55 %
pattern9-0-cpu4-0-09011744/iozone.out:  CPU Utilization: Wall time   44.367    CPU time   63.045    CPU utilization 142.10 %

Signed-off-by: Hiro Yoshioka <hyoshiok@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:56 -07:00
Christoph Lameter
1b2db9fb7a [PATCH] sys_move_pages: 32bit support (i386, x86_64)
sys_move_pages() support for 32bit (i386 plus x86_64 compat layer)

Add support for move_pages() on i386 and also add the compat functions
necessary to run 32 bit binaries on x86_64.

Add compat_sys_move_pages to the x86_64 32bit binary layer.  Note that it is
not up to date so I added the missing pieces.  Not sure if this is done the
right way.

[akpm@osdl.org: compile fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:53 -07:00
Linus Torvalds
6c763eb9ea Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (27 commits)
  [PATCH] PCI: nVidia quirk to make AER PCI-E extended capability visible
  [PATCH] PCI: fix issues with extended conf space when MMCONFIG disabled because of e820
  [PATCH] PCI: Bus Parity Status sysfs interface
  [PATCH] PCI: fix memory leak in MMCONFIG error path
  [PATCH] PCI: fix error with pci_get_device() call in the mpc85xx driver
  [PATCH] PCI: MSI-K8T-Neo2-Fir: run only where needed
  [PATCH] PCI: fix race with pci_walk_bus and pci_destroy_dev
  [PATCH] PCI: clean up pci documentation to be more specific
  [PATCH] PCI: remove unneeded msi code
  [PATCH] PCI: don't move ioapics below PCI bridge
  [PATCH] PCI: cleanup unused variable about msi driver
  [PATCH] PCI: disable msi mode in pci_disable_device
  [PATCH] PCI: Allow MSI to work on kexec kernel
  [PATCH] PCI: AMD 8131 MSI quirk called too late, bus_flags not inherited ?
  [PATCH] PCI: Move various PCI IDs to header file
  [PATCH] PCI Bus Parity Status-broken hardware attribute, EDAC foundation
  [PATCH] PCI: i386/x86_84: disable PCI resource decode on device disable
  [PATCH] PCI ACPI: Rename the functions to avoid multiple instances.
  [PATCH] PCI: don't enable device if already enabled
  [PATCH] PCI: Add a "enable" sysfs attribute to the pci devices to allow userspace (Xorg) to enable devices without doing foul direct access
  ...
2006-06-22 15:07:59 -07:00
Bjorn Helgaas
4f1bcaf094 [PATCH] vgacon: make VGA_MAP_MEM take size, remove extra use
VGA_MAP_MEM translates to ioremap() on some architectures.  It makes sense
to do this to vga_vram_base, because we're going to access memory between
vga_vram_base and vga_vram_end.

But it doesn't really make sense to map starting at vga_vram_end, because
we aren't going to access memory starting there.  On ia64, which always has
to be different, ioremapping vga_vram_end gives you something completely
incompatible with ioremapped vga_vram_start, so vga_vram_size ends up being
nonsense.

As a bonus, we often know the size up front, so we can use ioremap()
correctly, rather than giving it a zero size.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22 15:05:58 -07:00
bibo,mao
b209a6ee49 [PATCH] PCI: cleanup unused variable about msi driver
In IA64 platform, msi driver does not use irq_vector variable, and in
x86 platform LAST_DEVICE_VECTOR should one before FIRST_SYSTEM_VECTOR,
this patch modify this.

Signed-off-by: bibo, mao <bibo.mao@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 12:00:00 -07:00
Mark Maule
fd58e55fcf [PATCH] PCI: msi abstractions and support for altix
Abstract portions of the MSI core for platforms that do not use standard
APIC interrupt controllers.  This is implemented through a new arch-specific
msi setup routine, and a set of msi ops which can be set on a per platform
basis.

Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 11:59:58 -07:00
David Woodhouse
c509075e71 Add Kbuild file for i386 'make headers_install'
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-06-18 12:19:46 +01:00
Len Brown
63518472c0 Pull trivial1 into release branch 2006-06-15 15:37:09 -04:00
David Woodhouse
66643de455 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	include/asm-powerpc/unistd.h
	include/asm-sparc/unistd.h
	include/asm-sparc64/unistd.h

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-24 09:22:21 +01:00
Kimball Murray
e0c1e9bf81 [PATCH] x86_64: avoid IRQ0 ioapic pin collision
The patch addresses a problem with ACPI SCI interrupt entry, which gets
re-used, and the IRQ is assigned to another unrelated device.  The patch
corrects the code such that SCI IRQ is skipped and duplicate entry is
avoided.  Second issue came up with VIA chipset, the problem was caused by
original patch assigning IRQs starting 16 and up.  The VIA chipset uses
4-bit IRQ register for internal interrupt routing, and therefore cannot
handle IRQ numbers assigned to its devices.  The patch corrects this
problem by allowing PCI IRQs below 16.

Cc: len.brown@intel.com

Signed-off by: Natalie Protasevich <Natalie.Protasevich@unisys.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-08 09:34:56 -07:00
David Woodhouse
b07019f293 Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-04-30 20:34:39 +01:00
Chuck Ebbert
543f2a3382 [PATCH] i386: fix broken FP exception handling
The FXSAVE information leak patch introduced a bug in FP exception
handling: it clears FP exceptions only when there are already
none outstanding.  Mikael Pettersson reported that causes problems
with the Erlang runtime and has tested this fix.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Acked-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-29 14:13:16 -07:00
David Woodhouse
5614253686 Remove unneeded _syscallX macros from user view in asm-*/unistd.h
These aren't needed by glibc or klibc, and they're broken in some cases
anyway. The uClibc folks are apparently switching over to stop using
them too (now that we agreed that they should be dropped, at least).

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-04-29 01:51:47 +01:00
David Woodhouse
d6754b401a Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2006-04-29 01:42:26 +01:00
Zachary Amsden
6e5882cfa2 [PATCH] x86/PAE: Fix pte_clear for the >4GB RAM case
Proposed fix for ptep_get_and_clear_full PAE bug.  Pte_clear had the same bug,
so use the same fix for both.  Turns out pmd_clear had it as well, but pgds
are not affected.

The problem is rather intricate.  Page table entries in PAE mode are 64-bits
wide, but the only atomic 8-byte write operation available in 32-bit mode is
cmpxchg8b, which is expensive (at least on P4), and thus avoided.  But it can
happen that the processor may prefetch entries into the TLB in the middle of an
operation which clears a page table entry.  So one must always clear the P-bit
in the low word of the page table entry first when clearing it.

Since the sequence *ptep = __pte(0) leaves the order of the write dependent on
the compiler, it must be coded explicitly as a clear of the low word followed
by a clear of the high word.  Further, there must be a write memory barrier
here to enforce proper ordering by the compiler (and, in the future, by the
processor as well).

On > 4GB memory machines, the implementation of pte_clear for PAE was clearly
deficient, as it could leave virtual mappings of physical memory above 4GB
aliased to memory below 4GB in the TLB.  The implementation of
ptep_get_and_clear_full has a similar bug, although not nearly as likely to
occur, since the mappings being cleared are in the process of being destroyed,
and should never be dereferenced again.

But, as luck would have it, it is possible to trigger bugs even without ever
dereferencing these bogus TLB mappings, even if the clear is followed fairly
soon after with a TLB flush or invalidation.  The problem is that memory above
4GB may now be aliased into the first 4GB of memory, and in fact, may hit a
region of memory with non-memory semantics.  These regions include AGP and PCI
space.  As such, these memory regions are not cached by the processor.  This
introduces the bug.

The processor can speculate memory operations, including memory writes, as long
as they are committed with the proper ordering.  Speculating a memory write to
a linear address that has a bogus TLB mapping is possible.  Normally, the
speculation is harmless.  But for cached memory, it does leave the falsely
speculated cacheline unmodified, but in a dirty state.  This cache line will be
eventually written back.  If this cacheline happens to intersect a region of
memory that is not protected by the cache coherency protocol, it can corrupt
data in I/O memory, which is generally a very bad thing to do, and can cause
total system failure or just plain undefined behavior.

These bugs are extremely unlikely, but the severity is of such magnitude, and
the fix so simple that I think fixing them immediately is justified.  Also,
they are nearly impossible to debug.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-27 12:00:59 -07:00
David Woodhouse
cd469e0cc6 Exclude asm-generic/{page,memory_model}.h from user bits of i386/x86_64 page.h
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-04-27 15:48:08 +01:00
David Woodhouse
62c4f0a2d5 Don't include linux/config.h from anywhere else in include/
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-04-26 12:56:16 +01:00
Jens Axboe
912d35f867 [PATCH] Add support for the sys_vmsplice syscall
sys_splice() moves data to/from pipes with a file input/output. sys_vmsplice()
moves data to a pipe, with the input being a user address range instead.

This uses an approach suggested by Linus, where we can hold partial ranges
inside the pages[] map. Hopefully this will be useful for network
receive support as well.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-26 10:59:21 +02:00
Andi Kleen
18bd057b14 [PATCH] i386/x86-64: Fix x87 information leak between processes
AMD K7/K8 CPUs only save/restore the FOP/FIP/FDP x87 registers in FXSAVE
when an exception is pending.  This means the value leak through
context switches and allow processes to observe some x87 instruction
state of other processes.

This was actually documented by AMD, but nobody recognized it as
being different from Intel before.

The fix first adds an optimization: instead of unconditionally
calling FNCLEX after each FXSAVE test if ES is pending and skip
it when not needed. Then do a x87 load from a kernel variable to
clear FOP/FIP/FDP.

This means other processes always will only see a constant value
defined by the kernel in their FP state.

I took some pain to make sure to chose a variable that's already
in L1 during context switch to make the overhead of this low.

Also alternative() is used to patch away the new code on CPUs
who don't need it.

Patch for both i386/x86-64.

The problem was discovered originally by Jan Beulich. Richard
Brunner provided the basic code for the workarounds, with contribution
from Jan.

This is CVE-2006-1056

Cc: richard.brunner@amd.com
Cc: jbeulich@novell.com

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-20 07:58:11 -07:00
lepton
1bb858f27e [PATCH] asm-i386/atomic.h: local_irq_save should be used instead of local_irq_disable
atomic_add_return() if CONFIG_M386 can accidentally enable local interrupts.

Signed-off-by: Lepton Wu <ytht.net@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-19 09:13:50 -07:00
Jens Axboe
70524490ee [PATCH] splice: add support for sys_tee()
Basically an in-kernel implementation of tee, which uses splice and the
pipe buffers as an intelligent way to pass data around by reference.

Where the user space tee consumes the input and produces a stdout and
file output, this syscall merely duplicates the data inside a pipe to
another pipe. No data is copied, the output just grabs a reference to the
input pipe data.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 15:51:17 +02:00
Linus Torvalds
88dd9c16ce Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] vfs: add splice_write and splice_read to documentation
  [PATCH] Remove sys_ prefix of new syscalls from __NR_sys_*
  [PATCH] splice: warning fix
  [PATCH] another round of fs/pipe.c cleanups
  [PATCH] splice: comment styles
  [PATCH] splice: add Ingo as addition copyright holder
  [PATCH] splice: unlikely() optimizations
  [PATCH] splice: speedups and optimizations
  [PATCH] pipe.c/fifo.c code cleanups
  [PATCH] get rid of the PIPE_*() macros
  [PATCH] splice: speedup __generic_file_splice_read
  [PATCH] splice: add direct fd <-> fd splicing support
  [PATCH] splice: add optional input and output offsets
  [PATCH] introduce a "kernel-internal pipe object" abstraction
  [PATCH] splice: be smarter about calling do_page_cache_readahead()
  [PATCH] splice: optimize the splice buffer mapping
  [PATCH] splice: cleanup __generic_file_splice_read()
  [PATCH] splice: only call wake_up_interruptible() when we really have to
  [PATCH] splice: potential !page dereference
  [PATCH] splice: mark the io page as accessed
2006-04-11 06:34:02 -07:00
Andrew Morton
491d4bed80 [PATCH] sys_kexec_load() naming fixups
__NR_sys_kexec_load should be __NR_kexec_load.  Mainly affects users of the
_syscallN() macros, and glibc is already checking for __NR_kexec_load.

Cc: Ulrich Drepper <drepper@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:42 -07:00
Yasunori Goto
c80d79d746 [PATCH] Configurable NODES_SHIFT
Current implementations define NODES_SHIFT in include/asm-xxx/numnodes.h for
each arch.  Its definition is sometimes configurable.  Indeed, ia64 defines 5
NODES_SHIFT values in the current git tree.  But it looks a bit messy.

SGI-SN2(ia64) system requires 1024 nodes, and the number of nodes already has
been changeable by config.  Suitable node's number may be changed in the
future even if it is other architecture.  So, I wrote configurable node's
number.

This patch set defines just default value for each arch which needs multi
nodes except ia64.  But, it is easy to change to configurable if necessary.

On ia64 the number of nodes can be already configured in generic ia64 and SN2
config.  But, NODES_SHIFT is defined for DIG64 and HP'S machine too.  So, I
changed it so that all platforms can be configured via CONFIG_NODES_SHIFT.  It
would be simpler.

See also: http://marc.theaimsgroup.com/?l=linux-kernel&m=114358010523896&w=2

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:39 -07:00
Randy Dunlap
dc8cbaed57 [PATCH] mptspec: remove duplicate #include
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:34 -07:00
OGAWA Hirofumi
7519fdc90f [PATCH] Remove sys_ prefix of new syscalls from __NR_sys_*
On i386, we don't use sys_ prefix for __NR_*. This patch removes it
[FWIW, _syscall*() macros will generate foo() instead of sys_foo().]

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 14:00:04 +02:00
Jordan Hargrave
b20367a6c2 [PATCH] x86_64: Fix drift with HPET timer enabled
If the HPET timer is enabled, the clock can drift by ~3 seconds a day.
This is due to the HPET timer not being initialized with the correct
setting (still using PIT count).

If HZ changes, this drift can become even more pronounced.

HPET patch initializes tick_nsec with correct tick_nsec settings for
HPET timer.

Vojtech comments:

  "It's not entirely correct (it assumes the HPET ticks totally
   exactly), but it's significantly better than assuming the PIT error
   there."

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09 11:53:53 -07:00
Andi Kleen
95d769aaf4 [PATCH] i386: Consolidate modern APIC handling
AMD systems have a modern APIC that supports 8 bit IDs, but
don't have a XAPIC version number.  Add a new "modern_apic"
subfunction that handles this correctly and use it (nearly)
everywhere where XAPIC is tested for.

I removed one wart: the code specified that external APICs
would use an 8bit APIC ID. But I checked a real 82093 data sheet
and it says clearly that they only use 4bit. So I removed
this special case since it would a bit awkward to implement now.

I removed the valid APIC tests in mptable parsing completely. On any modern
system they only check against the full field width (8bit) anyways
and are no-ops. This also fixes them doing the wrong thing
on >8 core Opterons.

This makes i386 boot again on 16 core Opterons.

Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09 11:53:51 -07:00
Arjan van de Ven
952223683e [PATCH] x86_64: Introduce e820_all_mapped
Introduce a e820_all_mapped() function which checks if the entire range
<start,end> is mapped with type.

This is done by moving the local start variable to the end of each
known-good region; if at the end of the function the start address is
still before end, there must be a part that's not of the correct type;
otherwise it's a good region.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09 11:53:50 -07:00
Ashok Raj
c12ea918ee x86_64: Remove stale lapic definition from apicdef.h
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-04-01 22:50:03 -05:00
Andrew Morton
2cf8d82d63 [PATCH] make local_t signed
local_t's were defined to be unsigned.  This increases confusion because
atomic_t's are signed.  The patch goes through and changes all implementations
to use signed longs throughout.

Also, x86-64 was using 32-bit quantities for the value passed into local_add()
and local_sub().  Fixed.

All (actually, both) existing users have been audited.

(Also s/__inline__/inline/ in x86_64/local.h)

Cc: Andi Kleen <ak@muc.de>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:55 -08:00
Andrew Morton
f79e2abb9b [PATCH] sys_sync_file_range()
Remove the recently-added LINUX_FADV_ASYNC_WRITE and LINUX_FADV_WRITE_WAIT
fadvise() additions, do it in a new sys_sync_file_range() syscall instead.
Reasons:

- It's more flexible.  Things which would require two or three syscalls with
  fadvise() can be done in a single syscall.

- Using fadvise() in this manner is something not covered by POSIX.

The patch wires up the syscall for x86.

The sycall is implemented in the new fs/sync.c.  The intention is that we can
move sys_fsync(), sys_fdatasync() and perhaps sys_sync() into there later.

Documentation for the syscall is in fs/sync.c.

A test app (sync_file_range.c) is in
http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz.

The available-to-GPL-modules do_sync_file_range() is for knfsd: "A COMMIT can
say NFS_DATA_SYNC or NFS_FILE_SYNC.  I can skip the ->fsync call for
NFS_DATA_SYNC which is hopefully the more common."

Note: the `async' writeout mode SYNC_FILE_RANGE_WRITE will turn synchronous if
the queue is congested.  This is trivial to fix: add a new flag bit, set
wbc->nonblocking.  But I'm not sure that we want to expose implementation
details down to that level.

Note: it's notable that we can sync an fd which wasn't opened for writing.
Same with fsync() and fdatasync()).

Note: the code takes some care to handle attempts to sync file contents
outside the 16TB offset on 32-bit machines.  It makes such attempts appear to
succeed, for best 32-bit/64-bit compatibility.  Perhaps it should make such
requests fail...

Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:54 -08:00