Commit graph

1021 commits

Author SHA1 Message Date
Andrew Morton
b912a1c737 x86: arch/x86/kernel/cpu/mcheck/ checkpatch fixes
#40: FILE: arch/x86/kernel/cpu/mcheck/k7.c:46:
+				snprintf (misc, 20, "[%08x%08x]", ahigh, alow);

WARNING: line over 80 characters
#45: FILE: arch/x86/kernel/cpu/mcheck/k7.c:50:
+				snprintf (addr, 24, " at %08x%08x", ahigh, alow);

WARNING: no space between function name and open parenthesis '('
#45: FILE: arch/x86/kernel/cpu/mcheck/k7.c:50:
+				snprintf (addr, 24, " at %08x%08x", ahigh, alow);

WARNING: no space between function name and open parenthesis '('
#48: FILE: arch/x86/kernel/cpu/mcheck/k7.c:52:
+			printk (KERN_EMERG "CPU %d: Bank %d: %08x%08x%s%s\n",

WARNING: no space between function name and open parenthesis '('
#65: FILE: arch/x86/kernel/cpu/mcheck/p4.c:161:
+		printk (KERN_DEBUG "CPU %d: EIP: %08x EFLAGS: %08x\n"

WARNING: no space between function name and open parenthesis '('
#88: FILE: arch/x86/kernel/cpu/mcheck/p4.c:182:
+				snprintf (misc, 20, "[%08x%08x]", ahigh, alow);

WARNING: line over 80 characters
#93: FILE: arch/x86/kernel/cpu/mcheck/p4.c:186:
+				snprintf (addr, 24, " at %08x%08x", ahigh, alow);

WARNING: no space between function name and open parenthesis '('
#93: FILE: arch/x86/kernel/cpu/mcheck/p4.c:186:
+				snprintf (addr, 24, " at %08x%08x", ahigh, alow);

WARNING: no space between function name and open parenthesis '('
#96: FILE: arch/x86/kernel/cpu/mcheck/p4.c:188:
+			printk (KERN_EMERG "CPU %d: Bank %d: %08x%08x%s%s\n",

WARNING: no space between function name and open parenthesis '('
#120: FILE: arch/x86/kernel/cpu/mcheck/p6.c:46:
+				snprintf (misc, 20, "[%08x%08x]", ahigh, alow);

WARNING: line over 80 characters
#125: FILE: arch/x86/kernel/cpu/mcheck/p6.c:50:
+				snprintf (addr, 24, " at %08x%08x", ahigh, alow);

WARNING: no space between function name and open parenthesis '('
#125: FILE: arch/x86/kernel/cpu/mcheck/p6.c:50:
+				snprintf (addr, 24, " at %08x%08x", ahigh, alow);

WARNING: no space between function name and open parenthesis '('
#128: FILE: arch/x86/kernel/cpu/mcheck/p6.c:52:
+			printk (KERN_EMERG "CPU %d: Bank %d: %08x%08x%s%s\n",

total: 0 errors, 13 warnings, 100 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Min Zhang <mzhang@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:32:12 +01:00
Min Zhang
9e8b6d90ac arch/x86/kernel/cpu/mcheck/p4.c: cleanups
SMP, the machine check exception dispatches all logical processors within a
physical package to the machine-check exception handler, so the printk
within each handler outputs concurrently and makes the output unreadable.
Refer to Intel system programming guide Part 1 Section 7.8.5
http://developer.intel.com/design/processor/manuals/253668.pdf

Signed-off-by: Min Zhang <mzhang@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:32:11 +01:00
Huang, Ying
8b2cb7a8f5 x86: 32-bit EFI runtime service support: fixes in sync with 64-bit support
support according to fixes of x86_64 support.

- Delete efi_rt_lock because it is used during system early boot,
  before SMP is initialized.

- Change local_flush_tlb() to __flush_tlb_all() to flush global page
  mapping.

- Clean up includes.

- Revise Kconfig description.

- Enable noefi kernel parameter on i386.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:32:11 +01:00
Glauber de Oliveira Costa
bfd074e05b replace x86_read/write_per_cpu with a common function.
x86_read_per_cpu() and its writeish sister are not present in x86_64. So in
this patch, we replace them with __get_cpu_var(), which is present in both

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:32:11 +01:00
Glauber de Oliveira Costa
53fd13cff0 x86: patching functions on 64-bit
Like i386, x86_64 also need to include its own patching function.
(Well, if you're not in a hurry, and don't care about speed, you don't
really _need_ ;-))

So here they are. Not much different in essence from i386

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:32:10 +01:00
Glauber de Oliveira Costa
2f485ef568 x86: move patching code to arch-specific file.
The core patching code for paravirt is sufficiently different
among i386 and x86_64, and we move them to specific files.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:32:10 +01:00
Glauber de Oliveira Costa
72fe485854 x86: replace privileged instructions with paravirt macros
The assembly code in entry_64.S issues a bunch of privileged instructions,
like cli, sti, swapgs, and others. Paravirt guests are forbidden to do so,
and we then replace them with macros that will do the right thing.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:32:08 +01:00
Glauber de Oliveira Costa
e801f864ec x86: adds paravirt hook for swapgs
This patch adds paravirt hook for swapgs operation, which is a privileged
operation in x86_64.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:32:08 +01:00
Glauber de Oliveira Costa
e5aaac4436 x86: provide paravirtualized hook for rdtscp
This patch adds a field in pv_cpu_ops for a paravirtualized hook
for rdtscp, needed for x86_64.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:32:05 +01:00
Glauber de Oliveira Costa
b1df07bd66 x86: change paravirt_32.c name
This patch changes paravirt_32.c to paravirt.c. The goal
is to have paravirt support in x86_64, so we do it in a common file

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:32:04 +01:00
Markus Metzger
cba4b65d35 x86, ptrace: add buffer size checks
Pass the buffer size for (most) ptrace commands that pass user-allocated buffers and check that size before accessing the buffer. Unfortunately, PTRACE_BTS_GET already uses all 4 parameters.
Commands that access user buffers return the number of bytes or records read or written.

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:32:03 +01:00
Markus Metzger
e6ae5d9540 x86, ptrace: support 32bit-cross-64bit BTS recording
Support BTS recording of 32bit and 64bit tasks from 32bit or 64bit tasks.

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:32:03 +01:00
Markus Metzger
da35c37198 x86, ptrace: rlimit BTS buffer allocation
Check the rlimit of the tracing task for total and locked memory when allocating the BTS buffer.

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:32:03 +01:00
Masami Hiramatsu
59e87cdcd2 x86: move deeply indented code to reenter_kprobe
Move some deeply indented code related to re-entrance processing
from kprobe_handler() to reenter_kprobe().

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:32:02 +01:00
Harvey Harrison
40102d4a41 x86: add reenter_kprobe helper
[ mhiramat@redhat.com: updated it to latest x86.git ]

Factor common X86_32, X86_64 kprobe reenter logic from deeply
indented section to helper function.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
2008-01-30 13:32:02 +01:00
Masami Hiramatsu
ddc66df876 x86: fix kprobe_handler reenable preemption
Fix a preemption bug in kprobe_handler(). It has to call preempt_enable()
before returning.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:32:01 +01:00
Venki Pallipadi
bde6f5f59c x86: voluntary leave_mm before entering ACPI C3
Aviod TLB flush IPIs during C3 states by voluntary leave_mm()
before entering C3.

The performance impact of TLB flush on C3 should not be significant with
respect to C3 wakeup latency. Also, CPUs tend to flush TLB in hardware while in
C3 anyways.

On a 8 logical CPU system, running make -j2, the number of tlbflush IPIs goes
down from 40 per second to ~ 0. Total number of interrupts during the run
of this workload was ~1200 per second, which makes it ~3% savings in wakeups.

There was no measurable performance or power impact however.

[ akpm@linux-foundation.org: symbol export fixes. ]

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:32:01 +01:00
Ingo Molnar
7d409d6057 x86: add some pirq debugging
we use a few static mapping rules in our pirq routing functions,
and for example regression f3ac84324f was due to the pirq
being out of range of the remapping array. Put in a few
WARN_ON_ONCE() lines so that we get notified about any such
out-of-bound incidents.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:32:00 +01:00
Gary Hade
9f8daccaa0 PCI: remove default PCI expansion ROM memory allocation
increasing number of PCI slots in large multi-node systems.  The kernel
currently attempts by default to allocate memory for all PCI expansion
ROMs so there has also been an increasing number of PCI memory
allocation failures seen on these systems.  This occurs because the BIOS
either (1) provides insufficient PCI memory resource for all the
expansion ROMs or (2) provides adequate PCI memory resource for
expansion ROMs but provides the space in kernel unexpected BIOS assigned
P2P non-prefetch windows.

The resulting PCI memory allocation failures may be benign when related
to memory requests for expansion ROMs themselves but in some cases they
can occur when attempting to allocate space for more critical BARs.
This can happen when a successful expansion ROM allocation request
consumes memory resource that was intended for a non-ROM BAR.  We have
seen this happen during PCI hotplug of an adapter that contains a P2P
bridge where successful memory allocation for an expansion ROM BAR on
device behind the bridge consumed memory that was intended for a non-ROM
BAR on the P2P bridge.  In all cases the allocation failure messages can
be very confusing for users.

This patch addresses the issue by changing the kernel default behavior
so that expansion ROM memory allocations are no longer attempted by
default when the BIOS has not assigned a specific address range to the
expansion ROM BAR.  This was done by changing the 'pci=rom' boot option
behavior for BIOS unassigned expansion ROMs to actually match it's
current kernel-parameters.txt description which already implies "off" by
default. Behavior for BIOS assigned expansion ROMs implemented in
pcibios_assign_resources() [arch/x86/pci/i386.c] is unchanged.

Signed-off-by: Gary Hade <garyhade@us.ibm.com>
Cc: Greg KH <greg@kroah.com>
Cc: Jan Beulich <jbeulich@novell.com>
Acked-by: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:59 +01:00
Roland McGrath
fdadd54db5 x86: x86 ptrace generic requests
This removes duplicated code by calling the generic ptrace_request and
compat_ptrace_request functions for the things they already handle.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:56 +01:00
Roland McGrath
bb61682b3f x86: x86 core dump TLS
This makes ELF core dumps of 32-bit processes include a new
note type NT_386_TLS (0x200) giving the contents of the TLS
slots in struct user_desc format.  This lets post mortem
examination figure out what the segment registers mean like
the debugger does with get_thread_area on a live process.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:56 +01:00
Roland McGrath
a06b24e8bf x86: x86 ia32_binfmt removal
Remove the old ia32_binfmt.c file, which is no longer used.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:56 +01:00
Roland McGrath
a97f52e678 x86: compat_binfmt_elf
This switches x86-64's 32-bit ELF support to use the shared
fs/compat_binfmt_elf.c code instead of our own ia32_binfmt.c.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:55 +01:00
Roland McGrath
60b3b9af35 x86: x86 user_regset cleanup
This removes a bunch of dead code that is no longer needed now
that the user_regset interfaces are being used for all these jobs.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:55 +01:00
Roland McGrath
5a4646a4ef x86: x86 ptrace user_regset
This cleans up the PTRACE_*REGS* request code so each one is just a
simple call to copy_regset_to_user or copy_regset_from_user.  The
ptrace layouts already match the user_regset formats (core dump formats).

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:54 +01:00
Roland McGrath
070459d95e x86: x86 user_regset_view
This defines task_user_regset_view and the tables
describing the x86 user_regset layouts for 32 and 64.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:53 +01:00
Roland McGrath
91e7b707a4 x86: x86 user_regset general regs
This adds accessor functions in the user_regset style for
the general registers (struct user_regs_struct).

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:52 +01:00
Roland McGrath
4c79a2d8e5 x86: x86 user_regset TLS
This adds accessor functions in the user_regset style for the TLS data.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:52 +01:00
Roland McGrath
1bd5718ce5 x86: x86 TLS desc_struct cleanup
This cleans up the TLS code to use struct desc_struct and to separate the
encoding and installation magic from the interface wrappers.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:51 +01:00
Roland McGrath
1eeaed7679 x86: x86 i387 cleanup
This removes all the old code that is no longer used after
the i387 unification and cleanup.  The i387_64.h is renamed
to i387.h with no changes, but since it replaces the nonempty
one-line stub i387.h it looks like a big diff and not a rename.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:51 +01:00
Roland McGrath
4421011120 x86: x86 i387 user_regset
This revamps the i387 code to be shared across 32-bit, 64-bit,
and 32-on-64.  It does so by consolidating the code in one place
based on the user_regset accessor interfaces.  This switches
32-bit to using the i387_64.h header and 64-bit to using the
i387.c that was previously i387_32.c, but that's what took the
least cleanup in each file.  Here i387.h is stubbed to always
include i387_64.h rather than renaming the file, to keep this
diff smaller and easier to read.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:50 +01:00
Roland McGrath
b7b71725fb x86: i387 renaming
This renames arch/x86/kernel/{i387_32.c => i387.c}.
This is a pure renaming, but paves the way for merging
the 32-bit and 64-bit versions of this code.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:50 +01:00
Roland McGrath
ff0ebb23c6 x86: x86 user_regset math_emu
This converts the ptrace/signal accessors for i387 math_emu
state to the user_regset interface style, and calls these
from the old interfaces.

It also cleans up math_emulate's ptrace check to be a
single-step check, which is what it really wants.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:49 +01:00
Roland McGrath
cc927a25bd x86: x86 i387 header cleanup
This moves some code into asm-x86/i387_64.h in preparation for
unifying this code between 32 and 64.  The 32-bit versions of
some things are copied in some existing names changed to match
32-bit names and share code.  For 64, save_i387 is moved into
an inline from i387_64.c; this matches restore_i387, which is
already an inline, and makes sense since there is exactly one
caller (in signal_64.c).  The save_i387 function could use more
cosmetic cleanup, but it is just moved verbatim in this patch.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:49 +01:00
Harvey Harrison
e7b5e11eaa x86: kprobes leftover cleanups
Eliminate __always_inline, all of these static functions are
only called once.  Minor whitespace cleanup.  Eliminate one
supefluous return at end of void function.  Change the one
#ifndef to #ifdef to match the sense of the rest of the config
tests.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:43 +01:00
Joe Perches
ab4a574ef2 arch/x86/: spelling fixes
Spelling fixes.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:42 +01:00
Harvey Harrison
85f2adf169 x86: use helper in fault_64.c
Use the fixup_exception() helper in fault_64.c

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:42 +01:00
Harvey Harrison
6d48583ba9 x86: unify extable_{32|64}.c
Introduce fixup_exception() on 64-bit and use it in kprobes to
eliminate an #ifdef.

Only 64-bit needs search_extable() due to a stepping bug.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:41 +01:00
Glauber de Oliveira Costa
1a53905add x86: move definitions to processor.h
This patch moves definitions that are present in only one of the files
(between processor_32.h and processor_64.h), to processor.h. They're mostly
structures and function definitions.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:39 +01:00
Glauber de Oliveira Costa
5300db887e x86: unify x86_cpuinfo struct.
x86_cpuinfo is one more to the family of "not fundamentally different"
structs. It's unified in processor.h, with very specific fields enclosed
around ifdefs.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:33 +01:00
Glauber de Oliveira Costa
7818a1e029 x86: provide 64-bit with a load_sp0 function.
Paravirt guests need to inform the underlying hypervisor whenever the sp0
tss field changes. i386 already has such a function, and we use it for
x86_64 too. There's an unnecessary (for 64-bit) msr handling part in the original
version, and it is placed around an ifdef. Making no more sense in
processor_32.h, it is moved to the common header

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:31 +01:00
Glauber de Oliveira Costa
ca241c7503 x86: unify tss_struct
Although slighly different, the tss_struct is very similar in x86_64 and
i386. The really different part, which matchs the hardware vision of it, is
now called x86_hw_tss, and each of the architectures provides yours.
It's then used as a field in the outter tss_struct.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:31 +01:00
Glauber de Oliveira Costa
053de04441 x86: get rid of _MASK flags
There's no need for the *_MASK flags (TF_MASK, IF_MASK, etc), found in
processor.h (both _32 and _64). They have a one-to-one mapping with the
EFLAGS value. This patch removes the definitions, and use the already
existent X86_EFLAGS_ version when applicable.

[ roland@redhat.com: KVM build fixes. ]

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:27 +01:00
Neil Horman
7bcbc78dea x86: clean up arch/x86/kernel/early-quirks.c
clean up checkpatch errors. No code changed.

      text    data     bss     dec     hex filename
       705     120       0     825     339 early-quirks.o.before
       705     120       0     825     339 early-quirks.o.after

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:26 +01:00
Cyrill Gorcunov
3b095a04e7 x86: cleanup i387_32.c according to checkpatch
clean up checkpatch warnings/errors on i387_32.c

The old and new i387_32.s (asm listings) were checked with diff to
be identical so it's safe to apply this patch.

Signed-off-by: Cyrill Gorcunov <gorunov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:26 +01:00
Neil Horman
c6b4832432 x86, kexec: force x86 arches to boot kdump kernels on boot cpu
Recently a kdump bug was discovered in which a system would hang inside
calibrate_delay during the booting of the kdump kernel.  This was caused
by the fact that the jiffies counter was not being incremented during
timer calibration.  The root cause of this problem was found to be a
bios misconfiguration of the hypertransport bus.  On system affected by
this hang, the bios had assigned APIC ids which used extended apic bits
(more than the nominal 4 bit ids's), but failed to configure bit 17 of
the hypertransport transaction config register, which indicated that the
mask for the destination field of interrupt packets accross the ht bus
(see section 3.3.9 of
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/26094.PDF).
If a crash occurs on a cpu with an APIC id that extends beyond 4 bits,
it will not recieve interrupts during the kdump kernel boot, and this
hang will be the result.  The fix is to add this patch, whcih add an
early pci quirk check, to forcibly enable this bit in the httcfg
register.  This enables all cpus on a system to receive interrupts, and
allows kdump kernel bootup to procede normally.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:25 +01:00
Jan Beulich
e94271017f x86: adjust enable_NMI_through_LVT0()
Its previous use in a call to on_each_cpu() was pointless, as at the
time that code gets executed only one CPU is online. Further, the
function can be __cpuinit, and for this to work without
CONFIG_HOTPLUG_CPU setup_nmi() must also get an attribute (this one
can even be __init; on 64-bits check_timer() also was lacking that
attribute).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:24 +01:00
Jan Beulich
cae4595764 x86: make __{save,restore}_processor_state static
.. allowing to remove their declarations from a global include file
(the symbols don't exist for anything but x86).

Likewise for 64-bits' fix_processor_context(), just that that one was
properly declared in an arch-specific header.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:23 +01:00
Jan Beulich
3e7622f9d7 x86: move to .rodata/.init.data
The array is never written, and on 64-bits it's not even being used
past initial boot.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:23 +01:00
Jan Beulich
22f5991c85 x86-64: honor notify_die() returning NOTIFY_STOP
This requires making die() return a value, making its callers honor
this (and be prepared that it may return), and making oops_end() have
two additional parameters.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:23 +01:00
Jan Beulich
bdb4f15606 i386: hard_{en,dis}able_TSC can be static
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:21 +01:00
Masami Hiramatsu
d6be29b871 x86: kprobes code for x86 unification
This patch unifies kprobes code.

- Unify kprobes_*.h to kprobes.h
- Unify kprobes_*.c to kprobes.c
  (Differences are separated by ifdefs)
 - Most differences are related to REX prefix and rip relatives.
 - Two inline assembly code are different.
 - One difference in kprobe_handlre()
 - One fixup exception code is different, but it will be unified
   if mm/extable_*.c are unified.
- Merge history logs into arch/x86/kernel/kprobes.c.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:21 +01:00
Masami Hiramatsu
8533bbe9f8 x86: prepare kprobes code for x86 unification
This patch cleanup kprobes code on x86 for unification.
This patch is based on Arjan's previous work.

- Remove spurious whitespace changes
- Add harmless includes
- Make the 32/64 files more identical
 - Generalize structure fields' and local variable name.
 - Wrap accessing to stack address by macros.
 - Modify bitmap making macro.
 - Merge fixup code into is_riprel() and change its name to fix_riprel().
 - Set MAX_INSN_SIZE to 16 on both arch.
 - Use u32 for bitmaps on both architectures.
 - Clarify some comments.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:21 +01:00
Masami Hiramatsu
da07ab0375 x86: return probe-booster for x86-64
This patch adds kretprobe-booster to kprobes_64.c.

- Changes are based on x86-32.
- Rewrite register saving/restoring code

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:21 +01:00
Masami Hiramatsu
aa470140e8 x86: kprobe-booster for x86-64
This patch adds kprobe-booster to kprobes_64.c.

- Changes are based on x86-32.
- Add REX prefix checking code.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:21 +01:00
Nick Piggin
314cdbefd1 x86: FIFO ticket spinlocks
Introduce ticket lock spinlocks for x86 which are FIFO. The implementation
is described in the comments. The straight-line lock/unlock instruction
sequence is slightly slower than the dec based locks on modern x86 CPUs,
however the difference is quite small on Core2 and Opteron when working out of
cache, and becomes almost insignificant even on P4 when the lock misses cache.
trylock is more significantly slower, but they are relatively rare.

On an 8 core (2 socket) Opteron, spinlock unfairness is extremely noticable,
with a userspace test having a difference of up to 2x runtime per thread, and
some threads are starved or "unfairly" granted the lock up to 1 000 000 (!)
times. After this patch, all threads appear to finish at exactly the same
time.

The memory ordering of the lock does conform to x86 standards, and the
implementation has been reviewed by Intel and AMD engineers.

The algorithm also tells us how many CPUs are contending the lock, so
lockbreak becomes trivial and we no longer have to waste 4 bytes per
spinlock for it.

After this, we can no longer spin on any locks with preempt enabled
and cannot reenable interrupts when spinning on an irq safe lock, because
at that point we have already taken a ticket and the would deadlock if
the same CPU tries to take the lock again.  These are questionable anyway:
if the lock happens to be called under a preempt or interrupt disabled section,
then it will just have the same latency problems. The real fix is to keep
critical sections short, and ensure locks are reasonably fair (which this
patch does).

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:31:21 +01:00
Nick Piggin
95c354fe9f spinlock: lockbreak cleanup
The break_lock data structure and code for spinlocks is quite nasty.
Not only does it double the size of a spinlock but it changes locking to
a potentially less optimal trylock.

Put all of that under CONFIG_GENERIC_LOCKBREAK, and introduce a
__raw_spin_is_contended that uses the lock data itself to determine whether
there are waiters on the lock, to be used if CONFIG_GENERIC_LOCKBREAK is
not set.

Rename need_lockbreak to spin_needbreak, make it use spin_is_contended to
decouple it from the spinlock implementation, and make it typesafe (rwlocks
do not have any need_lockbreak sites -- why do they even get bloated up
with that break_lock then?).

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:20 +01:00
Markus Metzger
a95d67f87e x86, ptrace: new ptrace BTS API
Here's the new ptrace BTS API that supports two different overflow handling mechanisms (wrap-around and buffer-full-signal) to support two different use cases (debugging and profiling).

It further combines buffer allocation and configuration.

Opens:
- memory rlimit
- overflow signal

What would be the right signal to use?

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:20 +01:00
Markus Metzger
e4811f2568 x86, ptrace: change BTS GET ptrace interface
Change the ptrace interface to mimick an array from newst to oldest.

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:20 +01:00
Markus Metzger
3c68904fee x86, ptrace: use jiffies for BTS timestamps
Replace sched_clock() with jiffies for BTS timestamps.

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:20 +01:00
Markus Metzger
2f4aaf53c2 x86, ptrace: remove bad comment
Remove no longer correct comment.

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:20 +01:00
Huang, Ying
2215e69d2c x86 boot: use E820 memory map on EFI 32 platform
Because the EFI memory map are converted to e820 memory map in bootloader, the
EFI memory map handling code is removed to clean up.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:19 +01:00
Huang, Ying
e429795c68 x86: EFI runtime service support: remove duplicated code from efi_32.c
This patch removes the duplicated code between efi_32.c and efi.c.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:19 +01:00
Huang, Ying
de18c850af x86: EFI runtime service support: EFI runtime services
This patch adds support for several EFI runtime services for EFI x86_64
system.

The EFI support for emergency_restart is added.

Signed-off-by: Chandramouli Narayanan <mouli@linux.intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:19 +01:00
Huang, Ying
5b83683f32 x86: EFI runtime service support
This patch adds basic runtime services support for EFI x86_64 system.  The
main file of the patch is the addition of efi_64.c for x86_64.  This file is
modeled after the EFI IA32 avatar.  EFI runtime services initialization are
implemented in efi_64.c.  Some x86_64 specifics are worth noting here.  On
x86_64, parameters passed to EFI firmware services need to follow the EFI
calling convention.  For this purpose, a set of functions named efi_call<x>
(<x> is the number of parameters) are implemented.  EFI function calls are
wrapped before calling the firmware service.  The duplicated code between
efi_32.c and efi_64.c is placed in efi.c to remove them from efi_32.c.

Signed-off-by: Chandramouli Narayanan <mouli@linux.intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:19 +01:00
Daniel Walker
8c8b8859b6 mcheck mce_64: mce_read_sem to mutex
Converted to a mutex, and changed the name to mce_read_mutex.

Signed-off-by: Daniel Walker <dwalker@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:17 +01:00
Aaron Durbin
fa20efd2fc x86: add ACPI reboot option
Add the ability to reboot an x86_64 based machine using the RESET_REG in the
FADT ACPI table.

Signed-off-by: Aaron Durbin <adurbin@google.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:17 +01:00
Harvey Harrison
75604d7f7f x86: remove all definitions with fastcall
fastcall is always defined to be empty, remove it from arch/x86

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:17 +01:00
Harvey Harrison
6b7d190b14 x86: remove last users of FASTCALL
FASTCALL() is always empty.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:16 +01:00
Glauber de Oliveira Costa
507f90c9f9 x86: move _set_gate and its users to a common location
This patch moves _set_gate and its users to desc.h. We can now
use common code for x86_64 and i386.

[ mingo@elte.hu: set_system_gate() fixes for nasty crashes. ]

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:14 +01:00
Glauber de Oliveira Costa
cc6978528c x86: modify get_desc_base
This patch makes get_desc_base() receive a struct desc_struct,
and then uses its internal fields to compute the base address.
This is done at both i386 and x86_64, and then it is moved
to common header

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:14 +01:00
Glauber de Oliveira Costa
75b8bb3e56 x86: change write_ldt_entry signature
this patch changes the signature of write_ldt_entry.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
CC: Zachary Amsden <zach@vmware.com>
CC: Jeremy Fitzhardinge <Jeremy.Fitzhardinge.citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:13 +01:00
Glauber de Oliveira Costa
014b15be30 x86: change write_gdt_entry signature.
This patch changes the write_gdt_entry function signature.
Instead of the old "a" and "b" parameters, it now receives
a pointer to a desc_struct, and the size of the entry being
handled. This is because x86_64 can have some 16-byte entries
as well as 8-byte ones.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
CC: Zachary Amsden <zach@vmware.com>
CC: Jeremy Fitzhardinge <Jeremy.Fitzhardinge.citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:13 +01:00
Glauber de Oliveira Costa
80fbb69a8d x86: introduce fill_ldt
This patch introduces fill_ldt(), which populates a ldt descriptor
from a user_desc in once, instead of relying in the LDT_entry_a and
LDT_entry_b macros

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:13 +01:00
Glauber de Oliveira Costa
5af725026f x86: modify write_ldt function
This patch modifies the write_ldt() function to make use
of the new struct desc_struct instead of entry_1 and entry_2
entries

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:13 +01:00
Glauber de Oliveira Costa
8d947344c4 x86: change write_idt_entry signature
this patch changes write_idt_entry signature. It now takes a gate_desc
instead of the a and b parameters. It will allow it to be later unified
between i386 and x86_64.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
CC: Zachary Amsden <zach@vmware.com>
CC: Jeremy Fitzhardinge <Jeremy.Fitzhardinge.citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:12 +01:00
Glauber de Oliveira Costa
010d4f8221 x86: introduce gate_desc type.
To account for the differences in gate descriptor in i386 and x86_64
a gate_desc type is introduced.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:12 +01:00
Glauber de Oliveira Costa
f6dc247cac x86: change gdt acessor macro name
This patch changes the name of x86_64 macro used to access the per-cpu
gdt. It is now equal to the i386 version, which will allow code to be shared.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:12 +01:00
Glauber de Oliveira Costa
6b68f01baa x86: unify struct desc_ptr
This patch unifies struct desc_ptr between i386 and x86_64.
They can be expressed in the exact same way in C code, only
having to change the name of one of them. As Xgt_desc_struct
is ugly and big, this is the one that goes away.

There's also a padding field in i386, but it is not really
needed in the C structure definition.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:12 +01:00
Glauber de Oliveira Costa
6842ef0e85 x86: unify desc_struct
This patch aims to make the access of struct desc_struct variables
equal across architectures. In this patch, I unify the i386 and x86_64
versions under an anonymous union, keeping the way they are accessed
untouched (a and b for 32-bit code, individual bit-fields for 64-bit).

This solution is not beautiful, but will allow us to integrate common
code that differed by the way descriptors were used. This is to be viewed
incrementally. There's simply too much code to be fixed at once.

In the future, goal is to set up in a single way of acessing
the desc_struct fields.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:11 +01:00
Glauber de Oliveira Costa
746ef0cd0c x86: prepare 64-bit architecture initialization for paravirt
This patch prepares the x86_64 architecture initialization for
paravirt. It requires a memory initialization step, which is done
by implementing 64-bit version for machine_specific_memory_setup,
and putting an ARCH_SETUP hook, for guest-dependent initialization.
This last step is done akin to i386

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:11 +01:00
Glauber de Oliveira Costa
ba082427ae x86: tweak io_64.h for paravirt.
We need something here because we can't call in and out instructions
directly. However, we have to be careful, because no indirections are
allowed in misc_64.c , and paravirt_ops is a kind of one. So just
call it directly there

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:10 +01:00
Glauber de Oliveira Costa
ee238e5ca6 x86: prepare time related functions for paravirt
This patch add provisions for time related functions so they
can be later replaced by paravirt versions.

it basically encloses {g,s}et_wallclock inside the
already existent functions update_persistent_clock and
read_persistent_clock, and defines {s,g}et_wallclock
to the core of such functions.

it also allow for a later-on-game time initialization, as done
by i386. Paravirt guests can set a function to do their own
initialization this way.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:10 +01:00
Glauber de Oliveira Costa
49a697871e x86: turn priviled operation into a macro in head_64.S
under paravirt, read cr2 cannot be issued directly anymore.
So wrap it in a macro, defined to the operation itself in case
paravirt is off, but to something else if we have paravirt
in the game

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:10 +01:00
Glauber de Oliveira Costa
70fd93c9d9 x86: export cpu_gdt_descr
With paravirualization, hypervisors needs to handle the gdt,
that was right to this point only used at very early
inialization code. Hypervisors (lguest being the current case)
are commonly modules, so make it an export

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:10 +01:00
Glauber de Oliveira Costa
21db5584f9 x86: export math_state_restore
Export math_state_restore symbol, so it can be used for hypervisors.
They are commonly loaded as modules (lguest being an example).

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:10 +01:00
Harvey Harrison
675a081360 x86: unify mmap_{32|64}.c
mmap_is_ia32 always true for X86_32, or while emulating IA32 on X86_64

Randomization not supported on X86_32 in legacy layout.  Both layouts allow
randomization on X86_64.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:10 +01:00
Jeremy Fitzhardinge
f3f20de87c x86: clean up mm/init_32.c
Some code reformatting in init_32.c.  No functional change.

Signed-off-by: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:09 +01:00
Jeremy Fitzhardinge
27ec161f93 x86: kill mk_pte_huge
It only has a single use, which can be trivially replaced.

Signed-off-by: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:09 +01:00
Markus Metzger
eee3af4a2c x86, ptrace: support for branch trace store(BTS)
Resend using different mail client

Changes to the last version:
- split implementation into two layers: ds/bts and ptrace
- renamed TIF's
- save/restore ds save area msr in __switch_to_xtra()
- make block-stepping only look at BTF bit

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:09 +01:00
Glauber de Oliveira Costa
d89542229b x86: put together equal pieces of system.h
This patch puts together pieces of system_{32,64}.h that
looks like the same. It's the first step towards integration
of this file.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:08 +01:00
Joerg Roedel
929fd58913 x86: some whitespace cleanups in paging code
This patch does some whitespace cleanups in the paging code to fix some
checkpatch.pl warnings of my formerly merged cleanup patches.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:08 +01:00
Andrew Morton
954683a2c1 x86: PIE executable randomization, uninlining
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:07 +01:00
Andrew Morton
bb1ad8205b x86: PIE executable randomization, checkpatch fixes
#39: FILE: arch/ia64/ia32/binfmt_elf32.c:229:
+elf32_map (struct file *filep, unsigned long addr, struct elf_phdr *eppnt, int prot, int type, unsigned long unused)

WARNING: no space between function name and open parenthesis '('
#39: FILE: arch/ia64/ia32/binfmt_elf32.c:229:
+elf32_map (struct file *filep, unsigned long addr, struct elf_phdr *eppnt, int prot, int type, unsigned long unused)

WARNING: line over 80 characters
#67: FILE: arch/x86/kernel/sys_x86_64.c:80:
+			new_begin = randomize_range(*begin, *begin + 0x02000000, 0);

ERROR: use tabs not spaces
#110: FILE: arch/x86/kernel/sys_x86_64.c:185:
+ ^I        mm->cached_hole_size = 0;$

ERROR: use tabs not spaces
#111: FILE: arch/x86/kernel/sys_x86_64.c:186:
+ ^I^Imm->free_area_cache = mm->mmap_base;$

ERROR: use tabs not spaces
#112: FILE: arch/x86/kernel/sys_x86_64.c:187:
+ ^I}$

ERROR: use tabs not spaces
#141: FILE: arch/x86/kernel/sys_x86_64.c:216:
+ ^I^I/* remember the largest hole we saw so far */$

ERROR: use tabs not spaces
#142: FILE: arch/x86/kernel/sys_x86_64.c:217:
+ ^I^Iif (addr + mm->cached_hole_size < vma->vm_start)$

ERROR: use tabs not spaces
#143: FILE: arch/x86/kernel/sys_x86_64.c:218:
+ ^I^I        mm->cached_hole_size = vma->vm_start - addr;$

ERROR: use tabs not spaces
#157: FILE: arch/x86/kernel/sys_x86_64.c:232:
+  ^Imm->free_area_cache = TASK_UNMAPPED_BASE;$

ERROR: need a space before the open parenthesis '('
#291: FILE: arch/x86/mm/mmap_64.c:101:
+	} else if(mmap_is_legacy()) {

WARNING: braces {} are not necessary for single statement blocks
#302: FILE: arch/x86/mm/mmap_64.c:112:
+	if (current->flags & PF_RANDOMIZE) {
+		mm->mmap_base += ((long)rnd) << PAGE_SHIFT;
+	}

WARNING: line over 80 characters
#314: FILE: fs/binfmt_elf.c:48:
+static unsigned long elf_map (struct file *, unsigned long, struct elf_phdr *, int, int, unsigned long);

WARNING: no space between function name and open parenthesis '('
#314: FILE: fs/binfmt_elf.c:48:
+static unsigned long elf_map (struct file *, unsigned long, struct elf_phdr *, int, int, unsigned long);

WARNING: line over 80 characters
#429: FILE: fs/binfmt_elf.c:438:
+					   eppnt, elf_prot, elf_type, total_size);

ERROR: need space after that ',' (ctx:VxV)
#480: FILE: fs/binfmt_elf.c:939:
+				elf_prot, elf_flags,0);
 				                   ^

total: 9 errors, 7 warnings, 461 lines checked
Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:07 +01:00
Jiri Kosina
cc503c1b43 x86: PIE executable randomization
main executable of (specially compiled/linked -pie/-fpie) ET_DYN binaries
onto a random address (in cases in which mmap() is allowed to perform a
randomization).

The code has been extraced from Ingo's exec-shield patch
http://people.redhat.com/mingo/exec-shield/

[akpm@linux-foundation.org: fix used-uninitialsied warning]
[kamezawa.hiroyu@jp.fujitsu.com: fixed ia32 ELF on x86_64 handling]

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:07 +01:00
Glauber de Oliveira Costa
8f12dea613 x86: introduce native_read_tscp
Targetting paravirt, this patch introduces native_read_tscp, in
place of rdtscp() macro. When in a paravirt guest, this will
involve a function call, and thus, cannot be done in the vdso area.
These users then have to call the native version directly

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:06 +01:00
Glauber de Oliveira Costa
4e87173eac x86: split get_cycles_sync
This patch splits get_cycles_sync() into  __get_cycles_sync(),
and the rdtscll part. Paravirt guests cannot issue rdtscl directly,
as it involves a function call in vdso area.

So, using the __get_cycles_sync() base, we introduce vget_cycles_sync,
which then calls the native version of rdtscll. Ideally, however, a guest
should define its own clocksource, together with a vread function

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:03 +01:00
Glauber de Oliveira Costa
16e2011be6 x86: allow sched clock to be overridden by paravirt
This patch turns the sched_clock into native_sched_clock.
sched clock becomes a weak symbol, which can then give its
place to a paravirt definition.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:03 +01:00
Glauber de Oliveira Costa
fe58fc8f40 x86: wipe out traditional opt from x86_64 Makefile
Among other things, using -traditional as a gcc option stops us from
using macro token pasting, which is a feature we heavily rely on.

There was still a use of -traditional in arch/x86/kernel/Makefile_64,
which this patch removes.

I don't see any problems building kernels in my x86_64 box without
-traditional.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:03 +01:00
Harvey Harrison
96daa8cd53 x86: use def_bool where possible in Kconfig.cpu
x86: Use def_bool where possible in Kconfig.cpu

Change occurances of:
	bool
	default X

to:
	def_bool X

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:03 +01:00
Hiroshi Shimamoto
6612538ca9 x86: clean up process_32/64.c
White space and coding style clean up.
Make process_32/64.c similar.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:03 +01:00
Harvey Harrison
3c2362e629 x86: use def_bool where possible
Change occurances of:
	bool
	default X

to:
	def_bool X

Change ocurances of:
	bool "Foo"
	default X

to:
	def_bool X
	prompt "Foo"

Shows no difference in generated config for allmodconfig/allyesconfig.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:03 +01:00
Joerg Roedel
40842bf50e x86: use __PAGE_KERNEL* instead of _KERNPG_TABLE
This minor cleanup replaces _KERNPG_TABLE with the __PAGE_KERNEL* for 2MB PTEs
in the 64-bit memory initialization code. The __PAGE_KERNEL* defines are more
appropriate for PTEs.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:02 +01:00
Len Brown
2ba7deef09 x86: 32-bit IOAPIC: de-fang IRQ compression
commit c434b7a6ae
(x86: avoid wasting IRQs for PCI devices)
created a concept of "IRQ compression" on i386
to conserve IRQ numbers on systems with many
sparsely populated IO APICs.

The same scheme was also added to x86_64,
but later removed when x86_64 recieved an IRQ over-haul
that made it unnecessary -- including per-CPU
IRQ vectors that greatly increased the IRQ capacity
on the machine.

i386 has not received the analogous over-haul,
and thus a previous attempt to delete IRQ compression
from i386 was rejected on the theory that there may
exist machines that actually need it.  The fact is
that the author of IRQ compression patch was unable
to confirm the actual existence of such a system.

As a result, all i386 kernels with IOAPIC support
pay the following:

1. confusion

IRQ compression re-names the traditional IOAPIC
pin numbers (aka ACPI GSI's) into sequential IRQ #s:

ACPI: PCI Interrupt 0000:00:1c.0[A] -> GSI 20 (level, low) -> IRQ 16
ACPI: PCI Interrupt 0000:00:1c.1[B] -> GSI 21 (level, low) -> IRQ 17
ACPI: PCI Interrupt 0000:00:1c.2[C] -> GSI 22 (level, low) -> IRQ 18
ACPI: PCI Interrupt 0000:00:1c.3[D] -> GSI 23 (level, low) -> IRQ 19
ACPI: PCI Interrupt 0000:00:1c.4[A] -> GSI 20 (level, low) -> IRQ 16

This makes /proc/interrupts look different
depending on system configuration and device probe order.
It is also different than the x86_64 kernel running
on the exact same system.  As a result, programmers
get confused when comparing systems.

2. complexity

The IRQ code in Linux is already overly complex,
and IRQ compression makes it worse.  There have
already been two bug workarounds related to IRQ
compression -- the IRQ0 timer workaround and
the VIA PCI IRQ workaround.

3. size

All i386 kernels with IOAPIC support contain an int[4096] --
a 4 page array to contain the renamed IRQs.

So while the irq compression code on i386 should really
be deleted -- even before merging the x86_64 irq-overhaul,
this patch simply disables it on all high volume systems
to avoid problems #1 and #2 on most all i386 systems.

A large system with pin numbers >=64 will still have compression
to conserve limited IRQ numbers for sparse IOAPICS.  However,
the vast majority of the planet, those with only pin numbers < 64
will use an identity GSI -> IRQ mapping.

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
2008-01-30 13:31:02 +01:00
H. Peter Anvin
faca62273b x86: use generic register name in the thread and tss structures
This changes size-specific register names (eip/rip, esp/rsp, etc.) to
generic names in the thread and tss structures.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:02 +01:00
Roland McGrath
25149b62d3 x86: x86 ptrace merge removals
This removes the old separate 64-bit and ia32 ptrace source files.
They are no longer used.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:02 +01:00
Roland McGrath
cbc9d9d982 x86: x86 ptrace merge complete
This switches over the 64-bit build to use the shared ptrace code,
instead of the old ptrace_64.c and arch/x86/ia32/ptrace32.c code.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:02 +01:00
Roland McGrath
099cd6e9da x86: x86 ia32 ptrace arch merge
This moves the sys32_ptrace code into arch/x86/kernel/ptrace.c,
verbatim except for a few hard-coded sizes replaced with sizeof.
Here this code can use the shared local functions in this file.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:01 +01:00
Roland McGrath
cb757c41f3 x86: x86 ia32 ptrace getreg/putreg merge
This reimplements the 64-bit IA32-emulation register access
functions in arch/x86/kernel/ptrace.c, where they can share
some guts with the native access functions directly.

These functions are not used yet, but this paves the way to move
IA32 ptrace support into this file to share its local functions.

[akpm@linuxfoundation.org: Build fix]

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:01 +01:00
Roland McGrath
86976cd805 x86: x86 ptrace merge syscall trace
This moves the 64-bit syscall tracing functions into ptrace.c,
so that ptrace_64.c becomes entirely obsolete.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:01 +01:00
Roland McGrath
e9c86c789f x86: x86 ptrace arch merge
This adds 64-bit support to arch_ptrace in arch/x86/kernel/ptrace.c,
so this function can be used for native ptrace on both 32 and 64.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:01 +01:00
Roland McGrath
2047b08be6 x86: x86 ptrace getreg/putreg merge
This merges 64-bit support into the low-level register access
functions in arch/x86/kernel/ptrace.c, paving the way to share
this file between 32-bit and 64-bit builds.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:01 +01:00
Roland McGrath
06ee1b687a x86: x86 ptrace getreg/putreg cleanup
This cleans up the getreg/putreg functions to move the special cases
(segment registers and eflags) out into their own subroutines.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:01 +01:00
Roland McGrath
e39c289141 x86: ptrace FLAG_MASK cleanup
This cleans up the FLAG_MASK macro to use symbolic constants instead of a
magic number.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:01 +01:00
Roland McGrath
d52e9d690f x86: ptrace_32 renamed
This renames ptrace_32.c back to ptrace.c, in preparation
for merging the 32/64 versions of these files.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:01 +01:00
Roland McGrath
0f5340933f x86: x86-32 thread_struct.debugreg
This replaces the debugreg[7] member of thread_struct with individual
members debugreg0, etc.  This saves two words for the dummies 4 and 5,
and harmonizes the code between 32 and 64.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:59 +01:00
Roland McGrath
d277fb89df x86: x86-64 ia32 ptrace get/putreg32 current task
This generalizes the getreg32 and putreg32 functions so they can be used on
the current task, as well as on a task stopped in TASK_TRACED and switched
off.  This lays the groundwork to share this code for all kinds of
user-mode machine state access, not just ptrace.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:58 +01:00
Roland McGrath
5fd4d16bd5 x86: x86-32 ptrace get/putreg current task
This generalizes the getreg and putreg functions so they can be used on the
current task, as well as on a task stopped in TASK_TRACED and switched off.
This lays the groundwork to share this code for all kinds of user-mode
machine state access, not just ptrace.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:58 +01:00
Roland McGrath
ce90f34085 x86: x86-64 ptrace get/putreg current task
This generalizes the getreg and putreg functions so they can be used on the
current task, as well as on a task stopped in TASK_TRACED and switched off.
This lays the groundwork to share this code for all kinds of user-mode
machine state access, not just ptrace.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:58 +01:00
Roland McGrath
9e714bed64 x86: x86-32 ptrace whitespace
This canonicalizes the indentation in the getreg and putreg functions.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:58 +01:00
Roland McGrath
80976c0867 x86: x86-64 ptrace whitespace
This canonicalizes the indentation in the getreg and putreg functions.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:58 +01:00
Roland McGrath
ff14c6164b x86: x86-64 ia32 ptrace pt_regs cleanup
This cleans up the getreg32/putreg32 functions to use struct pt_regs in a
straightforward fashion, instead of equivalent ugly pointer arithmetic.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:57 +01:00
Roland McGrath
a46ff73d53 x86: setup64 eflags constants
This cleans up arch/x86/kernel/setup64.c to use the X86_EFLAGS_* constants
from <asm/processor-flags.h> instead of the EF_* enum in <asm/ptrace.h>.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:57 +01:00
H. Peter Anvin
742fa54a62 x86: use generic register names in struct sigcontext
Switch struct sigcontext (defined in <asm/sigcontext*.h>) to using
register names withut e- or r-prefixes for both 32- and 64-bit x86.
This is intended as a preliminary step in unifying this code between
architectures.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:56 +01:00
H. Peter Anvin
153d5f2e57 x86: use generic register names in struct user_regs_struct
Switch struct user_regs_struct (defined in <asm/user.h>, which is no
longer exported to userspace) to using register names without e- or
r-prefixes for both 32 and 64 bit x86.  This is intended as a
preliminary step in unifying this code between architectures.

Also, be a bit more strict in truncating 32-bit "extended" segment
register values to 16 bits.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:56 +01:00
H. Peter Anvin
65ea5b0349 x86: rename the struct pt_regs members for 32/64-bit consistency
We have a lot of code which differs only by the naming of specific
members of structures that contain registers.  In order to enable
additional unifications, this patch drops the e- or r- size prefix
from the register names in struct pt_regs, and drops the x- prefixes
for segment registers on the 32-bit side.

This patch also performs the equivalent renames in some additional
places that might be candidates for unification in the future.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:56 +01:00
Jeremy Fitzhardinge
53756d3722 x86: add set/clear_cpu_cap operations
The patch to suppress bitops-related warnings added a pile of ugly
casts.  Many of these were related to the management of x86 CPU
capabilities.  Clean these up by adding specific set/clear_cpu_cap
macros, and use them consistently.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Andi Kleen <ak@suse.de>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:55 +01:00
Jeremy Fitzhardinge
5548fecdff x86: clean up bitops-related warnings
Add casts to appropriate places to silence spurious bitops warnings.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Andi Kleen <ak@suse.de>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:55 +01:00
Mike Travis
7bf0c23ed2 x86: prevent dereferencing non-allocated per_cpu variables
'for_each_possible_cpu(i)' when there's a _remote possibility_ of
dereferencing a non-allocated per_cpu variable involved.

All files except mm/vmstat.c are x86 arch.

Thanks to pageexec@freemail.hu for pointing this out.

Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: <pageexec@freemail.hu>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:55 +01:00
Roland McGrath
0fa376e027 x86: PTRACE_SINGLEBLOCK
This adds the PTRACE_SINGLEBLOCK request on x86, matching the ia64 feature.
The implementation comes from the generic ptrace code and relies on the
low-level machine support provided by arch_has_block_step() et al.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:55 +01:00
Roland McGrath
1ecc798c67 x86: debugctlmsr kprobes
This adjusts the x86 kprobes implementation to cope with per-thread
MSR_IA32_DEBUGCTLMSR being set for user mode.  I haven't delved deep
enough into the kprobes code to be really sure this covers all the
cases where the user-mode BTF setting needs to be cleared or restored.
It looks about right to me.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:54 +01:00
Roland McGrath
10faa81e10 x86: debugctlmsr arch_has_block_step
This implements user-mode step-until-branch on x86 using the BTF bit
in MSR_IA32_DEBUGCTLMSR.  It's just like single-step, only less so.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:54 +01:00
Roland McGrath
7e9916040b x86: debugctlmsr context switch
This adds low-level support for a per-thread value of MSR_IA32_DEBUGCTLMSR.
The per-thread value is switched in when TIF_DEBUGCTLMSR is set.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:54 +01:00
Roland McGrath
0a049bb0ab x86: debugctlmsr kconfig
This adds the (internal) Kconfig macro CONFIG_X86_DEBUGCTLMSR,
to be defined when configuring to support only hardware that
definitely supports MSR_IA32_DEBUGCTLMSR with the BTF flag.

The Intel documentation says "P6 family" and later processors all have it.
I think the Kconfig dependencies are right to have it set for those and
unset for others (i.e., when 586 and earlier are supported).

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:54 +01:00
Roland McGrath
d9771e8c50 x86: x86-32 ptrace debugreg cleanup
This cleans up the 32-bit ptrace code to separate the guts of the
debug register access from the implementation of PTRACE_PEEKUSR and
PTRACE_POKEUSR.  The new functions ptrace_[gs]et_debugreg match the
new 64-bit entry points for parity, but they don't need to be global.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:52 +01:00
Roland McGrath
d0f0817582 x86: x86-64 ia32 ptrace debugreg cleanup
This cleans up the ia32 compat ptrace code to use shared code from
native ptrace for the implementation guts of debug register access.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:52 +01:00
Roland McGrath
962ff3804d x86: x86-64 ptrace debugreg cleanup
This cleans up the 64-bit ptrace code to separate the guts of the
debug register access from the implementation of PTRACE_PEEKUSR and
PTRACE_POKEUSR.  The new functions ptrace_[gs]et_debugreg are made
global so that the ia32 code can later be changed to call them too.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:52 +01:00
Roland McGrath
e4aed6cc45 x86-64 ptrace: use task_pt_regs
This cleans up the 64-bit ptrace code to use task_pt_regs instead of its
own redundant code that does the same thing a different way.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:52 +01:00
Roland McGrath
62a97d447b x86-32 ptrace: use task_pt_regs
This cleans up the 32-bit ptrace code to use task_pt_regs instead of its
own redundant code that does the same thing a different way.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:52 +01:00
Roland McGrath
227195d4a6 x86-32: ptrace generic resume
This removes the handling for PTRACE_CONT et al from the 32-bit
ptrace code, so it uses the new generic code via ptrace_request.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:51 +01:00
Roland McGrath
18982c158f x86-64: ptrace generic resume
This removes the handling for PTRACE_CONT et al from the 64-bit
ptrace code, so it uses the new generic code via ptrace_request.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:51 +01:00
Roland McGrath
e1f287735c x86 single_step: TIF_FORCED_TF
This changes the single-step support to use a new thread_info flag
TIF_FORCED_TF instead of the PT_DTRACE flag in task_struct.ptrace.
This keeps arch implementation uses out of this non-arch field.

This changes the ptrace access to eflags to mask TF and maintain
the TIF_FORCED_TF flag directly if userland sets TF, instead of
relying on ptrace_signal_deliver.  The 64-bit and 32-bit kernels
are harmonized on this same behavior.  The ptrace_signal_deliver
approach works now, but this change makes the low-level register
access code reliable when called from different contexts than a
ptrace stop, which will be possible in the future.

The 64-bit do_debug exception handler is also changed not to clear TF
from user-mode registers.  This matches the 32-bit kernel's behavior.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:50 +01:00
Roland McGrath
7122ec8158 x86: single_step: share code
This removes the single-step code from ptrace_32.c and uses the step.c code
shared with the 64-bit kernel.  The two versions of the code were nearly
identical already, so the shared code has only a couple of simple #ifdef's.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:50 +01:00
Roland McGrath
5f76cb1f6c x86: single_step 0xf0
This fixes the 64-bit single-step handling code's instruction
decoder to grok the 0xf0 (lock) prefix, which the 32-bit code
already does correctly.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:50 +01:00
Roland McGrath
3f80c1adc9 x86: single_step segment macros
This cleans up the single-step code to use the asm/segment.h macros
for segment selector magic bits, rather than its own constant.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:50 +01:00
Roland McGrath
fa1e03eae2 x86: single_step moved
This moves the single-step support code from ptrace_64.c into a new file
step.c, verbatim.  This paves the way for consolidating this code between
64-bit and 32-bit versions.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:50 +01:00
Roland McGrath
7f232343e0 x86: arch_has_single_step
This defines the new standard arch_has_single_step macro.  It makes the
existing set_singlestep and clear_singlestep entry points global, and
renames them to the new standard names user_enable_single_step and
user_disable_single_step, respectively.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:48 +01:00
Roland McGrath
77c03dcd44 x86: remove TRAP_FLAG
This gets rid of the local constant macro TRAP_FLAG.
It's redundant with the public constant macro X86_EFLAGS_TF.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:48 +01:00
Andrew Morton
022eb43419 x86: kmap_atomic() debugging
[ mingo@elte.hu: cleanups and made dependent on CONFIG_DEBUG_HIGHMEM.

  this caught a handful of bugs already, so lets apply it. If it gets
  things wrong we'll disable it. ]

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:47 +01:00
Mathieu Desnoyers
2c0b8a7578 x86: fall back on interrupt disable in cmpxchg8b on 80386 and 80486
Actually, on 386, cmpxchg and cmpxchg_local fall back on
cmpxchg_386_u8/16/32: it disables interruptions around non atomic
updates to mimic the cmpxchg behavior.

The comment:
/* Poor man's cmpxchg for 386. Unsuitable for SMP */

already present in cmpxchg_386_u32 tells much about how this cmpxchg
implementation should not be used in a SMP context. However, the cmpxchg_local
can perfectly use this fallback, since it only needs to be atomic wrt the local
cpu.

This patch adds a cmpxchg_486_u64 and uses it as a fallback for cmpxchg64
and cmpxchg64_local on 80386 and 80486.

Q:
but why is it called cmpxchg_486 when the other functions are called

A:
Because the standard cmpxchg is missing only on 386, but cmpxchg8b is
missing both on 386 and 486.

Citing Intel's Instruction set reference:

cmpxchg:
This instruction is not supported on Intel processors earlier than the
Intel486 processors.

cmpxchg8b:
This instruction encoding is not supported on Intel processors earlier
than the Pentium processors.

Q:
What's the reason to have cmpxchg64_local on 32 bit architectures?
Without that need all this would just be a few simple defines.

A:
cmpxchg64_local on 32 bits architectures takes unsigned long long
parameters, but cmpxchg_local only takes longs. Since we have cmpxchg8b
to execute a 8 byte cmpxchg atomically on pentium and +, it makes sense
to provide a flavor of cmpxchg and cmpxchg_local using this instruction.

Also, for 32 bits architectures lacking the 64 bits atomic cmpxchg, it
makes sense _not_ to define cmpxchg64 while cmpxchg could still be
available.

Moreover, the fallback for cmpxchg8b on i386 for 386 and 486 is a

However, cmpxchg64_local will be emulated by disabling interrupts on all
architectures where it is not supported atomically.

Therefore, we *could* turn cmpxchg64_local into a cmpxchg_local, but it
would make the 386/486 fallbacks ugly, make its design different from
cmpxchg/cmpxchg64 (which really depends on atomic operations and cannot
be emulated) and require the __cmpxchg_local to be expressed as a macro
rather than an inline function so the parameters would not be fixed to
unsigned long long in every case.

So I think cmpxchg64_local makes sense there, but I am open to
suggestions.

Q:
Are there any callers?

A:
I am actually using it in LTTng in my timestamping code. I use it to
work around CPUs with asynchronous TSCs. I need to update 64 bits
values atomically on this 32 bits architecture.

Changelog:
- Ran though checkpatch.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:47 +01:00
Ralf Baechle
5f627f8e12 mips, x86: optimize the i8259 code a bit
The timer code always calls the clock_event_device set_net_event and
set_mode methods with interrupts disabled, so no need to use
spin_lock_irqsave / spin_unlock_irqrestore for those.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by:Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:47 +01:00
Christoph Lameter
b263295dbf x86: 64-bit, make sparsemem vmemmap the only memory model
Use sparsemem as the only memory model for UP, SMP and NUMA.  Measurements
indicate that DISCONTIGMEM has a higher overhead than sparsemem.  And
FLATMEMs benefits are minimal.  So I think its best to simply standardize
on sparsemem.

Results of page allocator tests (test can be had via git from slab git
tree branch tests)

Measurements in cycle counts. 1000 allocations were performed and then the
average cycle count was calculated.

Order	FlatMem	Discontig	SparseMem
0	  639	  665		  641
1	  567	  647		  593
2	  679	  774		  692
3	  763	  967		  781
4	  961	 1501		  962
5	 1356	 2344		 1392
6	 2224	 3982		 2336
7	 4869	 7225		 5074
8	12500	14048		12732
9	27926	28223		28165
10	58578	58714		58682

(Note that FlatMem is an SMP config and the rest NUMA configurations)

Memory use:

SMP Sparsemem
-------------

Kernel size:

   text    data     bss     dec     hex filename
3849268  397739 1264856 5511863  541ab7 vmlinux

             total       used       free     shared    buffers     cached
Mem:       8242252      41164    8201088          0        352      11512
-/+ buffers/cache:      29300    8212952
Swap:      9775512          0    9775512

SMP Flatmem
-----------

Kernel size:

   text    data     bss     dec     hex filename
3844612  397739 1264536 5506887  540747 vmlinux

So 4.5k growth in text size vs. FLATMEM.

             total       used       free     shared    buffers     cached
Mem:       8244052      40544    8203508          0        352      11484
-/+ buffers/cache:      28708    8215344

2k growth in overall memory use after boot.

NUMA discontig:

   text    data     bss     dec     hex filename
3888124  470659 1276504 5635287  55fcd7 vmlinux

             total       used       free     shared    buffers     cached
Mem:       8256256      56908    8199348          0        352      11496
-/+ buffers/cache:      45060    8211196
Swap:      9775512          0    9775512

NUMA sparse:

   text    data     bss     dec     hex filename
3896428  470659 1276824 5643911  561e87 vmlinux

8k text growth. Given that we fully inline virt_to_page and friends now
that is rather good.

             total       used       free     shared    buffers     cached
Mem:       8264720      57240    8207480          0        352      11516
-/+ buffers/cache:      45372    8219348
Swap:      9775512          0    9775512

The total available memory is increased by 8k.

This patch makes sparsemem the default and removes discontig and
flatmem support from x86.

[ akpm@linux-foundation.org: allnoconfig build fix ]

Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:47 +01:00
Borislav Petkov
4ebd1290ba x86: vmlinux_32.lds.S: remove repeated comment from the x86-32 linker script
Remove repeated comment from the linker script for the x86-32 target.

Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:46 +01:00
Yinghai Lu
9de819fe72 x86: do not set boot cpu in cpu_online_map at x86_64_start_kernel()
In init/main.c boot_cpu_init() does that later.

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:46 +01:00
Yinghai Lu
949ec325c7 x86: set cpu_index to nr_cpus instead of 0
Some BIOSes that support two/four dualcore/quadcore systems, will get:

ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:1 APIC version 16
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 15:1 APIC version 16
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x02] enabled)
Processor #2 15:1 APIC version 16
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x03] enabled)
Processor #3 15:1 APIC version 16
ACPI: LAPIC (acpi_id[0x05] lapic_id[0x84] disabled)
ACPI: LAPIC (acpi_id[0x06] lapic_id[0x85] disabled)
ACPI: LAPIC (acpi_id[0x07] lapic_id[0x86] disabled)
ACPI: LAPIC (acpi_id[0x08] lapic_id[0x87] disabled)
ACPI: LAPIC (acpi_id[0x09] lapic_id[0x88] disabled)
ACPI: LAPIC (acpi_id[0x0a] lapic_id[0x89] disabled)
ACPI: LAPIC (acpi_id[0x0b] lapic_id[0x8a] disabled)
ACPI: LAPIC (acpi_id[0x0c] lapic_id[0x8b] disabled)
ACPI: LAPIC (acpi_id[0x0d] lapic_id[0x8c] disabled)
ACPI: LAPIC (acpi_id[0x0e] lapic_id[0x8d] disabled)
ACPI: LAPIC (acpi_id[0x0f] lapic_id[0x8e] disabled)
ACPI: LAPIC (acpi_id[0x10] lapic_id[0x8f] disabled)

SMP: Allowing 16 CPUs, 12 hotplug CPUs

the /proc/cpuinfo will show a bunch of NULL cpus with cpu_index=0

so assign impossible cpu_index value at first instead of 0.

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:46 +01:00
Vladimir Berezniker
b3ca74a2bf x86: sanitize user specified e820 memmap values
Sanitize user specified e820 memory ranges, using the same logic that is
applied to the values returned by the BIOS.  This ensures consistent
handling regardless of the source of the memory mappings.

Allows overriding portions of the memory map without specifying one in
it's entirety (memmap=exactmap).

E.g. marking a range of bad RAM as reserved with memmap=48M$528M

BIOS supplied range

BIOS-e820: 0000000000100000 - 000000007fe80000 (usable)

becomes

user: 0000000000100000 - 0000000021000000 (usable)
user: 0000000021000000 - 0000000024000000 (reserved)
user: 0000000024000000 - 000000007fe80000 (usable)

Previously this did not work, as the original BIOS range was left
untouched while the user defined range was appended to the end of the
memory map.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Vladimir Berezniker <vmpn@hitechman.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:46 +01:00
Roland McGrath
efd1ca52d0 x86: TLS cleanup
This consolidates the four different places that implemented the same
encoding magic for the GDT-slot 32-bit TLS support.  The old tls32.c was
renamed and is now only slightly modified to be the shared implementation.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:46 +01:00
Roland McGrath
13abd0e504 x86: tls32 moved
This renames arch/x86/ia32/tls32.c to arch/x86/kernel/tls.c, which does
nothing now but paves the way to consolidate this code for 32-bit too.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:45 +01:00
Roland McGrath
df5d438e33 x86: ptrace fs/gs_base
The fs_base and gs_base fields are available in user_regs_struct.
But reading these via ptrace (PTRACE_GETREGS or PTRACE_PEEKUSR) does
not give a reliably useful value.  The thread_struct fields are 0
when do_arch_prctl decided to use a GDT slot instead of MSR_FS_BASE,
which it does for a value under 1<<32.

This changes ptrace access to fs_base and gs_base to work like
PTRACE_ARCH_PRCTL does.  That is, it reads the base address that
user-mode memory access using the fs/gs instruction prefixes will
use, regardless of how it's being implemented in the kernel.  The
MSR vs GDT is an implementation detail that is pretty much hidden
from userland in the actual using, and there is no reason that
ptrace should give the internal implementation picture rather than
the user-mode semantic picture.  In the case of setting the value,
this can implicitly change the fsindex/gsindex value (also
separately in user_regs_struct), which is what happens when the
thread calls arch_prctl itself.  In a PTRACE_SETREGS, the fs_base
change will come after the fsindex change due to the order of the
struct, and so a change the debugger made to fs_base will have the
effect intended, another part of the user_regs_struct will now
differ when read back from what the debugger wrote.

This makes PTRACE_ARCH_PRCTL obsolete.  We could consider declaring
it deprecated and removing it one day, though there is no hurry.
For the foreseeable future, debuggers have to assume an old kernel
that does not report reliable fs_base/gs_base values in user_regs_struct
and stick to PTRACE_ARCH_PRCTL anyway.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:45 +01:00
Roland McGrath
91394eb097 x86: use get_desc_base
This changes a couple of places to use the get_desc_base function.
They were duplicating the same calculation with different equivalent code.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:45 +01:00
Roland McGrath
cadd516422 x86 vDSO: canonicalize sysenter .eh_frame
Some assembler versions automagically optimize .eh_frame contents,
changing their size.  The CFI in sysenter.S was not using optimal
formatting, so it would be changed by newer/smarter assemblers.
This ran afoul of the wired constant for padding out the other vDSO
images to match its size.  This changes the original hand-coded
source to use the optimal format encoding for its operations.  That
leaves nothing more for a fancy assembler to do, so the sizes will
match the wired-in expected size regardless of the assembler version.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:44 +01:00
Roland McGrath
16e48e7e79 x86 vDSO: makefile cleanup
This cleans up the arch/x86/vdso/Makefile rules for vdso.so to
share more code with the vdso32-*.so rules and remove old cruft.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:44 +01:00
Roland McGrath
69d0627a7f x86 vDSO: reorder vdso32 code
This reorders the code in the 32-bit vDSO images to put the signal
trampolines first and __kernel_vsyscall after them.  The order does
not matter to userland, it just uses what AT_SYSINFO or e_entry
says.  Since the signal trampolines are the same size in both
versions of the vDSO, putting them first is the simplest way to get
the addresses to line up.  This makes it work to use a more compact
layout for the vDSO.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:44 +01:00
Roland McGrath
16f4bc738d x86 vDSO: ia32 vsyscall removal
This removes all the old vsyscall code from arch/x86/ia32/ that is
no longer used because arch/x86/vdso/ code has replaced it.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:44 +01:00
Roland McGrath
af65d64845 x86 vDSO: consolidate vdso32
This makes x86_64's ia32 emulation support share the sources used in the
32-bit kernel for the 32-bit vDSO and much of its setup code.

The 32-bit vDSO mapping now behaves the same on x86_64 as on native 32-bit.
The abi.syscall32 sysctl on x86_64 now takes the same values that
vm.vdso_enabled takes on the 32-bit kernel.  That is, 1 means a randomized
vDSO location, 2 means the fixed old address.  The CONFIG_COMPAT_VDSO
option is now available to make this the default setting, the same meaning
it has for the 32-bit kernel.  (This does not affect the 64-bit vDSO.)

The argument vdso32=[012] can be used on both 32-bit and 64-bit kernels to
set this paramter at boot time.  The vdso=[012] argument still does this
same thing on the 32-bit kernel.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:43 +01:00
Roland McGrath
00f8b1bc0e x86 vDSO: ia32 vdso32-syscall build
This puts the syscall version of the 32-bit vDSO in arch/x86/vdso/vdso32/
for 64-bit IA32 support.  This is not used yet, but it paves the way for
consolidating the 32-bit vDSO source and build logic all in one place.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:43 +01:00
Roland McGrath
36197c92a2 x86 vDSO: ia32 sysenter_return
This changes the 64-bit kernel's support for the 32-bit sysenter
instruction to use stored fields rather than constants for the
user-mode return address, as the 32-bit kernel does.  This adds a
sysenter_return field to struct thread_info, as 32-bit has.  There
is no observable effect from this yet.  It makes the assembly code
independent of the 32-bit vDSO mapping address, paving the way for
making the vDSO address vary as it does on the 32-bit kernel.

[ akpm@linux-foundation.org: build fix on !CONFIG_IA32_EMULATION ]

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:43 +01:00
Roland McGrath
0aa97fb226 x86 vDSO: ia32_sysenter_target
This harmonizes the name for the entry point from the 32-bit sysenter
instruction across 32-bit and 64-bit kernels.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:43 +01:00
Roland McGrath
f288f32dc5 x86 vDSO: vdso32 setup
This moves arch/x86/kernel/sysenter_32.c to arch/x86/vdso/vdso32-setup.c,
keeping all the code relating only to vDSO magic in the vdso/ subdirectory.
This is a pure renaming, but it paves the way to consolidating the code for
dealing with 32-bit vDSOs across CONFIG_X86_32 and CONFIG_IA32_EMULATION.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:42 +01:00
Roland McGrath
85a93b95dc x86 vDSO: i386 vdso32 install
This enables 'make vdso_install' for i386 as on x86_64 and powerpc.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:42 +01:00
Roland McGrath
4707c4717a x86 vDSO: absolute relocs
This updates the exceptions for absolute relocs for the new symbol name
convention used for symbols extracted from the vDSO images.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:42 +01:00
Roland McGrath
6c3652efca x86 vDSO: i386 vdso32
This makes the i386 kernel use the new vDSO build in arch/x86/vdso/vdso32/
to replace the old one from arch/x86/kernel/.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:42 +01:00
Roland McGrath
0249c9c1e7 x86 vDSO: vdso32 build
This builds the 32-bit vDSO images in the arch/x86/vdso subdirectory.
Nothing uses the images yet, but this paves the way for consolidating
the vDSO build logic all in one place.  The new images use a linker
script sharing the layout parts from vdso-layout.lds.S with the 64-bit
vDSO.  A new vdso32-syms.lds is generated in the style of vdso-syms.lds.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:42 +01:00
Roland McGrath
0c2f51a7d2 x86 vDSO: arch/x86/vdso/vdso32
This moves the i386 vDSO sources into arch/x86/vdso/vdso32/, a
new directory.  This patch is a pure renaming, but paves the way
for consolidating the vDSO build logic.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:42 +01:00
Roland McGrath
108b545137 x86 vDSO: harmonize asm-offsets
This change harmonizes the asm-offsets macros used in the 32-bit vDSO
across 32-bit and 64-bit builds.  It's a purely cosmetic change for now,
but it paves the way for consolidating the 32-bit vDSO builds.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:41 +01:00
Roland McGrath
f6b46ebf90 x86 vDSO: new layout
This revamps the vDSO linker script to lay things out with the best
packing of the data and good, separate alignment of the code.  The
rigid layout using VDSO_TEXT_OFFSET no longer matters to the kernel.
I've moved the layout parts of the linker script into a new include
file, vdso-layout.lds.S; this is in preparation for sharing the script
for the 32-bit vDSO builds too.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:41 +01:00
Roland McGrath
2b9c97e161 x86 vDSO: remove vdso-syms.o
Get rid of vdso-syms.o from the kernel link.  We don't need it any more.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:41 +01:00
Roland McGrath
7f3646aa16 x86 vDSO: use vdso-syms.lds
This patch changes the kernel's references to addresses in the vDSO image
to be based on the symbols defined by vdso-syms.lds instead of the old
vdso-syms.o symbols.  This is all wrapped up in a macro defined by the new
asm-x86/vdso.h header; that's the only place in the kernel source that has
to know the details of the scheme for getting vDSO symbol values.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:41 +01:00
Roland McGrath
5b93049337 x86 vDSO: generate vdso-syms.lds
This patch adds a new way of extracting symbols from the built vDSO image.
This is much simpler and less fragile than using ld -R; it removes the
need to control the DSO layout quite so exactly.  I was clearly unduly
distracted by clever ld uses when I did the original vDSO implementation.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:40 +01:00
Jiri Kosina
c1d171a002 x86: randomize brk
Randomize the location of the heap (brk) for i386 and x86_64.  The range is
randomized in the range starting at current brk location up to 0x02000000
offset for both architectures.  This, together with
pie-executable-randomization.patch and
pie-executable-randomization-fix.patch, should make the address space
randomization on i386 and x86_64 complete.

Arjan says:

This is known to break older versions of some emacs variants, whose dumper
code assumed that the last variable declared in the program is equal to the
start of the dynamically allocated memory region.

(The dumper is the code where emacs effectively dumps core at the end of it's
compilation stage; this coredump is then loaded as the main program during
normal use)

iirc this was 5 years or so; we found this way back when I was at RH and we
first did the security stuff there (including this brk randomization).  It
wasn't all variants of emacs, and it got fixed as a result (I vaguely remember
that emacs already had code to deal with it for other archs/oses, just
ifdeffed wrongly).

It's a rare and wrong assumption as a general thing, just on x86 it mostly
happened to be true (but to be honest, it'll break too if gcc does
something fancy or if the linker does a non-standard order).  Still its
something we should at least document.

Note 2: afaik it only broke the emacs *build*.  I'm not 100% sure about that
(it IS 5 years ago) though.

[ akpm@linux-foundation.org: deuglification ]

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:40 +01:00
Robert Richter
7b83dae7aa x86: extended interrupt LVT support for AMD Barcelona
Also macro definitions in apicdef.h has been updated.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:40 +01:00
Andi Kleen
739f33b38b x86: untable __init references between IO data
Earlier patch added IO APIC setup into local APIC setup. This caused
modpost warnings. Fix them by untangling setup_local_APIC() and splitting
it into smaller functions. The IO APIC initialization is only called
for the BP init.

Also removed some outdated debugging code and minor cleanup.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:40 +01:00
Yinghai Lu
3f6e5a12b8 x86: use core id bits for apicid_to_node initialization
We shoud use core id bits instead of max cores, in case later with AMD
downcores Quad core Opteron.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:39 +01:00
Yinghai Lu
a860b63c41 x86: store core id bits in cpuinfo_x8
We need to store core id bits to cpuinfo_x86 in early_identify_cpu. So we
use it to create acpiid_to_node array in k8topolgy.c

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:39 +01:00
Adrian Bunk
8c4385d7f5 x86: remove -maccumulate-outgoing-args on 32-bit
Contrary to the comment "newer gccs do it by default", newer gcc versions
default to -maccumulate-outgoing-args only with CONFIG_CC_OPTIMIZE_FOR_SIZE=n,
and then only with some CPU settings.

Measured with an i386 defconfig, gcc 4.2.1 and kernel 2.6.23-rc1 ("orig" is
the plain kernel, "changed is with -maccumulate-outgoing-args removed):

$ ls -la vmlinux*
-rwxrwxr-x 1 bunk bunk 6269713 2007-07-24 22:19 vmlinux.changed
-rwxrwxr-x 1 bunk bunk 6425361 2007-07-24 22:19 vmlinux.orig
$ size vmlinux.*
   text    data     bss     dec     hex filename
4493465  504108  614400 5611973  55a1c5 vmlinux.changed
4646160  504108  614400 5764668  57f63c vmlinux.orig
$

That's a 2.5% size increase that does for sure hurt small systems.

If the stack unwinder ever comes back and needs this as indicated in the
comment, adding it to the cflags when the user enabled the unwinder should be
a better option.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:39 +01:00
Yinghai Lu
1c69524c2e x86: clear IO_APIC before enabing apic error vector.
4 socket quad core, 8 socket quad core will do apic ID lifting for BSP.

But io-apic regs for ExtINT still use 0 as dest.

so when we enable apic error vector in BSP, we will get one APIC error.

CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU 0/4 -> Node 0
CPU: Physical Processor ID: 1
CPU: Processor Core ID: 0
SMP alternatives: switching to UP code
ACPI: Core revision 20070126
enabled ExtINT on CPU#0
ESR value after enabling vector: 00000000, after 0000000c
APIC error on CPU0: 0c(08)
ENABLING IO-APIC IRQs
Synchronizing Arb IDs.

So move enable_IO_APIC from setup_IO_APIC into setup_local_APIC and call it
before enabling the ACPI error vector.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:39 +01:00
Thomas Gleixner
04e1ba8521 x86: cleanup kernel/setup_64.c
Clean it up before applying more patches to it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:39 +01:00
Steven Rostedt
5cabbd97b1 x86: remove unused tsk_thread from asm-offsets_64.c
So this patch simply removes the "thread" from asm-offsets.c since I
can't find an owner for it.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:39 +01:00
Dave Jones
7ebad70534 x86: use CR0 defines.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:39 +01:00
Thomas Gleixner
fe21a445b9 x86: adjust numa 32 namespace
Use the 64bit numa variable names for numa32 as well.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:38 +01:00
Thomas Gleixner
7462894a7c x86: fixup numa 64 namespace
Using a variable name, which is the same as a macro name is not
really smart. Change the variable names and fixup all users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:38 +01:00
Thomas Gleixner
e3cfe529dd x86: cleanup numa_64.c
Clean it up before applying more patches.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:37 +01:00
Thomas Gleixner
64883ab0e3 x86: cleanup mpspec variants
Bring the mpspec variants into sync to prepare merging and
paravirt support.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:35 +01:00
Thomas Gleixner
0b9c99b6f2 x86: cleanup tlbflush.h variants
Bring the tlbflush.h variants into sync to prepare merging and
paravirt support.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:35 +01:00
Glauber de Oliveira Costa
6abcd98ffa x86: irqflags consolidation
This patch consolidates the irqflags include files containing common
paravirt definitions. The native definition for interrupt handling, halt,
and such, are the same for 32 and 64 bit, and they are kept in irqflags.h.
the differences are split in the arch-specific files.

The syscall function, irq_enable_sysexit, has a very specific i386 naming,
and its name is then changed to a more general one.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:33 +01:00
Hiroshi Shimamoto
416b72182a x86: clean up nmi_32/64.c
clean up and make nmi_32/64.c more similar.
- white space and coding style clean up.
- nmi_cpu_busy is available on CONFIG_SMP.
- move functions __acpi_nmi_enable, acpi_nmi_enable,
  __acpi_nmi_disable and acpi_nmi_disable.
- make variables name more similar.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:33 +01:00
Bernhard Walle
c9cce83dd1 x86: remove extern declarations for code, data, bss resources
This patch removes the extern struct resource declarations for
data_resource, code_resource and bss_resource on x86 and declares that
three structures as static as done on other architectures like IA64.

On i386, these structures are moved to setup_32.c (from e820_32.c) because
that's code that is not specific to e820 and also required on EFI systems.
That makes the "extern" reference superfluous.

On x86_64, data_resource, code_resource and bss_resource are passed to
e820_reserve_resources() as arguments just as done on i386 and IA64.  That
also avoids the "extern" reference and it's possible to make it static.

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:32 +01:00
Cyrill Gorcunov
9773db2a30 x86: remove dead code in ia32-emu
Remove useless second time checking of fsave argument in save_i387_ia32()
routine.  It's possible the compiler is doing the same but that is much
better to remove the dead code explicitly.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:32 +01:00
Lucas Woods
201c19948b x86: remove duplicate includes
Signed-off-by: Lucas Woods <woodzy@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:32 +01:00
Paul Jimenez
2d2ee8de5f x86: mtrr use type bool [RESEND AGAIN]
This is a janitorish patch to 1) remove private TRUE/FALSE #def's in
favor of using the standard enum from linux/stddef.h and 2) switch the
variables holding those values to type 'bool' (from linux/types.h)
since it both seems more appropriate and allows for potentially better
optimization.

As a truly minor aside, I removed a couple of comments documenting
a 'do_safe' parameter that seems to no longer exist.

Signed-off-by: Paul Jimenez <pj@place.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:31 +01:00
Adrian Bunk
3e7593966b x86: pci-dma_64.c: cleanups
This patch contains the following cleanups:
- make the needlessly global iommu_setup() static
- remove the unused EXPORT_SYMBOL(iommu_merge)

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:31 +01:00
Adrian Bunk
ed65260bb8 x86: pci-calgary_64.c: make a variable static
"debugging" is a horrible name for a global variable - thankfully it can
become static.

Also put it out of __read_mostly so that gcc no longer has to emit it
at all.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:31 +01:00
Adrian Bunk
867ab54566 x86: nmi_64.c: make code static
This patch makes the following needlessly global code static:
- panic_on_timeout
- setup_nmi_watchdog()

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:31 +01:00
Adrian Bunk
231fd906c5 x86 mce_64.c: make struct mcelog static
This patch makes the needlessly global struct mcelog static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:30 +01:00
Hiroshi Shimamoto
52e3d90def x86: io_apic_64.c: remove unused config check
CONFIG_IRQBALANCE doesn't exist on x86_64.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:30 +01:00
Adrian Bunk
013d23e156 x86 e820_64.c: make 2 functions static
This patch makes the following needlessly global functions static:
- e820_print_map()
- early_panic()

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:30 +01:00
Paul Jimenez
2b8e05b567 x86: make i8259_64 more _32-like
Howdy! Here's a simple janitorish patch for you:

This patch mainly hinges around two includes and their ramifications:

#include <i8259.h>	which provides cached_{slave,master}_mask
#include <io_ports.h>	which provides PIC_{MASTER,SLAVE}_{IMR,CMD}

Adding these two includes and using those half dozen or so definitions
removed 140+ lines of diffs between i8259_32.c and i8259_64.c, thus
making it easier for the real substantitive differences between them to
show up, and hopefully therefore making it easier to eventually merge
the two.  All the warnings that checkpatch.pl throws (missing spaces
after commas and >80 character lines) exist intentionally to match
i8259_32.c.

Signed-off-by: Paul Jimenez <pj@place.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:29 +01:00
Thomas Gleixner
f20ebee418 x86: move 8259 defines to i8259.h
Move the i8259 defines and remove the now io_ports.h

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:29 +01:00
Adrian Bunk
f0cd0af1b0 x86: unexport __{read,write}_lock_failed
This patch removes the unused exports for __{read,write}_lock_failed.

Signed-off-by: Adrian Bunk <bunk@kernel.org>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:29 +01:00
Thomas Gleixner
3abf024d2a x86: nuke a ton of unused exports
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:28 +01:00
Thomas Gleixner
a72368dd37 x86: remove dead code and exports
No users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:28 +01:00
Thomas Gleixner
16da2f9305 x86: smp_64.c: remove unused exports and cleanup while at it
The exports are nowhere used. There is even no reason why they were
ever introduced.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:27 +01:00
Thomas Gleixner
081e10b96e x86: clean up arch/x86/kernel/time_64.c includes
Reduce the lets include all to the minimum.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:27 +01:00
Thomas Gleixner
1122b134bc x86: share rtc code
Remove the rtc code from time_64.c and add the extra bits to the
i386 path. The ACPI century check is probably valid for i386 as
well, but this is material for a separate patch.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:27 +01:00
Thomas Gleixner
fe599f9fbc x86: isolate the rtc code for sharing
The mach-default/mach_time.h code inline is moved to arch/x86/kernel/rtc.c
and the header files are adjusted.

Shrink the 3 dozen includes to the ones we really need.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:26 +01:00
Thomas Gleixner
6ce60b07e6 x86: unify mc146818rtc.h - prepare for sharing rtc code
Unify mc146818rtc.h by adding the rtc_cmos_read/write functions to
time_64.c. This is a preparatory patch to finaly share the rtc code,
which is unsurprisingly similar.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:26 +01:00
Thomas Gleixner
4ec08da02f x86: remove the duplicated arch/x86/ia32/mmap32.c
Use mmap_32.c in arch/x86/mm instead

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:26 +01:00
Thomas Gleixner
f8eeae6821 x86: clean up arch/x86/mm/mmap_32/64.c
White space and coding style clenaup.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:25 +01:00
Thomas Gleixner
ed4aed98da x86: clean up arch/x86/kernel/vsmp_64.c
White space and coding style clenaup.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:24 +01:00
Thomas Gleixner
9a211abeaa x86: clean up ioport_32.c
Remove unused variables, rename the "unused" argument to regp. It is used !
Codingstyle fixes.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:24 +01:00
Thomas Gleixner
f2f58178f4 x86: simplify set_bitmap in ioport_32.c
Simplify set_bitmap(). This is not in a hotpath and we really can use the
straight forward loop through those bits. A similar implementation is used
in the 64 bit code as well.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:23 +01:00
Thomas Gleixner
0e078e2f50 x86: prepare merging arch/x86/kernel/apic_32/64.c
Shuffle code around, so we get a readable diff.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:20 +01:00
Thomas Gleixner
3a12d93dc0 x86: make smp_local_timer_interrupt() static
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:20 +01:00
Thomas Gleixner
87ebecf14c x86: move ack_bad_irq into irq code
Match i386, where we have this in the irq code. It belongs there.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:19 +01:00
Thomas Gleixner
3e35a0e525 x86: move ioapic code where it belongs
The commit 399287229c hacked the
ioapic resource mapping into apic.c for no good reason.
Move the code into io_apic_64.c where it belongs.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:19 +01:00
Thomas Gleixner
eaf76e8b93 x86: remove duplicate start_kernel declaration
start_kernel is already declared in a generic header file.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:19 +01:00
Thomas Gleixner
70a2002563 x86: move pmtmr related declarations
Move more stuff out of proto.h

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:18 +01:00
Thomas Gleixner
aaa64e04f9 x86: move numa related declarations
More stuff shuffeled to the correct place

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:17 +01:00
Thomas Gleixner
af7a78e925 x86: move mce related declarations
Move the mce related declarations where they belong, fix the
users and remove 32bit dependency in mce.h

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:17 +01:00
Thomas Gleixner
718fc13b46 x86: move debug related declarations to kdebug.h
Move them and fixup some users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:17 +01:00
Thomas Gleixner
c9ff03428f x86: move k8 related declarations
Move k8 related declarations to k8.h and fix numa_64.c

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:16 +01:00
Thomas Gleixner
8c61b900eb x86: make early_indentify_cpu static
early_indentify_cpu is only used in setup_64.c

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:16 +01:00
Thomas Gleixner
376ff0352c x86: move acpi and pci declarations
Move acpi/pci related declarations to the correct headers
and remove the duplicate.

Build fix from: Andrew Morton

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:16 +01:00
Thomas Gleixner
42e0a9aa5d x86: use u32 for some lapic functions
Use u32 so 32 and 64bit have the same interface.

Andrew Morton: xen, lguest build fixes

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:15 +01:00
Thomas Gleixner
3c6bb07ac1 x86: use u32 for safe_apic_wait_icr_idle()
Preperatory patch for merging apic headers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:15 +01:00
Thomas Gleixner
37e650c7c8 x86: rename get_maxlvt to lapic_get_maxlvt
Use the same name for the 32 and 64 bit variant.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:14 +01:00
Thomas Gleixner
77e463d104 x86: merge arch/x86/kernel/ldt_32/64.c
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:14 +01:00
Thomas Gleixner
70f5088dd5 x86: prepare arch/x86/kernel/ldt_32/64.c for merging
White space and coding style cleanups.

Change unsigned to int. There is no win when we compare mincount against pc->size,
which is an int as well. Casting pc->size to unsigned just might hide real problems.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:13 +01:00
Thomas Gleixner
fc2d625c4f x86: introduce ldt_write accessor
Create a ldt write accessor like the 32 bit one.

Preparatory patch for merging ldt.c and anyway necessary for
64bit paravirt ops.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:13 +01:00
Thomas Gleixner
78aa1f66f7 x86: clean up arch/x86/kernel/ldt_32/64.c
White space and coding style clenaup.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:13 +01:00
Thomas Gleixner
2f36fa13ce x86: clean up arch/x86/kernel/e820_64.c
White space and coding style cleanup.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:12 +01:00
Ingo Molnar
05fccb0e38 x86: code cleanups in arch/x86/kernel/pci-gart_64.c
code cleanups:

                                       errors   lines of code   errors/KLOC
 arch/x86/kernel/pci-gart_64.c            183             748         244.6
 arch/x86/kernel/pci-gart_64.c              0             790             0

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:12 +01:00
Ingo Molnar
e8d591dc71 x86: lindent arch/i386/math-emu, cleanup
manually clean up some of the damage that lindent caused.
(this is a separate commit so that in the unlikely case of
a typo we can bisect it down to the manual edits.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:12 +01:00
Ingo Molnar
3d0d14f983 x86: lindent arch/i386/math-emu
lindent these files:
                                       errors   lines of code   errors/KLOC
 arch/x86/math-emu/                      2236            9424         237.2
 arch/x86/math-emu/                       128            8706          14.7

no other changes. No code changed:

   text    data     bss     dec     hex filename
   5589802  612739 3833856 10036397         9924ad vmlinux.before
   5589802  612739 3833856 10036397         9924ad vmlinux.after

the intent of this patch is to ease the automated tracking of kernel
code quality - it's just much easier for us to maintain it if every file
in arch/x86 is supposed to be clean.

NOTE: it is a known problem of lindent that it causes some style damage
of its own, but it's a safe tool (well, except for the gcc array range
initializers extension), so we did the bulk of the changes via lindent,
and did the manual fixups in a followup patch.

the resulting math-emu code has been tested by Thomas Gleixner on a real
386 DX CPU as well, and it works fine.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:11 +01:00
Ingo Molnar
a4ec1effce x86: mach-voyager, lindent
lindent the mach-voyager files to get rid of more than 300 style errors:

                                       errors   lines of code   errors/KLOC
 arch/x86/mach-voyager/   [old]           409            3729         109.6
 arch/x86/mach-voyager/   [new]            71            3678          19.3

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:10 +01:00
Ingo Molnar
31183ba8fd x86: clean up arch/x86/kernel/aperture_64.c printk()s
clean up arch/x86/kernel/aperture_64.c printk()s.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:10 +01:00
Ingo Molnar
c140df973c x86: clean up arch/x86/kernel/aperture_64.c
whitespace cleanup. No code changed:

   text    data     bss     dec     hex filename
   2080      76       4    2160     870 aperture_64.o.before
   2080      76       4    2160     870 aperture_64.o.after

                                       errors   lines of code   errors/KLOC
 arch/x86/kernel/aperture_64.c            114             299         381.2
 arch/x86/kernel/aperture_64.c              0             315             0

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:09 +01:00
Thomas Gleixner
5bafb671e2 x86: clean up arch/x86/ia32/mmap32.c
White space and coding style clenaup.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:09 +01:00
Thomas Gleixner
6ec875666d x86: clean up arch/x86/ia32/syscall32.c
White space and coding style clenaup.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:08 +01:00
Thomas Gleixner
c202f298de x86: clean up arch/x86/ia32/sys_ia32.c
White space and coding style clenaup.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:08 +01:00
Thomas Gleixner
5de15d42e4 x86: clean up arch/x86/ia32/ptrace32.c
White space and coding style clenaup.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:08 +01:00
Thomas Gleixner
2da06b4e5d x86: clean up arch/x86/ia32/ipc32.c
White space and coding style cleanup.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:08 +01:00
Thomas Gleixner
99b9cdf758 x86: clean up arch/x86/ia32/ia32_signal.c
White space and coding style clenaup.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:07 +01:00
Thomas Gleixner
8edf8bee88 x86: clean up arch/x86/ia32/aout32.c
White space and coding style clenaup.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:07 +01:00
Thomas Gleixner
d94448b1fd x86: clean up arch/x86/ia32/fpu32.c
White space and coding style clenaup.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:07 +01:00
Hiroshi Shimamoto
39d44a5147 x86: enable irq in default_idle on 64-bit
local_irq_enable() is missing after sched_clock_idle_wakeup_event().

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:06 +01:00
Ingo Molnar
5ee613b675 x86: idle wakeup event in the HLT loop
do a proper idle-wakeup event on HLT as well - some CPUs stop the TSC
in HLT too, not just when going through the ACPI methods.

(the ACPI idle code already does this.)

[ update the 64-bit side too, as noticed by Jiri Slaby. ]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:06 +01:00
Guillaume Chazarain
53d517cdba x86: scale cyc_2_nsec according to CPU frequency
scale the sched_clock() cyc_2_nsec scaling factor according to
CPU frequency changes.

[ mingo@elte.hu: simplified it and fixed it for SMP. ]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:06 +01:00
Roland McGrath
83bd01024b x86: protect against sigaltstack wraparound
cf http://lkml.org/lkml/2007/10/3/41

To summarize: on Linux, SA_ONSTACK decides whether you are already on the
signal stack based on the value of the SP at the time of a signal.  If
you are not already inside the range, you are not "on the signal stack"
and so the new signal handler frame starts over at the base of the signal
stack.

sigaltstack (and sigstack before it) was invented in BSD.  There, the
SA_ONSTACK behavior has always been different.  It uses a kernel state
flag to decide, rather than the SP value.  When you first take an
SA_ONSTACK signal and switch to the alternate signal stack, it sets the
SS_ONSTACK flag in the thread's sigaltstack state in the kernel.
Thereafter you are "on the signal stack" and don't switch SP before
pushing a handler frame no matter what the SP value is.  Only when you
sigreturn from the original handler context do you clear the SS_ONSTACK
flag so that a new handler frame will start over at the base of the
alternate signal stack.

The undesireable effect of the Linux behavior is that an overflow of the
alternate signal stack can not only go undetected, but lead to a ring
buffer effect of clobbering the original handler frame at the base of the
signal stack for each successive signal that comes just after the
overflow.  This is what Shi Weihua's test case demonstrates.  Normally
this does not come up because of the signal mask, but the test case uses
SA_NODEFER for its SIGSEGV handler.

The other subtle part of the existing Linux semantics is that a simple
longjmp out of a signal handler serves to take you off the signal stack
in a safe and reliable fashion without having used sigreturn (nor having
just returned from the handler normally, which means the same).  After
the longjmp (or even informal stack switching not via any proper libc or
kernel interface), the alternate signal stack stands ready to be used
again.

A paranoid program would allocate a PROT_NONE red zone around its
alternate signal stack.  Then a small overflow would trigger a SIGSEGV in
handler setup, and be fatal (core dump) whether or not SIGSEGV is
blocked.  As with thread stack red zones, that cannot catch all overflows
(or underflows).  e.g., a local array as large as page size allocated in
a function called from a handler, but not actually touched before more
calls push more stack, could cause an overflow that silently pushes into
some unrelated allocated pages.

The BSD behavior does not do anything in particular about overflow.  But
it does at least avoid the wraparound or "ring buffer effect", so you'll
just get a straightforward all-out overflow down your address space past
the low end of the alternate signal stack.  I don't know what the BSD
behavior is for longjmp out of an SA_ONSTACK handler.

The POSIX wording relating to sigaltstack is pretty minimal.  I don't
think it speaks to this issue one way or another.  (The program that
overflows its stack is clearly in undefined behavior territory of one
sort or another anyhow.)

Given the longjmp issue and the potential for highly subtle complications
in existing programs relying on this in arcane ways deep in their code, I
am very dubious about changing the behavior to the BSD style persistent
flag.  I think Shi Weihua's patches have a similar effect by tracking the
SP used in the last handler setup.

I think it would be sensible for the signal handler setup code to detect
when it would itself be causing a stack overflow.  Maybe something like
the following patch (untested).  This issue exists in the same way on all
machines, so ideally they would all do a similar check.

When it's the handler function itself or its callees that cause the
overflow, rather than the signal handler frame setup alone crossing the
boundary, this still won't help.  But I don't see any way to distinguish
that from the valid longjmp case.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:06 +01:00
Ingo Molnar
f9fc58910e x86: add DMI quirk for io-delay hangs on Compaq Presario V6000 laptops
add the DMI strings provided by Islam Amer <pharon@gmail.com>, for
the Compaq Presario V6000 (Quanta/30B7).

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:05 +01:00
Ingo Molnar
d0049e71c6 x86: make io_delay=0xed the default
make io_delay=0xed the default. This frees up port 0x80 which is
a debug port on some machines and locks up certain laptops.

Testing only for now. Try the io_delay=0x80 boot option if this does not
work for you.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:05 +01:00
Ingo Molnar
6e7c402590 x86: various changes and cleanups to in_p/out_p delay details
various changes to the in_p/out_p delay details:

- add the io_delay=none method
- make each method selectable from the kernel config
- simplify the delay code a bit by getting rid of an indirect function call
- add the /proc/sys/kernel/io_delay_type sysctl
- change 'io_delay=standard|alternate' to io_delay=0x80 and io_delay=0xed
- make the io delay config not depend on CONFIG_DEBUG_KERNEL

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: "David P. Reed" <dpreed@reed.com>
2008-01-30 13:30:05 +01:00
Rene Herman
b02aae9cf5 x86: provide a DMI based port 0x80 I/O delay override.
x86: provide a DMI based port 0x80 I/O delay override.

Certain (HP) laptops experience trouble from our port 0x80 I/O delay
writes. This patch provides for a DMI based switch to the "alternate
diagnostic port" 0xed (as used by some BIOSes as well) for these.

David P. Reed confirmed that port 0xed works for him and provides a
proper delay. The symptoms of _not_ working are a hanging machine,
with "hwclock" use being a direct trigger.

Earlier versions of this attempted to simply use udelay(2), with the
2 being a value tested to be a nicely conservative upper-bound with
help from many on the linux-kernel mailinglist but that approach has
two problems.

First, pre-loops_per_jiffy calibration (which is post PIT init while
some implementations of the PIT are actually one of the historically
problematic devices that need the delay) udelay() isn't particularly
well-defined. We could initialise loops_per_jiffy conservatively (and
based on CPU family so as to not unduly delay old machines) which
would sort of work, but...

Second, delaying isn't the only effect that a write to port 0x80 has.
It's also a PCI posting barrier which some devices may be explicitly
or implicitly relying on. Alan Cox did a survey and found evidence
that additionally some drivers may be racy on SMP without the bus
locking outb.

Switching to an inb() makes the timing too unpredictable and as such,
this DMI based switch should be the safest approach for now. Any more
invasive changes should get more rigid testing first. It's moreover
only very few machines with the problem and a DMI based hack seems
to fit that situation.

This also introduces a command-line parameter "io_delay" to override
the DMI based choice again:

	io_delay=<standard|alternate>

where "standard" means using the standard port 0x80 and "alternate"
port 0xed.

This retains the udelay method as a config (CONFIG_UDELAY_IO_DELAY) and
command-line ("io_delay=udelay") choice for testing purposes as well.

This does not change the io_delay() in the boot code which is using
the same port 0x80 I/O delay but those do not appear to be a problem
as David P. Reed reported the problem was already gone after using the
udelay version. He moreover reported that booting with "acpi=off" also
fixed things and seeing as how ACPI isn't touched until after this DMI
based I/O port switch I believe it's safe to leave the ones in the boot
code be.

The DMI strings from David's HP Pavilion dv9000z are in there already
and we need to get/verify the DMI info from other machines with the
problem, notably the HP Pavilion dv6000z.

This patch is partly based on earlier patches from Pavel Machek and
David P. Reed.

Signed-off-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:05 +01:00
Mike Galbraith
4c6b8b4d62 x86: fix: s2ram + P4 + tsc = annoyance
s2ram recently became useful here, except for the kernel's annoying
habit of disabling my P4's perfectly good TSC.

[  107.894470] CPU 1 is now offline
[  107.894474] SMP alternatives: switching to UP code
[  107.895832] CPU0 attaching sched-domain:
[  107.895836]  domain 0: span 1
[  107.895838]   groups: 1
[  107.896097] CPU1 is down
[    3.726156] Intel machine check architecture supported.
[    3.726165] Intel machine check reporting enabled on CPU#0.
[    3.726167] CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
[    3.726170] CPU0: Thermal monitoring enabled
[    3.726175] Back to C!
[    3.726708] Force enabled HPET at resume
[    3.726775] Enabling non-boot CPUs ...
[    3.727049] CPU0 attaching NULL sched-domain.
[    3.727165] SMP alternatives: switching to SMP code
[    3.727858] Booting processor 1/1 eip 3000
[    3.727862] CPU 1 irqstacks, hard=b042f000 soft=b042d000
[    3.738173] Initializing CPU#1
[    3.798912] Calibrating delay using timer specific routine.. 5986.12 BogoMIPS (lpj=2993061)
[    3.798920] CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000 00000000
[    3.798931] CPU: Trace cache: 12K uops, L1 D cache: 8K
[    3.798934] CPU: L2 cache: 512K
[    3.798936] CPU: Physical Processor ID: 0
[    3.798938] CPU: After all inits, caps: bfebfbff 00000000 00000000 0000b080 00004400 00000000 00000000 00000000
[    3.798946] Intel machine check architecture supported.
[    3.798952] Intel machine check reporting enabled on CPU#1.
[    3.798955] CPU1: Intel P4/Xeon Extended MCE MSRs (12) available
[    3.798959] CPU1: Thermal monitoring enabled
[    3.799161] CPU1: Intel(R) Pentium(R) 4 CPU 3.00GHz stepping 09
[    3.799187] checking TSC synchronization [CPU#0 -> CPU#1]:
[    3.819181] Measured 63588552840 cycles TSC warp between CPUs, turning off TSC clock.
[    3.819184] Marking TSC unstable due to: check_tsc_sync_source failed.

If check_tsc_warp() is called after initial boot, and the TSC has in the
meantime been set (BIOS, user, silicon, elves) to a value lower than the
last stored/stale value, we blame the TSC.  Reset to pristine condition
after every test.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:04 +01:00
Rafael J. Wysocki
5c9c9bec05 x86: hibernation: document __save_processor_state() on x86
Document the fact that __save_processor_state() has to save all CPU
registers referred to by the kernel in case a different kernel is
used to load and restore a hibernation image containing it.

Sigend-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:04 +01:00
Sam Ravnborg
9484b1eb4d x86: fix make mrproper
Michael Opdenacker reported:

For backward compatibility with earlier (< 2.6.24) kernels,
arch/i386/boot/bzImage or arch/x86_64/boot/bzImage symbolic links to
arch/x86/boot/bzImage are created when you build an x86 kernel. The
arch/i386 or arch/x86_64 directories are then created for this only
purpose.

Issue: these generated directories and symbolic links are *not cleaned
up* when you run "make mrproper" (and thus "make distclean"). This
disturbs the production of patches, because the source tree is left with
generated files and directories.

Sam has an alternative fix:

The directory is killed during make clean as opposed to make mrproper.

Reported-by: Michael Opdenacker <michael-lists@free-electrons.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:04 +01:00
Balaji Rao
37a47db8d7 x86: assign IRQs to HPET timers, fix
Looks like IRQ 31 is assigned to timer 3, even without the patch!
I wonder who wrote the number 31. But the manual says that it is
zero by default.

I think we should check whether the timer has been allocated an IRQ before
proceeding to assign one to it.  Here is a patch that does this.

Signed-off-by: Balaji Rao <balajirrao@gmail.com>
Tested-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:03 +01:00
Balaji Rao
e3f37a54f6 x86: assign IRQs to HPET timers
The userspace API for the HPET (see Documentation/hpet.txt) did not work. The
HPET_IE_ON ioctl was failing as there was no IRQ assigned to the timer
device. This patch fixes it by allocating IRQs to timer blocks in the HPET.

arch/x86/kernel/hpet.c |   13 +++++--------
drivers/char/hpet.c    |   45 ++++++++++++++++++++++++++++++++++++++-------
include/linux/hpet.h   |    2 +-
3 files changed, 44 insertions(+), 16 deletions(-)

Signed-off-by: Balaji Rao <balajirrao@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:03 +01:00
Thomas Gleixner
1a0c009ac5 x86: unregister PIT clocksource when PIT is disabled
The following scenario might leave PIT as a disfunctional clock source:

    PIT is registered as clocksource
    PM_TIMER is registered as clocksource and enables highres/dyntick mode
    PIT is switched to oneshot mode
    -> now the readout of PIT is bogus, but the user might select PIT
    via the sysfs override, which would break the box as the time
    readout is unusable.

Unregister the PIT clocksource when the PIT clock event device is switched
into shutdown / oneshot mode.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:03 +01:00
Thomas Gleixner
4713e22ce8 clocksource: add unregister function to disable unusable clocksources
On x86 the PIT might become an unusable clocksource. Add an unregister
function to provide a possibilty to remove the PIT from the list of
available clock sources.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:02 +01:00
Thomas Gleixner
316da3b3fc x86: restrict PIT clocksource usage
PIT clocksource is registered unconditionally even when HPET is enabled
or when PIT is replaced by the local APIC timer. In both cases PIT can
not be used as it is stopped and the readout would be stale.

Prevent registering PIT in those cases.

patch depends on:

  x86: offer is_hpet_enabled() on !CONFIG_HPET_TIMER too

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:02 +01:00
Pavel Machek
b10db7f0d2 time: more timer related cleanups
I was confused by FSEC = 10^15 NSEC statement, plus small whitespace
fixes. When there's copyright, there should be GPL.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:00 +01:00
Greg KH
213eca7f48 kobj: fix threshold_init_device/kobject_uevent_env oops
the logic in this function is just crazy.  It's recursive, but we
can circumvent the creation for the kobject and whole creation of the
threshold_block if some conditions are met.  That's why we see the
allocate_threshold_blocks so many times in the callstack, yet only a few
kobjects created.

Then we blow up in kobject_uevent_env() on the first debug printk.
Which means that we are just passing in garbage.

Man, this is one time that comments in code would have been very nice to
have, and why forward goto's into major code blocks are just evil...

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:29:58 +01:00
Sam Ravnborg
01ba2bdc6b all archs: consolidate init and exit sections in vmlinux.lds.h
This patch consolidate all definitions of .init.text, .init.data
and .exit.text, .exit.data section definitions in
the generic vmlinux.lds.h.

This is a preparational patch - alone it does not buy
us much good.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-01-28 23:21:17 +01:00
Arjan van de Ven
9745512ce7 sched: latencytop support
LatencyTOP kernel infrastructure; it measures latencies in the
scheduler and tracks it system wide and per process.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:34 +01:00
Peter Zijlstra
8f4d37ec07 sched: high-res preemption tick
Use HR-timers (when available) to deliver an accurate preemption tick.

The regular scheduler tick that runs at 1/HZ can be too coarse when nice
level are used. The fairness system will still keep the cpu utilisation 'fair'
by then delaying the task that got an excessive amount of CPU time but try to
minimize this by delivering preemption points spot-on.

The average frequency of this extra interrupt is sched_latency / nr_latency.
Which need not be higher than 1/HZ, its just that the distribution within the
sched_latency period is important.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:29 +01:00
Gautham R Shenoy
86ef5c9a8e cpu-hotplug: replace lock_cpu_hotplug() with get_online_cpus()
Replace all lock_cpu_hotplug/unlock_cpu_hotplug from the kernel and use
get_online_cpus and put_online_cpus instead as it highlights the
refcount semantics in these operations.

The new API guarantees protection against the cpu-hotplug operation, but
it doesn't guarantee serialized access to any of the local data
structures. Hence the changes needs to be reviewed.

In case of pseries_add_processor/pseries_remove_processor, use
cpu_maps_update_begin()/cpu_maps_update_done() as we're modifying the
cpu_present_map there.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:02 +01:00
Linus Torvalds
eba0e319c1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (125 commits)
  [CRYPTO] twofish: Merge common glue code
  [CRYPTO] hifn_795x: Fixup container_of() usage
  [CRYPTO] cast6: inline bloat--
  [CRYPTO] api: Set default CRYPTO_MINALIGN to unsigned long long
  [CRYPTO] tcrypt: Make xcbc available as a standalone test
  [CRYPTO] xcbc: Remove bogus hash/cipher test
  [CRYPTO] xcbc: Fix algorithm leak when block size check fails
  [CRYPTO] tcrypt: Zero axbuf in the right function
  [CRYPTO] padlock: Only reset the key once for each CBC and ECB operation
  [CRYPTO] api: Include sched.h for cond_resched in scatterwalk.h
  [CRYPTO] salsa20-asm: Remove unnecessary dependency on CRYPTO_SALSA20
  [CRYPTO] tcrypt: Add select of AEAD
  [CRYPTO] salsa20: Add x86-64 assembly version
  [CRYPTO] salsa20_i586: Salsa20 stream cipher algorithm (i586 version)
  [CRYPTO] gcm: Introduce rfc4106
  [CRYPTO] api: Show async type
  [CRYPTO] chainiv: Avoid lock spinning where possible
  [CRYPTO] seqiv: Add select AEAD in Kconfig
  [CRYPTO] scatterwalk: Handle zero nbytes in scatterwalk_map_and_copy
  [CRYPTO] null: Allow setkey on digest_null 
  ...
2008-01-25 08:38:25 -08:00
Kay Sievers
af5ca3f4ec Driver core: change sysdev classes to use dynamic kobject names
All kobjects require a dynamically allocated name now. We no longer
need to keep track if the name is statically assigned, we can just
unconditionally free() all kobject names on cleanup.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:40 -08:00
Greg Kroah-Hartman
38a382ae5d Kobject: convert arch/* from kobject_unregister() to kobject_put()
There is no need for kobject_unregister() anymore, thanks to Kay's
kobject cleanup changes, so replace all instances of it with
kobject_put().


Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:39 -08:00
Greg Kroah-Hartman
542eb75a27 Kobject: change arch/x86/kernel/cpu/mcheck/mce_amd_64.c to use kobject_init_and_add
Stop using kobject_register, as this way we can control the sending of
the uevent properly, after everything is properly initialized.

Cc: Jacob Shin <jacob.shin@amd.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:30 -08:00
Greg Kroah-Hartman
a521cf209c Kobject: change arch/x86/kernel/cpu/mcheck/mce_amd_64.c to use kobject_create_and_add
Make this kobject dynamic and convert it to not use kobject_register,
which is going away.

Cc: Jacob Shin <jacob.shin@amd.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:30 -08:00
Greg Kroah-Hartman
5b3f355d8f Kobject: change arch/x86/kernel/cpu/intel_cacheinfo.c to use kobject_init_and_add
Stop using kobject_register, as this way we can control the sending of
the uevent properly, after everything is properly initialized.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:28 -08:00
Rafael J. Wysocki
775b64d2b6 PM: Acquire device locks on suspend
This patch reorganizes the way suspend and resume notifications are
sent to drivers.  The major changes are that now the PM core acquires
every device semaphore before calling the methods, and calls to
device_add() during suspends will fail, while calls to device_del()
during suspends will block.

It also provides a way to safely remove a suspended device with the
help of the PM core, by using the device_pm_schedule_removal() callback
introduced specifically for this purpose, and updates two drivers (msr
and cpuid) that need to use it.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:04 -08:00
Jeremy Fitzhardinge
f9c4cfe954 xen: disable vcpu_info placement for now
There have been several reports of Xen guest domains locking up when
using vcpu_info structure placement.  Disable it for now.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-23 18:04:54 -08:00
Jordan Crouse
667984d9e4 x86: GEODE fix a race condition in the MFGPT timer tick
When we set the MFGPT timer tick, there is a chance that we'll
immediately assert an event.  If for some reason the IRQ routing
for this clock has been setup for some other purpose, then we
could end up firing an interrupt into the SMM handler or worse.

This rearranges the timer tick init function to initalize the handler
before we set up the MFGPT clock to make sure that even if we get
an event, it will go to the handler.

Furthermore, in the handler we need to make sure that we clear the
event, even if the timer isn't running.

Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Arnd Hannemann <hannemann@i4.informatik.rwth-aachen.de>
2008-01-22 23:30:16 +01:00
Thomas Gleixner
4960c9df14 Revert "x86: fix NMI watchdog & 'stopped time' problem"
This reverts commit d4d25deca4.

It tried to fix long standing bugzilla entries, but the solution was
reported to break other systems. The reporter of

http://bugzilla.kernel.org/show_bug.cgi?id=9791

tracked it down to this commit and confirmed that reverting the patch
restores the correct behaviour. It's too late in the release cycle to
find a better solution than reverting the commit to avoid regressions.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
2008-01-22 10:23:01 +01:00
Arjan van de Ven
e107ebe0e4 x86: add support for the latest Intel processors to Oprofile
The latest Intel processors (the 45nm ones) have a model number of 23
(old ones had 15); they're otherwise compatible on the oprofile side.
This patch adds the new model number to the oprofile code.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-18 22:49:33 +01:00
Peter Zijlstra
fb1dac909d lockdep: more hardirq annotations for notify_die()
On Sat, 2007-12-29 at 18:06 +0100, Marcin Slusarz wrote:
> Hi
> Today I've got this (while i was upgrading my gentoo box):
>
> WARNING: at kernel/lockdep.c:2658 check_flags()
> Pid: 21680, comm: conftest Not tainted 2.6.24-rc6 #63
>
> Call Trace:
>  [<ffffffff80253457>] check_flags+0x1c7/0x1d0
>  [<ffffffff80257217>] lock_acquire+0x57/0xc0
>  [<ffffffff8024d5c0>] __atomic_notifier_call_chain+0x60/0xd0
>  [<ffffffff8024d641>] atomic_notifier_call_chain+0x11/0x20
>  [<ffffffff8024d67e>] notify_die+0x2e/0x30
>  [<ffffffff8020da0a>] do_divide_error+0x5a/0xa0
>  [<ffffffff80522bdd>] trace_hardirqs_on_thunk+0x35/0x3a
>  [<ffffffff80255b89>] trace_hardirqs_on+0xd9/0x180
>  [<ffffffff80522bdd>] trace_hardirqs_on_thunk+0x35/0x3a
>  [<ffffffff80523c2d>] error_exit+0x0/0xa9
>
> possible reason: unannotated irqs-off.
> irq event stamp: 4693
> hardirqs last  enabled at (4693): [<ffffffff80522bdd>] trace_hardirqs_on_thunk+0x35/0x3a
> hardirqs last disabled at (4692): [<ffffffff80522c17>] trace_hardirqs_off_thunk+0x35/0x37
> softirqs last  enabled at (3546): [<ffffffff80238343>] __do_softirq+0xb3/0xd0
> softirqs last disabled at (3521): [<ffffffff8020c97c>] call_softirq+0x1c/0x30

more early fixups for notify_die()..

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-16 09:51:59 +01:00
Bernhard Walle
8ee291f87c x86: fix RTC_AIE with CONFIG_HPET_EMULATE_RTC
In the current code, RTC_AIE doesn't work if the RTC relies on
CONFIG_HPET_EMULATE_RTC because the code sets the RTC_AIE flag in
hpet_set_rtc_irq_bit().  The interrupt handles does accidentally check
for RTC_PIE and not RTC_AIE when comparing the time which was set in
hpet_set_alarm_time().

I now verified on a test system here that without the patch applied,
the attached test program fails on a system that has HPET with
2.6.24-rc7-default. That's not critical since I guess the problem has
been there for several kernel releases, but as the fix is quite
obvious.

Configuration is CONFIG_RTC=y and CONFIG_HPET_EMULATE_RTC=y.

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-15 16:44:38 +01:00
Ingo Molnar
23be8c7ddf x86: fix boot crash on HIGHMEM4G && SPARSEMEM
Denys Fedoryshchenko reported a bootup crash when he upgraded
his system from 3GB to 4GB RAM:

   http://lkml.org/lkml/2008/1/7/9

the bug is due to HIGHMEM4G && SPARSEMEM kernels making pfn_to_page()
to return an invalid pointer when the pfn is in a memory hole. The
256 MB PCI aperture at the end of RAM was not mapped by sparsemem,
and hence the pfn was not valid. But set_highmem_pages_init() iterated
this range without checking the pfn's validity first.

this bug was probably present in the sparsemem code ever since sparsemem
has been introduced in v2.6.13. It was masked due to HIGHMEM64G using
larger memory regions in sparsemem_32.h:

 #ifdef CONFIG_X86_PAE
 #define SECTION_SIZE_BITS       30
 #define MAX_PHYSADDR_BITS       36
 #define MAX_PHYSMEM_BITS        36
 #else
 #define SECTION_SIZE_BITS       26
 #define MAX_PHYSADDR_BITS       32
 #define MAX_PHYSMEM_BITS        32
 #endif

which creates 1GB sparsemem regions instead of 64MB sparsemem regions.
So in practice we only ever created true sparsemem holes on x86 with
HIGHMEM4G - but that was rarely used by distros.

( btw., we could probably save 2MB of mem_map[]s on X86_PAE if we reduced
  the sparsemem region size to 256 MB. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-15 16:44:37 +01:00
Steven Rostedt
40d6a14662 Kick CPUS that might be sleeping in cpus_idle_wait
Sometimes cpu_idle_wait gets stuck because it might miss CPUS that are
already in idle, have no tasks waiting to run and have no interrupts going
to them.  This is common on bootup when switching cpu idle governors.

This patch gives those CPUS that don't check in an IPI kick.

 Background:
 -----------
I notice this while developing the mcount patches, that every once in a
while the system would hang. Looking deeper, the hang was always at boot
up when registering init_menu of the cpu_idle menu governor. Talking
with Thomas Gliexner, we discovered that one of the CPUS had no timer
events scheduled for it and it was in idle (running with NO_HZ). So the
CPU would not set the cpu_idle_state bit.

Hitting sysrq-t a few times would eventually route the interrupt to the
stuck CPU and the system would continue.

Note, I would have used the PDA isidle but that is set after the
cpu_idle_state bit is cleared, and would leave a window open where we
may miss being kicked.

hmm, looking closer at this, we still have a small race window between
clearing the cpu_idle_state and disabling interrupts (hence the RFC).

    CPU0:                          CPU 1:
  ---------                       ---------
 cpu_idle_wait():                 cpu_idle():
      |                           __cpu_cpu_var(is_idle) = 1;
      |                           if (__get_cpu_var(cpu_idle_state)) /* == 0 */
 per_cpu(cpu_idle_state, 1) = 1;         |
 if (per_cpu(is_idle, 1)) /* == 1 */     |
 smp_call_function(1)                    |
      |                             receives ipi and runs do_nothing.
 wait on map == empty               idle();
   /* waits forever */

So really we need interrupts off for most of this then. One might think
that we could simply clear the cpu_idle_state from do_nothing, but I'm
assuming that cpu_idle governors can be removed, and this might cause a
race that a governor might be used after the module was removed.

Venki said:

  I think your RFC patch is the right solution here.  As I see it, there is
  no race with your RFC patch.  As long as you call a dummy smp_call_function
  on all CPUs, we should be OK.  We can get rid of cpu_idle_state and the
  current wait forever logic altogether with dummy smp_call_function.  And so
  there wont be any wait forever scenario.

  The whole point of cpu_idle_wait() is to make all CPUs come out of idle
  loop atleast once.  The caller will use cpu_idle_wait something like this.

  // Want to change idle handler

  - Switch global idle handler to always present default_idle

  - call cpu_idle_wait so that all cpus come out of idle for an instant
    and stop using old idle pointer and start using default idle

  - Change the idle handler to a new handler

  - optional cpu_idle_wait if you want all cpus to start using the new
    handler immediately.

Maybe the below 1s patch is safe bet for .24.  But for .25, I would say we
just replace all complicated logic by simple dummy smp_call_function and
remove cpu_idle_state altogether.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-14 08:52:22 -08:00
Sebastian Siewior
15e7b4452b [CRYPTO] twofish: Merge common glue code
There is almost no difference between 32 & 64 bit glue code.

Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-01-14 17:07:57 +11:00
Len Brown
02d5bccf8e Pull bugzilla-9194 into release branch 2008-01-11 12:27:13 -05:00
Len Brown
9f9adecd2d PM: ACPI and APM must not be enabled at the same time
ACPI and APM used "pm_active" to guarantee that
they would not be simultaneously active.

But pm_active was recently moved under CONFIG_PM_LEGACY,
so that without CONFIG_PM_LEGACY, pm_active became a NOP --
allowing ACPI and APM to both be simultaneously enabled.
This caused unpredictable results, including boot hangs.

Further, the code under CONFIG_PM_LEGACY is scheduled
for removal.

So replace pm_active with pm_flags.
pm_flags depends only on CONFIG_PM,
which is present for both CONFIG_APM and CONFIG_ACPI.

http://bugzilla.kernel.org/show_bug.cgi?id=9194

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2008-01-11 12:26:47 -05:00
Tan Swee Heng
9a7dafbba4 [CRYPTO] salsa20: Add x86-64 assembly version
This is the x86-64 version of the Salsa20 stream cipher algorithm. The
original assembly code came from
<http://cr.yp.to/snuffle/salsa20/amd64-3/salsa20.s>. It has been
reformatted for clarity.

Signed-off-by: Tan Swee Heng <thesweeheng@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-01-11 08:16:57 +11:00
Tan Swee Heng
974e4b752e [CRYPTO] salsa20_i586: Salsa20 stream cipher algorithm (i586 version)
This patch contains the salsa20-i586 implementation. The original
assembly code came from
<http://cr.yp.to/snuffle/salsa20/x86-pm/salsa20.s>. I have reformatted
it (added indents) so that it matches the other algorithms in
arch/x86/crypto.

Signed-off-by: Tan Swee Heng <thesweeheng@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-01-11 08:16:57 +11:00
Sebastian Siewior
06e1a8f050 [CRYPTO] aes-asm: Merge common glue code
32 bit and 64 bit glue code is using (now) the same
piece code. This patch unifies them.

Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-01-11 08:16:24 +11:00
Sebastian Siewior
5157dea813 [CRYPTO] aes-i586: Remove setkey
The setkey() function can be shared with the generic algorithm.

Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-01-11 08:16:10 +11:00
Sebastian Siewior
81190b3215 [CRYPTO] aes-x86-64: Remove setkey
The setkey() function can be shared with the generic algorithm.

Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-01-11 08:16:10 +11:00
Sebastian Siewior
89e1265431 [CRYPTO] aes: Move common defines into a header file
This three defines are used in all AES related hardware.

Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-01-11 08:16:04 +11:00
Thomas Gleixner
a2b484a29c x86: fix do_fork_idle section mismatch
With CPU_HOTPLUG=n:

WARNING: vmlinux.o(.text+0x104f8): Section mismatch: reference to .init.text:fork_idle (between
'do_fork_idle' and 'lapic_timer_broadcast')

do_fork_idle() needs to be __cpuinit. It can be static as well.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-08 16:10:35 -08:00
Rusty Russell
476c6c11a9 fix lguest rmmod "bad pgd"
After 17d57a9206 ("x86: fix x86-32 early
fixmap initialization.") removing lg.ko caused a printk from vunmap:

	mm/memory.c:115: bad pgd 004b3027.

On the second use after module load, the kernel crashes.

This fixes the immediate problem (accessed and dirty bits not set as
expected in pmd_none_or_clear_bad).  I can't see why this would cause
a crash, but I haven't been able to reproduce it once this is applied.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-01 11:30:35 -08:00
Linus Torvalds
0068441870 Revert "x86: fix show cpuinfo cpu number always zero"
This reverts commit fbdcf18df7.

As pointed out by Yanmin Zhang, the problem was already fixed
differently (and correctly), and rather than fix anything, it actually
causes us to create a sub-optimal sched-domains hierarchy (not setting
up the domain belonging to the core) when CONFIG_X86_HT=y.

Requested-by: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-25 20:16:16 -08:00
Jason Gaston
04fa11ea17 x86: intel_cacheinfo.c: cpu cache info entry for Intel Tolapai
This patch adds a cpu cache info entry for the Intel Tolapai cpu.

Signed-off-by: Jason Gaston <jason.d.gaston@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-21 01:27:19 +01:00
Ingo Molnar
c0a698b744 x86: fix die() to not be preemptible
Andrew "Eagle Eye" Morton noticed that we use raw_local_save_flags()
instead of raw_local_irq_save(flags) in die(). This allows the
preemption of oopsing contexts - which is highly undesirable. It also
causes CONFIG_DEBUG_PREEMPT to complain, as reported by Miles Lane.

this bug was introduced via:

  commit 39743c9ef7
  Author: Andi Kleen <ak@suse.de>
  Date:   Fri Oct 19 20:35:03 2007 +0200

      x86: use raw locks during oopses

-               spin_lock_irqsave(&die.lock, flags);
+               __raw_spin_lock(&die.lock);
+               raw_local_save_flags(flags);

that is not a correct open-coding of spin_lock_irqsave(): both the
ordering is wrong (irqs should be disabled _first_), and the wrong
flags-saving API was used.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-12-21 01:27:19 +01:00
Mike Travis
fbdcf18df7 x86: fix show cpuinfo cpu number always zero
when called by setup_arch) after smp_store_cpu_info() had set it to the
correct value.

The error shows up in 'cat /proc/cpuinfo' will all cpus = 0.

Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Suresh B Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-19 23:20:19 +01:00
Adrian Bunk
3d054f0fad x86_32: disable_pse must be __cpuinitdata
CONFIG_HOTPLUG_CPU=y:

WARNING: vmlinux.o(.text+0xfa52): Section mismatch: reference to .init.data:disable_pse (between 'identify_cpu' and 'identify_secondary_cpu')

[ akpm@linux-foundation.org: initializer fix. ]

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-19 23:20:19 +01:00
Adrian Bunk
3446fa057c x86_32: select_idle_routine() must be __cpuinit
CONFIG_HOTPLUG_CPU=y:

WARNING: vmlinux.o(.text+0x1199a): Section mismatch: reference to .init.text.5:select_idle_routine (between 'init_intel' and 'init_nexgen')

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-19 23:20:18 +01:00
Adrian Bunk
f2206ec92c x86 smpboot_32.c section fixes
CONFIG_HOTPLUG_CPU=y:

WARNING: vmlinux.o(.text+0x22c60): Section mismatch: reference to .init.data:cpu_idle_tasks (between 'do_boot_cpu' and 'do_warm_boot_cpu')
WARNING: vmlinux.o(.text+0x22c99): Section mismatch: reference to .init.data:cpu_idle_tasks (between 'do_boot_cpu' and 'do_warm_boot_cpu')
WARNING: vmlinux.o(.text+0x2359b): Section mismatch: reference to .init.data:smp_b_stepping (between 'smp_store_cpu_info' and 'cpu_exit_clear')
WARNING: vmlinux.o(.text+0x235a0): Section mismatch: reference to .init.data:smp_b_stepping (between 'smp_store_cpu_info' and 'cpu_exit_clear')

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-19 23:20:18 +01:00
Adrian Bunk
d533798326 x86 apic_32.c section fix
CONFIG_HOTPLUG_CPU=y:

WARNING: vmlinux.o(.text+0x2390d): Section mismatch: reference to .init.text.5:setup_local_APIC (between 'start_secondary' and 'check_tsc_warp')

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-19 23:20:18 +01:00
Ingo Molnar
4aae070252 x86: fix "Kernel panic - not syncing: IO-APIC + timer doesn't work!"
this is the tale of a full day spent debugging an ancient but elusive bug.

after booting up thousands of random .config kernels, i finally happened
to generate a .config that produced the following rare bootup failure
on 32-bit x86:

| ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
| ..MP-BIOS bug: 8254 timer not connected to IO-APIC
| ...trying to set up timer (IRQ0) through the 8259A ...  failed.
| ...trying to set up timer as Virtual Wire IRQ... failed.
| ...trying to set up timer as ExtINT IRQ... failed :(.
| Kernel panic - not syncing: IO-APIC + timer doesn't work!  Boot with apic=debug
| and send a report.  Then try booting with the 'noapic' option

this bug has been reported many times during the years, but it was never
reproduced nor fixed.

the bug that i hit was extremely sensitive to .config details.

First i did a .config-bisection - suspecting some .config detail.
That led to CONFIG_X86_MCE: enabling X86_MCE magically made the bug disappear
and the system would boot up just fine.

Debugging my way through the MCE code ended up identifying two unlikely
candidates: the thing that made a real difference to the hang was that
X86_MCE did two printks:

 Intel machine check architecture supported.
 Intel machine check reporting enabled on CPU#1.

Adding the same printks to a !CONFIG_X86_MCE kernel made the bug go away!

this left timing as the main suspect: i experimented with adding various
udelay()s to the arch/x86/kernel/io_apic_32.c:check_timer() function, and
the race window turned out to be narrower than 30 microseconds (!).

That made debugging especially funny, debugging without having printk
ability before the bug hits is ... interesting ;-)

eventually i started suspecting IRQ activities - those are pretty much the
only thing that happen this early during bootup and have the timescale of
a few dozen microseconds. Also, check_timer() changes the IRQ hardware
in various creative ways, so the main candidate became IRQ0 interaction.

i've added a counter to track timer irqs (on which core they arrived, at
what exact time, etc.) and found that no timer IRQ would arrive after the
bug condition hits - even if we re-enable IRQ0 and re-initialize the i8259A,
but that we'd get a small number of timer irqs right around the time when we
call the check_timer() function.

Eventually i got the following backtrace triggered from debug code in the
timer interrupt:

...trying to set up timer as Virtual Wire IRQ... failed.
...trying to set up timer as ExtINT IRQ...
Pid: 1, comm: swapper Not tainted (2.6.24-rc5 #57)
EIP: 0060:[<c044d57e>] EFLAGS: 00000246 CPU: 0
EIP is at _spin_unlock_irqrestore+0x5/0x1c
EAX: c0634178 EBX: 00000000 ECX: c4947d63 EDX: 00000246
ESI: 00000002 EDI: 00010031 EBP: c04e0f2e ESP: f7c41df4
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
 CR0: 8005003b CR2: ffe04000 CR3: 00630000 CR4: 000006d0
 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
 DR6: ffff0ff0 DR7: 00000400
  [<c05f5784>] setup_IO_APIC+0x9c3/0xc5c

the spin_unlock() was called from init_8259A(). Wait ... we have an IRQ0
entry while we are in the middle of setting up the local APIC, the i8259A
and the PIT??

That is certainly not how it's supposed to work! check_timer() was supposed
to be called with irqs turned off - but this eroded away sometime in the
past. This code would still work most of the time because this code runs
very quickly, but just the right timing conditions are present and IRQ0
hits in this small, ~30 usecs window, timer irqs stop and the system does
not boot up. Also, given how early this is during bootup, the hang is
very deterministic - but it would only occur on certain machines (and
certain configs).

The fix was quite simple: disable/restore interrupts properly in this
function. With that in place the test-system now boots up just fine.

(64-bit x86 io_apic_64.c had the same bug.)

Phew! One down, only 1500 other kernel bugs are left ;-)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-18 18:05:58 +01:00
Masami Hiramatsu
0b0122faf4 x86: kprobes bugfix
Kprobes for x86-64 may cause a kernel crash if it inserted on "iret"
instruction. "call absolute" is invalid on x86-64, so we don't need
treat it.

 - Change the processing order as same as x86-32.
 - Add "iret"(0xcf) case.
 - Remove next_rip local variable.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-18 18:05:58 +01:00
Masami Hiramatsu
29b6cd794e x86: jprobe bugfix
jprobe for x86-64 may cause kernel page fault when the jprobe_return()
is called from incorrect function.

- Use jprobe_saved_regs instead getting it from stack.
  (Especially on x86-64, it may get incorrect data, because
   pt_regs can not be get by using container_of(rsp))
- Change the type of stack pointer to unsigned long *.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-18 18:05:58 +01:00
Barry Kasindorf
bd87f1f028 oprofile: op_model_athlon.c support for AMD family 10h barcelona performance counters
This patch is for controlling the upper 32bits of the event ctrl msrs.
This includes the upper 4 bits of the event select and the Guest Only and
Host Only bits

This patch is necessary to make Event Based Profiling work reliably on a
Family 10h processor

[akpm@linux-foundation.org: checkpatch.pl fixes]

Signed-off-by: Barry Kasindorf <barry.kasindorf@amd.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-18 18:05:58 +01:00
Andrew Morton
5867a78f41 revert "Hibernation: Use temporary page tables for kernel text mapping on x86_64"
Revert commit efa4d2fb04 ("Hibernation:
Use temporary page tables for kernel text mapping on x86_64") because it
causes my t61p to reboot right at the end of resume-from-disk.  For
reasons unknown at this time.

Cc: Pavel Machek <pavel@ucw.cz>
Cc: Andi Kleen <ak@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-17 19:28:15 -08:00
Jeremy Fitzhardinge
7999f4b4e5 xen: relax signature check
Some versions of Xen 3.x set their magic number to "xen-3.[12]", so
relax the test to match them.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-10 19:46:58 -08:00
Len Brown
f7a5274d7d Pull suspend-2.6.24 into release branch 2007-12-06 16:26:52 -05:00
Pavel Machek
74d0f3338f ACPI: suspend: old debugging hacks sneaked back
Old debugging hack sneaked back during x86 merge, this removes it.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-12-06 16:03:06 -05:00
Linus Torvalds
3743d33edf Tiny clean-up of OPROFILE/KPROBES configuration
Make the Kconfig.instrumentation file a bit easier on the eyes, and use
the new ARCH_SUPPORTS_OPROFILE for x86[-64].

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-06 09:41:12 -08:00
Andrew Morton
da54becc71 x86: arch_register_cpu() section fix
fix this on i386 allnoconfig:

 WARNING: vmlinux.o(.text+0x6f2e): Section mismatch: reference to .init.text:register_cpu (between 'arch_register_cpu' and 'text_poke')

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-04 17:19:07 +01:00
Adrian Bunk
f22d9bc1e8 x86: free_cache_attributes() section fix
free_cache_attributes() must be __cpuinit since it calls the
__cpuinit cache_remove_shared_cpu_map().

This patch fixes the following section mismatch reported by
Chris Clayton:

 ...
 WARNING: vmlinux.o(.text+0x90b6): Section mismatch: reference to .init.text:cache_remove_shared_cpu_map (between 'free_cache_attributes' and 'show_level')
 ...

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-04 17:19:07 +01:00
Don Zickus
75bc122c2d x86: add the word 'WARNING' in check_nmi_watchdog() output
Our automated test suite looks for keywords like error, fail, warning in
the boot log.  In the case when the nmi watchdog is determined to be
stuck in check_nmi_watchdog(), none of those keywords are displayed.

This patch adds a keyword, "WARNING:", so it makes it easier to notice
when the nmi watchdog isn't working correctly. Also add a proper
KERN_WARNING mark to this printout.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-04 17:19:07 +01:00
Adrian Bunk
ee0011a798 x86: revert CONFIG_X86_HT semantics change
The recent Kconfig changes in x86 resulted in CONFIG_X86_HT no longer
being set if (X86_32 && MK8).

After grep'ing through the tree I think the problem is that different
places have different assumptions about the semantics of CONFIG_X86_HT,
either:

- hyperthreading or
- multicore

This should be sorted out properly, but until then we should keep the
2.6.23 status quo.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-04 17:19:07 +01:00
Eric W. Biederman
17d57a9206 x86: fix x86-32 early fixmap initialization.
pageexec@freemail.hu writes:

> i've just noticed that the chunk in i386/kernel/head.S ended up in a
> weird place, namely, it's not going to be executed as it's just after
> a 'jmp 3f' and before startup_32_smp, probably not what you intended.
> on a sidenote, the whole thing can be done in a single insn, like:
>
> movl $(swapper_pg_pmd - __PAGE_OFFSET + 0x067), (swapper_pg_dir -
> __PAGE_OFFSET+ 4092)

Thanks for the reminder I thought we had fixed this problem a while ago.

Needed to get fixed virtual address for USB debug and earlycon with mmio.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-12-03 17:17:10 +01:00
OGAWA Hirofumi
0c1b272406 x86: disable hpet legacy replacement for kdump
we should also add hpet_disable() for kdump.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-03 17:17:10 +01:00
OGAWA Hirofumi
c86c7fbc82 x86: disable hpet on shutdown
If HPET was enabled by pci quirks, we use i8253 as initial clockevent
because pci quirks doesn't run until pci is initialized.

The above means the kernel (or something) is assuming HPET legacy
replacement is disabled and can use i8253 at boot.

If we used kexec, it isn't true. So, this patch disables HPET legacy
replacement for kexec in machine_shutdown().

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-03 17:17:10 +01:00
Linus Torvalds
e1cca7e8d4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup:
  x86 setup: don't recalculate ss:esp unless really necessary
2007-11-29 16:25:29 -08:00
Jeremy Fitzhardinge
f97b895495 x86/paravirt: revert exports to restore old behaviour
Subdividing the paravirt_ops structure caused a regression in certain
non-GPL modules which try to use mmu_ops and cpu_ops.  This restores the
old behaviour, and makes it consistent with the non-CONFIG_PARAVIRT case.

Takashi Iwai <tiwai@suse.de> adds:
> I took at this problem (as I have an nvidia card on one of my
> workstations), and found out that the following suffer from
> EXPORT_SYMBOL_GPL changes:
>
> * local_disable_irq(), local_irq_save*(), etc.
> * MSR-related macros like rdmsr(), wrmsr(), read_cr0(), etc.
>   wbinvd(), too.
> * pmd_val(), pgd_val(), etc are all involved with pv_mm_ops.
>   pmd_large() and pmd_bad() is also indirectly involved.
>   __flush_tlb() and friends suffer, too.

Christoph Hellwig objects to this patch on the grounds that modules
shouldn't be using these operations anyway.  I don't think this is a
particularly good reason to reject the patch, for several reasons:

1. These operations are still available to modules when not using
   CONFIG_PARAVIRT, since they are implicitly exported as inline
   functions via the kernel headers.  Exporting the same functionality as
   GPL-only symbols just adds a gratuitious difference between
   CONFIG_PARAVIRT and non-CONFIG_PARAVIRT configurations.  If we really
   think these operations are not for module use (or non-GPL module use),
   then we should solve the problem in a general way.

2. It's a regression from previous kernels, which would work these
   modules even with CONFIG_PARAVIRT enabled.

3. The operations in question seem pretty reasonable for modules to
   use.  The control registers/MSRs can be accessed directly anyway, so there's
   no benefit in preventing modules from using standard interfaces.  And it seems
   reasonable to allow a graphics driver to create its own mappings if it wants.

Therefore, I think this patch should go in for 2.6.24.  If people
really think that these operations should not be available to modules,
then we can address that separately.

Signed-off-by: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com>
Cc: Tobias Powalowski <t.powa@gmx.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-29 09:24:55 -08:00
Randy Dunlap
b8415ec34f lguest: prevent VISWS or VOYAGER randconfigs
Keep lguest from being enabled on VISWS or VOYAGER configs, just as is
already done for VMI and XEN.  Otherwise randconfigs with VISWS and LGUEST
have this problem:

In file included from arch/x86/kernel/setup_32.c:61:
include/asm-x86/mach-visws/setup_arch.h:8:1: warning: "ARCH_SETUP" redefined
In file included from include/asm/msr.h:80,
                 from include/asm/processor_32.h:17,
                 from include/asm/processor.h:2,
                 from include/asm/thread_info_32.h:16,
                 from include/asm/thread_info.h:2,
                 from include/linux/thread_info.h:21,
                 from include/linux/preempt.h:9,
                 from include/linux/spinlock.h:49,
                 from include/linux/seqlock.h:29,
                 from include/linux/time.h:8,
                 from include/linux/timex.h:57,
                 from include/linux/sched.h:53,
                 from arch/x86/kernel/setup_32.c:24:
include/asm/paravirt.h:458:1: warning: this is the location of the previous definition

(and of course, this happens because kconfig does not follow dependencies
when [evil] select is used...)

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-29 09:24:55 -08:00
KAMEZAWA Hiroyuki
b6fd6ecb83 memory hotplug x86_64: fix section mismatch in init_memory_mapping()
Changes __meminit to __init_refok.

WARNING: vmlinux.o(.text+0x1d07c): Section mismatch: reference to
.init.text:find_e820_area (between 'init_memory_mapping' and 'arch_add_memory')

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-29 09:24:54 -08:00
Jeremy Fitzhardinge
2c80b01bea xen: mask _PAGE_PCD from ptes
_PAGE_PCD maps a page with caching disabled, which is typically used for
mapping harware registers.  Xen never allows it to be set on a mapping, and
unprivileged guests never need it since they can't see the real underlying
hardware.  However, some uncached mappings are made early when probing the
(non-existent) APIC, and its OK to mask off the PCD flag in these cases.

This became necessary because Xen started checking for this bit, rather
than silently masking it off.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-29 09:24:52 -08:00
Jens Rottmann
16252da654 x86 setup: don't recalculate ss:esp unless really necessary
In order to work around old LILO versions providing an invalid ss
register, the current setup code always sets up a new stack,
immediately following .bss and the heap. But this breaks LOADLIN.

This rewrite of the workaround checks for an invalid stack (ss!=ds)
first, and leaves ss:sp alone otherwise (apart from aligning esp).

[hpa note: LOADLIN has a number of arbitrary hard-coded limits that
are being pushed up against.  Without some major revision of LOADLIN
itself it will not be sustainable keeping it alive.  This gives it
another brief lease on life, however.  This patch also helps the
cmdline truncation problem with old versions of SYSLINUX.]

Signed-off-by: Jens Rottmann <JRottmann at LiPPERT-AT. de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2007-11-28 18:17:17 -08:00
Linus Torvalds
ff1ea52fa3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
  x86: fix APIC related bootup crash on Athlon XP CPUs
  time: add ADJ_OFFSET_SS_READ
  x86: export the symbol empty_zero_page on the 32-bit x86 architecture
  x86: fix kprobes_64.c inlining borkage
  pci: use pci=bfsort for HP DL385 G2, DL585 G2
  x86: correctly set UTS_MACHINE for "make ARCH=x86"
  lockdep: annotate do_debug() trap handler
  x86: turn off iommu merge by default
  x86: fix ACPI compile for LOCAL_APIC=n
  x86: printk kernel version in WARN_ON and other dump_stack users
  ACPI: Set max_cstate to 1 for early Opterons.
  x86: fix NMI watchdog & 'stopped time' problem
2007-11-26 19:41:28 -08:00
Linus Torvalds
6b41016032 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (39 commits)
  ACPI: EC: Workaround for optimized controllers (version 3)
  ACPI: EC: use printk_ratelimit(), add some DEBUG mode messages
  Revert "ACPI: EC: Workaround for optimized controllers"
  ACPI: fix two IRQ8 issues in IOAPIC mode
  ACPI: Add missing spaces to printk format
  cpuidle: fix HP nx6125 regression
  cpuidle: add sched_clock_idle_[sleep|wakeup]_event() hooks
  cpuidle: fix C3 for no bus-master control case
  ACPI: thinkpad-acpi: fix oops when a module parameter has no value
  Revert "Fix very high interrupt rate for IRQ8 (rtc) unless pnpacpi=off"
  ACPI: EC: Don't init EC early if it has no _INI
  Revert "acpi: make ACPI_PROCFS default to y"
  Revert "ACPI: add documentation for deprecated /proc/acpi/battery in ACPI_PROCFS"
  ACPI: Split out control for /proc/acpi entries from battery, ac, and sbs.
  ACPI: Video: Increase buffer size for writes to brightness proc file.
  ACPI: EC: Workaround for optimized controllers
  ACPI: SBS: Fix retval warning
  ACPI: Enable MSR (FixedHW) support for T-States
  ACPI: Get throttling info from BIOS only after evaluating _PDC
  ACPI: Use _TSS for throttling control, when present. Add error checks.
  ...
2007-11-26 19:09:58 -08:00
Andreas Herrmann
8c6531f7a9 x86: correctly set UTS_MACHINE for "make ARCH=x86"
For a kernel built with "make ARCH=x86" the following system
information is displayed when running the new kernel

    $ uname -m
    x86

On some i386 systems (e.g. K7) we even have the following information

    $ uname -m
    x66

This is weird. The usual information for "uname -m" should be "x86_64"
on 64-bit and "i386" or "i686" on 32-bit.

This patch fixes the issue by setting UTS_MACHINE to "i386" for 32-bit
kernel builds and to "x86_64" for 64-bit kernel builds. I.e., "x86"
won't be used for UTS_MACHINE anymore.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andreas Herrmann <aherrman@arcor.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-26 17:38:53 -08:00
Ingo Molnar
f44d9efd35 x86: fix APIC related bootup crash on Athlon XP CPUs
warmbloodedcreature@gmail.com reported that an APIC-enabled
Asus a7v8x-x with an Athlon XP reboots early in the bootup:

   http://bugzilla.kernel.org/show_bug.cgi?id=8723

after a long marathon of spontaneous-reboot debugging, it turns
out to be caused by sync_Arb_ids(). AMD CPUs never really needed
this sequence anyway, so just return early if we meet an AMD CPU.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-11-26 20:42:20 +01:00
Theodore Ts'o
8232fd6252 x86: export the symbol empty_zero_page on the 32-bit x86 architecture
The latest KVM driver wants to use the empty_zero_page symbol, and it's
not exported in 32-bit x86 (although it is exported by x86_64, s390, and
uml architectures).

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: tglx@linutronix.de
Cc: linux-kernel@vger.kernel.com
Cc: kvm-devel@lists.sourceforge.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-26 20:42:19 +01:00
Andrew Morton
8645419cdb x86: fix kprobes_64.c inlining borkage
fix:

arch/x86/kernel/kprobes_64.c: In function 'set_current_kprobe':
arch/x86/kernel/kprobes_64.c:152: sorry, unimplemented: inlining failed in call to 'is_IF_modifier': recursive inlining
arch/x86/kernel/kprobes_64.c:166: sorry, unimplemented: called from here

Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: mingo@elte.hu
Cc: akpm@linux-foundation.org
Cc: tglx@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-26 20:42:19 +01:00
Michal Schmidt
c82bc5ad54 pci: use pci=bfsort for HP DL385 G2, DL585 G2
HP ProLiant systems DL385 G2 and DL585 G2 need pci=bfsort to enumerate PCI
devices in the expected order.

Matt sayeth:

  biosdevname is a userspace app I wrote to help solve this so we don't need
  to patch the kernel for future systems.  It's not integrated into any
  distributions properly yet, but is included in openSUSE 10.3 and Fedora 8
  for people who want to download and install it there.  It acts as a udev
  helper.

  For the time being, patching the kernel is necessary.  I really hope
  biosdevname eliminates that need in future distributions.

  http://linux.dell.com/biosdevname/

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Cc: mingo@elte.hu
Cc: andy@greyhouse.net
Cc: john.cagle@hp.com
Cc: Matt Domsch <Matt_Domsch@dell.com>
Cc: Greg KH <greg@kroah.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-26 20:42:19 +01:00
Andreas Herrmann
43517854da x86: correctly set UTS_MACHINE for "make ARCH=x86"
x86: correctly set UTS_MACHINE for "make ARCH=x86"

For a kernel built with "make ARCH=x86" the following system
information is displayed when running the new kernel

    $ uname -m
    x86

On some i386 systems (e.g. K7) we even have the following information

    $ uname -m
    x66

This is weird. The usual information for "uname -m" should be "x86_64"
on 64-bit and "i386" or "i686" on 32-bit.

This patch fixes the issue by setting UTS_MACHINE to "i386" for 32-bit
kernel builds and to "x86_64" for 64-bit kernel builds. I.e., "x86"
won't be used for UTS_MACHINE anymore.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andreas Herrmann <aherrman@arcor.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@redhat.com>
2007-11-26 20:42:19 +01:00
Peter Zijlstra
000f4a9e71 lockdep: annotate do_debug() trap handler
Ensure the hardirq state is consistent before using locks. Use the rare
trace_hardirqs_fixup() because the trap can happen in any context.

resolves this rare lockdep warning:

WARNING: at kernel/lockdep.c:2658 check_flags()
 [<c013571e>] check_flags+0x90/0x140
 [<c0138a69>] lock_release+0x4b/0x1d0
 [<c0507fea>] notifier_call_chain+0x2a/0x47
 [<c050806b>] __atomic_notifier_call_chain+0x64/0x6d
 [<c0508007>] __atomic_notifier_call_chain+0x0/0x6d
 [<c050808b>] atomic_notifier_call_chain+0x17/0x1a
 [<c0131802>] notify_die+0x30/0x34
 [<c0506b09>] do_debug+0x3e/0xd4
 [<c050658f>] debug_stack_correct+0x27/0x2c
 [<c04be389>] tcp_rcv_established+0x1/0x620
 [<c04c38c2>] tcp_v4_do_rcv+0x2b/0x313
 [<c04c56b6>] tcp_v4_rcv+0x467/0x85d
 [<c0505ff2>] _spin_lock_nested+0x27/0x32
 [<c04c5a4d>] tcp_v4_rcv+0x7fe/0x85d
 [<c04c560e>] tcp_v4_rcv+0x3bf/0x85d
 [<c04adbb5>] ip_local_deliver_finish+0x11b/0x1b0
 [<c04adac8>] ip_local_deliver_finish+0x2e/0x1b0
 [<c04ada7b>] ip_rcv_finish+0x27b/0x29a
 [<c04961e5>] netif_receive_skb+0xfb/0x2a6
 [<c04add0f>] ip_rcv+0x0/0x1fb
 [<c0496354>] netif_receive_skb+0x26a/0x2a6
 [<c04961e5>] netif_receive_skb+0xfb/0x2a6
 [<c049872e>] process_backlog+0x7f/0xc6
 [<c04983ba>] net_rx_action+0xb9/0x1ac
 [<c0498348>] net_rx_action+0x47/0x1ac
 [<c01376cb>] trace_hardirqs_on+0x118/0x16b
 [<c01225e2>] __do_softirq+0x49/0xa2
 [<c010595f>] do_softirq+0x60/0xdd
 [<c0506300>] _spin_unlock_irq+0x20/0x2c
 [<c0103e4f>] restore_nocheck+0x12/0x15
 [<c01440e1>] handle_fasteoi_irq+0x0/0x9b
 [<c0105a70>] do_IRQ+0x94/0xaa
 [<c0506300>] _spin_unlock_irq+0x20/0x2c
 [<c0104832>] common_interrupt+0x2e/0x34
 [<c0114703>] native_safe_halt+0x2/0x3
 [<c0102c01>] default_idle+0x44/0x65
 [<c010257f>] cpu_idle+0x42/0x50
 [<c076ea09>] start_kernel+0x26b/0x270
 [<c076e317>] unknown_bootoption+0x0/0x196
 =======================
irq event stamp: 559190
hardirqs last  enabled at (559190): [<c0507316>] kprobe_exceptions_notify+0x299/0x305
hardirqs last disabled at (559189): [<c05067bf>] do_int3+0x1d/0x95
softirqs last  enabled at (559172): [<c010595f>] do_softirq+0x60/0xdd
softirqs last disabled at (559181): [<c010595f>] do_softirq+0x60/0xdd

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-11-26 20:42:19 +01:00
Ingo Molnar
bc84cf17b5 x86: turn off iommu merge by default
revert this commit for now:

    commit 9480626830
    Author: Andi Kleen <ak@suse.de>
    Date:   Fri Oct 19 20:35:03 2007 +0200

        x86: enable iommu_merge by default

it's causing regressions:

    http://bugzilla.kernel.org/show_bug.cgi?id=9412

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-11-26 20:42:19 +01:00
Arjan van de Ven
57c351de71 x86: printk kernel version in WARN_ON and other dump_stack users
today, all oopses contain a version number of the kernel, which is nice
because the people who actually do bother to read the oops get this
vital bit of information always without having to ask the reporter in
another round trip.

However, WARN_ON() and many other dump_stack() users right now lack this
information; the patch below adds this. This information is essential
for getting people to use their time effectively when looking at these
things; in addition, it's essential for tools that try to collect
statistics about defects.

Please consider, since its so simple and important for long term kernel
quality processes.

The code is identical between 32/64 bit; a lot of this code should be
unified over time, the patch keeps the identical-ness intact.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-11-26 20:42:19 +01:00
Maciej W. Rozycki
d4d25deca4 x86: fix NMI watchdog & 'stopped time' problem
More than 3 years ago Niclas Gustafsson reported a 'stopped time'
problem:

> Watching the /proc/interrupts with 10s apart after the "stop".
>
> [root@s151 root]# more /proc/interrupts
>            CPU0
>   0:   66413955  local-APIC-edge  timer
[...]
> LOC:   67355837
> ERR:          0
> MIS:          0
> [root@s151 root]# more /proc/interrupts
>            CPU0
>   0:   66413955  local-APIC-edge  timer
[...]
> LOC:   67379568
> ERR:          0
> MIS:          0

This may be because buggy SMM firmware messes with the 8259A (configured
for a transparent mode -- yes that rare "local-APIC-edge" mode is tricky
;-) ) insanely.

this should resolve:

  http://bugzilla.kernel.org/show_bug.cgi?id=2544
  http://bugzilla.kernel.org/show_bug.cgi?id=6296

Patch-dusted-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-11-26 20:42:19 +01:00
Len Brown
e6532b8883 Pull fluff into release branch
Conflicts:

	drivers/acpi/ec.c

Signed-off-by: Len Brown <len.brown@intel.com>
2007-11-20 01:21:47 -05:00
Len Brown
614a6bbecc Pull thermal into release branch 2007-11-20 01:20:31 -05:00
Shaohua Li
61fd47e0c8 ACPI: fix two IRQ8 issues in IOAPIC mode
Use mp_irqs[] to get PNP device's interrupt polarity and trigger.
There are two reasons to do this:
1. BIOS bug for PNP interrupt
2. BIOS explictly does override
mp_irqs[] should cover all the cases.

http://bugzilla.kernel.org/show_bug.cgi?id=5243
http://bugzilla.kernel.org/show_bug.cgi?id=7679
http://bugzilla.kernel.org/show_bug.cgi?id=9153

[lenb: fixed !IOAPIC and 64-bit !SMP builds]

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-11-20 01:16:29 -05:00
Sam Ravnborg
80ef88d6d2 x86: simplify "make ARCH=x86" and fix kconfig all.config
Simplify "make ARCH=x86" and fix kconfig so we again
can set 64BIT in all.config.

For a fix the diffstat is nice:
 6 files changed, 3 insertions(+), 36 deletions(-)

The patch reverts these commits:
0f855aa64b
-> kconfig: add helper to set config symbol from environment variable

2a113281f5
-> kconfig: use $K64BIT to set 64BIT with all*config targets

Roman Zippel pointed out that kconfig supported string
compares so the additional complexity introduced by the
above two patches were not needed.

With this patch we have following behaviour:

# make {allno,allyes,allmod,rand}config [ARCH=...]
option \ host arch      | 32bit         | 64bit
=====================================================
./.                     | 32bit         | 64bit
ARCH=x86                | 32bit         | 32bit
ARCH=i386               | 32bit         | 32bit
ARCH=x86_64             | 64bit         | 64bit

The general rule are that ARCH= and native architecture
takes precedence over the configuration.
So make ARCH=i386 [whatever] will always build a 32-bit
kernel no matter what the configuration says.
The configuration will be updated to 32-bit if it was
configured to 64-bit and the other way around.

This behaviour is consistent with previous behaviour so
no suprises here.

make ARCH=x86 will per default result in a 32-bit kernel
but as the only ARCH= value x86 allow the user to select
between 32-bit and 64-bit using menuconfig. 

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Andreas Herrmann <aherrman@arcor.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
2007-11-17 17:21:54 +01:00
Denys
6d1b30e30c x86: reboot fixup for wrap2c board
Needed to make the wireless board, WRAP2C reboot.

Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-11-17 16:27:02 +01:00