x86 32 bit already has this feature: This patch uses the stack frames with
frame pointer into an exact stack trace, by following the frame pointer.
This only affects kernels built with the CONFIG_FRAME_POINTER config option
enabled, and greatly reduces the amount of noise in oopses.
This code uses the traditional method of doing backtraces, but if it
finds a valid frame pointer chain, will use that to show which parts
of the backtrace are reliable and which parts are not
Due to the fragility and importance of the backtrace code, this needs to
be well reviewed and well tested before merging into mainlne.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch turns the x86 64 bit HANDLE_STACK macro in the backtrace code
into a function, just like 32 bit has. This is needed pre work in order to
get exact backtraces for CONFIG_FRAME_POINTER to work.
The function and it's arguments are not the same as 32 bit; due to the
exception/interrupt stack way of x86-64 there are a few differences.
This patch should not have any behavior changes, only code movement.
Due to the fragility and importance of the backtrace code, this needs to be
well reviewed and well tested before merging into mainlne.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Right now, we take the stack pointer early during the backtrace path, but
only calculate bp several functions deep later, making it hard to reconcile
the stack and bp backtraces (as well as showing several internal backtrace
functions on the stack with bp based backtracing).
This patch moves the bp taking to the same place we take the stack pointer;
sadly this ripples through several layers of the back tracing stack,
but it's not all that bad in the end I hope.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The 32 bit Frame Pointer backtracer code checks if the EBP is valid
to do a backtrace; however currently on a failure it just gives up
and prints nothing. That's not very nice; we can do better and still
print a decent backtrace.
This patch changes the backtracer to use the regular backtracing algorithm
at the same time as the EBP backtracer; the EBP backtracer is basically
used to figure out which part of the backtrace are reliable vs those
which are likely to be noise.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
For enhancing the 32 bit EBP based backtracer, I need the capability
for the backtracer to tell it's customer that an entry is either
reliable or unreliable, and the backtrace printing code then needs to
print the unreliable ones slightly different.
This patch adds the basic capability, the next patch will add a user
of this capability.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The current x86 32 bit FRAME_POINTER chasing code has a nasty bug in
that the EBP tracer doesn't actually update the value of EBP it is
tracing, so that the code doesn't actually switch to the irq stack
properly.
The result is a truncated backtrace:
WARNING: at timeroops.c:8 kerneloops_regression_test() (Not tainted)
Pid: 0, comm: swapper Not tainted 2.6.24-0.77.rc4.git4.fc9 #1
[<c040649a>] show_trace_log_lvl+0x1a/0x2f
[<c0406d41>] show_trace+0x12/0x14
[<c0407061>] dump_stack+0x6c/0x72
[<e0258049>] kerneloops_regression_test+0x44/0x46 [timeroops]
[<c04371ac>] run_timer_softirq+0x127/0x18f
[<c0434685>] __do_softirq+0x78/0xff
[<c0407759>] do_softirq+0x74/0xf7
=======================
This patch fixes the code to update EBP properly, and to check the EIP
before printing (as the non-framepointer backtracer does) so that
the same test backtrace now looks like this:
WARNING: at timeroops.c:8 kerneloops_regression_test()
Pid: 0, comm: swapper Not tainted 2.6.24-rc7 #4
[<c0405d17>] show_trace_log_lvl+0x1a/0x2f
[<c0406681>] show_trace+0x12/0x14
[<c0406ef2>] dump_stack+0x6a/0x70
[<e01f6040>] kerneloops_regression_test+0x3b/0x3d [timeroops]
[<c0426f07>] run_timer_softirq+0x11b/0x17c
[<c04243ac>] __do_softirq+0x42/0x94
[<c040704c>] do_softirq+0x50/0xb6
[<c04242a9>] irq_exit+0x37/0x67
[<c040714c>] do_IRQ+0x9a/0xaf
[<c04057da>] common_interrupt+0x2e/0x34
[<c05807fe>] cpuidle_idle_call+0x52/0x78
[<c04034f3>] cpu_idle+0x46/0x60
[<c05fbbd3>] rest_init+0x43/0x45
[<c070aa3d>] start_kernel+0x279/0x27f
=======================
This shows that the backtrace goes all the way down to user context now.
This bug was found during the port to 64 bit of the frame pointer backtracer.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
It's not too pretty, but I found this made the "PANIC: early exception"
messages become much more reliably useful: 1. print the vector number,
2. print the %cs value, 3. handle error-code-pushing vs non-pushing vectors.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The check for an unitialized clock event device triggers, when the local
apic timer is registered as a dummy clock event device for broadcasting.
Preset the multiplicator to avoid a false positive.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Check the APIC timer calibration result for sanity. When the frequency
is out of range, issue a warning and disable the local APIC timer.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The GDT_ENTRY() macro in pm.c would incorrectly cut the bottom 8 bits
off the base. We didn't define any bases with the bottom 8 bits
nonzero, so it is a non-manifest bug, but it's still a bug.
Pointed out by John Smith <johnsmith9344@gmail.com>.
Cc: John Smith <johnsmith9344@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
If we use the bootloader-provided stack pointer, we might end up in a
situation where the bootloader (incorrectly) pointed the stack in the
middle of our heap. Catch this by simply comparing the computed heap
end value to the stack pointer minus the defined stack size.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Push video mode setup as late as possible; messages issued through the
BIOS interface after video mode setup will either not be seen (for
framebuffer modes) or will screw up the cursor (for text modes.)
In particular, this makes the EDD probing message show up correctly.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tell the user to specify edd=off in the case of EDD probing hangs.
Per LKML discussion.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Add prototype for cmdline_find_option_bool() missing from:
x86 setup: early cmdline parser handle boolean options
Also, fix up a minor formatting error in that patch.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Unnecessary capitals are shouting; no need for it here.
Thus, change "OK" to "ok" and add a space.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
On early boot, probing the Bios for EDD happens without any message.
Enhanced Disk Drive Services (EDD) is a mechanism to match x86 BIOS device
names (int13 device 80h) to Linux device names (e.g. /dev/sda, /dev/hda)
There are buggy Bios out there having problems with EDD. This can be problems
with the Bios itself or with addon cards, too.
This patch is adds an informational message on early boot.
CONFIG_EDD is not set with defconfig, but with allmodconfig (i.e. CONFIG_EDD=m)
so the EDD probe may be active on early boot on many systems nowadays.
I can tell, that the probe is active on SuSE distro and with that I have seen
more than one system hanging endlessly with those "black screen with a blinking
cursor in the the upper left" on installation, making it difficult for the end-
user to find out, what`s the issue.
For sure I have seen this on FujitsuSiemens PCs with i810 and with i815 chipset.
This one also honours the "quiet" bootparam.
Also see:
http://marc.info/?l=linux-kernel&m=119781937207969&w=2http://marc.info/?l=linux-kernel&m=119783934032326&w=2http://marc.info/?l=linux-kernel&m=119783678529100&w=2
Signed-off-by: Roland Kletzing <devzero@web.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch extends the early commandline parser to support boolean options.
The current version in mainline only supports parsing "option=arg" value pairs.
With this it should be easy making other messages like "Uncompressing kernel"
honour the "quiet" parameter, too.
Signed-off-by: Roland Kletzing <devzero@web.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Fix the operand constraints for the segment accessor functions,
{rd,wr}{fs,gs}*. In particular, the 8-bit functions used "r"
constraints instead of "q" constraints.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Display VESA graphics modes, with their mode IDs, in the vga=ask
menu. Most VESA mode numbers are platform-dependent, so it helps to
have an easy way to display them.
Based in part on a patch by Petr Vandrovec <petr@vandrovec.name>.
Cc: Petr Vandrovec <petr@vandrovec.name>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
To set CR0.PE, use the X86_CR0_PE macro defined in
<asm/processor-flags.h> instead of hardcoding it as a constant (1).
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Intel VT doesn't like to engage when the protected-mode state isn't
fully initialized. Make life easier for it by initializing LDTR (to
null) and TR (to a dummy hunk of low memory which will never actually
be touched.)
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Make the transition to protected mode more paranoid by having
back-to-back near jump (to synchronize the 386/486 prefetch queue) and
far jump (to set up the code segment.)
While we're at it, zero as many registers as practical (for future
expandability of the 32-bit entry interface) and enter 32-bit mode
with a valid stack. Note that the 32-bit code cannot rely on this
stack, or we'll break all other existing users of the 32-bit
entrypoint, but it may make debugging hacks easier to write.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
get_segment_eip has similarities to convert_rip_to_linear(),
and is used in a similar context. Move get_segment_eip to
step.c to allow easier consolidation.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Move out tick_nohz_stop_sched_tick() call from the loop in cpu_idle
same as 32-bit version.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Use the fixup_exception() helper instead of the open-coded
search_extable() users.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Should be the last of the error_code tests that could use
the PF_ defines. Makes X86_32|64 a little closer.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Small step towards unifying traps_32|64.c. No functional
changes. Pull out a small helper from an if() statement
in die().
Marked as __kprobes as eventually we will want to call this
from do_page_fault similar to how X86_64 does it.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The machine check handler registers ioctl handler that is called
with the BKL held. Changing to register unlocked_ioctl instead.
Also mce ioctl handler does not seem to need any lock protection.
To: Andi Kleen <andi@firstfloor.org>
Cc: linux-kernel@vger.kernel.org
Cc: kernel-janitors@vger.kernel.org
Change the Machine check handler to use unlocked_ioctl instead of
ioctl handler. Also the mce ioctl handler does not need any lock
protection.
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The hypervisor doesn't allow PCD or PWT to be set on guest ptes, so
make sure they're masked out. Also, fix up some previous mispatching.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Fix various compilation problems as a result of changing pte_t.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Make sure pte_t, whatever its definition, has a pte element with type
pteval_t. This allows common code to access it without needing to be
specifically parameterised on what pagetable mode we're compiling for.
For 32-bit, this means that pte_t becomes a union with "pte" and "{
pte_low, pte_high }" (PAE) or just "pte_low" (non-PAE).
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Make users of supported_pte_mask common. This has the side-effect of
introducing the variable for 32-bit non-PAE, but I think its a pretty
small cost to simplify the code.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Avoid a conflict between Voyager's leave_mm and asm-x86/mmu.h's leave_mm.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Fix one error reported by checkpatch,
it now reports:
total: 0 errors, 0 warnings, 42 lines checked
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
On 32-bit NUMA, the memmap representing struct pages on each node is
allocated from node-local memory if possible. As only node-0 has memory from
ZONE_NORMAL, the memmap must be mapped into low memory. This is done by
reserving space in the Kernel Virtual Area (KVA) for the memmap belonging
to other nodes by taking pages from the end of ZONE_NORMAL and remapping
the other nodes memmap into those virtual addresses. The node boundaries
are then adjusted so that the region of pages is not used and it is marked
as reserved in the bootmem allocator.
This reserved portion of the KVA is PMD aligned althought
strictly speaking that requirement could be lifted (see thread at
http://lkml.org/lkml/2007/8/24/220). The problem is that when aligned, there
may be a portion of ZONE_NORMAL at the end that is not used for memmap and
does not have an initialised memmap nor is it marked reserved in the bootmem
allocator. Later in the boot process, these pages are freed and a storm of
Bad page state messages result.
This patch marks these pages reserved that are wasted due to alignment
in the bootmem allocator so they are not accidently freed. It is worth
noting that memory from node-0 is wasted where it could have been put into
ZONE_HIGHMEM on NUMA machines. Worse, the KVA is always reserved from the
location of real memory even when there is plenty of spare virtual address
space.
This patch also makes sure that reserve_bootmem() is not called with a
0-length size in numa_kva_reserve(). When this happens, it usually means
that a kernel built for Summit is being booted on a normal machine. The
resulting BUG_ON() is misleading so it is caught here.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Return the size of bts_struct in the PTRACE_BTS_STATUS command.
Change types to u32.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Unify arch/x86/kernel/acpi/sleep*.c
Pretty trivial unification; when two functions differed, it was
usually in error handling, and better of the two was picked up.
Signed-off-by: Pavel Machek <pavel@suse.cz>
Looks-okay-to: Rafael J. Wysocki <rjw@sisk.pl>
Tested-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The use of the __GENERIC_PERCPU is a bit problematic since arches
may want to run their own percpu setup while using the generic
percpu definitions. Replace it through a kconfig variable.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The boot protocol has until now required that the initrd be located in
lowmem, which makes the lowmem/highmem boundary visible to the boot
loader. This was exported to the bootloader via a compile-time
field. Unfortunately, the vmalloc= command-line option breaks this
part of the protocol; instead of adding yet another hack that affects
the bootloader, have the kernel relocate the initrd down below the
lowmem boundary inside the kernel itself.
Note that this does not rely on HIGHMEM being enabled in the kernel.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch export the boot parameters via debugfs for debugging.
The files added are as follow:
boot_params/data : binary file for struct boot_params
boot_params/version : boot protocol version
This patch is based on 2.6.24-rc5-mm1 and has been tested on i386 and
x86_64 platform.
This patch is based on the Peter Anvin's proposal.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
reboot_{32|64}.c unification patch.
This patch unifies the code from the reboot_32.c and reboot_64.c files.
It has been tested in computers with X86_32 and X86_64 kernels and it
looks like all reboot modes work fine (EFI restart system hasn't been
tested yet).
Probably I made some mistakes (like I usually do) so I hope
we can identify and fix them soon.
Signed-off-by: Miguel Boton <mboton@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
While examining vmlinux namelist on i386 (nm -v vmlinux) I noticed :
c01021d0 t es7000_rename_gsi
c010221a T es7000_start_cpu
<Big Hole>
c0103000 T thread_saved_pc
and
c0113218 T acpi_restore_state_mem
c0113219 T acpi_save_state_mem
<Big Hole>
c0114000 t wakeup_code
This is because arch/x86/kernel/acpi/wakeup_32.S forces a .text alignment
of 4096 bytes. (I have no idea if it is really needed, since
arch/x86/kernel/acpi/wakeup_64.S uses a 16 bytes alignment *only*)
So arch/x86/kernel/built-in.o also has this alignment
arch/x86/kernel/built-in.o: file format elf32-i386
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00018c94 00000000 00000000 00001000 2**12
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
But as arch/x86/kernel/acpi/wakeup_32.o is not the first object linked
into arch/x86/kernel/built-in.o, linker had to build several holes to meet
alignement requirements, because of .o nestings in the kbuild process.
This can be solved by using a special section, .text.page_aligned, so that
no holes are needed.
# size vmlinux.before vmlinux.after
text data bss dec hex filename
4619942 422838 458752 5501532 53f25c vmlinux.before
4610534 422838 458752 5492124 53cd9c vmlinux.after
This saves 9408 bytes
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Someone complained that the 32-bit defconfig contains AS as default IO
scheduler. Change that to CFQ.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>