Commit graph

68 commits

Author SHA1 Message Date
Vincent Hanquez
1cc6f12e03 [PATCH] xen: x86: Use new macro for debugreg
Make use of the 2 new macro set_debugreg and get_debugreg.

Signed-off-by: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:13 -07:00
Andrew Morton
c92c6ffdb1 [PATCH] mtrr size-and-base debugging
Consolidate the mtrr sanity checking, add a dump_stack().

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:12 -07:00
Andrew Morton
a3a255e744 [PATCH] x86: cpu_khz type fix
x86_64's cpu_khz is unsigned int and there is no reason why x86 needs to use
unsigned long.

So make cpu_khz unsigned int on x86 as well.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:11 -07:00
Alexey Dobriyan
129f69465b [PATCH] Remove i386_ksyms.c, almost.
* EXPORT_SYMBOL's moved to other files
* #include <linux/config.h>, <linux/module.h> where needed
* #include's in i386_ksyms.c cleaned up
* After copy-paste, redundant due to Makefiles rules preprocessor directives
  removed:

	#ifdef CONFIG_FOO
	EXPORT_SYMBOL(foo);
	#endif

	obj-$(CONFIG_FOO) += foo.o

* Tiny reformat to fit in 80 columns

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:11 -07:00
Natalie Protasevich
c434b7a6ae [PATCH] x86: avoid wasting IRQs for PCI devices
I have submitted the patch for x86_64, this is submission for i386.

The patch changes the way IRQs are handed out to PCI devices.  Currently,
each I/O APIC pin gets associated with an IRQ, no matter if the pin is used
or not.  This imposes severe limitation on systems that have designs that
employ many I/O APICs, only utilizing couple lines of each, such as P64H2
chipset.  It is used in ES7000, and currently, there is no way to boot the
system with more that 9 I/O APICs.

The simple change below allows to boot a system with say 64 (or more) I/O
APICs, each providing 1 slot, which otherwise impossible because of the IRQ
gaps created for unused lines on each I/O APIC.  It does not resolve the
problem with number of devices that exceeds number of possible IRQs, but
eases up a tension for IRQs on any large system with potentually large
number of devices.

Signed-off-by: Natalie Protasevich <Natalie.Protasevich@unisys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:10 -07:00
Jan Beulich
7fbb4f6e68 [PATCH] adjust i386 watchdog tick calculation
Get the i386 watchdog tick calculation into a state where it can also be used
on CPUs with frequencies beyond 4GHz, and it consolidates the calculation into
a single place (for potential furture adjustments).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:09 -07:00
Natalie Protasevich
ca05fea6db [PATCH] Do not enforce unique IO_APIC_ID check for xAPIC systems (i386)
This patch is per Andi's request to remove NO_IOAPIC_CHECK from genapic and
use heuristics to prevent unique I/O APIC ID check for systems that don't
need it.  The patch disables unique I/O APIC ID check for Xeon-based and
other platforms that don't use serial APIC bus for interrupt delivery.
Andi stated that AMD systems don't need unique IO_APIC_IDs either.

Signed-off-by: Natalie Protasevich <Natalie.Protasevich@unisys.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:09 -07:00
Roland McGrath
7c1def1652 [PATCH] i386: never block forced SIGSEGV
This problem was first noticed on PPC and has already been fixed there.
But the exact same issue applies to other platforms in the same way.  The
signal blocking for sa_mask and the handled signal takes place after the
handler setup.  When the stack is bogus, the handler setup forces a
SIGSEGV.  But then this will be blocked, and returning to user mode will
fault again and iterate.  This patch fixes the problem by checking whether
signal handler setup failed, and not doing the signal-blocking if so.  This
copies what was done in the ppc code.  I think all architectures' signal
handler setup code follows this pattern and needs the change.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:09 -07:00
Venkatesh Pallipadi
8a9e1b0f56 [PATCH] Platform SMIs and their interferance with tsc based delay calibration
Issue:
Current tsc based delay_calibration can result in significant errors in
loops_per_jiffy count when the platform events like SMIs
(System Management Interrupts that are non-maskable) are present. This could
lead to potential kernel panic(). This issue is becoming more visible with 2.6
kernel (as default HZ is 1000) and on platforms with higher SMI handling
latencies. During the boot time, SMIs are mostly used by BIOS (for things
like legacy keyboard emulation).

Description:
The psuedocode for current delay calibration with tsc based delay looks like
(0) Estimate a value for loops_per_jiffy
(1) While (loops_per_jiffy estimate is accurate enough)
(2)   wait for jiffy transition (jiffy1)
(3)   Note down current tsc (tsc1)
(4)   loop until tsc becomes tsc1 + loops_per_jiffy
(5)   check whether jiffy changed since jiffy1 or not and refine
loops_per_jiffy estimate

Consider the following cases
Case 1:
If SMIs happen between (2) and (3) above, we can end up with a
loops_per_jiffy value that is too low. This results in shorted delays and
kernel can panic () during boot (Mostly at IOAPIC timer initialization
timer_irq_works() as we don't have enough timer interrupts in a specified
interval).

Case 2:
If SMIs happen between (3) and (4) above, then we can end up with a
loops_per_jiffy value that is too high. And with current i386 code, too
high lpj value (greater than 17M) can result in a overflow in
delay.c:__const_udelay() again resulting in shorter delay and panic().

Solution:
The patch below makes the calibration routine aware of asynchronous events
like SMIs. We increase the delay calibration time and also identify any
significant errors (greater than 12.5%) in the calibration and notify it to
user.

Patch below changes both i386 and x86-64 architectures to use this
new and improved calibrate_delay_direct() routine.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:08 -07:00
Andy Whitcroft
05b79bdcb4 [PATCH] sparsemem memory model for i386
Provide the architecture specific implementation for SPARSEMEM for i386 SMP
and NUMA systems.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Martin Bligh <mbligh@aracnet.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:05 -07:00
Martin Hicks
753ee72896 [PATCH] VM: early zone reclaim
This is the core of the (much simplified) early reclaim.  The goal of this
patch is to reclaim some easily-freed pages from a zone before falling back
onto another zone.

One of the major uses of this is NUMA machines.  With the default allocator
behavior the allocator would look for memory in another zone, which might be
off-node, before trying to reclaim from the current zone.

This adds a zone tuneable to enable early zone reclaim.  It is selected on a
per-zone basis and is turned on/off via syscall.

Adding some extra throttling on the reclaim was also required (patch
4/4).  Without the machine would grind to a crawl when doing a "make -j"
kernel build.  Even with this patch the System Time is higher on
average, but it seems tolerable.  Here are some numbers for kernbench
runs on a 2-node, 4cpu, 8Gig RAM Altix in the "make -j" run:

			wall  user   sys   %cpu  ctx sw.  sleeps
			----  ----   ---   ----   ------  ------
No patch		1009  1384   847   258   298170   504402
w/patch, no reclaim     880   1376   667   288   254064   396745
w/patch & reclaim       1079  1385   926   252   291625   548873

These numbers are the average of 2 runs of 3 "make -j" runs done right
after system boot.  Run-to-run variability for "make -j" is huge, so
these numbers aren't terribly useful except to seee that with reclaim
the benchmark still finishes in a reasonable amount of time.

I also looked at the NUMA hit/miss stats for the "make -j" runs and the
reclaim doesn't make any difference when the machine is thrashing away.

Doing a "make -j8" on a single node that is filled with page cache pages
takes 700 seconds with reclaim turned on and 735 seconds without reclaim
(due to remote memory accesses).

The simple zone_reclaim syscall program is at
http://www.bork.org/~mort/sgi/zone_reclaim.c

Signed-off-by: Martin Hicks <mort@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:14 -07:00
Ingo Molnar
39c715b717 [PATCH] smp_processor_id() cleanup
This patch implements a number of smp_processor_id() cleanup ideas that
Arjan van de Ven and I came up with.

The previous __smp_processor_id/_smp_processor_id/smp_processor_id API
spaghetti was hard to follow both on the implementational and on the
usage side.

Some of the complexity arose from picking wrong names, some of the
complexity comes from the fact that not all architectures defined
__smp_processor_id.

In the new code, there are two externally visible symbols:

 - smp_processor_id(): debug variant.

 - raw_smp_processor_id(): nondebug variant. Replaces all existing
   uses of _smp_processor_id() and __smp_processor_id(). Defined
   by every SMP architecture in include/asm-*/smp.h.

There is one new internal symbol, dependent on DEBUG_PREEMPT:

 - debug_smp_processor_id(): internal debug variant, mapped to
                             smp_processor_id().

Also, i moved debug_smp_processor_id() from lib/kernel_lock.c into a new
lib/smp_processor_id.c file.  All related comments got updated and/or
clarified.

I have build/boot tested the following 8 .config combinations on x86:

 {SMP,UP} x {PREEMPT,!PREEMPT} x {DEBUG_PREEMPT,!DEBUG_PREEMPT}

I have also build/boot tested x64 on UP/PREEMPT/DEBUG_PREEMPT.  (Other
architectures are untested, but should work just fine.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:13 -07:00
gregkh@suse.de
8874b414ff [PATCH] class: convert arch/* to use the new class api instead of class_simple
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:09 -07:00
Thomas Hood
92c6dc59b7 [PATCH] apm.c: ignore_normal_resume is set a bit too late
This patch causes the ignore_normal_resume flag to be set slightly earlier,
before there is a chance that the apm driver will receive the normal resume
event from the BIOS.  (Addresses Debian bug #310865)

Signed-off-by: Thomas Hood <jdthood@yahoo.co.uk>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-14 07:19:35 -07:00
Keith Owens
5754c9b649 [PATCH] Stop arch/i386/kernel/vsyscall-note.o being rebuilt every time
arch/i386/kernel/vsyscall-note.o is not listed as a target so its .cmd file
is neither considered as a target nor is it read on the next build.  This
causes vsyscall-note.o to be rebuilt every time that you run make, which
causes vmlinux to be rebuilt every time.

Signed-off-by: Keith Owens <kaos@ocs.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-08 16:21:13 -07:00
Dave Jones
f94ea640a2 [CPUFREQ] Typos.
cpfureq developers cant spel.

Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:52 -07:00
Dave Jones
6778bae0f2 [CPUFREQ] longhaul - adjust transition latency.
From patch by: Ken Staton <ken_staton@agilent.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:51 -07:00
Dave Jones
1174631418 [CPUFREQ] Longhaul: Magic timer frobbing.
As mandated by the spec, disable timer around transitions.

From code by : Ken Staton <ken_staton@agilent.com
Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:51 -07:00
Dave Jones
3be6a48f3c [CPUFREQ] longhaul - disable PCI mastering around transition.
The spec states that we have to do this, which is *horrid*.

Based on code from: Ken Staton <ken_staton@agilent.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:51 -07:00
Dave Jones
065b807ca1 [CPUFREQ] dual-core powernow-k8
With the release of the dual-core AMD Opterons last week,
it's high time that cpufreq supported them.  The attached
patch applies cleanly to 2.6.12-rc3 and updates powernow-k8
to support the latest Athlon 64 and Opteron processors.

Update the driver to version 1.40.0 and provide support
for dual-core processors.

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:46 -07:00
Dave Jones
c5d28fb297 [CPUFREQ] Recalibrate cpu_khz [2/2]
Some cpufreq drivers (at that time, only powernow-k7) need to recalibrate the
cpu_khz at runtime.

Signed-off-by: Bruno Ducrot <ducrot@poupinou.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:46 -07:00
Dave Jones
91350ed49b [CPUFREQ] Recalibrate cpu_khz [1/2]
We have to recalibrate cpu_khz in order to use the current FID instead the max
FID since some BIOS do not put the processor at maximum frequency at POST. 
Also, some BIOS will change the processor frequency at our back after cpu_khz
was calibrate.  Finally, this will fix a long standing bug when we do
something like this:

# rmmod powernow-k7
# modprobe powernow-k7

Signed-off-by: Bruno Ducrot <ducrot@poupinou.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:45 -07:00
Dave Jones
bf6fc9fd2d [CPUFREQ] AMD Elan SC520 cpufreq driver.
From: Sean Young <sean@mess.org>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:45 -07:00
Dave Jones
6f4095af6d [CPUFREQ] speedstep-smi: it works on at least one P4M
The speedstep-smi driver actually works on >=1 notebook with a
Pentium 4-M CPU where all other cpufreq drivers fail. Therefore,
allow speedstep-smi on P4Ms again, but warn users of likely failure

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:44 -07:00
Dave Jones
8282864a96 [CPUFREQ] speedstep-centrino: Pentium 4 - M (HT) support
The Pentium 4 - Ms (HT) with CPUID 0xF34 and 0xF41 seem to support
centrino-like enhanced speedstep; however, no "table" support is possible.
Therefore, put NULL entries into speedstep-centrino.c

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:43 -07:00
Dave Jones
7eb53d8823 [CPUFREQ] powernow-k7: don't print khz element of FSB.
Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:42 -07:00
Alexander Nyberg
adaa765d76 [PATCH] acpi build fix: x86 setup.c
This is a neverending story

linux/acpi.h contains empty declarations for acpi_boot_init() &
acpi_boot_table_init() but they are nested inside #ifdef CONFIG_ACPI.

So we'll have to #ifdef in arch/i386/kernel/setup.c: setup_arch()

Signed-off-by: Alexander Nyberg <alexn@telia.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-31 14:54:17 -07:00
Siddha, Suresh B
49f384b82b [PATCH] x86: fix smp_num_siblings on buggy BIOSes
This fixes 'smp_num_siblings' value on the systems with a buggy bios,
which sets number of siblings to '2' even when HT is disabled.  (more
details are at http://bugzilla.kernel.org/show_bug.cgi?id=4359)

I am planning to do more cleanup in this area (like moving smp_num_siblings
to per cpuinfo) shortly.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-28 11:14:00 -07:00
Adrian Bunk
70ffc71c5c [PATCH] arch/i386/kernel/cpu/intel_cacheinfo.c: section fix
num_cache_leaves is used in __devexit cache_remove_dev() and can therefore
not be __devinit.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-28 11:14:00 -07:00
Andi Kleen
2df9fa3664 [PATCH] x86_64: i386/x86-64: Export cpu_core_map
Needed for the powernow k8 driver for dual core support.

Signed-off-by: Andi Kleen <ak@suse.de>
Cc: <mark.langsdorf@amd.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-20 15:48:21 -07:00
Andi Kleen
b41e29398a [PATCH] x86_64: 386/x86-64 Further AMD dual core fixes
- Remove duplicated ifdef
- Make core_id match what Intel uses
- Initialize phys_proc_id correctly for non DC case
- Handle non power of two core numbers.

Fixes for both i386 and x86-64

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-20 15:48:20 -07:00
Andi Kleen
a158608bf4 [PATCH] x86_64/i386: fix defaults for physical/core id in /proc/cpuinfo
Last round hopefully of cpu_core_id changes hopefully fow now:

- Always initialize cpu_core_id for all CPUs, even when no dual core setup
  is detected.  This prevents funny /proc/cpuinfo output

- Do the same with phys_proc_id[] even when no HyperThreading - dito.

- Use the CPU APIC-ID from CPUID 1 instead of the linux virtual CPU number
  to identify the core for AMD dual core setups.

Patch for i386/x86-64.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-17 07:59:13 -07:00
Linus Torvalds
22490eb80c Fix acpi_find_rsdp() - acpi_scan_rsdp takes length, not end
Noticed by Jakub Jermar <jermar@itbs.cz>
2005-05-06 15:39:23 -07:00
maximilian attems
a27e951f1e [PATCH] cyrix: eliminate bad section references
Fix cyrix section references:
 convert __initdata to __devinitdata.

Error: ./arch/i386/kernel/cpu/mtrr/cyrix.o .text refers to 00000379
R_386_32          .init.data
Error: ./arch/i386/kernel/cpu/mtrr/cyrix.o .text refers to 00000399
R_386_32          .init.data
Error: ./arch/i386/kernel/cpu/mtrr/cyrix.o .text refers to 000003b3
R_386_32          .init.data
Error: ./arch/i386/kernel/cpu/mtrr/cyrix.o .text refers to 000003b9
R_386_32          .init.data
Error: ./arch/i386/kernel/cpu/mtrr/cyrix.o .text refers to 000003bf
R_386_32          .init.data

Signed-of-by: maximilian attems <janitor@sternwelten.at>

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-05 16:36:47 -07:00
Prasanna S Panchamukhi
0b9e2cac8a [PATCH] Kprobes: Incorrect handling of probes on ret/lret instruction
Kprobes could not handle the insertion of a probe on the ret/lret
instruction and used to oops after single stepping since kprobes was
modifying eip/rip incorrectly.  Adjustment of eip/rip is not required after
single stepping in case of ret/lret instruction, because eip/rip points to
the correct location after execution of the ret/lret instruction.  This
patch fixes the above problem.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-05 16:36:39 -07:00
Paolo 'Blaisorblade' Giarrusso
0c28130b5c [PATCH] x86_64: make string func definition work as intended
In include/asm-x86_64/string.h there are such comments:

/* Use C out of line version for memcmp */
#define memcmp __builtin_memcmp
int memcmp(const void * cs,const void * ct,size_t count);

This would mean that if the compiler does not decide to use __builtin_memcmp,
it emits a call to memcmp to be satisfied by the C out-of-line version in
lib/string.c.  What happens is that after preprocessing, in lib/string.i you
may find the definition of "__builtin_strcmp".

Actually, by accident, in the object you will find the definition of strcmp
and such (maybe a trick intended to redirect calls to __builtin_memcmp to the
default memcmp when the definition is not expanded); however, this particular
case is not a documented feature as far as I can see.

Also, the EXPORT_SYMBOL does not work, so it's duplicated in the arch.

I simply added some #undef to lib/string.c and removed the (now duplicated)
exports in x86-64 and UML/x86_64 subarchs (the second ones are introduced by
another patch I just posted for -mm).

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
CC: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-05 16:36:33 -07:00
Alexander Nyberg
f48d9663f1 [PATCH] x86 stack initialisation fix
The recent change fix-crash-in-entrys-restore_all.patch

 	childregs->esp = esp;

 	p->thread.esp = (unsigned long) childregs;
-	p->thread.esp0 = (unsigned long) (childregs+1);
+	p->thread.esp0 = (unsigned long) (childregs+1) - 8;

 	p->thread.eip = (unsigned long) ret_from_fork;

introduces an inconsistency between esp and esp0 before the task is run the
first time.  esp0 is no longer the actual start of the stack, but 8 bytes
off.

This shows itself clearly in a scenario when a ptracer that is set to also
ptrace eventual children traces program1 which then clones thread1.  Now
the ptracer wants to modify the registers of thread1.  The x86 ptrace
implementation bases it's knowledge about saved user-space registers upon
p->thread.esp0.  But this will be a few bytes off causing certain writes to
the kernel stack to overwrite a saved kernel function address making the
kernel when actually running thread1 jump out into user-space.  Very
spectacular.

The testcase I've used is:
/* start with strace -f ./a.out */
#include <pthread.h>
#include <stdio.h>

void *do_thread(void *p)
{
	for (;;);
}

int main()
{
	pthread_t one;
	pthread_create(&one, NULL, &do_thread, NULL);
	for (;;);
	return 0;
}

So, my solution is to instead of just adjusting esp0 that creates an
inconsitent state I adjust where the user-space registers are saved with -8
bytes.  This gives us the wanted extra bytes on the start of the stack and
esp0 is now correct.  This solves the issues I saw from the original
testcase from Mateusz Berezecki and has survived testing here.  I think
this should go into -mm a round or two first however as there might be some
cruft around depending on pt_regs lying on the start of the stack.  That
however would have broken with the first change too!

It's actually a 2-line diff but I had to move the comment of why the -8 bytes
are there a few lines up. Thanks to Zwane for helping me with this.

Signed-off-by: Alexander Nyberg <alexn@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-05 16:36:30 -07:00
David Woodhouse
27b030d58c Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2005-05-03 08:14:09 +01:00
Adrian Bunk
408b664a7d [PATCH] make lots of things static
Another large rollup of various patches from Adrian which make things static
where they were needlessly exported.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:59:29 -07:00
Jesper Juhl
7ed20e1ad5 [PATCH] convert that currently tests _NSIG directly to use valid_signal()
Convert most of the current code that uses _NSIG directly to instead use
valid_signal().  This avoids gcc -W warnings and off-by-one errors.

Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:59:14 -07:00
Jesper Juhl
e49332bd12 [PATCH] misc verify_area cleanups
There were still a few comments left refering to verify_area, and two
functions, verify_area_skas & verify_area_tt that just wrap corresponding
access_ok_skas & access_ok_tt functions, just like verify_area does for
access_ok - deprecate those.

There was also a few places that still used verify_area in commented-out
code, fix those up to use access_ok.

After applying this one there should not be anything left but finally
removing verify_area completely, which will happen after a kernel release
or two.

Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:59:08 -07:00
Matt Mackall
d59745ce3e [PATCH] clean up kernel messages
Arrange for all kernel printks to be no-ops.  Only available if
CONFIG_EMBEDDED.

This patch saves about 375k on my laptop config and nearly 100k on minimal
configs.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:59:02 -07:00
Paolo 'Blaisorblade' Giarrusso
5e7b83ffc6 [PATCH] uml: fix syscall table by including $(SUBARCH)'s one, for i386
Split the i386 entry.S files into entry.S and syscall_table.S which is
included in the previous one (so actually there is no difference between them)
and use the syscall_table.S in the UML build, instead of tracking by hand the
syscall table changes (which is inherently error-prone).

We must only insert the right #defines to inject the changes we need from the
i386 syscall table (for instance some different function names); also, we
don't implement some i386 syscalls, as ioperm(), nor some TLS-related ones
(yet to provide).

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:58:55 -07:00
Pavel Pisa
ad6714230f [PATCH] Linux 2.6.x VM86 interrupt emulation fixes
Patch solves VM86 interrupt emulation deadlock on SMP systems.  The VM86
interrupt emulation has been heavily tested and works well on UP systems
after last update, but it seems to deadlock when we have used it on SMP/HT
boxes now.

It seems, that disable_irq() cannot be called from interrupts, because it
waits until disabled interrupt handler finishes
(/kernel/irq/manage.c:synchronize_irq():while(IRQ_INPROGRESS);).  This
blocks one CPU after another.  Solved by use disable_irq_nosync.

There is the second problem.  If IRQ source is fast, it is possible, that
interrupt is sometimes processed and re-enabled by the second CPU, before
it is disabled by the first one, but negative IRQ disable depths are not
allowed.  The spinlocking and disabling IRQs over call to
disable_irq_nosync/enable_irq is the only solution found reliable till now.

Signed-off-by: Michal Sojka <sojkam1@control.felk.cvut.cz>
Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:58:52 -07:00
Zwane Mwaikambo
3c3b73b6f5 [PATCH] cpuid x87 bit on AMD falsely marked as PNI
http://bugme.osdl.org/show_bug.cgi?id=4426

vendor_id       : AuthenticAMD
cpu family      : 6
model           : 10
model name      : AMD Athlon(tm) XP
stepping        : 0
cpu MHz         : 2204.807
<snipped>
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 mmx fxsr sse pni syscall mmxext 3dnowext 3dnow
bogomips        : 4358.14

We're marking bit 0 of extended function 0x80000001 cpuid as PNI support on
AMD processors, when it actually denotes x87 FPU present.  Patch for i386
and x86_64 below.

Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:58:51 -07:00
john stultz
35492df5ae [PATCH] i386: fix hpet for systems that don't support legacy replacement
Currently the i386 HPET code assumes the entire HPET implementation from
the spec is present.  This breaks on boxes that do not implement the
optional legacy timer replacement functionality portion of the spec.

This patch, which is very similar to my x86-64 patch for the same issue,
fixes the problem allowing i386 systems that cannot use the HPET for the
timer interrupt and RTC to still use the HPET as a time source.  I've
tested this patch on a system systems without HPET, with HPET but without
legacy timer replacement, as well as HPET with legacy timer replacement.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:58:50 -07:00
Lee Revell
a6954ba2e8 [PATCH] Enable write combining for server works LE rev > 6
Enable write combining for server works LE rev > 6 per
http://www.ussg.iu.edu/hypermail/linux/kernel/0104.3/1007.html

Signed-Off-By: Lee Revell <rlrevell@joe-job.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:58:49 -07:00
Stas Sergeev
48c88211a6 [PATCH] x86: entry.S trap return fixes
do_debug() and do_int3() return void.

This patch fixes the CONFIG_KPROBES variant of do_int3() to return void too
and adjusts entry.S accordingly.

Signed-off-by: Stas Sergeev <stsp@aknet.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:58:49 -07:00
Jaya Kumar
a2f7c35415 [PATCH] x86 reboot: Add reboot fixup for gx1/cs5530a
This patch by Jaya Kumar introduces a generic infrastructure to deal with
x86 chipsets with nonstandard reset sequences, and adds support for the
Geode gx1/cs5530a chipset.

Signed-off-by: Jaya Kumar <jayalk@intworks.biz>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:58:49 -07:00
Jack F Vogel
67701ae976 [PATCH] check nmi watchdog is broken
A bug against an xSeries system showed up recently noting that the
check_nmi_watchdog() test was failing.

I have been investigating it and discovered in both i386 and x86_64 the
recent change to the routine to use the cpu_callin_map has uncovered a
problem.  Prior to that change, on an SMP box, the test was trivally
passing because all cpu's were found to not yet be online, but now with the
callin_map they are discovered, it goes on to test the counter and they
have not yet begun to increment, so it announces a CPU is stuck and bails
out.

On all the systems I have access to test, the announcement of failure is
also bougs...  by the time you can login and check /proc/interrupts, the
NMI count is happily incrementing on all CPUs.  Its just that the test is
being done too early.

I have tried moving the call to the test around a bit, and it was always
too early.  I finally hit on this proposed solution, it delays the routine
via a late_initcall(), seems like the right solution to me.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:58:48 -07:00