I'm seeing a kernel panic on an ES7000-600 when booting in virtual wire
mode. The panic happens because smp_read_mpc() is passed a physical
address, and it should be virtual. I tested the attached patch on the
ES7000-600 and on a 2 cpu Dell box, and saw no problems on either.
Signed-off-by: Dan Yeisley <dan.yeisley@unisys.com>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
AMD SimNow!'s JIT doesn't like them at all in the guest. For distribution
installation it's easiest if it's a boot time option.
Also I moved the variable to a more appropiate place and make
it independent from sysctl
And marked __read_mostly which it is.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add some more gitignore files for i386 architecture. This files are
created during the build process of a i386 kernel.
Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This path isn't obvious. It looks as if the kernel will be taking three
args from the user stack, but it only takes one from there.
Signed-off-by: Albert Cahalan <acahalan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Lost a few hours debugging an early-bootup fault within printk itself,
which manifested itself as a hard to debug early hang.
This patch makes it much easier by printing out early faults via
early_printk(), which function is a lot simpler than a full printk, and
hence more likely to succeed in emergencies. (We do not recover from early
faults anyway, so there's no loss from not having these messages in the
normal printk buffer.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The *at patches introduced fstatat and, due to inusfficient research, I
used the newfstat functions generally as the guideline. The result is that
on 32-bit platforms we don't have all the information needed to implement
fstatat64.
This patch modifies the code to pass up 64-bit information if
__ARCH_WANT_STAT64 is defined. I renamed the syscall entry point to make
this clear. Other archs will continue to use the existing code. On x86-64
the compat code is implemented using a new sys32_ function. this is what
is done for the other stat syscalls as well.
This patch might break some other archs (those which define
__ARCH_WANT_STAT64 and which already wired up the syscall). Yet others
might need changes to accomodate the compatibility mode. I really don't
want to do that work because all this stat handling is a mess (more so in
glibc, but the kernel is also affected). It should be done by the arch
maintainers. I'll provide some stand-alone test shortly. Those who are
eager could compile glibc and run 'make check' (no installation needed).
The patch below has been tested on x86 and x86-64.
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Initialising cpu_possible_map to all-ones with CONFIG_HOTPLUG_CPU means that
a) All for_each_cpu() loops will iterate across all NR_CPUS CPUs, rather
than over possible ones. That can be quite expensive.
b) Soon we'll be allocating per-cpu areas only for possible CPUs. So with
CPU_MASK_ALL, we'll be wasting memory.
I also switched voyager over to not use CPU_MASK_ALL in the non-CPU-hotplug
case. Should be OK..
I note that parisc is also using CPU_MASK_ALL. Suggest that it stop doing
that.
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Paul Jackson <pj@sgi.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Zwane Mwaikambo <zwane@linuxpower.ca>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Registers system call for the i386 architecture.
Signed-off-by: Janak Desai <janak@us.ibm.com>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix wrong '!' in bad apic fix
I forgot to remove the ! when moving the code from x86-64 to i386 x86-64
tested !disable_apic, but of course for cpu_has_apic it shouldn't be
negated.
Credit goes to Jan Beulich for spotting it with eagle eyes.
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Show first field of kernel version in register dumps like x86_64 does.
Changes output from e.g.:
(2.6.16-rc1)
to:
(2.6.16-rc1 #12)
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
i386 CPU init code accesses freed init memory when booting a newly-started
processor after CPU hotplug. The cpu_devs array is searched to find the
vendor and it contains pointers to freed data.
Fix that by:
1. Zeroing entries for freed vendor data after bootup.
2. Changing Transmeta, NSC and UMC to all __init[data].
3. Printing a warning (once only) and setting this_cpu
to a safe default when the vendor is not found.
This does not change behavior for AMD systems. They were broken already
but no error was reported.
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
dump_stack() on page allocation failure presently has an irritating habit
of shouting just "====" at everyone: please stop it.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
percpu_data blindly allocates bootmem memory to store NR_CPUS instances of
cpudata, instead of allocating memory only for possible cpus.
As a preparation for changing that, we need to convert various 0 -> NR_CPUS
loops to use for_each_cpu().
(The above only applies to users of asm-generic/percpu.h. powerpc has gone it
alone and is presently only allocating memory for present CPUs, so it's
currently corrupting memory).
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Jens Axboe <axboe@suse.de>
Cc: Anton Blanchard <anton@samba.org>
Acked-by: William Irwin <wli@holomorphy.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It's bad juju to touch the APIC when it hasn't been enabled.
I also moved ack_bad_irq for x86-64 out of line following i386.
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some broken BIOS's had processors disabled, but
same apic id as a valid processor. This causes
acpi_processor_start() to think this disabled
cpu is ok, and croak. So we dont record bad
apicid's anymore.
http://bugzilla.kernel.org/show_bug.cgi?id=5930
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Test for old_freq equals 0 to insure not to divide by 0:
______________________________________________
Check for not initialized freq on cpufreq changes
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
Avoid lost tick compensation early in boot before the TSCs are
synchronized. Currently timekeeping is enabled before the TSCs are
synchronized, thus when the TSCs are synched (reset to zero), it appears
that a number of lost ticks have occurred. This can cause premature expiry
of timers and in extreme cases can cause the soft lockup detection to fire.
This resolves issues reported by Andy Whitcroft as well as bug #5366
reported by Tim Mann.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Acked-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ignore clock frequencies below 2Ghz for CPU's detected with N60 errata bug.
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
This patch fixes the following compile error:
...
CC arch/i386/kernel/cpu/cpufreq/gx-suspmod.o
arch/i386/kernel/cpu/cpufreq/gx-suspmod.c: In function 'gx_detect_chipset':
arch/i386/kernel/cpu/cpufreq/gx-suspmod.c:193: error: implicit declaration of function 'pci_match_id'
arch/i386/kernel/cpu/cpufreq/gx-suspmod.c:193: warning: comparison between pointer and integer
make[3]: *** [arch/i386/kernel/cpu/cpufreq/gx-suspmod.o] Error 1
<-- snip -->
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Dave Jones <davej@redhat.com>
This is a subset of the bluesmoke project core code, stripped of the NMI work
which isn't ready to merge and some of the "interesting" proc functionality
that needs reworking or just has no place in kernel. It requires no core
kernel changes except the added scrub functions already posted.
The goal is to merge further functionality only after the core code is
accepted and proven in the base kernel, and only at the point the upstream
extras are really ready to merge.
From: doug thompson <norsk5@xmission.com>
This converts EDAC to sysfs and is the final chunk neccessary before EDAC
has a stable user space API and can be considered for submission into the
base kernel.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: doug thompson <norsk5@xmission.com>
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add the sys_pselect6() and sys_poll() calls to the i386 syscall table.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Handle TIF_RESTORE_SIGMASK as added by David Woodhouse's patch entitled:
[PATCH] 2/3 Add TIF_RESTORE_SIGMASK support for arch/powerpc
[PATCH] 3/3 Generic sys_rt_sigsuspend
It does the following:
(1) Declares TIF_RESTORE_SIGMASK for i386.
(2) Invokes it over to do_signal() when TIF_RESTORE_SIGMASK is set.
(3) Makes do_signal() support TIF_RESTORE_SIGMASK, using the signal mask saved
in current->saved_sigmask.
(4) Discards sys_rt_sigsuspend() from the arch, using the generic one instead.
(5) Makes sys_sigsuspend() save the signal mask and set TIF_RESTORE_SIGMASK
rather than attempting to fudge the return registers.
(6) Makes sys_sigsuspend() return -ERESTARTNOHAND rather than looping
intrinsically.
(7) Makes setup_frame(), setup_rt_frame() and handle_signal() return 0 or
-EFAULT rather than true/false to be consistent with the rest of the
kernel.
Due to the fact do_signal() is then only called from one place:
(8) Makes do_signal() no longer have a return value is it was just being
ignored; force_sig() takes care of this.
(9) Discards the old sigmask argument to do_signal() as it's no longer
necessary.
(10) Makes do_signal() static.
(11) Marks the second argument to do_notify_resume() as unused. The unused
argument should remain in the middle as the arguments are passed in as
registers, and the ordering is specific in entry.S
Given the way do_signal() is now no longer called from sys_{,rt_}sigsuspend(),
they no longer need access to the exception frame, and so can just take
arguments normally.
This patch depends on sys_rt_sigsuspend patch.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Wire up the x86 syscalls
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
cpufreq init can be called when a CPU is set online.
Need to make powernow-k8's initialisation functions __cpuinit to
prevents oopses when a CPU is off/onlined on a AMD system
Cc: trenn@suse.de
Cc: mark.langsdorf@amd.com
Cc: davej@redhat.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Recent changes caused part of stack traces from SysRq-T to print at
KERN_EMERG loglevel. Also, parts of stack dump during oops were failing to
print at that level when they should.
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I tried to send the forcedeth maintainer an email, but it came back with:
"The mail address manfreds@colorfullife.com is not read anymore.
Please resent your mail to manfred@ instead of manfreds@."
This patch fixes this.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
)
From: Al Viro <viro@ftp.linux.org.uk>
task_pt_regs() needs the same offset-by-8 to match copy_thread()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
)
From: Ingo Molnar <mingo@elte.hu>
This is the latest version of the scheduler cache-hot-auto-tune patch.
The first problem was that detection time scaled with O(N^2), which is
unacceptable on larger SMP and NUMA systems. To solve this:
- I've added a 'domain distance' function, which is used to cache
measurement results. Each distance is only measured once. This means
that e.g. on NUMA distances of 0, 1 and 2 might be measured, on HT
distances 0 and 1, and on SMP distance 0 is measured. The code walks
the domain tree to determine the distance, so it automatically follows
whatever hierarchy an architecture sets up. This cuts down on the boot
time significantly and removes the O(N^2) limit. The only assumption
is that migration costs can be expressed as a function of domain
distance - this covers the overwhelming majority of existing systems,
and is a good guess even for more assymetric systems.
[ People hacking systems that have assymetries that break this
assumption (e.g. different CPU speeds) should experiment a bit with
the cpu_distance() function. Adding a ->migration_distance factor to
the domain structure would be one possible solution - but lets first
see the problem systems, if they exist at all. Lets not overdesign. ]
Another problem was that only a single cache-size was used for measuring
the cost of migration, and most architectures didnt set that variable
up. Furthermore, a single cache-size does not fit NUMA hierarchies with
L3 caches and does not fit HT setups, where different CPUs will often
have different 'effective cache sizes'. To solve this problem:
- Instead of relying on a single cache-size provided by the platform and
sticking to it, the code now auto-detects the 'effective migration
cost' between two measured CPUs, via iterating through a wide range of
cachesizes. The code searches for the maximum migration cost, which
occurs when the working set of the test-workload falls just below the
'effective cache size'. I.e. real-life optimized search is done for
the maximum migration cost, between two real CPUs.
This, amongst other things, has the positive effect hat if e.g. two
CPUs share a L2/L3 cache, a different (and accurate) migration cost
will be found than between two CPUs on the same system that dont share
any caches.
(The reliable measurement of migration costs is tricky - see the source
for details.)
Furthermore i've added various boot-time options to override/tune
migration behavior.
Firstly, there's a blanket override for autodetection:
migration_cost=1000,2000,3000
will override the depth 0/1/2 values with 1msec/2msec/3msec values.
Secondly, there's a global factor that can be used to increase (or
decrease) the autodetected values:
migration_factor=120
will increase the autodetected values by 20%. This option is useful to
tune things in a workload-dependent way - e.g. if a workload is
cache-insensitive then CPU utilization can be maximized by specifying
migration_factor=0.
I've tested the autodetection code quite extensively on x86, on 3
P3/Xeon/2MB, and the autodetected values look pretty good:
Dual Celeron (128K L2 cache):
---------------------
migration cost matrix (max_cache_size: 131072, cpu: 467 MHz):
---------------------
[00] [01]
[00]: - 1.7(1)
[01]: 1.7(1) -
---------------------
cacheflush times [2]: 0.0 (0) 1.7 (1784008)
---------------------
Here the slow memory subsystem dominates system performance, and even
though caches are small, the migration cost is 1.7 msecs.
Dual HT P4 (512K L2 cache):
---------------------
migration cost matrix (max_cache_size: 524288, cpu: 2379 MHz):
---------------------
[00] [01] [02] [03]
[00]: - 0.4(1) 0.0(0) 0.4(1)
[01]: 0.4(1) - 0.4(1) 0.0(0)
[02]: 0.0(0) 0.4(1) - 0.4(1)
[03]: 0.4(1) 0.0(0) 0.4(1) -
---------------------
cacheflush times [2]: 0.0 (33900) 0.4 (448514)
---------------------
Here it can be seen that there is no migration cost between two HT
siblings (CPU#0/2 and CPU#1/3 are separate physical CPUs). A fast memory
system makes inter-physical-CPU migration pretty cheap: 0.4 msecs.
8-way P3/Xeon [2MB L2 cache]:
---------------------
migration cost matrix (max_cache_size: 2097152, cpu: 700 MHz):
---------------------
[00] [01] [02] [03] [04] [05] [06] [07]
[00]: - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
[01]: 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
[02]: 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
[03]: 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1)
[04]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1)
[05]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1)
[06]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1)
[07]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) -
---------------------
cacheflush times [2]: 0.0 (0) 19.2 (19281756)
---------------------
This one has huge caches and a relatively slow memory subsystem - so the
migration cost is 19 msecs.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Cc: <wilder@us.ibm.com>
Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The explicit and implicit calls to setup_early_printk() were passing
inconsistent arguments.
Signed-Off-By: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
early_cpu_detect only runs on the BP, but this code needs to run
on all CPUs.
Looks like a mismerge somewhere. Also add a warning comment.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Currently we attempt to restore virtual wire mode on reboot, which only
works if we can figure out where the i8259 is connected. This is very
useful when we are kexec another kernel and likely helpful to an peculiar
BIOS that make assumptions about how the system is setup.
Since the acpi MADT table does not provide the location where the i8259 is
connected we have to look at the hardware to figure it out.
Most systems have the i8259 connected the local apic of the cpu so won't be
affected but people running Opteron and some serverworks chipsets should be
able to use kexec now.
In addition this patch removes the hard coded assumption that the io_apic
that delivers isa interrups is always known to the kernel as io_apic 0.
There does not appear to be anything to guarantee that assumption is true.
And From: Vivek Goyal <vgoyal@in.ibm.com>
A minor fix to the patch which remembers the location of where i8259 is
connected. Now counter i has been replaced by apic. counter i is having
some junk value which was leading to non-detection of i8259 connected to
IOAPIC.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Passing random input values in eax to cpuid is not a good idea
because the CPU will GPF for unknown ones.
Use the correct x86-64 version that exists for a longer time too.
This also adds a memory barrier to prevent the optimizer from
reordering.
Cc: tigran@veritas.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Whenever we see that a CPU is capable of C3 (during ACPI cstate init), we
disable local APIC timer and switch to using a broadcast from external timer
interrupt (IRQ 0). This is needed because Intel CPUs stop the local
APIC timer in C3. This is currently only enabled for Intel CPUs.
Patch below adds the code for i386 and also the ACPI hunk.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove the finer control of local APIC timer. We cannot provide a sub-jiffy
control like this when we use broadcast from external timer in place of
local APIC. Instead of removing this only on systems that may end up using
broadcast from external timer (due to C3), I am going the
"I'm feeling lucky" way to remove this fully. Basically, I am not sure about
usefulness of this code today. Few other architectures also don't seem to
support this today.
If you are using profiling and fine grained control and don't like this going
away in normal case, yell at me right now.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some people need it now on 64bit so reuse the i386 code for
x86-64. This will be also useful for future bug workarounds.
It is a bit simplified there because there is no need
to do it very early on x86-64. This means it doesn't need
early ioremap et.al. We run it as a core initcall right now.
I hope it's not needed for early setup.
I added a general CONFIG_DMI symbol in case IA64 or someone
else wants to reuse the code later too.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
So why are we calling smp_send_stop from machine_halt?
We don't.
Looking more closely at the bug report the problem here
is that halt -p is called which triggers not a halt but
an attempt to power off.
machine_power_off calls machine_shutdown which calls smp_send_stop.
If pm_power_off is set we should never make it out machine_power_off
to the call of do_exit. So pm_power_off must not be set in this case.
When pm_power_off is not set we expect machine_power_off to devolve
into machine_halt.
So how do we fix this?
Playing too much with smp_send_stop is dangerous because it
must also be safe to be called from panic.
It looks like the obviously correct fix is to only call
machine_shutdown when pm_power_off is defined. Doing
that will make Andi's assumption about not scheduling
true and generally simplify what must be supported.
This turns machine_power_off into a noop like machine_halt
when pm_power_off is not defined.
If the expected behavior is that sys_reboot(LINUX_REBOOT_CMD_POWER_OFF)
becomes sys_reboot(LINUX_REBOOT_CMD_HALT) if pm_power_off is NULL
this is not quite a comprehensive fix as we pass a different parameter
to the reboot notifier and we set system_state to a different value
before calling device_shutdown().
Unfortunately any fix more comprehensive I can think of is not
obviously correct. The core problem is that there is no architecture
independent way to detect if machine_power will become a noop, without
calling it.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Print bits for RDTSCP, SVM, CR8-LEGACY.
Also now print power flags on i386 like x86-64 always did.
This will add a new line in the 386 cpuinfo, but that shouldn't
be an issue - did that in the past too and I haven't heard
of any breakage.
I shrunk some of the fields in the i386 cpuinfo_x86 to chars
to make up for the new int "x86_power" field. Overall it's
smaller than before.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
They previously tried to figure this out on their own.
Suggested by Venkatesh.
Cc: venkatesh.pallipadi@intel.com
Cc: davej@redhat.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Define it for i386 too.
This is a synthetic flag that signifies that the CPU's TSC runs
at a constant P state invariant frequency.
Fix up the logic on x86-64/i386 to set it on all known CPUs.
Use the AMD defined bit to set it on future AMD CPUs.
Cc: venkatesh.pallipadi@intel.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
arch: Use <linux/capability.h> where capable() is used.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>