When acpi=off or there is no SRAT defined, apicid_to_node is got from K8
Northbridge PCI configuration space in k8_scan_nodes() in
arch/x86_64/mm/k8toplogy.c.
The problem is that it assumes bsp apic id is 0 at that point.
For four socket system with Quad core cpus installed, all cpus apic id
is offset by 4, and bsp apic id is 4.
For eight socket system with dual core cpus installed, all cpus apic id
is offset by 2, and bsp apic id is 2.
We need get boot_cpu_id --- bsp apic id, before k8_scan_nodes by called.
So create early_acpi_boot_init and early_get_smp_config for get boot_cpu_id.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Otherwise, enabling (or better, subsequent disabling) of single
stepping would cause a kernel oops on CPUs not having this MSR.
The patch could have been added a conditional to the MSR write in
user_disable_single_step(), but centralizing the updates seems safer
and (looking forward) better manageable.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In the x86 native_smp_send_reschedule_function(), don't send the IPI if the
cpu has gone offline already. Warn nevertheless!!
Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix:
ERROR: do not initialise externals to 0 or NULL
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
otherwise Vmemmap and High Kernel Mapping string is not showing up.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
store initial_apicid from early identify. it is could be different from
phys_proc_id later.
also print it out in /proc/cpuinfo.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix a memcpy that should be a text_poke (in apply_alternatives).
Use kernel_wp_save/kernel_wp_restore in text_poke to support DEBUG_RODATA
correctly and so the CPU HOTPLUG special case can be removed.
Add text_poke_early, for alternatives and paravirt boot-time and module load
time patching.
Changelog:
- Fix text_set and text_poke alignment check (mixed up bitwise and and or)
- Remove text_set
- Export add_nops, so it can be used by others.
- Document text_poke_early.
- Remove clflush, since it breaks some VIA architectures and is not strictly
necessary.
- Add kerneldoc to text_poke and text_poke_early.
- Create a second vmap instead of using the WP bit to support Xen and VMI.
- Move local_irq disable within text_poke and text_poke_early to be able to
be sleepable in these functions.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Andi Kleen <andi@firstfloor.org>
CC: pageexec@freemail.hu
CC: H. Peter Anvin <hpa@zytor.com>
CC: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
for system with apicid lifting, boot cpu apicid will be 4
got:
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU 0/4 -> Node 0
CPU: Physical Processor ID: 1
CPU: Processor Core ID: 0
so try to offset apicid back before get phys_proc_id with bits shift.
then we can get correct socket ID
also remove remove cpu_data(0) reference.
because cpu_data(0) only be ready after smp_prepare_cpus with the assignment
from boot_cpu_data to current_cpu_data aka cpu_data(0).
and check_bugs()==>identify_cpu(&boot_cpu_data) is quite before than
smp_prepare_cpus. So just use boot_cpu_id instead.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
wraps the busy loop for wait_for_init_deasserted() in a function,
so smp_callin in x86_64 looks like more i386
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
setup_trampoline() looks very similar between architectures, and this
patch unifies them. The i386 version allocates bootmem memory, while
the x86_64 version uses a fixed address.
In this patch, we initialize the global trampoline_base to the x86_64 version,
and i386 allocation can later override it.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
make setup_trampoline non-static. This way, it won't conflict
with the extern declaration in smp.h
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Change voyager's trampoline base to unsigned char *
instead of u32. This way, it won't conflict with
the other architectures when including smp.h
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The parameter passing parsing is done in the common smpboot.c
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
it was already cleared two lines above, and so, this removal
is bogus
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
it is already used in x86_64. In i386, it only
removes from cpu_online_map
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This lock does not protect cpu_online_map, so its
length can be shortened, and in some cases, removed.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
set_cpu_sibling_map() and remove_sibling_info() are
equal between architectures, and are now moved to common
file
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
move definitions that are now equal in type from
smpboot_{32,64}.c to smpboot.c
cpu_callin_map is put temporarily in smp_64.h (already
exists in smp_32.h), and will soon be merged.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch merges the copyright notices, and valuable
comments that were left back on smp_{32,64}.c. With that,
files are empty, and are deleted
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
this patch creates tlb_32.c and tlb_64.c, with
tlb-related functions that used to live in smp*.c files.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch moves all ipi and apic related functions
from smp_32.c to ipi.c
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
this patch moves all the functions and data structures that look
like exactly the same from smp_{32,64}.c to smp.c
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
function definition is moved to common header.
x86_64 version is now called native_smp_send_stop
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This can be safely added to i386. After that,
functions look exactly the same for both arches
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
with the hlt_works change, it is possible to have i386
and x86_64 stop_this_cpu() looking exactly the same. They
can, after that, be merged.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
the two versions (the inner version, and the outer version, that takes
the locks) of smp_call_function_mask are made into one. With the changes,
i386 and x86_64 versions look exactly the same.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This function is used in smp_send_stop(). It's like
smp_call_function_mask, but always go to all online cpus,
and does not take any locks.
It is added to x86_64, but will soon be unified in a common file
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch creates smpcommon.c with functions that are
equal between architectures. The i386-only init_gdt
is ifdef'd.
Note that smpcommon.o figures twice in the Makefile:
this is because sub-architectures like voyager that does
not use the normal smp_$(BITS) files also have to access them
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
with this removal, exports for both i386 and x86_64,
regarding the "smp_call_function" series are now the same.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
this patches moves prefill_possible_map() to smpboot.c
Right now it is x86_64-specific, but nothing intrinsically
prevents it to be used by i386
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
this patch allows a cpu to be marked as present but disabled in i386,
just as x86_64 currently does.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
definition is moved to common header. x86_64 version is now called
native_smp_cpus_done
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
definition is moved to common header. x86_64 version is now called
native_smp_prepare_cpus
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
definition is moved to common header. x86_64 version is now called
native_prepare_boot_cpu
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
function definition is moved to common header. x86_64 version
is now called native_cpu_up
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
definition is moved to common header, x86_64 function name
now is native_smp_call_function_mask
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
function definition is moved to common header, x86_64 version is now called
native_smp_send_reschedule
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
the smp_ops symbol is temporarily defined in smp_64.c, but it will soon
be unified
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jeremy Fitzhardinge pointed out that looking at the boot_params
struct to determine if the system is running in a paravirtual
environment is not reliable for the Xen case, currently. He also
points out that there already exists a function to determine if
the system is running in a paravirtual environment. So let's use
that instead. This gets rid of the preprocessor test too.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jeremy Fitzhardinge pointed out that looking at the boot_params
struct to determine if the system is running in a paravirtual
environment is not reliable for the Xen case, currently. He also
points out that there already exists a function to determine if
the system is running in a paravirtual environment. So let's use
that instead. This gets rid of the preprocessor test too.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The current tsc_init() clears the TSC feature bit if the TSC khz
cannot be calculated, causing us to panic in
arch/x86/kernel/cpu/bugs.c check_config(). We should simply mark it
unstable.
Frankly, someone should take an axe to this code. mark_tsc_unstable()
not only marks it unstable, but sets tsc_enabled to 0, which seems
redundant but is actually important here because means it won't be
used by sched_clock() either. Perhaps a tristate enum "UNUSABLE,
UNSTABLE, OK" would be clearer, and separate mark_tsc_unstable() and
mark_tsc_broken() functions?
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch is an add-on to the 64-bit ebda patch. It makes
the functions reserve_ebda_region (renamed from reserve_ebda)
and copy_e820_map equal to the 32-bit versions of the previous
patch.
Changes:
Use u64 and u32 for local variables in copy_e820_map.
The amount of conventional memory and the start of the EBDA are
detected by reading the BIOS data area directly. Paravirtual
environments do not provide this area, so we bail out early
in that case. They will just have to set up a correct memory
map to start with.
Add a safety net for zeroed out BIOS data area.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Explicitly reserve_early the whole address range from the end of
conventional memory as reported by the bios data area up to the
1Mb mark. Regard the info retrieved from the BIOS data area with
a bit of paranoia, though, because some biosses forget to register
the EBDA correctly.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds explicit detection of the EBDA and reservation
of the rom and adapter address space 0xa0000-0x100000 to the
i386 kernels. Before this patch, the EBDA size was hardcoded
as 4Kb. Also, the reservation of the adapter range was done by
modifying the e820 map which is now not necessary any longer,
and that code is removed from copy_e820_map.
The amount of conventional memory and the start of the EBDA are
detected by reading the BIOS data area directly. Paravirtual
environments do not provide this area, so we bail out early
in that case. They will just have to set up a correct memory
map to start with.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The proper dependency check uncovered a few dependency problems,
the subarchitecture used a mixture of selects and depends on SMP
and PCI dependency was messed up.
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Change iret implementation to not be dependent on direct-access vcpu
structure.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Depends on:
[PATCH 2/3] x86: coding style fixes to arch/x86/kernel/early_printk.c
Remove two:
ERROR: do not initialise statics to 0 or NULL
paolo@paolo-desktop:/tmp/c$ size *
text data bss dec hex filename
1172 280 12 1464 5b8 early_printk.o.after
1172 280 12 1464 5b8 early_printk.o.before
This patch is changing the binary output:
paolo@paolo-desktop:/tmp/c$ md5sum *
dad9a9a881e0eeda62cc5645bd3d7cad early_printk.o.after
da32f5cd8f248970e4809e1005393e95 early_printk.o.before
because the two variables moved to another section. No
change in functionality.
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
X86_HT is used for hyperthreading or multicore on 32-bit.
The X86_HT on 64-bit is different from 32-bit, it means hyperthreading only.
And X86_HT is not used on 64-bit except from cpu/initel_cacheinfo.c.
Unify X86_HT for hyperthreading or multicore.
Turn X86_HT on when X86_64 and SMP are enabled.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Roland Dreier reported in http://lkml.org/lkml/2008/2/27/194
[ 8425.915139] BUG: unable to handle kernel paging request at ffffc20001a0a000
[ 8425.919087] IP: [<ffffffff8021dacc>] clflush_cache_range+0xc/0x25
[ 8425.919087] PGD 1bf80e067 PUD 1bf80f067 PMD 1bb497067 PTE 80000047000ee17b
This is on a Intel machine with 36bit physical address space. The PTE
entry references 47000ee000, which is outside of it.
Add a check for the physical address space and warn/printk about the
stupid caller.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This makes the Xen console just work. Before, you had to ask for it
on the kernel command line with console=hvc0
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
early_init_intel() on 64-bit is introduced by
commit 2b16a23538
Author: Andi Kleen <ak@suse.de>
Date: Wed Jan 30 13:32:40 2008 +0100
x86: move X86_FEATURE_CONSTANT_TSC into early cpu feature detection
sets CONSTANT_TSC for intel cpus - but it is already set in init_intel().
don't need to set that two times in early_init_intel() and init_intel().
this patch removes the init_intel() one.
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
quad core 8 socket system will have apic id lifting.the apic id range could
be [4, 0x23]. and apic_is_clustered_box will think that need to three clusters
and that is larger than 2. So it is treated as a clustered_box.
and will get:
Marking TSC unstable due to TSCs unsynchronized
even if the CPUs have X86_FEATURE_CONSTANT_TSC set.
this quick fix will check if the cpu is from AMD.
but vsmp still needs that checking...
this patch is fix to make sure that vsmp not to be passed.
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
when comparing the e820 direct from BIOS, and the one by kexec:
BIOS-provided physical RAM map:
- BIOS-e820: 0000000000000000 - 0000000000097400 (usable)
+ BIOS-e820: 0000000000000100 - 0000000000097400 (usable)
BIOS-e820: 0000000000097400 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e6000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000dffa0000 (usable)
- BIOS-e820: 00000000dffae000 - 00000000dffb0000 type 9
+ BIOS-e820: 00000000dffae000 - 00000000dffb0000 (reserved)
BIOS-e820: 00000000dffb0000 - 00000000dffbe000 (ACPI data)
BIOS-e820: 00000000dffbe000 - 00000000dfff0000 (ACPI NVS)
BIOS-e820: 00000000dfff0000 - 00000000e0000000 (reserved)
BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
- BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
=======> that is the local apic address... somewhere we lost it
BIOS-e820: 00000000ff700000 - 0000000100000000 (reserved)
BIOS-e820: 0000000100000000 - 0000004020000000 (usable)
found one entry about reserved is missing for the kernel by kexec.
it turns out init_apic_mappings is called before e820_reserve_resources
in setup_arch. but e820_reserve_resources is using request_resource.
it will not handle the conflicts.
there are three ways to fix it:
1. change request_resource in e820_reserve_resources to to insert_resource
2. move init_apic_mappings after e820_reserve_resources
3. use late_initcall to insert lapic resource.
this patch is using method 3, that is less intrusive.
in later version could consider to use method 1.
before patch
fed20000-ffffffff : PCI Bus #00
fee00000-fee00fff : Local APIC
fefff000-feffffff : pnp 00:09
ff700000-ffffffff : reserved
with patch will get map in first kernel
fed20000-ffffffff : PCI Bus #00
fee00000-fee00fff : Local APIC
fee00000-fee00fff : reserved
fefff000-feffffff : pnp 00:09
ff700000-ffffffff : reserved
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
e820_resource_resources could use insert_resource instead of request_resource
also move code_resource, data_resource, bss_resource, and crashk_res
out of e820_reserve_resources.
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Copy x86_64 and add a head32.c so we can start moving early
architecture initialization out of assembly.
[ Sam Ravnborg <sam@ravnborg.org>: updated it to x86 ]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
quad core 8 socket system will have apic id lifting.the apic id range could
be [4, 0x23]. and apic_is_clustered_box will think that need to three clusters
and that is large than 2. So it is treated as clustered_box.
and will get
Marking TSC unstable due to TSCs unsynchronized
even the CPUs have X86_FEATURE_CONSTANT_TSC set.
this patch will check if the cpu is from AMD.
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Now cpu/proc.c and cpu/proc_64.c are same.
So cpu/proc_64.c can be removed.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Change /proc/cpuinfo on 32-bit, it will look like on 64-bit.
'power management' line is added and power management information
will be printed at the line.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86 /proc/cpuinfo code can be unified.
This is the first step of unification.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The patch fixes 33 errors and a few warnings reported by checkpatch.pl
arch/x86/oprofile/op_model_athlon.o:
text data bss dec hex filename
1691 0 32 1723 6bb op_model_athlon.o.before
1691 0 32 1723 6bb op_model_athlon.o.after
md5:
c354bc2d7140e1e626c03390eddaa0a6 op_model_athlon.o.before.asm
c354bc2d7140e1e626c03390eddaa0a6 op_model_athlon.o.after.asm
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The patch make the file errors free.
Only 4 "WARNING: line over 80 characters" left.
arch/x86/kernel/cpu/mcheck/p5.o:
text data bss dec hex filename
452 0 4 456 1c8 p5.o.before
452 0 4 456 1c8 p5.o.after
md5:
50c945ef150aa95bf0481cc3e1dc3315 p5.o.before.asm
50c945ef150aa95bf0481cc3e1dc3315 p5.o.after.asm
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The patch kills 45 errors and a few warnings.
The file is now error/warning free:
total: 0 errors, 0 warnings, 237 lines checked
arch/x86/lib/string_32.c has no obvious style problems and is ready for submission.
no code changed:
arch/x86/lib/string_32.o:
text data bss dec hex filename
639 0 0 639 27f string_32.o.before
639 0 0 639 27f string_32.o.after
md5:
2db1c48187cf5113bb595153ee1fc73d string_32.o.before.asm
2db1c48187cf5113bb595153ee1fc73d string_32.o.after.asm
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Also update field names to simply payload_{offset,length} so as to not rule
out uncompressed images.
Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
> Don't we have a special section for page-aligned data so it doesn't
> waste most of two pages?
We have .bss.page_aligned and it seems appropriate to use it.
text data bss dec hex filename
- 3388 8236 4 11628 2d6c ../build-32/arch/x86/mm/ioremap.o
+ 3388 48 4100 7536 1d70 ../build-32/arch/x86/mm/ioremap.o
Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Kills more than 150 errors/warnings
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
fix code to access CMOS rtc registers so that it does not use inb_p and
outb_p routines, which are deprecated. Extensive research on all known
CMOS RTC chipset timing shows that there is no need for a delay in
accessing the registers of these chips even on old machines. These chipa
are never on an expansion bus, but have always been "motherboard"
resources, either in the processor chipset or explicitly on the
motherboard, and they are not part of the ISA/LPC or PCI buses, so
delays should not be based on bus timing. The reason to fix it:
1) port 80 writes often hang some laptops that use ENE EC chipsets,
esp. those designed and manufactured by Quanta for HP;
2) RTC accesses are timing sensitive, and extra microseconds may matter;
3) the new "io_delay" function is calibrated by expansion bus timing needs,
thus is not appropriate for access to CMOS rtc registers.
Signed-off-by: David P. Reed <dpreed@reed.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Replace the hardcoded list of initialization functions for each CPU
vendor by a list in an ELF section, which is read at initialization in
arch/x86/kernel/cpu/cpu.c to fill the cpu_devs[] array. The ELF
section, named .x86cpuvendor.init, is reclaimed after boot, and
contains entries of type "struct cpu_vendor_dev" which associates a
vendor number with a pointer to a "struct cpu_dev" structure.
This first modification allows to remove all the VENDOR_init_cpu()
functions.
This patch also removes the hardcoded calls to early_init_amd() and
early_init_intel(). Instead, we add a "c_early_init" member to the
cpu_dev structure, which is then called if not NULL by the generic CPU
initialization code. Unfortunately, in early_cpu_detect(), this_cpu is
not yet set, so we have to use the cpu_devs[] array directly.
This patch is part of the Linux Tiny project, and is needed for
further patch that will allow to disable compilation of unused CPU
support code.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
It becomes to early for ioremap, so we use early_ioremap
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalemp.com>
Acked-by: Shai Fultheim <shai@scalemp.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Change Makefile so vsmp_64.o object is dependent
on PARAVIRT, rather than X86_VSMP
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalemp.com>
Acked-by: Shai Fultheim <shai@scalemp.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
we don't need get that so early.
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
I found it strange that the struct sk_buff definition was found
inside the DWARF debugging sections in the generated object, so I verified
and found that there is no need for the files that bring struct sk_buff
definition into this file and verified also that sk_buff is not brought
in indirectly too, thru other headers.
I went on and removed many other unneeded includes and the end
result is:
[acme@doppio net-2.6]$ l /tmp/sys_ia32.o.before /tmp/sys_ia32.o.after
-rw-rw-r-- 1 acme acme 185240 2008-02-06 19:19 /tmp/sys_ia32.o.after
-rw-rw-r-- 1 acme acme 248328 2008-02-06 19:00 /tmp/sys_ia32.o.before
Almost 64KB only on this object file!
There were no other side effects from this change:
[acme@doppio net-2.6]$ objcopy -j "text" /tmp/sys_ia32.o.before /tmp/text.before
[acme@doppio net-2.6]$ objcopy -j "text" /tmp/sys_ia32.o.after /tmp/text.after
[acme@doppio net-2.6]$ md5sum /tmp/text.before /tmp/text.after
b7ac9b17942add68494e698e4f965d36 /tmp/text.before
b7ac9b17942add68494e698e4f965d36 /tmp/text.after
One of the complaints about using tools such as systemtap is
that one has to install the huge kernel-debuginfo package:
[acme@doppio net-2.6]$ rpm -q --qf "%{size}\n" kernel-rt-debuginfo
471737710
543867594
[acme@doppio net-2.6]$
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
Cc: Ian Campbell <ijc@hellion.org.uk>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: virtualization@lists.linux-foundation.org
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[ tglx@linutronix.de: cleanup the other structs as well ]
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The extended century readout does not solve the year 2038 problem on
32bit!
v2: Fix compilation on !ACPI, pointed out by tglx
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We assume that the RTC clock is BCD, so print a warning if it claims
to be binary.
[ tglx@linutronix.de: changed to WARN_ON - we want to know that!
If no one reports it we can remove the complete if (RTC_ALWAYS_BCD)
magic, which has RTC_ALWAYS_BCD defined to 1 since Linux 1.0 ... ]
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We know it is already after 2000. Use the year 2000 offset for both 32
and 64 bit, which removes ifdefs and the 1970 magic.
[ tglx@linutronix.de: remove 1970 magic, replace bogus commit message ]
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Change size to unsigned long, becase caller and user all used unsigned long.
Also make bad_addr take an alignment parameter.
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The AMD Fam10h CPUs support new Gigabyte page table entry for
mapping 1GB at a time. Use this for the kernel direct mapping.
Only done for 64bit because i386 does not support GB page tables.
This only applies to the data portion of the direct mapping; the
kernel text mapping stays with 2MB pages because the AMD Fam10h
microarchitecture does not support GB ITLBs and AMD recommends
against using GB mappings for code.
Can be disabled with disable_gbpages on the kernel command line
[ tglx@linutronix.de: simplify enable code ]
[ Yinghai Lu <yinghai.lu@sun.com>: boot fix on 256 GB RAM ]
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
These new controls toggle experimental support for a new CPU feature,
the straightforward extension of largepages from the pmd level to the
pud level, which allows 1GB (kernel) TLBs instead of 2MB TLBs.
Turn it off by default, as this code has not been tested well enough yet.
Use the CONFIG_DIRECT_GBPAGES=y .config option or gbpages on the
boot line can be used to enable it. If enabled in the .config then
nogbpages boot option disables it.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Clean up the page table dumper (fix boundary conditions, table driven
address ranges, some formatting changes since it is no longer using
the kernel log but a separate virtual file), and generalize to 32
bits.
[ mingo@elte.hu: x86: fix the pagetable dumper ]
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch adds code to the kernel to have an (optional)
/proc/kernel_page_tables debug file that basically dumps the kernel
pagetables; this allows us kernel developers to verify that nothing fishy is
going on and that the various mappings are set up correctly. This was quite
useful in finding various change_page_attr() bugs, and is very likely to be
useful in the future as well.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: mingo@elte.hu
Cc: tglx@tglx.de
Cc: hpa@zytor.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Unify arch/x86/mm/Makefile between 32 and 64 bits.
All configuration variables that are protected by Kconfig constraints
have been put in the common part of the Makefile; however, the NUMA
files are totally different between 32 and 64 bits and are handled via
an ifdef.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Add debug information for DEBUG_PAGEALLOC to get some statistics about
the pool usage and split status.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We map a VMA for the 32-bit vDSO even when it's disabled, which is stupid.
For the 32-bit kernel it's the vdso_enabled boot parameter/sysctl
and for the 64-bit kernel it's the vdso32 boot parameter/syscall32 sysctl.
When it's disabled, we don't pass AT_SYSINFO_EHDR so processes don't use
the vDSO for anything, but we still map it. For the non-compat vDSO,
this means we're always putting an extra VMA somewhere, maybe lousing
up the control of the address space the user was hoping for.
Honor the setting by doing nothing in arch_setup_additional_pages.
[ also see: "x86 vDSO: don't use disabled vDSO for signal trampoline" ]
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If the vDSO was not mapped, don't use it as the "restorer" for a signal
handler. Whether we have a pointer in mm->context.vdso depends on what
happened at exec time, so we shouldn't check any global flags now.
Background:
Currently, every 32-bit exec gets the vDSO mapped even if it's disabled
(the process just doesn't get told about it). Because it's in fact
always there, the bug that this patch fixes cannot happen now. With
the second patch, it won't be mapped at all when it's disabled, which is
one of the things that people might really want when they disable it (so
nothing they didn't ask for goes into their address space).
The 32-bit signal handler setup when SA_RESTORER is not used refers to
current->mm->context.vdso without regard to whether the vDSO has been
disabled when the process was exec'd. This patch fixes this not to use
it when it's null, which becomes possible after the second patch. (This
never happens in normal use, because glibc's sigaction call uses
SA_RESTORER unless glibc detected the vDSO.)
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
people sometimes do crazy stuff like building really large static
arrays into their kernels or building allyesconfig kernels. Give
more space to the kernel and push modules up a bit: kernel has
512 MB and modules have 1.5 GB.
Should be enough for a few years ;-)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
http://bugzilla.kernel.org/show_bug.cgi?id=10124
this change:
commit 08f1c192c3
Author: Muli Ben-Yehuda <muli@il.ibm.com>
Date: Sun Jul 22 00:23:39 2007 +0300
x86-64: introduce struct pci_sysdata to facilitate sharing of ->sysdata
This patch introduces struct pci_sysdata to x86 and x86-64, and
converts the existing two users (NUMA, Calgary) to use it.
This lays the groundwork for having other users of sysdata, such as
the PCI domains work.
The Calgary bits are tested, the NUMA bits just look ok.
replaces pcibios_scan_root by pci_scan_bus_parented...
but in pcibios_scan_root we have a check about scanned busses.
Cc: <yakui.zhao@intel.com>
Cc: Stian Jordet <stian@jordet.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <greg@kroah.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Yinghai Lu" <yhlu.kernel@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The prevent_tail_call() macro works around the problem of the compiler
clobbering argument words on the stack, which for asmlinkage functions
is the caller's (user's) struct pt_regs. The tail/sibling-call
optimization is not the only way that the compiler can decide to use
stack argument words as scratch space, which we have to prevent.
Other optimizations can do it too.
Until we have new compiler support to make "asmlinkage" binding on the
compiler's own use of the stack argument frame, we have work around all
the manifestations of this issue that crop up.
More cases seem to be prevented by also keeping the incoming argument
variables live at the end of the function. This makes their original
stack slots attractive places to leave those variables, so the compiler
tends not clobber them for something else. It's still no guarantee, but
it handles some observed cases that prevent_tail_call() did not.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch also resolves hangs on boot:
http://lkml.org/lkml/2008/2/23/263http://bugzilla.kernel.org/show_bug.cgi?id=10093
The bug was causing once-in-few-reboots 10-15 sec wait during boot on
certain laptops.
Earlier commit 40d6a14662 added
smp_call_function in cpu_idle_wait() to kick cpus that are in tickless
idle. Looking at cpu_idle_wait code at that time, code seemed to be
over-engineered for a case which is rarely used (while changing idle
handler).
Below is a simplified version of cpu_idle_wait, which just makes a dummy
smp_call_function to all cpus, to make them come out of old idle handler
and start using the new idle handler. It eliminates code in the idle
loop to handle cpu_idle_wait.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
gcc expects all toplevel assembly to return to the original section type.
The code in alteranative.c does not do this. This caused some strange bugs
in sched-devel where code would end up in the .rodata section and when
the kernel sets the NX bit on all .rodata, the kernel would crash when
executing this code.
This patch adds a .previous marker to return the code back to the
original section.
Credit goes to Andrew Pinski for telling me it wasn't a gcc bug but a
bug in the toplevel asm code in the kernel. ;-)
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In time_cpufreq_notifier() the cpu id to act upon is held in freq->cpu. Use it
instead of smp_processor_id() in the call to set_cyc2ns_scale().
This makes the preempt_*able() unnecessary and lets set_cyc2ns_scale() update
the intended cpu's cyc2ns.
Related mail/thread: http://lkml.org/lkml/2007/12/7/130
Signed-off-by: Karsten Wiese <fzu@wemgehoertderstaat.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
revert:
| commit 47001d6033
| Author: Thomas Gleixner <tglx@linutronix.de>
| Date: Tue Apr 1 19:45:18 2008 +0200
|
| x86: tsc prevent time going backwards
it has been identified to cause suspend regression - and the
commit fixes a longstanding bug that existed before 2.6.25 was
opened - so it can wait some more until the effects are better
understood.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We handle a broken tsc these days, so no need to panic. We clear the
TSC bit when tsc_init decides it's unreliable (eg. under lguest w/ bad
host TSC), leading to bogus panic.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The commits:
commit 37a47db8d7
Author: Balaji Rao <balajirrao@gmail.com>
Date: Wed Jan 30 13:30:03 2008 +0100
x86: assign IRQs to HPET timers, fix
and
commit e3f37a54f6
Author: Balaji Rao <balajirrao@gmail.com>
Date: Wed Jan 30 13:30:03 2008 +0100
x86: assign IRQs to HPET timers
have been identified to cause a regression on some platforms due to
the assignement of legacy IRQs which makes the legacy devices
connected to those IRQs disfunctional.
Revert them.
This fixes http://bugzilla.kernel.org/show_bug.cgi?id=10382
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We already catch most of the TSC problems by sanity checks, but there
is a subtle bug which has been in the code for ever. This can cause
time jumps in the range of hours.
This was reported in:
http://lkml.org/lkml/2007/8/23/96
and
http://lkml.org/lkml/2008/3/31/23
I was able to reproduce the problem with a gettimeofday loop test on a
dual core and a quad core machine which both have sychronized
TSCs. The TSCs seems not to be perfectly in sync though, but the
kernel is not able to detect the slight delta in the sync check. Still
there exists an extremly small window where this delta can be observed
with a real big time jump. So far I was only able to reproduce this
with the vsyscall gettimeofday implementation, but in theory this
might be observable with the syscall based version as well.
CPU 0 updates the clock source variables under xtime/vyscall lock and
CPU1, where the TSC is slighty behind CPU0, is reading the time right
after the seqlock was unlocked.
The clocksource reference data was updated with the TSC from CPU0 and
the value which is read from TSC on CPU1 is less than the reference
data. This results in a huge delta value due to the unsigned
subtraction of the TSC value and the reference value. This algorithm
can not be changed due to the support of wrapping clock sources like
pm timer.
The huge delta is converted to nanoseconds and added to xtime, which
is then observable by the caller. The next gettimeofday call on CPU1
will show the correct time again as now the TSC has advanced above the
reference value.
To prevent this TSC specific wreckage we need to compare the TSC value
against the reference value and return the latter when it is larger
than the actual TSC value.
I pondered to mark the TSC unstable when the readout is smaller than
the reference value, but this would render an otherwise good and fast
clocksource unusable without a real good reason.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix obsolete printks in aperture-64. We used not to handle missing
agpgart, but we handle it okay now.
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
right now if there's no CPU support for nmi_watchdog=2 we'll just
refuse it silently.
print a useful warning.
Signed-off-by: Ingo Molnar <mingo@elte.hu>