Commit graph

42519 commits

Author SHA1 Message Date
Rusty Russell
bd472c794b [PATCH] paravirt: Be careful about touching BIOS address space
BIOS ROM areas may not be mapped into the guest address space, so be careful
when touching those addresses to make sure they appear to be mapped.

[akpm@osdl.org: fix unused var warning]
AK: Changed __get_user to probe_kernel_address

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:08 +01:00
Rusty Russell
da181a8b39 [PATCH] paravirt: Add MMU virtualization to paravirt_ops
Add the three bare TLB accessor functions to paravirt-ops.  Most amusingly,
flush_tlb is redefined on SMP, so I can't call the paravirt op flush_tlb.
Instead, I chose to indicate the actual flush type, kernel (global) vs. user
(non-global).  Global in this sense means using the global bit in the page
table entry, which makes TLB entries persistent across CR3 reloads, not
global as in the SMP sense of invoking remote shootdowns, so the term is
confusingly overloaded.

AK: folded in fix from Zach for PAE compilation

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:08 +01:00
Rusty Russell
13623d7930 [PATCH] paravirt: Add APIC accessors to paravirt-ops.
Add APIC accessors to paravirt-ops.  Unfortunately, we need two write
functions, as some older broken hardware requires workarounds for
Pentium APIC errata - this is the purpose of apic_write_atomic.

AK: replaced __inline with inline

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:08 +01:00
Rusty Russell
6020c8f315 [PATCH] paravirt: Allow disable power management under hypervisor
Two legacy power management modes are much easier to just explicitly disable
when running in paravirtualized mode - neither APM nor PnP is still relevant.
The status of ACPI is still debatable, and noacpi is still a common enough
boot parameter that it is not necessary to explicitly disable ACPI.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:08 +01:00
Andi Kleen
3bbf547254 [PATCH] paravirt: Disable vdso by default when CONFIG_PARAVIRT is enabled
They don't work together and this way even glibc still works.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:08 +01:00
Rusty Russell
4f205fd45a [PATCH] paravirt: Allow selected bug checks to be
Allow selected bug checks to be skipped by paravirt kernels.  The two most
important are the F00F workaround (which is either done by the hypervisor,
or not required), and the 'hlt' instruction check, which can break under
some hypervisors.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:08 +01:00
Rusty Russell
c9ccf30d77 [PATCH] paravirt: Add startup infrastructure for paravirtualization
1) Each hypervisor writes a probe function to detect whether we are
   running under that hypervisor.  paravirt_probe() registers this
   function.

2) If vmlinux is booted with ring != 0, we call all the probe
   functions (with registers except %esp intact) in link order: the
   winner will not return.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:08 +01:00
Rusty Russell
d7cd56111f [PATCH] i386: cpu_detect extraction
Both lhype and Xen want to call the core of the x86 cpu detect code before
calling start_kernel.

(extracted from larger patch)

AK: folded in start_kernel header patch

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:08 +01:00
Rusty Russell
139ec7c416 [PATCH] paravirt: Patch inline replacements for paravirt intercepts
It turns out that the most called ops, by several orders of magnitude,
are the interrupt manipulation ops.  These are obvious candidates for
patching, so mark them up and create infrastructure for it.

The method used is that the ops structure has a patch function, which
is called for each place which needs to be patched: this returns a
number of instructions (the rest are NOP-padded).

Usually we can spare a register (%eax) for the binary patched code to
use, but in a couple of critical places in entry.S we can't: we make
the clobbers explicit at the call site, and manually clobber the
allowed registers in debug mode as an extra check.

And:

Don't abuse CONFIG_DEBUG_KERNEL, add CONFIG_DEBUG_PARAVIRT.

And:

AK:  Fix warnings in x86-64 alternative.c build

And:

AK: Fix compilation with defconfig

And:

^From: Andrew Morton <akpm@osdl.org>

Some binutlises still like to emit references to __stop_parainstructions and
__start_parainstructions.

And:

AK: Fix warnings about unused variables when PARAVIRT is disabled.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:08 +01:00
Rusty Russell
d3561b7fa0 [PATCH] paravirt: header and stubs for paravirtualisation
Create a paravirt.h header for all the critical operations which need to be
replaced with hypervisor calls, and include that instead of defining native
operations, when CONFIG_PARAVIRT.

This patch does the dumbest possible replacement of paravirtualized
instructions: calls through a "paravirt_ops" structure.  Currently these are
function implementations of native hardware: hypervisors will override the ops
structure with their own variants.

All the pv-ops functions are declared "fastcall" so that a specific
register-based ABI is used, to make inlining assember easier.

And:

+From: Andy Whitcroft <apw@shadowen.org>

The paravirt ops introduce a 'weak' attribute onto memory_setup().
Code ordering leads to the following warnings on x86:

    arch/i386/kernel/setup.c:651: warning: weak declaration of
                `memory_setup' after first use results in unspecified behavior

Move memory_setup() to avoid this.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
2006-12-07 02:14:07 +01:00
Nicolas Kaiser
db91b882aa [PATCH] i386: Fix double #includes in arch/i386
Fix double #includes in arch/i386

Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:07 +01:00
David Rientjes
3529833f1c [PATCH] i386: substitute __va lookup with pfn_to_kaddr
Substitutes allocate_pgdat virtual address lookup with pfn_to_kaddr macro.

Signed-off-by: David Rientjes <rientjes@cs.washington.edu>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:07 +01:00
Andi Kleen
b6bcc4bb1c [PATCH] x86-64: Don't force inlining of do_csum
It's two big and used by two callers. Calls should be cheap enough anyways.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:07 +01:00
Chuck Ebbert
8c89812684 [PATCH] i386: remove IOPL check on task switch
IOPL is implicitly saved and restored on task switch,
so explicit check is no longer needed.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:07 +01:00
Andi Kleen
516d283643 [PATCH] x86-64: Fix race in IO-APIC routing entry setup.
Interrupt could happen between setting the IO-APIC entry
and setting its interrupt data.

Pointed out by Linus.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:07 +01:00
Andi Kleen
d15512f442 [PATCH] i386: Fix race in IO-APIC routing entry setup.
Interrupt could happen between setting the IO-APIC entry
and setting its interrupt data.

Pointed out by Linus.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:07 +01:00
Paolo 'Blaisorblade' Giarrusso
e6536c1262 [PATCH] x86: comment magic constants in delay.h
For both i386 and x86_64, copy from arch/$ARCH/lib/delay.c comments about the
used magic constants, plus a few other niceties.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andi Kleen <ak@suse.de>

 include/asm-i386/delay.h   |    5 ++++-
 include/asm-x86_64/delay.h |    5 ++++-
 2 files changed, 8 insertions(+), 2 deletions(-)
2006-12-07 02:14:07 +01:00
Paolo 'Blaisorblade' Giarrusso
b9a8d94a47 [PATCH] x86-64: Make x86_64 udelay() round up instead of down.
Port two patches from i386 to x86_64 delay.c to make sure all rounding is done
upward instead of downward.

There is no sign in commit messages that the mismatch was done on purpose, and
"delay() guarantees sleeping at least for the specified time" is still a valid
rule IMHO.

The original x86 patches are both from pre-GIT era, i.e.:

"[PATCH] round up  in __udelay()" in commit
54c7e1f5cc6771ff644d7bc21a2b829308bd126f

"[PATCH] add 1 in __const_udelay()" in commit
42c77a9801b8877d8b90f65f75db758822a0bccc

(both commits are from converted BK repository to x86_64).

AK: fixed gcc warning

linux/arch/x86_64/lib/delay.c:43: warning: suggest parentheses around + or - inside shift
(did this actually work?)

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:07 +01:00
Muli Ben-Yehuda
bff6547bb6 [PATCH] Calgary: allow compiling Calgary in but not using it by default
This patch makes it possible to compile Calgary in but not use it by
default. In this mode, use 'iommu=calgary' to activate it.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:07 +01:00
Muli Ben-Yehuda
eae9375554 [PATCH] Calgary: check BBAR ioremap success when ioremapping
This patch cleans up the previous "Use BIOS supplied BBAR information"
patch. Mostly stylistic clenaups, but also check for ioremap failure
when we ioremap the BBAR rather than when trying to use it.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Laurent Vivier <Laurent.Vivier@bull.net>
2006-12-07 02:14:06 +01:00
Laurent Vivier
b34e90b8f0 [PATCH] Calgary: use BIOS supplied BBARs and topology information
Find the BBAR register address of each Calgary using the "Extended
BIOS Data Area" rather than calculating it ourselves. Also get the bus
topology (what PHB each bus is on) from Calgary rather than
calculating it ourselves.

This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=7407.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:06 +01:00
Muli Ben-Yehuda
58db854827 [PATCH] calgary: phb_shift can be int
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:06 +01:00
bibo,mao
cef518e88b [PATCH] i386: Move memory map printing and other code to e820.c
This patch moves e820 memory map print and memmap boot param
parsing function from setup.c to e820.c, also adds limit_regions
and print_memory_map declaration in header file.

Signed-off-by: bibo,mao <bibo.mao@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>

 arch/i386/kernel/e820.c  |  152 +++++++++++++++++++++++++++
 arch/i386/kernel/setup.c |  158 ---------------------------------
 include/asm-i386/e820.h  |    2
 arch/i386/kernel/e820.c  |  152 ++++++++++++++++++++++++++++++++++++++++++++++
 arch/i386/kernel/setup.c |  153 -----------------------------------------------
 include/asm-i386/e820.h  |    2
 3 files changed, 155 insertions(+), 152 deletions(-)
2006-12-07 02:14:06 +01:00
bibo,mao
b5b2405706 [PATCH] i386: Move e820/efi memmap walking code to e820.c
This patch moves e820/efi memmap table walking function from
setup.c to e820.c, also this patch adds extern declaration in
header file.

Signed-off-by: bibo,mao <bibo.mao@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>

 arch/i386/kernel/e820.c  |  115 +++++++++++++++++++++++++++++++++
 arch/i386/kernel/setup.c |  118 -----------------------------------
 include/asm-i386/e820.h  |    2
 arch/i386/kernel/e820.c  |  115 +++++++++++++++++++++++++++++++++++++++++++++
 arch/i386/kernel/setup.c |  118 -----------------------------------------------
 include/asm-i386/e820.h  |    2
 3 files changed, 117 insertions(+), 118 deletions(-)
2006-12-07 02:14:06 +01:00
bibo,mao
b2dff6a88c [PATCH] i386: Move find_max_pfn function to e820.c
Move more code from setup.c into e820.c

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:06 +01:00
bibo,mao
8e3342f736 [PATCH] i386: create e820.c for e820 map sanitize and copy function
This patch moves bios e820 map sanitize and copy function from
setup.c to e820.c

Signed-off-by: bibo,mao <bibo.mao@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>

 arch/i386/kernel/e820.c  |  252 +++++++++++++++++++++++++++++++++++++++++++++++
 arch/i386/kernel/setup.c |  240 --------------------------------------------
 2 files changed, 252 insertions(+), 240 deletions(-)
2006-12-07 02:14:06 +01:00
bibo,mao
269c2d81ed [PATCH] i386: i386 create e820.c to handle standard io/mem resources
This patch creates new file named e820.c to hanle standard io/mem
resources, moving request_standard_resources function from setup.c
to e820.c. Also this patch modifies Makfile to compile file e820.c.

Signed-off-by: bibo,mao <bibo.mao@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>

 Makefile |    2
 arch/i386/kernel/Makefile |    2
 arch/i386/kernel/e820.c   |  289 ++++++++++++++++++++++++++++++++++++++++++++++
 arch/i386/kernel/setup.c  |  276 -------------------------------------------
 3 files changed, 293 insertions(+), 274 deletions(-)
2006-12-07 02:14:06 +01:00
Albert Cahalan
8e3de538ee [PATCH] x86-64: Support -mregparm arguments for signals with SA_SIGINFO in compat mode
The recent change to make x86_64 support i386 binaries compiled
with -mregparm=3 only covered signal handlers without SA_SIGINFO.
(the 3-arg "real-time" ones)

To be compatible with i386, both types should be supported.

Signed-off-by: Albert Cahalan <acahalan@gmail.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:06 +01:00
Andi Kleen
b026872601 [PATCH] x86-64: Try multiple timer variants in check_timer
Instead of adding all kinds of more quirks try various timer
routing variants in check_timer.

In particular this tries to handle quirks from:
- Nvidia NF2-4 reference BIOS: wrong timer override
- Asus: Wrong timer override but no HPET table
- ATI: require timer disabled in 8259
- Some boards: require timer enabled in 8259

We just try many of the the known variants in the hopefully right order
in check_timer.

Trying pin 0/2 on Nvidia suggested by Tim Hockin.

TBD Experimental. Needs a lot of testing

Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:06 +01:00
Andi Kleen
11a4180c0b [PATCH] i386: Use probe_kernel_address instead of __get_user in fault paths
Makes the intention of the code cleaner to read and avoids
a potential deadlock on mmap_sem. Also change the types of
the arguments to not include __user because they're really
not user addresses.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:06 +01:00
Andi Kleen
ab2bf0c1c6 [PATCH] x86-64: Use probe_kernel_address in arch/x86_64/*
Instead of open coded __get_user

Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:06 +01:00
Andi Kleen
2fff0a4841 [PATCH] Generic: Move __user cast into probe_kernel_address
Caller of probe_kernel_address shouldn't need to know that
pka is internally implemented with __get_user. So move the
__user cast into pka.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:05 +01:00
Yinghai Lu
5df0287ecc [PATCH] x86-64: Extend clear_irq_vector
Clear the irq releated entries in irq_vector, irq_domain and vector_irq
instead of clearing irq_vector only. So when new irq is created, it
could reuse that vector. (actually is the second loop scanning from
FIRST_DEVICE_VECTOR+8). This could avoid the vectors are used up
with enough module inserting and removing

Cc: Eric W. Biedierman <ebiederm@xmission.com>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-By: Yinghai Lu <yinghai.lu@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:05 +01:00
Andi Kleen
3760dd6efa [PATCH] i386: Use CLFLUSH instead of WBINVD in change_page_attr
CLFLUSH is a lot faster than WBINVD so try to use that.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:05 +01:00
Andi Kleen
770d132f03 [PATCH] i386: Retrieve CLFLUSH size from CPUID
Also report it in /proc/cpuinfo similar to x86-64.

Needed for followon patch

Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:05 +01:00
Andi Kleen
ea7322decb [PATCH] x86-64: Speed and clean up cache flushing in change_page_attr
CLFLUSH is a lot faster than WBINVD so avoid the later if at all
possible.

Always pass the complete list of pages to other CPUs to cut down
the number of IPIs.

Minor other cleanup and sync with i386 version.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:05 +01:00
Joe Korty
74b47a7844 [PATCH] i386: Fix entry.S code with !CONFIG_VM86
The entry.S code at work_notifysig is surely wrong.  It drops into unrelated
code if the branch to work_notifysig_v86 is taken, and CONFIG_VM86=n.

	[PATCH] Make vm86 support optional
	tree 9b5daef528
	pushed to git Jan 8, 2006, and first appears in 2.6.16

The 'fix' here is to also compile out the vm86 test & branch when
CONFIG_VM86=n.

Signed-off-by: Joe Korty <joe.korty@ccur.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:04 +01:00
Avi Kivity
249e83fe83 [PATCH] x86-64: Extract segment descriptor definitions for use outside
Code that wants to use struct desc_struct cannot do so on i386 because
desc.h contains other code that will only compile on x86_64.

So extract the structure definitions into a asm-x86_64/desc_defs.h.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andi Kleen <ak@suse.de>

 include/asm-x86_64/desc.h      |   53 -------------------------------
 include/asm-x86_64/desc_defs.h |   69 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 70 insertions(+), 52 deletions(-)
2006-12-07 02:14:04 +01:00
Vivek Goyal
4c7aa6c3b2 [PATCH] i386: Mark CONFIG_RELOCATABLE EXPERIMENTAL
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:04 +01:00
Vivek Goyal
be274eeaf2 [PATCH] i386: extend bzImage protocol for relocatable protected mode kernel
Extend bzImage protocol to enable bootloaders to load a completely relocatable
bzImage.  Now protected mode component of kernel is also relocatable and a
boot-loader can load the protected mode component at a differnt physical
address than 1MB.  (If kernel was built with CONFIG_RELOCATABLE)

Kexec can make use of it to load this kernel at a different physical address
to capture kernel crash dumps.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:04 +01:00
Vivek Goyal
e69f202d0a [PATCH] i386: Implement CONFIG_PHYSICAL_ALIGN
o Now CONFIG_PHYSICAL_START is being replaced with CONFIG_PHYSICAL_ALIGN.
  Hardcoding the kernel physical start value creates a problem in relocatable
  kernel context due to boot loader limitations. For ex, if somebody
  compiles a relocatable kernel to be run from address 4MB, but this kernel
  will run from location 1MB as grub loads the kernel at physical address
  1MB. Kernel thinks that I am a relocatable kernel and I should run from
  the address I have been loaded at. So somebody wanting to run kernel
  from 4MB alignment location (for improved performance regions) can't do
  that.

o Hence, Eric proposed that probably CONFIG_PHYSICAL_ALIGN will make
  more sense in relocatable kernel context. At run time kernel will move
  itself to a physical addr location which meets user specified alignment
  restrictions.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:04 +01:00
Vivek Goyal
6a044b3a0a [PATCH] i386: Warn upon absolute relocations being present
o Relocations generated w.r.t absolute symbols are not processed as by
  definition, absolute symbols are not to be relocated. Explicitly warn
  user about absolutions relocations present at compile time.

o These relocations get introduced either due to linker optimizations or
  some programming oversights.

o Also create a list of symbols which have been audited to be safe and
  don't emit warnings for these.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:04 +01:00
Eric W. Biederman
968de4f026 [PATCH] i386: Relocatable kernel support
This patch modifies the i386 kernel so that if CONFIG_RELOCATABLE is
selected it will be able to be loaded at any 4K aligned address below
1G.  The technique used is to compile the decompressor with -fPIC and
modify it so the decompressor is fully relocatable.  For the main
kernel relocations are generated.  Resulting in a kernel that is relocatable
with no runtime overhead and no need to modify the source code.

A reserved 32bit word in the parameters has been assigned
to serve as a stack so we figure out where are running.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:04 +01:00
Eric W. Biederman
fd593d1277 [PATCH] relocatable kernel: Kallsyms generate relocatable symbols
Print the addresses of non-absolute symbols relative to _text
so that ld will generate relocations.  Allowing a relocatable
kernel to relocate them.  We can't actually use the symbol names
because kallsyms includes static symbols that are not exported
from their object files.

Add the _text symbol definitions to the architectures which don't
define it otherwise linker will fail.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:04 +01:00
Eric W. Biederman
2a43f3ede4 [PATCH] i386: CONFIG_PHYSICAL_START cleanup
Defining __PHYSICAL_START and __KERNEL_START in asm-i386/page.h works but
it triggers a full kernel rebuild for the silliest of reasons.  This
modifies the users to directly use CONFIG_PHYSICAL_START and linux/config.h
which prevents the full rebuild problem, which makes the code much
more maintainer and hopefully user friendly.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:04 +01:00
Eric W. Biederman
8621b81c74 [PATCH] i386: Reserve kernel memory starting from _text
Currently when we are reserving the memory the kernel text
resides in we start at __PHYSICAL_START which happens to be
correct but not very obvious.  In addition when we start relocating
the kernel __PHYSICAL_START is the wrong value, as it is an
absolute symbol that does not get relocated.

By starting the reservation at __pa_symbol(_text)
the code is clearer and will be correct when relocated.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:03 +01:00
Eric W. Biederman
9f45accf17 [PATCH] i386: define __pa_symbol()
On x86_64 we have to be careful with calculating the physical
address of kernel symbols.  Both because of compiler odditities
and because the symbols live in a different range of the virtual
address space.

Having a defintition of __pa_symbol that works on both x86_64 and
i386 simplifies writing code that works for both x86_64 and
i386 that has these kinds of dependencies.

So this patch adds the trivial i386 __pa_symbol definition.

Added assembly magic similar to RELOC_HIDE as suggested by Andi Kleen.
Just picked it up from x86_64.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:03 +01:00
Vivek Goyal
6ed018845f [PATCH] i386: Add comment for align to vmlinux.lds
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:03 +01:00
Vivek Goyal
6569580de7 [PATCH] i386: Distinguish absolute symbols
Ld knows about 2 kinds of symbols,  absolute and section
relative.  Section relative symbols symbols change value
when a section is moved and absolute symbols do not.

Currently in the linker script we have several labels
marking the beginning and ending of sections that
are outside of sections, making them absolute symbols.
Having a mixture of absolute and section relative
symbols refereing to the same data is currently harmless
but it is confusing.

This must be done carefully as newer revs of ld do not place
symbols that appear in sections without data and instead
ld makes those symbols global :(

My ultimate goal is to build a relocatable kernel.  The
safest and least intrusive technique is to generate
relocation entries so the kernel can be relocated at load
time.  The only penalty would be an increase in the size
of the kernel binary.  The problem is that if absolute and
relocatable symbols are not properly specified absolute symbols
will be relocated or section relative symbols won't be, which
is fatal.

The practical motivation is that when generating kernels that
will run from a reserved area for analyzing what caused
a kernel panic, it is simpler if you don't need to hard code
the physical memory location they will run at, especially
for the distributions.

[AK: and merged:]

o Also put a message so that in future people can be aware of it and
  avoid introducing absolute symbols.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:03 +01:00
Andi Kleen
67fd44fea2 [PATCH] x86-64: Implement compat code for SIOCSIFHWBROADCAST
This network ioctl wasn't handled before.

Reported by Alexandra.Kossovsky@oktetlabs.ru (Alexandra Kossovsky)
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:03 +01:00