Commit graph

3656 commits

Author SHA1 Message Date
Frederic Weisbecker
1b2852b152 vtime: Warn if irqs aren't disabled on system time accounting APIs
System time accounting APIs such as vtime_account_system() and
vtime_account_idle() need to be irqsafe. Current callers include
irq entry, exit and kvm, all of which have been checked against that
requirement. Now it's better to grow that with an automatic check
in case we have further callers or we missed something.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
2012-11-20 15:42:51 +01:00
Frederic Weisbecker
e3942ba040 vtime: Consolidate a bit the ctx switch code
On ia64 and powerpc, vtime context switch only consists
in flushing system and user pending time, plus a few
arch housekeeping.

Consolidate that into a generic implementation. s390 is
a special case because pending user and system time accounting
there is hard to dissociate. So it's keeping its own implementation.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
2012-11-19 16:41:32 +01:00
Frederic Weisbecker
bcebdf8465 vtime: Explicitly account pending user time on process tick
All vtime implementations just flush the user time on process
tick. Consolidate that in generic code by calling a user time
accounting helper. This avoids an indirect call in ia64 and
prepare to also consolidate vtime context switch code.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
2012-11-19 16:41:21 +01:00
Frederic Weisbecker
fd25b4c2f2 vtime: Remove the underscore prefix invasion
Prepending irq-unsafe vtime APIs with underscores was actually
a bad idea as the result is a big mess in the API namespace that
is even waiting to be further extended. Also these helpers
are always called from irq safe callers except kvm. Just
provide a vtime_account_system_irqsafe() for this specific
case so that we can remove the underscore prefix on other
vtime functions.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
2012-11-19 16:40:16 +01:00
Adam Buchbinder
48fc7f7e78 Fix misspellings of "whether" in comments.
"Whether" is misspelled in various comments across the tree; this
fixes them. No code changes.

Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-11-19 14:31:35 +01:00
Michael Neuling
b0302722ee powerpc: Setup relocation on exceptions for bare metal systems
This turns on MMU on execptions via AIL field in the LPCR.

Signed-off-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 15:08:06 +11:00
Michael Neuling
f7c32c24f5 powerpc: Move initial mfspr LPCR out of __init_LPCR
We want to change what's initially set in the LPCR, so start by taking the move
from LPCR out of the function and into the caller.

Signed-off-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 15:08:05 +11:00
Michael Neuling
c1fb6816fb powerpc: Add relocation on exception vector handlers
POWER8/v2.07 allows exceptions to be taken with the MMU still on.

A new set of exception vectors is added at 0xc000_0000_0000_4xxx.  When the HW
takes us here, MSR IR/DR will be set already and we no longer need a costly
RFID to turn the MMU back on again.

The original 0x0 based exception vectors remain for when the HW can't leave the
MMU on.  Examples of this are when we can't trust the current MMU mappings,
like when we are changing from guest to hypervisor (HV 0 -> 1) or when the MMU
was off already.  In these cases the HW will take us to the original 0x0 based
exception vectors with the MMU off as before.

This uses the new macros added previously too implement these new execption
vectors at 0xc000_0000_0000_4xxx.  We exit these exception vectors using
mflr/blr (rather than mtspr SSR0/RFID), since we don't need the costly MMU
switch anymore.

This moves the __end_interrupts marker down past these new 0x4000 vectors since
they will need to be copied down to 0x0 when the kernel is not at 0x0.

Signed-off-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 15:08:05 +11:00
Michael Neuling
4700dfaf1e powerpc: Add new macros needed for relocation on exceptions
POWER8/v2.07 allows exceptions to be taken with the MMU still on.

A new set of exception vectors is added at 0xc000_0000_0000_4xxx.  When the HW
takes us here, MSR IR/DR will be set already and we no longer need a costly
RFID to turn the MMU back on again.

The original 0x0 based exception vectors remain for when the HW can't leave the
MMU on.  Examples of this are when we can't trust the current the MMU mappings,
like when we are changing from guest to hypervisor (HV 0 -> 1) or when the MMU
was off already.  In these cases the HW will take us to the original 0x0 based
exception vectors with the MMU off as before.

The below macros are copies of the macros used at the 0x0 offset but modified
to handle the MMU being on.  In these macros we use the link register to jump
to the secondary handlers rather than using RFID (RFID was also use to turn on
the MMU).

Signed-off-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 15:08:04 +11:00
Michael Neuling
742415d6b6 powerpc: Turn syscall handler into macros
This turns the syscall handler into macros as we are going to want to reuse
them again later.

Signed-off-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 15:08:04 +11:00
Michael Neuling
61e2390ede powerpc: Make load_hander handle upto 64k offset
If we change load_hander() to use an ori instead of addi, we can load handlers
upto 64k away provided we are still 64k aligned.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 15:08:03 +11:00
Michael Neuling
faab4dd2d2 powerpc: Remove unessessary 0x3000 location enforcement
This removes the large gap between 0x1800 and 0x3000.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 15:08:03 +11:00
Michael Neuling
278a6cdc39 powerpc: Whitespace changes in exception64s.S
Remove redundancy spaces and make tab usage consistent.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 15:08:02 +11:00
Benjamin Herrenschmidt
de1bb03af7 Merge branch 'dt' into next 2012-11-15 15:02:44 +11:00
Anton Blanchard
11ee7e99f3 powerpc: Fix CONFIG_RELOCATABLE=y CONFIG_CRASH_DUMP=n build
If we build a kernel with CONFIG_RELOCATABLE=y CONFIG_CRASH_DUMP=n,
the kernel fails when we run at a non zero offset. It turns out
we were incorrectly wrapping some of the relocatable kernel code
with CONFIG_CRASH_DUMP.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 15:02:03 +11:00
Michael Neuling
c674e703cb powerpc: Add POWER8 architected mode to cputable
A PVR of 0x0F000004 means we are arch v2.07 complicate ie, POWER8.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 15:02:00 +11:00
Michael Neuling
df77c79920 powerpc/pseries: Update ibm,architecture.vec for PAPR 2.7/POWER8
Update ibm,architecture.vec for POWER8 and allows us to support more
than one parition per core.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 15:01:39 +11:00
Aravinda Prasad
a53fd61ac2 powerpc/ptrace: Enable hardware breakpoint upon re-registering
On powerpc, ptrace will disable hardware breakpoint request once the
breakpoint is hit. It is the responsibility of the caller to set it
again. However, when the caller sets the hardware breakpoint again
using ptrace(PTRACE_SET_DEBUGREG, child_pid, 0, addr), the hardware
breakpoint is not enabled.

While gdb's approach is to unregister and re-register the hardware
breakpoint every time the breakpoint is hit - which is working fine,
this could affect other programs trying to re-register hardware
breakpoint without unregistering.

This patch enables hardware breakpoint if the caller is re-registering.

Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 13:01:13 +11:00
Akinobu Mita
c5a0809a24 powerpc/iommu: Use bitmap library
- Caluculate the bitmap size with BITS_TO_LONGS()
 - Use bitmap_empty() to verify that all bits are cleared

This also includes a printk to pr_warn() conversion.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 13:01:04 +11:00
Michael Neuling
51cf2b30a5 powerpc: Fix denorm symbol name
Fix global symbol name to match actual denorm_exception_hv label.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 13:00:48 +11:00
Michael Neuling
71e1849724 powerpc: POWER8 cputable entry
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 13:00:45 +11:00
Michael Neuling
aec937b1ee powerpc: Add POWER8 setup code
Just a copy of POWER7 for now.  Will update with new code later.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 13:00:42 +11:00
Michael Neuling
cd5daaf713 powerpc: make POWER7 setup code name generic
We are going to reuse this in POWER8 so make the name generic.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 13:00:39 +11:00
Michael Neuling
ec1b33dcd2 powerpc/ptrace: Remove unused addr parameter in ppc_del_hwdebug()
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 13:00:29 +11:00
Michael Neuling
84295dfc59 powerpc/ptrace: Fix spelling mistake
s/intruction/instruction/

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 13:00:26 +11:00
K.Prasad
6c7a2856ad powerpc/hw-breakpoint: Use generic hw-breakpoint interfaces for new PPC ptrace flags
PPC_PTRACE_GETHWDBGINFO, PPC_PTRACE_SETHWDEBUG and PPC_PTRACE_DELHWDEBUG are
PowerPC specific ptrace flags that use the watchpoint register. While they are
targeted primarily towards BookE users, user-space applications such as GDB
have started using them for BookS too. This patch enables the use of generic
hardware breakpoint interfaces for these new flags.

Apart from the usual benefits of using generic hw-breakpoint interfaces, these
changes allow debuggers (such as GDB) to use a common set of ptrace flags for
their watchpoint needs and allow more precise breakpoint specification (length
of the variable can be specified).

Mikey added: rebased and added dbginfo.features around #ifdef
             CONFIG_HAVE_HW_BREAKPOINT

Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 13:00:23 +11:00
Michael Ellerman
16b86bf252 powerpc: Remove no longer used ppc_md.idle_loop()
The last user of ppc_md.idle_loop() was removed when we dropped the
legacy iSeries code, in commit 8ee3e0d.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 13:00:20 +11:00
Li Zhong
12660b1702 powerpc: Fix MAX_STACK_TRACE_ENTRIES too low warning !
This patch tries to fix the following BUG report:

[    0.012313] BUG: MAX_STACK_TRACE_ENTRIES too low!
[    0.012318] turning off the locking correctness validator.
[    0.012321] Call Trace:
[    0.012330] [c00000017666f6d0] [c000000000012128] .show_stack+0x78/0x184 (unreliable)
[    0.012339] [c00000017666f780] [c0000000000b6348] .save_trace+0x12c/0x14c
[    0.012345] [c00000017666f800] [c0000000000b7448] .mark_lock+0x2bc/0x710
[    0.012351] [c00000017666f8b0] [c0000000000bb198] .__lock_acquire+0x748/0xaec
[    0.012357] [c00000017666f9b0] [c0000000000bb684] .lock_acquire+0x148/0x194
[    0.012365] [c00000017666fa80] [c00000000069371c] .mutex_lock_nested+0x84/0x4ec
[    0.012372] [c00000017666fb90] [c000000000096998] .smpboot_register_percpu_thread+0x3c/0x10c
[    0.012380] [c00000017666fc30] [c0000000009ba910] .spawn_ksoftirqd+0x28/0x48
[    0.012386] [c00000017666fcb0] [c00000000000a98c] .do_one_initcall+0xd8/0x1d0
[    0.012392] [c00000017666fd60] [c00000000000b1f8] .kernel_init+0x120/0x398

[    0.012398] [c00000017666fe30] [c000000000009ad4] .ret_from_kernel_thread+0x5c/0x64
[    0.012404] [c00000017666fa00] [c00000017666fb20] 0xc00000017666fb20
[    0.012410] [c00000017666fa80] [c00000000069371c] .mutex_lock_nested+0x84/0x4ec
[    0.012416] [c00000017666fb90] [c000000000096998] .smpboot_register_percpu_thread+0x3c/0x10c
[    0.012422] [c00000017666fc30] [c0000000009ba910] .spawn_ksoftirqd+0x28/0x48
[    0.012427] [c00000017666fcb0] [c00000000000a98c] .do_one_initcall+0xd8/0x1d0
[    0.012433] [c00000017666fd60] [c00000000000b1f8] .kernel_init+0x120/0x398

[    0.012439] [c00000017666fe30] [c000000000009ad4] .ret_from_kernel_thread+0x5c/0x64
.......

The reason is that the back chain of c00000017666fe30
(ret_from_kernel_thread) contains some invalid value, which might form a
loop.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 13:00:17 +11:00
Benjamin Herrenschmidt
ab7f961a58 powerpc/powernv: Fix OPAL debug entry
OPAL provides the firmware base/entry in registers at boot time
for debugging purposes. We had a bug in the code trying to stash
these into the appropriate kernel globals (a line of code was
probably dropped by accident back when this was merged)

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 13:00:14 +11:00
Julia Lawall
bc26957c6c powerpc/rtas_flash: Eliminate possible double free
The function initialize_flash_pde_data is only called four times.  All four
calls are in the function rtas_flash_init, and on the failure of any of the
calls, remove_flash_pde is called on the third argument of each of the
calls.  There is thus no need for initialize_flash_pde_data to call
remove_flash_pde on the same argument.  remove_flash_pde kfrees the data
field of its argument, and does not clear that field, so this amounts ot a
possible double free.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@r@
identifier f,free,a;
parameter list[n] ps;
type T;
expression e;
@@

f(ps,T a,...) {
  ... when any
      when != a = e
  if(...) { ... free(a); ... return ...; }
  ... when any
}

@@
identifier r.f,r.free;
expression x,a;
expression list[r.n] xs;
@@

* x = f(xs,a,...);
  if (...) { ... free(a); ... return ...; }
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 13:00:11 +11:00
Michael Ellerman
6432200aa8 powerpc/udbg: Remove unused udbg_read()
The last user of udbg_read() was removed in 2005, in commit fca5dcd
"Simplify and clean up the xmon terminal I/O".

Given we haven't needed it for 7 years we can probably drop it.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 12:59:33 +11:00
Tony Breeds
0dc3289c79 powerpc: Add asm/debug.h to get powerpc_debugfs_root
Since the "Disintegrate asm/system.h for PowerPC"
(ae3a197e3d) This has been failing when
DEBUG is #defined.

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 12:59:27 +11:00
Nathan Fontenot
f459d63e16 powerpc+of: Remove the pSeries_reconfig.h file
Remove the pSeries_reconfig.h header file. At this point there is only one
definition in the file, pSeries_coalesce_init(), which can be
moved to rtas.h.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 12:56:55 +11:00
Nathan Fontenot
79d1c71295 powerpc+of: Rename the drivers/of prom_* functions to of_*
Rename the prom_*_property routines of the generic OF code to of_*_property.
This brings them in line with the naming used by the rest of the OF code.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Acked-by: Geoff Levand <geoff@infradead.org>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 12:56:52 +11:00
Nathan Fontenot
1cf3d8b3d2 powerpc+of: Add of node/property notification chain for adds and removes
This patch moves the notification chain for updates to the device tree
from the powerpc/pseries code to the base OF code. This makes this
functionality available to all architectures.

Additionally the notification chain is updated to allow notifications
for property add/remove/update. To make this work a pointer to a new
struct (of_prop_reconfig) is passed to the routines in the notification chain.
The of_prop_reconfig property contains a pointer to the node containing the
property and a pointer to the property itself. In the case of property
updates, the property pointer refers to the new property.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15 12:56:41 +11:00
Oleg Nesterov
65b2c8f0e5 uprobes/powerpc: Do not use arch_uprobe_*_step() helpers
No functional changes.

powerpc is the only user of arch_uprobe_enable/disable_step() helpers,
but they should die. They can not be used correctly, every arch needs
its own implementation (like x86 does). And they do not really help
even as initial-and-almost-working code, arch_uprobe_*_xol() hooks can
easily use user_enable/disable_single_step() directly.

Change arch_uprobe_*_step() to do nothing, and convert powerpc to use
ptrace helpers. This is equally wrong, powerpc needs the arch-specific
fixes.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2012-11-03 17:15:12 +01:00
Oleg Nesterov
f57d56dd29 uprobes/powerpc: Don't clear TIF_UPROBE in do_notify_resume()
Cleanup. No need to clear TIF_UPROBE, uprobe_notify_resume() does this.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2012-11-03 17:15:10 +01:00
Alexander Graf
0588000eac Merge commit 'origin/queue' into for-queue
Conflicts:
	arch/powerpc/include/asm/Kbuild
	arch/powerpc/include/uapi/asm/Kbuild
2012-10-31 13:36:18 +01:00
Paul Mackerras
512691d490 KVM: PPC: Book3S HV: Allow KVM guests to stop secondary threads coming online
When a Book3S HV KVM guest is running, we need the host to be in
single-thread mode, that is, all of the cores (or at least all of
the cores where the KVM guest could run) to be running only one
active hardware thread.  This is because of the hardware restriction
in POWER processors that all of the hardware threads in the core
must be in the same logical partition.  Complying with this restriction
is much easier if, from the host kernel's point of view, only one
hardware thread is active.

This adds two hooks in the SMP hotplug code to allow the KVM code to
make sure that secondary threads (i.e. hardware threads other than
thread 0) cannot come online while any KVM guest exists.  The KVM
code still has to check that any core where it runs a guest has the
secondary threads offline, but having done that check it can now be
sure that they will not come online while the guest is running.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-30 10:54:53 +01:00
Marcelo Tosatti
19bf7f8ac3 Merge remote-tracking branch 'master' into queue
Merge reason: development work has dependency on kvm patches merged
upstream.

Conflicts:
	arch/powerpc/include/asm/Kbuild
	arch/powerpc/include/asm/kvm_para.h

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-29 19:15:32 -02:00
Frederic Weisbecker
11113334d1 vtime: Make vtime_account_system() irqsafe
vtime_account_system() currently has only one caller with
vtime_account() which is irq safe.

Now we are going to call it from other places like kvm where
irqs are not always disabled by the time we account the cputime.

So let's make it irqsafe. The arch implementation part is now
prefixed with "__".

vtime_account_idle() arch implementation is prefixed accordingly
to stay consistent.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
2012-10-29 21:31:31 +01:00
Al Viro
ab75819d39 powerpc: make fork_idle() take the common "kernel thread" path in copy_thread()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-21 22:33:39 -04:00
Al Viro
ea516b1154 powerpc: put the "zero usp means using parent's stack pointer" to copy_thread()
simplifies callers, at that...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-21 22:28:43 -04:00
Al Viro
64c2f6596b powerpc: don't bother with CHECK_FULL_REGS in sys_fork() et.al.
copy_thread() will do it anyway.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-21 22:27:13 -04:00
Al Viro
9d401279d6 powerpc: don't bother with zero-extending arguments in sys_clone()
... since the syscall glue had been doing that for 9 years already.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-21 22:25:53 -04:00
Al Viro
53b50f9483 powerpc: take dereferencing to ret_from_kernel_thread()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-21 22:25:11 -04:00
Deepthi Dharwar
8ea959a17f cpuidle/powerpc: Fix smt_snooze_delay functionality.
smt_snooze_delay was designed to  delay idle loop's nap entry
in the native idle code before it got  ported over to use as part of
the cpuidle framework.

A -ve value  assigned to smt_snooze_delay should result in
busy looping, in other words disabling the entry to nap state.

	- https://lists.ozlabs.org/pipermail/linuxppc-dev/2010-May/082450.html

This particular functionality can be achieved currently by
echo 1 > /sys/devices/system/cpu/cpu*/state1/disable
but it is broken when one assigns -ve value to  the smt_snooze_delay
variable either via sysfs entry or ppc64_cpu util.

This patch aims to fix this, by disabling nap state when smt_snooze_delay
variable is set to -ve value.

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-10-18 10:57:24 +11:00
Al Viro
40792104b2 powerpc: don't mess with r2 in copy_thread() and friends
kernel_thread() callbacks are *not* in modules and are not going to
be there.  And it's not even read in ppc32 ret_from_kernel_thread(),
so no need to bother with it there either.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-14 19:35:52 -04:00
Al Viro
138d1ce80e powerpc: switch to saner kernel_execve() semantics
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-14 19:35:44 -04:00
Linus Torvalds
03d3602a83 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer core update from Thomas Gleixner:
 - Bug fixes (one for a longstanding dead loop issue)
 - Rework of time related vsyscalls
 - Alarm timer updates
 - Jiffies updates to remove compile time dependencies

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timekeeping: Cast raw_interval to u64 to avoid shift overflow
  timers: Fix endless looping between cascade() and internal_add_timer()
  time/jiffies: bring back unconditional LATCH definition
  time: Convert x86_64 to using new update_vsyscall
  time: Only do nanosecond rounding on GENERIC_TIME_VSYSCALL_OLD systems
  time: Introduce new GENERIC_TIME_VSYSCALL
  time: Convert CONFIG_GENERIC_TIME_VSYSCALL to CONFIG_GENERIC_TIME_VSYSCALL_OLD
  time: Move update_vsyscall definitions to timekeeper_internal.h
  time: Move timekeeper structure to timekeeper_internal.h for vsyscall changes
  jiffies: Remove compile time assumptions about CLOCK_TICK_RATE
  jiffies: Kill unused TICK_USEC_TO_NSEC
  alarmtimer: Rename alarmtimer_remove to alarmtimer_dequeue
  alarmtimer: Remove unused helpers & defines
  alarmtimer: Use hrtimer per-alarm instead of per-base
  alarmtimer: Implement minimum alarm interval for allowing suspend
2012-10-12 22:17:48 +09:00
Linus Torvalds
8213a2f3ee Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull pile 2 of execve and kernel_thread unification work from Al Viro:
 "Stuff in there: kernel_thread/kernel_execve/sys_execve conversions for
  several more architectures plus assorted signal fixes and cleanups.

  There'll be more (in particular, real fixes for the alpha
  do_notify_resume() irq mess)..."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (43 commits)
  alpha: don't open-code trace_report_syscall_{enter,exit}
  Uninclude linux/freezer.h
  m32r: trim masks
  avr32: trim masks
  tile: don't bother with SIGTRAP in setup_frame
  microblaze: don't bother with SIGTRAP in setup_rt_frame()
  mn10300: don't bother with SIGTRAP in setup_frame()
  frv: no need to raise SIGTRAP in setup_frame()
  x86: get rid of duplicate code in case of CONFIG_VM86
  unicore32: remove pointless test
  h8300: trim _TIF_WORK_MASK
  parisc: decide whether to go to slow path (tracesys) based on thread flags
  parisc: don't bother looping in do_signal()
  parisc: fix double restarts
  bury the rest of TIF_IRET
  sanitize tsk_is_polling()
  bury _TIF_RESTORE_SIGMASK
  unicore32: unobfuscate _TIF_WORK_MASK
  mips: NOTIFY_RESUME is not needed in TIF masks
  mips: merge the identical "return from syscall" per-ABI code
  ...

Conflicts:
	arch/arm/include/asm/thread_info.h
2012-10-12 10:49:08 +09:00
Marcelo Tosatti
03604b3114 Merge branch 'for-upstream' of http://github.com/agraf/linux-2.6 into queue
* 'for-upstream' of http://github.com/agraf/linux-2.6: (56 commits)
  arch/powerpc/kvm/e500_tlb.c: fix error return code
  KVM: PPC: Book3S HV: Provide a way for userspace to get/set per-vCPU areas
  KVM: PPC: Book3S: Get/set guest FP regs using the GET/SET_ONE_REG interface
  KVM: PPC: Book3S: Get/set guest SPRs using the GET/SET_ONE_REG interface
  KVM: PPC: set IN_GUEST_MODE before checking requests
  KVM: PPC: e500: MMU API: fix leak of shared_tlb_pages
  KVM: PPC: e500: fix allocation size error on g2h_tlb1_map
  KVM: PPC: Book3S HV: Fix calculation of guest phys address for MMIO emulation
  KVM: PPC: Book3S HV: Remove bogus update of physical thread IDs
  KVM: PPC: Book3S HV: Fix updates of vcpu->cpu
  KVM: Move some PPC ioctl definitions to the correct place
  KVM: PPC: Book3S HV: Handle memory slot deletion and modification correctly
  KVM: PPC: Move kvm->arch.slot_phys into memslot.arch
  KVM: PPC: Book3S HV: Take the SRCU read lock before looking up memslots
  KVM: PPC: bookehv: Allow duplicate calls of DO_KVM macro
  KVM: PPC: BookE: Support FPU on non-hv systems
  KVM: PPC: 440: Implement mfdcrx
  KVM: PPC: 440: Implement mtdcrx
  Document IACx/DACx registers access using ONE_REG API
  KVM: PPC: E500: Remove E500_TLB_DIRTY flag
  ...
2012-10-10 19:03:54 -03:00
Scott Wood
8043e494da powerpc/epapr: export epapr_hypercall_start
This fixes breakage introduced by the following commit:

  commit 6d2d82627f4f1e96a33664ace494fa363e0495cb
  Author: Liu Yu-B13201 <Yu.Liu@freescale.com>
  Date:   Tue Jul 3 05:48:56 2012 +0000

    PPC: Don't use hardcoded opcode for ePAPR hcall invocation

when a driver that uses ePAPR hypercalls is built as a module.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:40 +02:00
Liu Yu-B13201
2f979de8a7 KVM: PPC: ev_idle hcall support for e500 guests
Signed-off-by: Liu Yu <yu.liu@freescale.com>
[varun: 64-bit changes]
Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:37 +02:00
Stuart Yoder
fdcf8bd7e7 KVM: PPC: use definitions in epapr header for hcalls
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05 23:38:36 +02:00
Linus Torvalds
5f3d2f2e1a Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc updates from Benjamin Herrenschmidt:
 "Some highlights in addition to the usual batch of fixes:

   - 64TB address space support for 64-bit processes by Aneesh Kumar

   - Gavin Shan did a major cleanup & re-organization of our EEH support
     code (IBM fancy PCI error handling & recovery infrastructure) which
     paves the way for supporting different platform backends, along
     with some rework of the PCIe code for the PowerNV platform in order
     to remove home made resource allocations and instead use the
     generic code (which is possible after some small improvements to it
     done by Gavin).

   - Uprobes support by Ananth N Mavinakayanahalli

   - A pile of embedded updates from Freescale folks, including new SoC
     and board supports, more KVM stuff including preparing for 64-bit
     BookE KVM support, ePAPR 1.1 updates, etc..."

Fixup trivial conflicts in drivers/scsi/ipr.c

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (146 commits)
  powerpc/iommu: Fix multiple issues with IOMMU pools code
  powerpc: Fix VMX fix for memcpy case
  driver/mtd:IFC NAND:Initialise internal SRAM before any write
  powerpc/fsl-pci: use 'Header Type' to identify PCIE mode
  powerpc/eeh: Don't release eeh_mutex in eeh_phb_pe_get
  powerpc: Remove tlb batching hack for nighthawk
  powerpc: Set paca->data_offset = 0 for boot cpu
  powerpc/perf: Sample only if SIAR-Valid bit is set in P7+
  powerpc/fsl-pci: fix warning when CONFIG_SWIOTLB is disabled
  powerpc/mpc85xx: Update interrupt handling for IFC controller
  powerpc/85xx: Enable USB support in p1023rds_defconfig
  powerpc/smp: Do not disable IPI interrupts during suspend
  powerpc/eeh: Fix crash on converting OF node to edev
  powerpc/eeh: Lock module while handling EEH event
  powerpc/kprobe: Don't emulate store when kprobe stwu r1
  powerpc/kprobe: Complete kprobe and migrate exception frame
  powerpc/kprobe: Introduce a new thread flag
  powerpc: Remove unused __get_user64() and __put_user64()
  powerpc/eeh: Global mutex to protect PE tree
  powerpc/eeh: Remove EEH PE for normal PCI hotplug
  ...
2012-10-06 03:16:12 +09:00
Denys Vlasenko
751f409db6 compat: move compat_siginfo_t definition to asm/compat.h
This is a preparatory patch for the introduction of NT_SIGINFO elf note.

Make the location of compat_siginfo_t uniform across eight architectures
which have it.  Now it can be pulled in by including asm/compat.h or
linux/compat.h.

Most of the copies are verbatim.  compat_uid[32]_t had to be replaced by
__compat_uid[32]_t.  compat_uptr_t had to be moved up before
compat_siginfo_t in asm/compat.h on a several architectures (tile already
had it moved up).  compat_sigval_t had to be relocated from linux/compat.h
to asm/compat.h.

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Amerigo Wang <amwang@redhat.com>
Cc: "Jonathan M. Foote" <jmfoote@cert.org>
Cc: Roland McGrath <roland@hack.frob.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:05:16 +09:00
Linus Torvalds
d66e6737d4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
 - Optimised AES/SHA1 for ARM.
 - IPsec ESN support in talitos and caam.
 - x86_64/avx implementation of cast5/cast6.
 - Add/use multi-algorithm registration helpers where possible.
 - Added IBM Power7+ in-Nest support.
 - Misc fixes.

Fix up trivial conflicts in crypto/Kconfig due to the sparc64 crypto
config options being added next to the new ARM ones.

[ Side note: cut-and-paste duplicate help texts make those conflicts
  harder to read than necessary, thanks to git being smart about
  minimizing conflicts and maximizing the common parts... ]

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (71 commits)
  crypto: x86/glue_helper - fix storing of new IV in CBC encryption
  crypto: cast5/avx - fix storing of new IV in CBC encryption
  crypto: tcrypt - add missing tests for camellia and ghash
  crypto: testmgr - make test_aead also test 'dst != src' code paths
  crypto: testmgr - make test_skcipher also test 'dst != src' code paths
  crypto: testmgr - add test vectors for CTR mode IV increasement
  crypto: testmgr - add test vectors for partial ctr(cast5) and ctr(cast6)
  crypto: testmgr - allow non-multi page and multi page skcipher tests from same test template
  crypto: caam - increase TRNG clocks per sample
  crypto, tcrypt: remove local_bh_disable/enable() around local_irq_disable/enable()
  crypto: tegra-aes - fix error return code
  crypto: crypto4xx - fix error return code
  crypto: hifn_795x - fix error return code
  crypto: ux500 - fix error return code
  crypto: caam - fix error IDs for SEC v5.x RNG4
  hwrng: mxc-rnga - Access data via structure
  hwrng: mxc-rnga - Adapt clocks to new i.mx clock framework
  crypto: caam - add IPsec ESN support
  crypto: 842 - remove .cra_list initialization
  Revert "[CRYPTO] cast6: inline bloat--"
  ...
2012-10-04 09:06:34 -07:00
Anton Blanchard
d900bd7366 powerpc/iommu: Fix multiple issues with IOMMU pools code
There are a number of issues in the recent IOMMU pools code:

- On a preempt kernel we might switch CPUs in the middle of building
  a scatter gather list. When this happens the handle hint passed in
  no longer falls within the local CPU's pool. Check for this and
  fall back to the pool hint.

- We were missing a spin_unlock/spin_lock in one spot where we
  switch pools.

- We need to provide locking around dart_tlb_invalidate_all and
  dart_tlb_invalidate_one now that the global lock is gone.

Reported-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@kernel.org> [v3.6]
2012-10-04 18:03:20 +10:00
Linus Torvalds
88265322c1 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Highlights:

   - Integrity: add local fs integrity verification to detect offline
     attacks
   - Integrity: add digital signature verification
   - Simple stacking of Yama with other LSMs (per LSS discussions)
   - IBM vTPM support on ppc64
   - Add new driver for Infineon I2C TIS TPM
   - Smack: add rule revocation for subject labels"

Fixed conflicts with the user namespace support in kernel/auditsc.c and
security/integrity/ima/ima_policy.c.

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (39 commits)
  Documentation: Update git repository URL for Smack userland tools
  ima: change flags container data type
  Smack: setprocattr memory leak fix
  Smack: implement revoking all rules for a subject label
  Smack: remove task_wait() hook.
  ima: audit log hashes
  ima: generic IMA action flag handling
  ima: rename ima_must_appraise_or_measure
  audit: export audit_log_task_info
  tpm: fix tpm_acpi sparse warning on different address spaces
  samples/seccomp: fix 31 bit build on s390
  ima: digital signature verification support
  ima: add support for different security.ima data types
  ima: add ima_inode_setxattr/removexattr function and calls
  ima: add inode_post_setattr call
  ima: replace iint spinblock with rwlock/read_lock
  ima: allocating iint improvements
  ima: add appraise action keywords and default rules
  ima: integrity appraisal extension
  vfs: move ima_file_free before releasing the file
  ...
2012-10-02 21:38:48 -07:00
Linus Torvalds
aab174f0df Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs update from Al Viro:

 - big one - consolidation of descriptor-related logics; almost all of
   that is moved to fs/file.c

   (BTW, I'm seriously tempted to rename the result to fd.c.  As it is,
   we have a situation when file_table.c is about handling of struct
   file and file.c is about handling of descriptor tables; the reasons
   are historical - file_table.c used to be about a static array of
   struct file we used to have way back).

   A lot of stray ends got cleaned up and converted to saner primitives,
   disgusting mess in android/binder.c is still disgusting, but at least
   doesn't poke so much in descriptor table guts anymore.  A bunch of
   relatively minor races got fixed in process, plus an ext4 struct file
   leak.

 - related thing - fget_light() partially unuglified; see fdget() in
   there (and yes, it generates the code as good as we used to have).

 - also related - bits of Cyrill's procfs stuff that got entangled into
   that work; _not_ all of it, just the initial move to fs/proc/fd.c and
   switch of fdinfo to seq_file.

 - Alex's fs/coredump.c spiltoff - the same story, had been easier to
   take that commit than mess with conflicts.  The rest is a separate
   pile, this was just a mechanical code movement.

 - a few misc patches all over the place.  Not all for this cycle,
   there'll be more (and quite a few currently sit in akpm's tree)."

Fix up trivial conflicts in the android binder driver, and some fairly
simple conflicts due to two different changes to the sock_alloc_file()
interface ("take descriptor handling from sock_alloc_file() to callers"
vs "net: Providing protocol type via system.sockprotoname xattr of
/proc/PID/fd entries" adding a dentry name to the socket)

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits)
  MAX_LFS_FILESIZE should be a loff_t
  compat: fs: Generic compat_sys_sendfile implementation
  fs: push rcu_barrier() from deactivate_locked_super() to filesystems
  btrfs: reada_extent doesn't need kref for refcount
  coredump: move core dump functionality into its own file
  coredump: prevent double-free on an error path in core dumper
  usb/gadget: fix misannotations
  fcntl: fix misannotations
  ceph: don't abuse d_delete() on failure exits
  hypfs: ->d_parent is never NULL or negative
  vfs: delete surplus inode NULL check
  switch simple cases of fget_light to fdget
  new helpers: fdget()/fdput()
  switch o2hb_region_dev_write() to fget_light()
  proc_map_files_readdir(): don't bother with grabbing files
  make get_file() return its argument
  vhost_set_vring(): turn pollstart/pollstop into bool
  switch prctl_set_mm_exe_file() to fget_light()
  switch xfs_find_handle() to fget_light()
  switch xfs_swapext() to fget_light()
  ...
2012-10-02 20:25:04 -07:00
Catalin Marinas
8f9c0119d7 compat: fs: Generic compat_sys_sendfile implementation
This function is used by sparc, powerpc and arm64 for compat support.
The patch adds a generic implementation which calls do_sendfile()
directly and avoids set_fs().

The sparc architecture has wrappers for the sign extensions while
powerpc relies on the compiler to do the this. The patch adds wrappers
for powerpc to handle the u32->int type conversion.

compat_sys_sendfile64() can be replaced by a sys_sendfile() call since
compat_loff_t has the same size as off_t on a 64-bit system.

On powerpc, the patch also changes the 64-bit sendfile call from
sys_sendile64 to sys_sendfile.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-02 21:35:55 -04:00
Linus Torvalds
fdb2f9c2eb PCI changes for the 3.7 merge window:
Host bridge hotplug
     - Protect acpi_pci_drivers and acpi_pci_roots (Taku Izumi)
     - Clear host bridge resource info to avoid issue when releasing (Yinghai Lu)
     - Notify acpi_pci_drivers when hot-plugging host bridges (Jiang Liu)
     - Use standard list ops for acpi_pci_drivers (Jiang Liu)
 
   Device hotplug
     - Use pci_get_domain_bus_and_slot() to close hotplug races (Jiang Liu)
     - Remove fakephp driver (Bjorn Helgaas)
     - Fix VGA ref count in hotplug remove path (Yinghai Lu)
     - Allow acpiphp to handle PCIe ports without native hotplug (Jiang Liu)
     - Implement resume regardless of pciehp_force param (Oliver Neukum)
     - Make pci_fixup_irqs() work after init (Thierry Reding)
 
   Miscellaneous
     - Add pci_pcie_type(dev) and remove pci_dev.pcie_type (Yijing Wang)
     - Factor out PCI Express Capability accessors (Jiang Liu)
     - Add pcibios_window_alignment() so powerpc EEH can use generic resource assignment (Gavin Shan)
     - Make pci_error_handlers const (Stephen Hemminger)
     - Cleanup drivers/pci/remove.c (Bjorn Helgaas)
     - Improve Vendor-Specific Extended Capability support (Bjorn Helgaas)
     - Use standard list ops for bus->devices (Bjorn Helgaas)
     - Avoid kmalloc in pci_get_subsys() and pci_get_class() (Feng Tang)
     - Reassign invalid bus number ranges (Intel DP43BF workaround) (Yinghai Lu)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJQac4hAAoJEPGMOI97Hn6zjZYP/iaqU9kjmgTsBbSyzB4oApv/
 RRxo3I+ad9GF6XlMQfVAtyx1pgCD1gdGAtoDgGSCTqgdYD3CO10AxKU+yleAk1wo
 dbMxLifJNTrT3G1mZ/NL16yEGhCwvhfwzRtB1VoZmCT4lSApO/7cJkXl2DzHfA/i
 pmltOOiQCN8kbUcJbVPtUyTVPi2zl/8bsyCyTkS7YG0VXeGRM+ZUvPWZJ7MnWYYB
 5qoCdrw5ENCCiDQ9yw5SAfgL23b+0p6OI/x3Lkex0QQOWwSqGSiaHt4b7eitrC5b
 2eAJg32f/AzZke1YbKLMfdsL0VJP3GAswhDVHlgmo63rZkOZChm+97dgZ35Mcv5v
 kEXkWyBb1xJ3t8rZir6Qer9Iv2wOB+MkZ5qtU/Vf+l0wLQLYTrRVsKngrEDREONk
 dXbokp6iVSPeA1sTSdH9MmHlTUIj82ZLSGcxcjTsN8NWZjxx6g3rNx1uay+5MYOW
 4ET9zNu5snrAqN6N4Tb81gvtG8qYfxzdvVfrA9AaGKI6xxB7jkqgFJRp55JiEcFc
 x4cmWkhvdlhVsG2TQwFxYNfswOqD+7NCs6V4kSVZX6ezpDrH7I5VvcnnhstF7C8l
 KZul0EV7OW+kDK23pNe24lVP2xtOv6G8eK/3PmeKIXWl1V83nqre/oLufRzTfs+Z
 SxkILwY/MFpuCFteKE1t
 =haBu
 -----END PGP SIGNATURE-----

Merge tag 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI changes from Bjorn Helgaas:
 "Host bridge hotplug
    - Protect acpi_pci_drivers and acpi_pci_roots (Taku Izumi)
    - Clear host bridge resource info to avoid issue when releasing
      (Yinghai Lu)
    - Notify acpi_pci_drivers when hot-plugging host bridges (Jiang Liu)
    - Use standard list ops for acpi_pci_drivers (Jiang Liu)

  Device hotplug
    - Use pci_get_domain_bus_and_slot() to close hotplug races (Jiang
      Liu)
    - Remove fakephp driver (Bjorn Helgaas)
    - Fix VGA ref count in hotplug remove path (Yinghai Lu)
    - Allow acpiphp to handle PCIe ports without native hotplug (Jiang
      Liu)
    - Implement resume regardless of pciehp_force param (Oliver Neukum)
    - Make pci_fixup_irqs() work after init (Thierry Reding)

  Miscellaneous
    - Add pci_pcie_type(dev) and remove pci_dev.pcie_type (Yijing Wang)
    - Factor out PCI Express Capability accessors (Jiang Liu)
    - Add pcibios_window_alignment() so powerpc EEH can use generic
      resource assignment (Gavin Shan)
    - Make pci_error_handlers const (Stephen Hemminger)
    - Cleanup drivers/pci/remove.c (Bjorn Helgaas)
    - Improve Vendor-Specific Extended Capability support (Bjorn
      Helgaas)
    - Use standard list ops for bus->devices (Bjorn Helgaas)
    - Avoid kmalloc in pci_get_subsys() and pci_get_class() (Feng Tang)
    - Reassign invalid bus number ranges (Intel DP43BF workaround)
      (Yinghai Lu)"

* tag 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (102 commits)
  PCI: acpiphp: Handle PCIe ports without native hotplug capability
  PCI/ACPI: Use acpi_driver_data() rather than searching acpi_pci_roots
  PCI/ACPI: Protect acpi_pci_roots list with mutex
  PCI/ACPI: Use acpi_pci_root info rather than looking it up again
  PCI/ACPI: Pass acpi_pci_root to acpi_pci_drivers' add/remove interface
  PCI/ACPI: Protect acpi_pci_drivers list with mutex
  PCI/ACPI: Notify acpi_pci_drivers when hot-plugging PCI root bridges
  PCI/ACPI: Use normal list for struct acpi_pci_driver
  PCI/ACPI: Use DEVICE_ACPI_HANDLE rather than searching acpi_pci_roots
  PCI: Fix default vga ref_count
  ia64/PCI: Clear host bridge aperture struct resource
  x86/PCI: Clear host bridge aperture struct resource
  PCI: Stop all children first, before removing all children
  Revert "PCI: Use hotplug-safe pci_get_domain_bus_and_slot()"
  PCI: Provide a default pcibios_update_irq()
  PCI: Discard __init annotations for pci_fixup_irqs() and related functions
  PCI: Use correct type when freeing bus resource list
  PCI: Check P2P bridge for invalid secondary/subordinate range
  PCI: Convert "new_id"/"remove_id" into generic pci_bus driver attributes
  xen-pcifront: Use hotplug-safe pci_get_domain_bus_and_slot()
  ...
2012-10-01 12:05:36 -07:00
Linus Torvalds
0b981cb94b Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler changes from Ingo Molnar:
 "Continued quest to clean up and enhance the cputime code by Frederic
  Weisbecker, in preparation for future tickless kernel features.

  Other than that, smallish changes."

Fix up trivial conflicts due to additions next to each other in arch/{x86/}Kconfig

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  cputime: Make finegrained irqtime accounting generally available
  cputime: Gather time/stats accounting config options into a single menu
  ia64: Reuse system and user vtime accounting functions on task switch
  ia64: Consolidate user vtime accounting
  vtime: Consolidate system/idle context detection
  cputime: Use a proper subsystem naming for vtime related APIs
  sched: cpu_power: enable ARCH_POWER
  sched/nohz: Clean up select_nohz_load_balancer()
  sched: Fix load avg vs. cpu-hotplug
  sched: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW
  sched: Fix nohz_idle_balance()
  sched: Remove useless code in yield_to()
  sched: Add time unit suffix to sched sysctl knobs
  sched/debug: Limit sd->*_idx range on sysctl
  sched: Remove AFFINE_WAKEUPS feature flag
  s390: Remove leftover account_tick_vtime() header
  cputime: Consolidate vtime handling on context switch
  sched: Move cputime code to its own file
  cputime: Generalize CONFIG_VIRT_CPU_ACCOUNTING
  tile: Remove SD_PREFER_LOCAL leftover
  ...
2012-10-01 10:43:39 -07:00
Richard Weinberger
3cffdc8c3a Uninclude linux/freezer.h
This include is no longer needed.
(seems to be a leftover from try_to_freeze())

Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-01 09:58:18 -04:00
Al Viro
be6abfa769 powerpc: switch to generic sys_execve()/kernel_execve()
the only non-obvious part is that current_pt_regs() is really needed
here - task_pt_regs() is NULL for kernel threads; it's OK for ptrace
uses (the thing task_pt_regs() is intended for), but not for us.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-30 23:35:51 -04:00
Al Viro
58254e1002 powerpc: split ret_from_fork
... and get rid of in-kernel syscalls in kernel_thread()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-30 23:31:19 -04:00
James Morris
bf53083445 Linux 3.6-rc7
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQEcBAABAgAGBQJQX7MuAAoJEHm+PkMAQRiG0h0IAJURkrMCAQUxA+Ik66ReH89s
 LQcVd0U9uL4UUOi7f5WR64Vf9Cfu6VVGX9ZKSvjpNskvlQaUQPMIt4pMe6g4X4dI
 u0bApEy4XZz3nGabUAghIU8jJ8cDmhCG6kPpSiS7pi7KHc0yIa4WFtJRrIpGaIWT
 xuK38YOiOHcSDRlLyWZzainMncQp/ixJdxnqVMTonkVLk0q0b84XzOr4/qlLE5lU
 i+TsK3PRKdQXgvZ4CebL+srPBwWX1dmgP3VkeBloQbSSenSeELICbFWavn2ml+sF
 GXi4dO93oNquL/Oy5SwI666T4uNcrRPaS+5X+xSZgBW/y2aQVJVJuNZg6ZP/uWk=
 =0v2l
 -----END PGP SIGNATURE-----

Merge tag 'v3.6-rc7' into next

Linux 3.6-rc7

Requested by David Howells so he can merge his key susbsystem work into
my tree with requisite -linus changesets.
2012-09-28 13:37:32 +10:00
Michael Ellerman
466921c5a4 powerpc: Set paca->data_offset = 0 for boot cpu
In commit 407821a we assigned a poison value to the paca->data_offset.

Unfortunately with CONFIG_LOCK_STAT=y lockdep will read & write to percpu
data very early in boot, prior to us initialising the percpu areas,
leading to a crash.

We have been getting away with this because the data_offset was previously
set to zero. This causes lockdep to read & write to the initial copy of
the percpu variables, which are discarded later in boot.

Although that is "fishy", it does work, and for lock statistics it is no
big deal to discard the counts from early boot.

So set the paca->data_offset = 0 for the boot cpu paca only.

Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-27 12:51:06 +10:00
Frederic Weisbecker
a7e1a9e3af vtime: Consolidate system/idle context detection
Move the code that finds out to which context we account the
cputime into generic layer.

Archs that consider the whole time spent in the idle task as idle
time (ia64, powerpc) can rely on the generic vtime_account()
and implement vtime_account_system() and vtime_account_idle(),
letting the generic code to decide when to call which API.

Archs that have their own meaning of idle time, such as s390
that only considers the time spent in CPU low power mode as idle
time, can just override vtime_account().

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
2012-09-25 15:42:37 +02:00
Frederic Weisbecker
bf9fae9f5e cputime: Use a proper subsystem naming for vtime related APIs
Use a naming based on vtime as a prefix for virtual based
cputime accounting APIs:

- account_system_vtime() -> vtime_account()
- account_switch_vtime() -> vtime_task_switch()

It makes it easier to allow for further declension such
as vtime_account_system(), vtime_account_idle(), ... if we
want to find out the context we account to from generic code.

This also make it better to know on which subsystem these APIs
refer to.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
2012-09-25 15:31:31 +02:00
John Stultz
7063942116 time: Convert CONFIG_GENERIC_TIME_VSYSCALL to CONFIG_GENERIC_TIME_VSYSCALL_OLD
To help migrate archtectures over to the new update_vsyscall method,
redfine CONFIG_GENERIC_TIME_VSYSCALL as CONFIG_GENERIC_TIME_VSYSCALL_OLD

Cc: Tony Luck <tony.luck@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-09-24 12:38:07 -04:00
John Stultz
189374aed6 time: Move update_vsyscall definitions to timekeeper_internal.h
Since users will need to include timekeeper_internal.h, move
update_vsyscall definitions to timekeeper_internal.h.

Cc: Tony Luck <tony.luck@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-09-24 12:38:06 -04:00
Zhao Chenhui
e6651de9cc powerpc/smp: Do not disable IPI interrupts during suspend
During suspend, all interrupts including IPI will be disabled. In this case,
the suspend process will hang in SMP. To prevent this, pass the flag
IRQF_NO_SUSPEND when requesting IPI irq.

Signed-off-by: Zhao Chenhui <chenhui.zhao@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2012-09-19 08:38:16 -05:00
Benjamin Herrenschmidt
caa1d631fc Merge remote-tracking branch 'kumar/next' into next 2012-09-18 16:04:33 +10:00
Tiejun Chen
a9c4e541ea powerpc/kprobe: Complete kprobe and migrate exception frame
We can't emulate stwu since that may corrupt current exception stack.
So we will have to do real store operation in the exception return code.

Firstly we'll allocate a trampoline exception frame below the kprobed
function stack and copy the current exception frame to the trampoline.
Then we can do this real store operation to implement 'stwu', and reroute
the trampoline frame to r1 to complete this exception migration.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-18 15:32:42 +10:00
Li Zhong
e72bbbab27 powerpc/trace: Fix interrupt tracepoints vs. RCU
There are a few tracepoints in the interrupt code path, which is before
irq_enter(), or after irq_exit(), like
trace_irq_entry()/trace_irq_exit() in do_IRQ(),
trace_timer_interrupt_entry()/trace_timer_interrupt_exit() in
timer_interrupt().

If the interrupt is from idle(), and because tracepoint contains RCU
read-side critical section, we could see following suspicious RCU usage
reported:

[  145.127743] ===============================
[  145.127747] [ INFO: suspicious RCU usage. ]
[  145.127752] 3.6.0-rc3+ #1 Not tainted
[  145.127755] -------------------------------
[  145.127759] /root/.workdir/linux/arch/powerpc/include/asm/trace.h:33
suspicious rcu_dereference_check() usage!
[  145.127765]
[  145.127765] other info that might help us debug this:
[  145.127765]
[  145.127771]
[  145.127771] RCU used illegally from idle CPU!
[  145.127771] rcu_scheduler_active = 1, debug_locks = 0
[  145.127777] RCU used illegally from extended quiescent state!
[  145.127781] no locks held by swapper/0/0.
[  145.127785]
[  145.127785] stack backtrace:
[  145.127789] Call Trace:
[  145.127796] [c00000000108b530] [c000000000013c40] .show_stack
+0x70/0x1c0 (unreliable)
[  145.127806] [c00000000108b5e0]
[c0000000000f59d8] .lockdep_rcu_suspicious+0x118/0x150
[  145.127813] [c00000000108b680] [c00000000000fc58] .do_IRQ+0x498/0x500
[  145.127820] [c00000000108b750] [c000000000003950]
hardware_interrupt_common+0x150/0x180
[  145.127828] --- Exception: 501 at .plpar_hcall_norets+0x84/0xd4
[  145.127828]     LR = .check_and_cede_processor+0x38/0x70
[  145.127836] [c00000000108bab0] [c0000000000665dc] .shared_cede_loop
+0x5c/0x100
[  145.127844] [c00000000108bb70] [c000000000588ab0] .cpuidle_enter
+0x30/0x50
[  145.127850] [c00000000108bbe0]
[c000000000588b0c] .cpuidle_enter_state+0x3c/0xb0
[  145.127857] [c00000000108bc60] [c000000000589730] .cpuidle_idle_call
+0x150/0x6c0
[  145.127863] [c00000000108bd30] [c000000000058440] .pSeries_idle
+0x10/0x40
[  145.127870] [c00000000108bda0] [c00000000001683c] .cpu_idle
+0x18c/0x2d0
[  145.127876] [c00000000108be60] [c00000000000b434] .rest_init
+0x124/0x1b0
[  145.127884] [c00000000108bef0] [c0000000009d0d28] .start_kernel
+0x568/0x588
[  145.127890] [c00000000108bf90] [c000000000009660] .start_here_common
+0x20/0x40

This is because the RCU usage in interrupt context should be used in
area marked by rcu_irq_enter()/rcu_irq_exit(), called in
irq_enter()/irq_exit() respectively.

Move them into the irq_enter()/irq_exit() area to avoid the reporting.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17 16:31:54 +10:00
Aneesh Kumar K.V
048ee0993e powerpc/mm: Add 64TB support
Increase max addressable range to 64TB. This is not tested on
real hardware yet.

Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17 16:31:51 +10:00
Michael Neuling
b92a66a65c powerpc: Add denormalisation exception handling for POWER6/7
On POWER6 and POWER7 if the input operand to an instruction is a
denormalised single precision binary floating point value we can take
a denormalisation exception where it's expected that the hypervisor
(HV=1) will fix up the inputs before the instruction is run.

This adds code to handle this denormalisation exception for POWER6 and
POWER7.

It also add a CONFIG_PPC_DENORMALISATION option and sets it in
pseries/ppc64_defconfig.

This is useful on bare metal systems only.  Based on patch from Milton
Miller.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17 16:31:47 +10:00
Benjamin Herrenschmidt
eda485f06d Merge remote-tracking branch 'pci/pci/gavin-window-alignment' into next
Merge Gavin patches from the PCI tree as subsequent powerpc
patches are going to depend on them

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17 16:07:43 +10:00
Bjorn Helgaas
9a5d5bd848 Merge commit 'v3.6-rc5' into pci/gavin-window-alignment
* commit 'v3.6-rc5': (1098 commits)
  Linux 3.6-rc5
  HID: tpkbd: work even if the new Lenovo Keyboard driver is not configured
  Remove user-triggerable BUG from mpol_to_str
  xen/pciback: Fix proper FLR steps.
  uml: fix compile error in deliver_alarm()
  dj: memory scribble in logi_dj
  Fix order of arguments to compat_put_time[spec|val]
  xen: Use correct masking in xen_swiotlb_alloc_coherent.
  xen: fix logical error in tlb flushing
  xen/p2m: Fix one-off error in checking the P2M tree directory.
  powerpc: Don't use __put_user() in patch_instruction
  powerpc: Make sure IPI handlers see data written by IPI senders
  powerpc: Restore correct DSCR in context switch
  powerpc: Fix DSCR inheritance in copy_thread()
  powerpc: Keep thread.dscr and thread.dscr_inherit in sync
  powerpc: Update DSCR on all CPUs when writing sysfs dscr_default
  powerpc/powernv: Always go into nap mode when CPU is offline
  powerpc: Give hypervisor decrementer interrupts their own handler
  powerpc/vphn: Fix arch_update_cpu_topology() return value
  ARM: gemini: fix the gemini build
  ...

Conflicts:
	drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
	drivers/rapidio/devices/tsi721.c
2012-09-13 15:54:57 -06:00
Bjorn Helgaas
78890b5989 Merge commit 'v3.6-rc5' into next
* commit 'v3.6-rc5': (1098 commits)
  Linux 3.6-rc5
  HID: tpkbd: work even if the new Lenovo Keyboard driver is not configured
  Remove user-triggerable BUG from mpol_to_str
  xen/pciback: Fix proper FLR steps.
  uml: fix compile error in deliver_alarm()
  dj: memory scribble in logi_dj
  Fix order of arguments to compat_put_time[spec|val]
  xen: Use correct masking in xen_swiotlb_alloc_coherent.
  xen: fix logical error in tlb flushing
  xen/p2m: Fix one-off error in checking the P2M tree directory.
  powerpc: Don't use __put_user() in patch_instruction
  powerpc: Make sure IPI handlers see data written by IPI senders
  powerpc: Restore correct DSCR in context switch
  powerpc: Fix DSCR inheritance in copy_thread()
  powerpc: Keep thread.dscr and thread.dscr_inherit in sync
  powerpc: Update DSCR on all CPUs when writing sysfs dscr_default
  powerpc/powernv: Always go into nap mode when CPU is offline
  powerpc: Give hypervisor decrementer interrupts their own handler
  powerpc/vphn: Fix arch_update_cpu_topology() return value
  ARM: gemini: fix the gemini build
  ...

Conflicts:
	drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
	drivers/rapidio/devices/tsi721.c
2012-09-13 08:41:01 -06:00
Jia Hongtao
688ba1dbee powerpc/swiotlb: Enable at early stage and disable if not necessary
Remove the dependency on PCI initialization for SWIOTLB initialization.
So that PCI can be initialized at proper time.

SWIOTLB is partly determined by PCI inbound/outbound map which is assigned
in PCI initialization. But swiotlb_init() should be done at the stage of
mem_init() which is much earlier than PCI initialization. So we reserve the
memory for SWIOTLB first and free it if not necessary.

All boards are converted to fit this change.

Signed-off-by: Jia Hongtao <B38951@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
Acked-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2012-09-12 14:57:09 -05:00
Varun Sethi
39be5b4a7f powerpc/booke: Add CPU_FTR_EMB_HV check for e5500.
Added CPU_FTR_EMB_HV feature check for e5500.

Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2012-09-12 14:57:09 -05:00
Varun Sethi
0778407f83 powerpc/booke: Separate out restore_e5500/setup_e5500 routines.
For the 64 bit case separate out e5500 cpu_setup and cpu_restore functions.
The cpu_setup function (for the primary core) is passed the cpu_spec
pointer, which is not there in case of the cpu_restore function. Also, in
our case we will have to manipulate the CPU_FTR_EMB_HV flag on the primary
core.

Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2012-09-12 14:57:09 -05:00
Varun Sethi
2c71b0cc4a powerpc/booke: Merge the 32 bit e5500/e500mc cpu setup code.
Merge the 32 bit cpu setup code for e500mc/e5500 and define the
"cpu_restore" routine (for e5500/e6500) only for the 64 bit case. The
cpu_restore routine is used in the 64 bit case for setting up the secondary
cores.

Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2012-09-12 14:57:09 -05:00
Varun Sethi
7e0f4872a3 powepc/booke: Separate out E.HV check and ivor setup code.
Move the E.HV check and CPU_FTR_EMB_HV flag manipulation to the cpu setup
code.  Create a separate routine for E.HV ivors setup.

Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2012-09-12 14:57:08 -05:00
Zhao Chenhui
d0832a7507 powerpc/85xx: add HOTPLUG_CPU support
Add support to disable and re-enable individual cores at runtime on
MPC85xx/QorIQ SMP machines. Currently support e500v1/e500v2 core.

MPC85xx machines use ePAPR spin-table in boot page for CPU kick-off.  This
patch uses the boot page from bootloader to boot core at runtime.  It
supports 32-bit and 36-bit physical address.

Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Jin Qing <b24347@freescale.com>
Signed-off-by: Zhao Chenhui <chenhui.zhao@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2012-09-12 14:57:08 -05:00
Zhao Chenhui
ae5cab4763 powerpc/smp: add generic_set_cpu_up() to set cpu_state as CPU_UP_PREPARE
In the case of cpu hotplug, the cpu_state should be set to CPU_UP_PREPARE
when kicking cpu.  Otherwise, the cpu_state is always CPU_DEAD after
calling generic_set_cpu_dead(), which makes the delay in generic_cpu_die()
not happen.

Signed-off-by: Zhao Chenhui <chenhui.zhao@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2012-09-12 14:57:08 -05:00
Gavin Shan
4c2245bb5c powerpc/PCI: Override pcibios_window_alignment()
This patch implements pcibios_window_alignment() so powerpc platforms can
force P2P bridge windows to be at larger alignments than the PCI spec
requires.

[bhelgaas: changelog]
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-09-11 16:59:46 -06:00
Michael Neuling
cd14457304 powerpc: Dynamically calculate the dabrx based on kernel/user/hypervisor
Currently we mark the DABRX to interrupt on all matches
(hypervisor/kernel/user and then filter in software.  We can be a lot
smarter now that we can set the DABRX dynamically.

This sets the DABRX based on the flags passed by the user.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10 09:59:13 +10:00
Michael Neuling
4474ef055c powerpc: Rework set_dabr so it can take a DABRX value as well
Rework set_dabr to take a DABRX value as well.

Both the pseries and PS3 hypervisors do some checks on the DABRX
values that are passed in the hcall.  This patch stops bogus values
from being passed to hypervisor.  Also, in the case where we are
clearing the breakpoint, where DABR and DABRX are zero, we modify the
DABRX value to make it valid so that the hcall won't fail.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10 09:59:10 +10:00
Gavin Shan
f8f7d63fd9 powerpc/eeh: Trace eeh device from I/O cache
The idea comes from Benjamin Herrenschmidt. The eeh cache helps
fetching the pci device according to the given I/O address. Since
the eeh cache is serving for eeh, it's reasonable for eeh cache
to trace eeh device except pci device.

The patch make eeh cache to trace eeh device. Also, the major
eeh entry function eeh_dn_check_failure has been renamed to
eeh_dev_check_failure since it will take eeh device as input
parameter.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10 09:35:44 +10:00
Gavin Shan
35e5cfe27e powerpc/eeh: Move EEH initialization around
Currently, we have 3 phases for EEH initialization on pSeries platform.
All of them are done through builtin functions: platform initialization,
EEH device creation, and EEH subsystem enablement. All of them are done
no later than ppc_md.setup_arch. That means that the slab/slub isn't ready
yet, so we have to allocate memory chunks on basis of PAGE_SIZE for those
dynamically created EEH devices. That's pretty expensive.

In order to utilize slab/slub for memory allocation, we have to move the EEH
initialization functions around, but all of them should be called after slab
is ready.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10 09:35:27 +10:00
Michael Ellerman
407821a34f powerpc: Initialise paca.data_offset with poison
It's possible for the cpu_possible_mask to change between the time we
initialise the pacas and the time we setup per_cpu areas.

Obviously impossible cpus shouldn't ever be running, but stranger things
have happened. So be paranoid and initialise data_offset with a poison
value in case we don't set it up later.

Based on a patch from Anton Blanchard.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10 09:35:27 +10:00
Michael Neuling
3f4693eeea powerpc: Use consistent name info for arch_hw_breakpoint
Change bp_info to info to be consistent with the rest of this file.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-07 11:44:39 +10:00
Suzuki Poulose
4bc77a5ed2 powerpc: Export memory limit via device tree
The powerpc kernel doesn't export the memory limit enforced by 'mem='
kernel parameter. This is required for building the ELF header in
kexec-tools to limit the vmcore to capture only the used memory. On
powerpc the kexec-tools depends on the device-tree for memory related
information, unlike /proc/iomem on the x86.

Without this information, the kexec-tools assumes the entire System
RAM and vmcore creates an unnecessarily larger dump.

This patch exports the memory limit, if present, via
chosen/linux,memory-limit
property, so that the vmcore can be limited to the memory limit.

The prom_init seems to export this value in the same node. But doesn't
really
appear there.  Also the memory_limit gets adjusted with the processing of
crashkernel= parameter. This patch makes sure we get the actual limit.

The kexec-tools will use the value to limit the 'end' of the memory
regions.

Tested this patch on ppc64 and ppc32(ppc440) with a kexec-tools
patch by Mahesh.

Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Tested-by: Mahesh J. Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-07 11:44:33 +10:00
Suzuki Poulose
a84fcd4687 powerpc: Change memory_limit from phys_addr_t to unsigned long long
There are some device-tree nodes, whose values are of type phys_addr_t.
The phys_addr_t is variable sized based on the CONFIG_PHSY_T_64BIT.

Change these to a fixed unsigned long long for consistency.

This patch does the change only for memory_limit.

The following is a list of such variables which need the change:

 1) kernel_end, crashk_size - in arch/powerpc/kernel/machine_kexec.c

 2) (struct resource *)crashk_res.start - We could export a local static
    variable from machine_kexec.c.

Changing the above values might break the kexec-tools. So, I will
fix kexec-tools first to handle the different sized values and then change
 the above.

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-07 11:44:30 +10:00
Gavin Shan
cf1a4cf875 powerpc/pci: Save P2P bridge resource if possible
When PCI probe flag PCI_REASSIGN_ALL_RSRC has been passed into PCI
core, it's hoped that all resources to be reassigned by PCI core.
As to particular P2P (PCI-to-PCI) bridge, the size of the corresponding
BAR (I/O, MMIO, prefetchable MMIO) is calculated by the resources
required by the PCI devices behind the P2P bridge. That means that
the information like start/end address retrieved from the hardware
registers of the P2P bridge is meainingless in the case. However,
we still count that in and the BARs might have been configured by
firmware with non-zero size. That leads to space waste.

The patch explicitly sets the size of P2P bridge BARs to zero in
case that resource reassignment is expected with PCI probe flag
PCI_REASSIGN_ALL_RSRC. In the result, it will save overall resource
required by the system without waste.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-07 10:45:30 +10:00
Benjamin Herrenschmidt
fff34b3412 Merge branch 'merge' into next
Brings in various bug fixes from 3.6-rcX
2012-09-07 09:48:59 +10:00
Mihai Caraman
0127262c01 powerpc: Restore VDSO information on critical exception om BookE
Critical exception on 64-bit booke uses user-visible SPRG3 as scratch.
Restore VDSO information in SPRG3 on exception prolog.

Use a common sprg3 field in PACA for all powerpc64 architectures.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-07 09:48:49 +10:00
Paul Mackerras
9fb1b36ca1 powerpc: Make sure IPI handlers see data written by IPI senders
We have been observing hangs, both of KVM guest vcpu tasks and more
generally, where a process that is woken doesn't properly wake up and
continue to run, but instead sticks in TASK_WAKING state.  This
happens because the update of rq->wake_list in ttwu_queue_remote()
is not ordered with the update of ipi_message in
smp_muxed_ipi_message_pass(), and the reading of rq->wake_list in
scheduler_ipi() is not ordered with the reading of ipi_message in
smp_ipi_demux().  Thus it is possible for the IPI receiver not to see
the updated rq->wake_list and therefore conclude that there is nothing
for it to do.

In order to make sure that anything done before smp_send_reschedule()
is ordered before anything done in the resulting call to scheduler_ipi(),
this adds barriers in smp_muxed_message_pass() and smp_ipi_demux().
The barrier in smp_muxed_message_pass() is a full barrier to ensure that
there is a full ordering between the smp_send_reschedule() caller and
scheduler_ipi().  In smp_ipi_demux(), we use xchg() rather than
xchg_local() because xchg() includes release and acquire barriers.
Using xchg() rather than xchg_local() makes sense given that
ipi_message is not just accessed locally.

This moves the barrier between setting the message and calling the
cause_ipi() function into the individual cause_ipi implementations.
Most of them -- those that used outb, out_8 or similar -- already had
a full barrier because out_8 etc. include a sync before the MMIO
store.  This adds an explicit barrier in the two remaining cases.

These changes made no measurable difference to the speed of IPIs as
measured using a simple ping-pong latency test across two CPUs on
different cores of a POWER7 machine.

The analysis of the reason why processes were not waking up properly
is due to Milton Miller.

Cc: stable@vger.kernel.org # v3.0+
Reported-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05 16:05:22 +10:00
Anton Blanchard
714332858b powerpc: Restore correct DSCR in context switch
During a context switch we always restore the per thread DSCR value.
If we aren't doing explicit DSCR management
(ie thread.dscr_inherit == 0) and the default DSCR changed while
the process has been sleeping we end up with the wrong value.

Check thread.dscr_inherit and select the default DSCR or per thread
DSCR as required.

This was found with the following test case, when running with
more threads than CPUs (ie forcing context switching):

http://ozlabs.org/~anton/junkcode/dscr_default_test.c

With the four patches applied I can run a combination of all
test cases successfully at the same time:

http://ozlabs.org/~anton/junkcode/dscr_default_test.c
http://ozlabs.org/~anton/junkcode/dscr_explicit_test.c
http://ozlabs.org/~anton/junkcode/dscr_inherit_test.c

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05 16:05:22 +10:00
Anton Blanchard
1021cb268b powerpc: Fix DSCR inheritance in copy_thread()
If the default DSCR is non zero we set thread.dscr_inherit in
copy_thread() meaning the new thread and all its children will ignore
future updates to the default DSCR. This is not intended and is
a change in behaviour that a number of our users have hit.

We just need to inherit thread.dscr and thread.dscr_inherit from
the parent which ends up being much simpler.

This was found with the following test case:

http://ozlabs.org/~anton/junkcode/dscr_default_test.c

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05 16:05:21 +10:00
Anton Blanchard
00ca0de02f powerpc: Keep thread.dscr and thread.dscr_inherit in sync
When we update the DSCR either via emulation of mtspr(DSCR) or via
a change to dscr_default in sysfs we don't update thread.dscr.
We will eventually update it at context switch time but there is
a period where thread.dscr is incorrect.

If we fork at this point we will copy the old value of thread.dscr
into the child. To avoid this, always keep thread.dscr in sync with
reality.

This issue was found with the following testcase:

http://ozlabs.org/~anton/junkcode/dscr_inherit_test.c

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05 16:05:21 +10:00
Anton Blanchard
1b6ca2a6fe powerpc: Update DSCR on all CPUs when writing sysfs dscr_default
Writing to dscr_default in sysfs doesn't actually change the DSCR -
we rely on a context switch on each CPU to do the work. There is no
guarantee we will get a context switch in a reasonable amount of time
so fire off an IPI to force an immediate change.

This issue was found with the following test case:

http://ozlabs.org/~anton/junkcode/dscr_explicit_test.c

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05 16:05:20 +10:00
Paul Mackerras
375f561a41 powerpc/powernv: Always go into nap mode when CPU is offline
The CPU hotplug code for the powernv platform currently only puts
offline CPUs into nap mode if the powersave_nap variable is set.
However, HV-style KVM on this platform requires secondary CPU threads
to be offline and in nap mode.  Since we know nap mode works just
fine on all POWER7 machines, and the only machines that support the
powernv platform are POWER7 machines, this changes the code to
always put offline CPUs into nap mode, regardless of powersave_nap.
Powersave_nap still controls whether or not CPUs go into nap mode
when idle, as before.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05 16:05:20 +10:00
Paul Mackerras
dabe859ec6 powerpc: Give hypervisor decrementer interrupts their own handler
At the moment the handler for hypervisor decrementer interrupts is
the same as for decrementer interrupts, i.e. timer_interrupt().
This is bogus; if we ever do get a hypervisor decrementer interrupt
it won't have anything to do with the next timer event.  In fact
the only time we get hypervisor decrementer interrupts is when one
is left pending on exit from a KVM guest.

When we get a hypervisor decrementer interrupt we don't need to do
anything special to clear it, since they are edge-triggered on the
transition of HDEC from 0 to -1.  Thus this adds an empty handler
function for them.  We don't need to have them masked when interrupts
are soft-disabled, so we use STD_EXCEPTION_HV instead of
MASKABLE_EXCEPTION_HV.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05 16:05:19 +10:00
Mihai Caraman
8b64a9dfb0 powerpc/booke64: Use SPRG0/3 scratch for bolted TLB miss & crit int
Embedded.Hypervisor category defines GSPRG0..3 physical registers for guests.
Avoid SPRG4-7 usage as scratch in host exception handlers, otherwise guest
SPRG4-7 registers will be clobbered.
For bolted TLB miss exception handlers, which is the version currently
supported by KVM, use SPRN_SPRG_GEN_SCRATCH aka SPRG0 instead of
SPRN_SPRG_TLB_SCRATCH aka SPRG6. Keep using TLB PACA slots to fit in one
64-byte cache line.
For critical exception handlers use SPRG3 instead of SPRG7. Provide a routine
to store and restore user-visible SPRGs. This will be subsequently used
to restore VDSO information in SPRG3. Add EX_R13 to paca slots to free up
SPRG3 and change the critical exception epilog to use it.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05 15:35:52 +10:00
Mihai Caraman
79b5c8dbaa powerpc/booke64: Eemove mfspr srr1 duplicate in exception prolog
Refactor exception prolog to get rid of mfspr srr1 duplicate. This was
introduced by KVM integration, with DO_KVM macro logic expecting srr1 value
earlier in r11.
Reserve r11 to hold srr1's value also required at the end of the prolog and
free up r10 to serve as spare in addition macros.
For syscalls case this change does not add any performance penalty. For irq
soft-disabled case the change adds a store/load of conditional register value
to/from a paca slot. Paca slots fit in one 64-byte cache line so these
additional operations have little impact on performance.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05 15:35:49 +10:00
Mihai Caraman
fecff0f724 powerpc/booke64: Add DO_KVM kernel hooks
Hook DO_KVM macro into 64-bit booke for KVM integration. Extend interrupt
handlers' parameter list with interrupt vector numbers to accomodate the macro.
Only the bolted version of tlb miss handers is addressed now.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05 15:35:46 +10:00
Mihai Caraman
5473eb1c07 powerpc/booke64: Use GSRR registers in Guest Doorbell interrupts
Guest Doorbell interrupts use guest save and restore registers. Add a new
Guest Doorbell exception type to accommodate GSRR0/1 SPRs usage in exception
prolog and fix the exception handler.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05 15:35:43 +10:00
Mihai Caraman
a131075721 powerpc/booke64: Fix machine check handler to use the right prolog
Machine check exception handler was using a wrong prolog. Hypervisors like
KVM which are called early from the exception handler rely on the interrupt
source.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05 15:35:40 +10:00
Ananth N Mavinakayanahalli
8b7b80b9eb powerpc: Uprobes port to powerpc
This is the port of uprobes to powerpc. Usage is similar to x86.

[root@xxxx ~]# ./bin/perf probe -x /lib64/libc.so.6 malloc
Added new event:
  probe_libc:malloc    (on 0xb4860)

You can now use it in all perf tools, such as:

	perf record -e probe_libc:malloc -aR sleep 1

[root@xxxx ~]# ./bin/perf record -e probe_libc:malloc -aR sleep 20
[ perf record: Woken up 22 times to write data ]
[ perf record: Captured and wrote 5.843 MB perf.data (~255302 samples) ]
[root@xxxx ~]# ./bin/perf report --stdio
...

    69.05%           tar  libc-2.12.so   [.] malloc
    28.57%            rm  libc-2.12.so   [.] malloc
     1.32%  avahi-daemon  libc-2.12.so   [.] malloc
     0.58%          bash  libc-2.12.so   [.] malloc
     0.28%          sshd  libc-2.12.so   [.] malloc
     0.08%    irqbalance  libc-2.12.so   [.] malloc
     0.05%         bzip2  libc-2.12.so   [.] malloc
     0.04%         sleep  libc-2.12.so   [.] malloc
     0.03%    multipathd  libc-2.12.so   [.] malloc
     0.01%      sendmail  libc-2.12.so   [.] malloc
     0.01%     automount  libc-2.12.so   [.] malloc

The trap_nr addition patch is a prereq.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05 15:35:19 +10:00
Ananth N Mavinakayanahalli
41ab5266c3 powerpc: Add trap_nr to thread_struct
Add thread_struct.trap_nr and use it to store the last exception
the thread experienced. In this patch, we populate the field at
various places where we force_sig_info() to the process.

This is also used in uprobes to determine if the probed instruction
caused an exception.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05 15:19:36 +10:00
Michael Ellerman
d3dbeef657 powerpc: Rename 64-bit PVR constants to PVR_foo
We have an old FIXME in reg.h which points out that we should standardise
on PVR_foo for our PVR #defines. Currently we use PVR_ on 32-bit and PV_
on 64-bit.

So do that rename and remove the FIXME.

Seeing as we're touching all but one usage of __is_processor(), rename it
to something less ugly and more indicative of what it does, which is
simply to check the PVR version.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05 15:19:35 +10:00
Michael Ellerman
beacc6da86 powerpc: Remove all includes of <asm/abs_addr.h>
It's empty now, apart from other includes.

Fixup a few files that were getting things via this header.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05 15:19:33 +10:00
Michael Ellerman
3d2675238a powerpc/kernel: Remove uses of abs_to_virt() and virt_to_abs()
These days they are just __va() and __pa() respectively.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05 15:19:30 +10:00
Ingo Molnar
59f979455d Merge branch 'sched/urgent' into sched/core
Merge in the current fixes branch, we are going to apply dependent patches.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-04 14:31:00 +02:00
Jiri Kosina
7256a5d2da powerpc: Fix personality handling in ppc64_personality()
Directly comparing current->personality against PER_LINUX32 doesn't work
in cases when any of the personality flags stored in the top three bytes
are used.

Directly forcefully setting personality to PER_LINUX32 or PER_LINUX
discards any flags stored in the top three bytes

Use personality() macro to compare only PER_MASK bytes and make sure that
we are setting only the bits that should be set, instead of overwriting
the whole value.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-08-24 20:26:07 +10:00
Aaro Koskinen
4c374af5fd powerpc/dma-iommu: Fix IOMMU window check
Checking for device mask to cover the whole IOMMU table is too strict.
IOMMU allocators should handle mask constraint properly for each
allocation.

The patch enables to use old AirPort Extreme cards on PowerMacs with
more than 1GB of memory; without the patch the driver init fails with:

  b43-pci-bridge 0001:01:01.0: Warning: IOMMU window too big for device mask
  b43-pci-bridge 0001:01:01.0: mask: 0x3fffffff, table end: 0x80000000
  b43-phy0 ERROR: The machine/kernel does not support the required 30-bit DMA mask

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-08-24 20:26:07 +10:00
Tiejun Chen
5f630401f9 powerpc/kgdb: Restore current_thread_info properly
For powerpc BooKE and e200, singlestep is handled on the critical/dbg
exception stack. This causes current_thread_info() to fail for kgdb
internal, so previously We work around this issue by copying
the thread_info from the kernel stack before calling kgdb_handle_exception,
and copying it back afterwards.

But actually we don't do this properly. We should backup current_thread_info
then restore that when exit.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-08-24 20:26:06 +10:00
Tiejun Chen
949616cf2d powerpc/kgdb: Bail out of KGDB when we've been triggered
We need to skip a breakpoint exception when it occurs after
a breakpoint has already been removed.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-08-24 20:26:05 +10:00
Tiejun Chen
572b411cb4 powerpc/kgdb: Do not set kgdb_single_step on ppc
The kgdb_single_step flag has the possibility to indefinitely
hang the system on an SMP system.

The x86 arch have the same problem, and that problem was fixed by
commit 8097551d9ab9b9e3630(kgdb,x86: do not set kgdb_single_step
on x86). This patch does the same behaviors as x86's patch.

Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-08-24 20:26:05 +10:00
Michael Neuling
6d9c00c67b powerpc: Fix null pointer deref in perf hardware breakpoints
Currently if you are doing a global perf recording with hardware
breakpoints (ie perf record -e mem:0xdeadbeef -a), you can oops with:

  Faulting instruction address: 0xc000000000738890
  cpu 0xc: Vector: 300 (Data Access) at [c0000003f76af8d0]
      pc: c000000000738890: .hw_breakpoint_handler+0xa0/0x1e0
      lr: c000000000738830: .hw_breakpoint_handler+0x40/0x1e0
      sp: c0000003f76afb50
     msr: 8000000000001032
     dar: 6f0
   dsisr: 42000000
    current = 0xc0000003f765ac00
    paca    = 0xc00000000f262a00   softe: 0        irq_happened: 0x01
    pid   = 6810, comm = loop-read
  enter ? for help
  [c0000003f76afbe0] c00000000073cd04 .notifier_call_chain.isra.0+0x84/0xe0
  [c0000003f76afc80] c00000000073cdbc .notify_die+0x3c/0x60
  [c0000003f76afd20] c0000000000139f0 .do_dabr+0x40/0xf0
  [c0000003f76afe30] c000000000005a9c handle_dabr_fault+0x14/0x48
  --- Exception: 300 (Data Access) at 0000000010000480
  SP (ff8679e0) is in userspace

This is because we don't check to see if the break point is associated
with task before we deference the task_struct pointer.

This changes the update to use current.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-08-24 20:26:04 +10:00
Ashley Lai
4a727429ab PPC64: Add support for instantiating SML from Open Firmware
This patch instantiate Stored Measurement Log (SML) and put the
log address and size in the device tree.

Signed-off-by: Ashley Lai <adlai@us.ibm.com>
Signed-off-by: Kent Yoder <key@linux.vnet.ibm.com>
2012-08-22 16:22:24 -05:00
Frederic Weisbecker
baa36046d0 cputime: Consolidate vtime handling on context switch
The archs that implement virtual cputime accounting all
flush the cputime of a task when it gets descheduled
and sometimes set up some ground initialization for the
next task to account its cputime.

These archs all put their own hooks in their context
switch callbacks and handle the off-case themselves.

Consolidate this by creating a new account_switch_vtime()
callback called in generic code right after a context switch
and that these archs must implement to flush the prev task
cputime and initialize the next task cputime related state.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
2012-08-20 13:05:28 +02:00
Seth Jennings
da29aa8f2a powerpc/crypto: add compression support to arch vec
This patch enables compression engine support in the
architecture vector.  This causes the Power hypervisor
to allow access to the nx comrpession accelerator.

Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-08-01 17:47:53 +08:00
Linus Torvalds
6f51f51582 Merge branch 'for-linus-for-3.6-rc1' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull DMA-mapping updates from Marek Szyprowski:
 "Those patches are continuation of my earlier work.

  They contains extensions to DMA-mapping framework to remove limitation
  of the current ARM implementation (like limited total size of DMA
  coherent/write combine buffers), improve performance of buffer sharing
  between devices (attributes to skip cpu cache operations or creation
  of additional kernel mapping for some specific use cases) as well as
  some unification of the common code for dma_mmap_attrs() and
  dma_mmap_coherent() functions.  All extensions have been implemented
  and tested for ARM architecture."

* 'for-linus-for-3.6-rc1' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
  ARM: dma-mapping: add support for DMA_ATTR_SKIP_CPU_SYNC attribute
  common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute
  ARM: dma-mapping: add support for dma_get_sgtable()
  common: dma-mapping: introduce dma_get_sgtable() function
  ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING attribute
  common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
  common: dma-mapping: add support for generic dma_mmap_* calls
  ARM: dma-mapping: fix error path for memory allocation failure
  ARM: dma-mapping: add more sanity checks in arm_dma_mmap()
  ARM: dma-mapping: remove custom consistent dma region
  mm: vmalloc: use const void * for caller argument
  scatterlist: add sg_alloc_table_from_pages function
2012-07-30 10:11:31 -07:00
Marek Szyprowski
64ccc9c033 common: dma-mapping: add support for generic dma_mmap_* calls
Commit 9adc5374 ('common: dma-mapping: introduce mmap method') added a
generic method for implementing mmap user call to dma_map_ops structure.

This patch converts ARM and PowerPC architectures (the only providers of
dma_mmap_coherent/dma_mmap_writecombine calls) to use this generic
dma_map_ops based call and adds a generic cross architecture
definition for dma_mmap_attrs, dma_mmap_coherent, dma_mmap_writecombine
functions.

The generic mmap virt_to_page-based fallback implementation is provided for
architectures which don't provide their own implementation for mmap method.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-07-30 12:25:46 +02:00
Steven Rostedt
bac821a6e3 powerpc/ftrace: Trace function graph entry before updating index
As Colin Cross ported my x86 change to ARM, he also pointed out that
powerpc is also behind in this fix.

The commit 722b3c7469 "ftrace/graph: Trace function entry before
updating index" fixes an issue with function graph tracing for x86,
where if the called entry function decides not to trace interrupts, it
can fail the check if an interrupt comes in just after the
curr_ret_stack is updated.

The solution is to call the entry function first, then update the
curr_ret_stack if the entry function wants to be traced.

Cc: Colin Cross <ccross@android.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-27 11:42:34 +10:00
Anton Blanchard
0e3849836b powerpc: Lack of firmware flash support is not an error
Reduce the severity of the warning given when firmware flash is
not supported. Not all platforms have it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-27 11:42:33 +10:00
Stuart Yoder
1f8b0bc81a powerpc: Set stack limit properly in crit_transfer_to_handler
Commit 9778b696a0 incorrectly
changes the code setting the stack limit on entry to the
kernel to mark the thread_info at the bottom of the stack
out of bounds anymore. This fixes it.

Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-27 11:42:31 +10:00
Linus Torvalds
6dd53aa456 PCI changes for the 3.6 merge window:
Host bridge hotplug
     - Add MMCONFIG support for hot-added host bridges (Jiang Liu)
   Device hotplug
     - Move fixups from __init to __devinit (Sebastian Andrzej Siewior)
     - Call FINAL fixups for hot-added devices, too (Myron Stowe)
     - Factor out generic code for P2P bridge hot-add (Yinghai Lu)
     - Remove all functions in a slot, not just those with _EJx (Amos Kong)
   Dynamic resource management
     - Track bus number allocation (struct resource tree per domain) (Yinghai Lu)
     - Make P2P bridge 1K I/O windows work with resource reassignment (Bjorn Helgaas, Yinghai Lu)
     - Disable decoding while updating 64-bit BARs (Bjorn Helgaas)
   Power management
     - Add PCIe runtime D3cold support (Huang Ying)
   Virtualization
     - Add VFIO infrastructure (ACS, DMA source ID quirks) (Alex Williamson)
     - Add quirks for devices with broken INTx masking (Jan Kiszka)
   Miscellaneous
     - Fix some PCI Express capability version issues (Myron Stowe)
     - Factor out some arch code with a weak, generic, pcibios_setup() (Myron Stowe)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJQBy+9AAoJEPGMOI97Hn6zOpQP+wVFvA7pcteFj6HPs5nTq2Hc
 55oeRqCO0wBHoFMCKB0AjeTATjqxi9OhcjaiVrZejxNyWKC9MnrXuunpQ0l/hCbR
 M/TK+BCelfX2FU4eXNf+TBCCcOhOVWqQft9Gm6nYKwX8Y0msRVCceI4WwhZgSwtI
 vdtmnqlwolscdnq+8ThsnvUMtwkN0gExmn2FJRl6EoEgG0DTqhMkZ83uA+NPBhvv
 I+g0XbA6haaZph2nnSYR0hIW4Q7JkT/LgA6uVAQxamctwxLol7xxsjCRnfqrulkf
 kaRr2fAgBXfmaOIltro4UkXrCM52ZSyggCDfExHp6mWGPKMjE5ZcyK1YbGfmmumk
 DS3t1S0eBdDJXrnf9l/Yb8e95dQxRCYKelKzr1rTD9QAXsInE8rC40hvhfFaTa4s
 nZYRTz0SKv6coQihqaOR7shx1DNomLFk7jndaWEElfl9/cT/nQnZ8XLfVMzkJNNB
 Y4SM6zkiIaCL0aiSEE16MqVjmODYRjbURLYzQIrqr2KJQg8X6XjIRojQLjL6xEgA
 22ry2ZRPhqO68g7aLqvixiSDaTp0Z0Vw+JmgjtBqvkokwZcGQtm4umkpAdOi+Es8
 3bJaMY7ZUpDX53FE8iyP6AnmR/1k19rC1gNnNq/syWyjtYOYJ9i3QCTafFgvE1VC
 5coQ1L5tByHvpzK5PHwf
 =oo/A
 -----END PGP SIGNATURE-----

Merge tag 'for-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI changes from Bjorn Helgaas:
 "Host bridge hotplug:
    - Add MMCONFIG support for hot-added host bridges (Jiang Liu)
  Device hotplug:
    - Move fixups from __init to __devinit (Sebastian Andrzej Siewior)
    - Call FINAL fixups for hot-added devices, too (Myron Stowe)
    - Factor out generic code for P2P bridge hot-add (Yinghai Lu)
    - Remove all functions in a slot, not just those with _EJx (Amos
      Kong)
  Dynamic resource management:
    - Track bus number allocation (struct resource tree per domain)
      (Yinghai Lu)
    - Make P2P bridge 1K I/O windows work with resource reassignment
      (Bjorn Helgaas, Yinghai Lu)
    - Disable decoding while updating 64-bit BARs (Bjorn Helgaas)
  Power management:
    - Add PCIe runtime D3cold support (Huang Ying)
  Virtualization:
    - Add VFIO infrastructure (ACS, DMA source ID quirks) (Alex
      Williamson)
    - Add quirks for devices with broken INTx masking (Jan Kiszka)
  Miscellaneous:
    - Fix some PCI Express capability version issues (Myron Stowe)
    - Factor out some arch code with a weak, generic, pcibios_setup()
      (Myron Stowe)"

* tag 'for-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (122 commits)
  PCI: hotplug: ensure a consistent return value in error case
  PCI: fix undefined reference to 'pci_fixup_final_inited'
  PCI: build resource code for M68K architecture
  PCI: pciehp: remove unused pciehp_get_max_lnk_width(), pciehp_get_cur_lnk_width()
  PCI: reorder __pci_assign_resource() (no change)
  PCI: fix truncation of resource size to 32 bits
  PCI: acpiphp: merge acpiphp_debug and debug
  PCI: acpiphp: remove unused res_lock
  sparc/PCI: replace pci_cfg_fake_ranges() with pci_read_bridge_bases()
  PCI: call final fixups hot-added devices
  PCI: move final fixups from __init to __devinit
  x86/PCI: move final fixups from __init to __devinit
  MIPS/PCI: move final fixups from __init to __devinit
  PCI: support sizing P2P bridge I/O windows with 1K granularity
  PCI: reimplement P2P bridge 1K I/O windows (Intel P64H2)
  PCI: disable MEM decoding while updating 64-bit MEM BARs
  PCI: leave MEM and IO decoding disabled during 64-bit BAR sizing, too
  PCI: never discard enable/suspend/resume_early/resume fixups
  PCI: release temporary reference in __nv_msi_ht_cap_quirk()
  PCI: restructure 'pci_do_fixups()'
  ...
2012-07-24 16:17:07 -07:00
Linus Torvalds
f14121ab35 Devicetree updates for 3.6
A small set of changes for devicetree:
 - Couple of Documentation fixes
 - Addition of new helper function of_node_full_name
 - Improve of_parse_phandle_with_args return values
 - Some NULL related sparse fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJQDwsgAAoJEMhvYp4jgsXiuwUH/Ri6ZSnqHcz4Wa/X4FxvNc3I
 3Xelo/Vt3WLYue3s/+OYiM5FK9+KH8T6x+U79Q4p7vePcfUh6GJII0AUbMeRghkS
 m3FjNd5syzYNJlnDnqdngQYRDpaz8U/SyftjXyMPjJ1VWiyLx/EJQUkj1EEwDLe/
 ZVabppnco3Y6OJpFuETONNvXx5mE7xq86isW5+aYmviMkWSMMwJPf8qofLJ78Dh5
 OAhWuCPRDooz548+Wkabt90qHjF6FU43w5fU7zZW26NT39ptppcbZ2bAXcTYqIIq
 sATp5YSitvwFqO2c1mA/drZ9nrgxDPCaw3qCDyiMdcbWgXqDirz2x7q1iauVHF4=
 =5TZ/
 -----END PGP SIGNATURE-----

Merge tag 'dt-for-3.6' of git://sources.calxeda.com/kernel/linux

Pull devicetree updates from Rob Herring:
 "A small set of changes for devicetree:
   - Couple of Documentation fixes
   - Addition of new helper function of_node_full_name
   - Improve of_parse_phandle_with_args return values
   - Some NULL related sparse fixes"

Grant's busy packing.

* tag 'dt-for-3.6' of git://sources.calxeda.com/kernel/linux:
  of: mtd: nuke useless const qualifier
  devicetree: add helper inline for retrieving a node's full name
  of: return -ENOENT when no property
  usage-model.txt: fix typo machine_init->init_machine
  of: Fix null pointer related warnings in base.c file
  LED: Fix missing semicolon in OF documentation
  of: fix a few typos in the binding documentation
2012-07-24 14:07:22 -07:00
Linus Torvalds
5fecc9d8f5 KVM updates for the 3.6 merge window
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQDRDNAAoJEI7yEDeUysxlkl8P/3C2AHx2webOU8sVzhfU6ONZ
 ZoGevwBjyZIeJEmiWVpFTTEew1l0PXtpyOocXGNUXIddVnhXTQOKr/Scj4uFbmx8
 ROqgK8NSX9+xOGrBPCoN7SlJkmp+m6uYtwYkl2SGnsEVLWMKkc7J7oqmszCcTQvN
 UXMf7G47/Ul2NUSBdv4Yvizhl4kpvWxluiweDw3E/hIQKN0uyP7CY58qcAztw8nG
 csZBAnnuPFwIAWxHXW3eBBv4UP138HbNDqJ/dujjocM6GnOxmXJmcZ6b57gh+Y64
 3+w9IR4qrRWnsErb/I8inKLJ1Jdcf7yV2FmxYqR4pIXay2Yzo1BsvFd6EB+JavUv
 pJpixrFiDDFoQyXlh4tGpsjpqdXNMLqyG4YpqzSZ46C8naVv9gKE7SXqlXnjyDlb
 Llx3hb9Fop8O5ykYEGHi+gIISAK5eETiQl4yw9RUBDpxydH4qJtqGIbLiDy8y9wi
 Xyi8PBlNl+biJFsK805lxURqTp/SJTC3+Zb7A7CzYEQm5xZw3W/CKZx1ZYBfpaa/
 pWaP6tB7JwgLIVXi4HQayLWqMVwH0soZIn9yazpOEFv6qO8d5QH5RAxAW2VXE3n5
 JDlrajar/lGIdiBVWfwTJLb86gv3QDZtIWoR9mZuLKeKWE/6PRLe7HQpG1pJovsm
 2AsN5bS0BWq+aqPpZHa5
 =pECD
 -----END PGP SIGNATURE-----

Merge tag 'kvm-3.6-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Avi Kivity:
 "Highlights include
  - full big real mode emulation on pre-Westmere Intel hosts (can be
    disabled with emulate_invalid_guest_state=0)
  - relatively small ppc and s390 updates
  - PCID/INVPCID support in guests
  - EOI avoidance; 3.6 guests should perform better on 3.6 hosts on
    interrupt intensive workloads)
  - Lockless write faults during live migration
  - EPT accessed/dirty bits support for new Intel processors"

Fix up conflicts in:
 - Documentation/virtual/kvm/api.txt:

   Stupid subchapter numbering, added next to each other.

 - arch/powerpc/kvm/booke_interrupts.S:

   PPC asm changes clashing with the KVM fixes

 - arch/s390/include/asm/sigp.h, arch/s390/kvm/sigp.c:

   Duplicated commits through the kvm tree and the s390 tree, with
   subsequent edits in the KVM tree.

* tag 'kvm-3.6-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (93 commits)
  KVM: fix race with level interrupts
  x86, hyper: fix build with !CONFIG_KVM_GUEST
  Revert "apic: fix kvm build on UP without IOAPIC"
  KVM guest: switch to apic_set_eoi_write, apic_write
  apic: add apic_set_eoi_write for PV use
  KVM: VMX: Implement PCID/INVPCID for guests with EPT
  KVM: Add x86_hyper_kvm to complete detect_hypervisor_platform check
  KVM: PPC: Critical interrupt emulation support
  KVM: PPC: e500mc: Fix tlbilx emulation for 64-bit guests
  KVM: PPC64: booke: Set interrupt computation mode for 64-bit host
  KVM: PPC: bookehv: Add ESR flag to Data Storage Interrupt
  KVM: PPC: bookehv64: Add support for std/ld emulation.
  booke: Added crit/mc exception handler for e500v2
  booke/bookehv: Add host crit-watchdog exception support
  KVM: MMU: document mmu-lock and fast page fault
  KVM: MMU: fix kvm_mmu_pagetable_walk tracepoint
  KVM: MMU: trace fast page fault
  KVM: MMU: fast path of handling guest page fault
  KVM: MMU: introduce SPTE_MMU_WRITEABLE bit
  KVM: MMU: fold tlb flush judgement into mmu_spte_update
  ...
2012-07-24 12:01:20 -07:00
Benjamin Herrenschmidt
668fcb6972 Remove stale .rej file
Commit 9778b696a0 accidentally added
a .rej file (probably my fault), remove it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-23 09:38:53 +10:00
Benjamin Herrenschmidt
dcd261ba58 powerpc/iommu: Fix iommu pool initialization
The iommu pool patch has a bug where it would cause a crash when using
only one pool (based on the size of the DMA window).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-13 17:45:49 +10:00
Benjamin Herrenschmidt
af3bf7fbe0 Merge remote-tracking branch 'kumar/next' into next
Freescale updates for 3.6
2012-07-13 13:38:26 +10:00
Shaohui Xie
c5f02bb422 powerpc/watchdog: move booke watchdog param related code to setup-common.c
Currently, BOOKE watchdog code for checking "wdt" and "wdt_period" is
in setup_32.c, it cannot be used in 64-bit, so move it to a common place
setup-common.c, which will be shared by 32-bit and 64-bit.

Also, replace the simple_strtoul with kstrtol.

Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2012-07-11 07:44:03 -05:00
roger blofeld
fd5a42980e powerpc/ftrace: Fix assembly trampoline register usage
Just like the module loader, ftrace needs to be updated to use r12
instead of r11 with newer gcc's.

Signed-off-by: Roger Blofeld <blofeldus@yahoo.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: stable@vger.kernel.org
2012-07-11 14:23:32 +10:00
Naveen N. Rao
ac84aa2b3b powerpc/hw_breakpoints: Fix incorrect pointer access
If arch_validate_hwbkpt_settings() fails, bp->ctx won't be valid and the
kernel panics. Add a check to fix this.

Reported-by: Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-11 14:20:20 +10:00
Anton Blanchard
18ad51dd34 powerpc: Add VDSO version of getcpu
We have a request for a fast method of getting CPU and NUMA node IDs
from userspace. This patch implements a getcpu VDSO function,
similar to x86.

Ben suggested we use SPRG3 which is userspace readable. SPRG3 can be
modified by a KVM guest, so we save the SPRG3 value in the paca and
restore it when transitioning from the guest to the host.

I have a glibc patch that implements sched_getcpu on top of this.
Testing on a POWER7:

baseline: 538 cycles
vdso:      30 cycles

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-11 14:18:40 +10:00
Michael Ellerman
e6a74c6ea3 powerpc: Add a symbol for hypervisor trampolines
Purely for cosmetic purposes, otherwise it can appear that we are in
single_step_pSeries() which is slightly confusing.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-11 14:18:38 +10:00
Benjamin Herrenschmidt
8bf8385b9c powerpc: Fixup oddity in entry_32.S
When I "fixed" the CONFIG_TRACE_IRQFLAGS case on interrupt entry,
I screwed up a little bit with the test for user space vs. kernel.

The code is fine, there's just some dead code around it. I basically
removed the test and always create the added stack frame whether
coming from user or kernel since in any case we do need to save
a bunch of volatile registers or bad things would happen (we can
take page faults in the kernel for example).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-11 14:18:33 +10:00
Stuart Yoder
9778b696a0 powerpc: Use CURRENT_THREAD_INFO instead of open coded assembly
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-11 14:18:22 +10:00
Liu Yu
2dc3d4cc68 powerpc/e500: make load_up_spe a normal fuction
So that we can call it when improving SPE switch like book3e did for fp
switch.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Olivia Yin <hong-hua.yin@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2012-07-10 07:07:22 -05:00
Anton Blanchard
d6b9a81b2a powerpc: IOMMU fault injection
Add the ability to inject IOMMU faults. We enable this per device
via a fail_iommu sysfs property, similar to fault injection on other
subsystems.

An example:

...
0003:01:00.1 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 02)

To inject one error to this device:

echo 1 > /sys/bus/pci/devices/0003:01:00.1/fail_iommu
echo 1 > /sys/kernel/debug/fail_iommu/probability
echo 1 > /sys/kernel/debug/fail_iommu/times

As feared, the first failure injected on the be3 results in an
unrecoverable error, taking down both functions of the card
permanently:

be2net 0003:01:00.1: Unrecoverable error in the card

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:59 +10:00
Anton Blanchard
a980349725 powerpc: Call dma_debug_add_bus for PCI and VIO buses
The DMA API debug code has hooks to verify all DMA entries have been
freed at time of hot unplug. We need to call dma_debug_add_bus for
this to work.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:57 +10:00
Anton Blanchard
44b372d8a0 powerpc/vio: Separate vio bus probe and device probe
Similar to PCI, separate the bus probe from device probe. This allows
us to attach bus notifiers for DMA debug and IOMMU fault injection
before devices have been probed.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:54 +10:00
Anton Blanchard
62761d1f68 powerpc/vio: Remove dma not supported warnings
During boot we see a number of these warnings:

vio 30000000: Warning: IOMMU dma not supported: mask 0xffffffffffffffff, table unavailable

The reason for this is that we set IOMMU properties for all VIO
devices even if they are not DMA capable.

Only set DMA ops, table and mask for devices with a DMA window.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:51 +10:00
Michael Neuling
962cffbd8a powerpc: Enforce usage of RA 0-R31 where possible
Some macros use RA where when RA=R0 the values is 0, so make this
the enforced mnemonic in the macro.

Idea suggested by Andreas Schwab.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:35 +10:00
Michael Neuling
0b7673c35e powerpc: Enforce usage of R0-R31 where possible
Enforce the use of R0-R31 in macros where possible now we have all the
fixes in.

R0-R31 macros are removed here so that can't be used anymore.  They
should not be defined anywhere.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:30 +10:00
Michael Neuling
e55174e911 powerpc: Fixes for instructions not using correct register naming
These macros are using integers where they could be using logical
names since they take registers.

We are going to enforce this soon, so fix these up now.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:16 +10:00
Michael Neuling
4404a9f98f powerpc/pasemi: Move lbz/stbciz to ppc-opcode.h
move lbz/stbciz to ppc-opcode.h.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:00 +10:00
Michael Neuling
c75df6f96c powerpc: Fix usage of register macros getting ready for %r0 change
Anything that uses a constructed instruction (ie. from ppc-opcode.h),
need to use the new R0 macro, as %r0 is not going to work.

Also convert usages of macros where we are just determining an offset
(usually for a load/store), like:
	std	r14,STK_REG(r14)(r1)
Can't use STK_REG(r14) as %r14 doesn't work in the STK_REG macro since
it's just calculating an offset.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:17:55 +10:00
Benjamin Herrenschmidt
50bba07d6a Merge branch 'merge' into next
We want to bring in the latest IRQ fixes
2012-07-10 19:16:43 +10:00
Benjamin Herrenschmidt
21b2de3412 powerpc: Fix build of some debug irq code
There was a typo, checking for CONFIG_TRACE_IRQFLAG instead of
CONFIG_TRACE_IRQFLAGS causing some useful debug code to not be
built

This in turns causes a build error on BookE 64-bit due to incorrect
semicolons at the end of a couple of macros, so let's fix that too

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: stable@vger.kernel.org [v3.4]
2012-07-10 19:16:20 +10:00
Benjamin Herrenschmidt
be2cf20a5a powerpc: More fixes for lazy IRQ vs. idle
Looks like we still have issues with pSeries and Cell idle code
vs. the lazy irq state. In fact, the reset fixes that went upstream
are exposing the problem more by causing BUG_ON() to trigger (which
this patch turns into a WARN_ON instead).

We need to be careful when using a variant of low power state that
has the side effect of turning interrupts back on, to properly set
all the SW & lazy state to look as if everything is enabled before
we enter the low power state with MSR:EE off as we will return with
MSR:EE on. If not, we have a discrepancy of state which can cause
things to go very wrong later on.

This patch moves the logic into a helper and uses it from the
pseries and cell idle code. The power4/970 idle code already got
things right (in assembly even !) so I'm not touching it. The power7
"bare metal" idle code is subtly different and correct. Remains PA6T
and some hypervisor based Cell platforms which have questionable
code in there, but they are mostly dead platforms so I'll fix them
when I manage to get final answers from the respective maintainers
about how the low power state actually works on them.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: stable@vger.kernel.org [v3.4]
2012-07-10 19:16:07 +10:00
Grant Likely
74a7f08448 devicetree: add helper inline for retrieving a node's full name
The pattern (np ? np->full_name : "<none>") is rather common in the
kernel, but can also make for quite long lines.  This patch adds a new
inline function, of_node_full_name() so that the test for a valid node
pointer doesn't need to be open coded at all call sites.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
2012-07-06 07:16:34 -05:00
Bjorn Helgaas
85a00dd391 Merge branch 'pci/myron-pcibios_setup' into next
* pci/myron-pcibios_setup:
  xtensa/PCI: factor out pcibios_setup()
  x86/PCI: adjust section annotations for pcibios_setup()
  unicore32/PCI: adjust section annotations for pcibios_setup()
  tile/PCI: factor out pcibios_setup()
  sparc/PCI: factor out pcibios_setup()
  sh/PCI: adjust section annotations for pcibios_setup()
  sh/PCI: factor out pcibios_setup()
  powerpc/PCI: factor out pcibios_setup()
  parisc/PCI: factor out pcibios_setup()
  MIPS/PCI: adjust section annotations for pcibios_setup()
  MIPS/PCI: factor out pcibios_setup()
  microblaze/PCI: factor out pcibios_setup()
  ia64/PCI: factor out pcibios_setup()
  cris/PCI: factor out pcibios_setup()
  alpha/PCI: factor out pcibios_setup()
  PCI: pull pcibios_setup() up into core
2012-07-05 15:31:05 -06:00
Myron Stowe
67ea194ad3 powerpc/PCI: factor out pcibios_setup()
The PCI core provides a generic pcibios_setup() routine.  Drop this
architecture-specific version in favor of that.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-07-05 15:09:13 -06:00
Wanpeng Li
f0a875fd3f powerpc: Fix kernel-doc warning
Warning(arch/powerpc/kernel/pci_of_scan.c:210): Excess function parameter 'node' description in 'of_scan_pci_bridge'
Warning(arch/powerpc/kernel/vio.c:636): No description found for parameter 'desired'
Warning(arch/powerpc/kernel/vio.c:636): Excess function parameter 'new_desired' description in 'vio_cmo_set_dev_desired'
Warning(arch/powerpc/kernel/vio.c:1270): No description found for parameter 'viodrv'
Warning(arch/powerpc/kernel/vio.c:1270): Excess function parameter 'drv' description in '__vio_register_driver'
Warning(arch/powerpc/kernel/vio.c:1289): No description found for parameter 'viodrv'
Warning(arch/powerpc/kernel/vio.c:1289): Excess function parameter 'driver' description in 'vio_unregister_driver'

Signed-off-by: Wanpeng Li <liwp@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:49 +10:00
Anton Blanchard
b4c3a8729a powerpc/iommu: Implement IOMMU pools to improve multiqueue adapter performance
At the moment all queues in a multiqueue adapter will serialise
against the IOMMU table lock. This is proving to be a big issue,
especially with 10Gbit ethernet.

This patch creates 4 pools and tries to spread the load across
them. If the table is under 1GB in size we revert back to the
original behaviour of 1 pool and 1 largealloc pool.

We create a hash to map CPUs to pools. Since we prefer interrupts to
be affinitised to primary CPUs, without some form of hashing we are
very likely to end up using the same pool. As an example, POWER7
has 4 way SMT and with 4 pools all primary threads will map to the
same pool.

The largealloc pool is reduced from 1/2 to 1/4 of the space to
partially offset the overhead of breaking the table up into pools.

Some performance numbers were obtained with a Chelsio T3 adapter on
two POWER7 boxes, running a 100 session TCP round robin test.

Performance improved 69% with this patch applied.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:48 +10:00
Anton Blanchard
d362213722 powerpc/iommu: Push spinlock into iommu_range_alloc and __iommu_free
In preparation for IOMMU pools, push the spinlock into
iommu_range_alloc and __iommu_free.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:47 +10:00
Anton Blanchard
67ca141567 powerpc/iommu: Reduce spinlock coverage in iommu_free
This patch moves tce_free outside of the lock in iommu_free.

Some performance numbers were obtained with a Chelsio T3 adapter on
two POWER7 boxes, running a 100 session TCP round robin test.

Performance improved 25% with this patch applied.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:47 +10:00
Anton Blanchard
0e4bc95d87 powerpc/iommu: Reduce spinlock coverage in iommu_alloc and iommu_free
We currently hold the IOMMU spinlock around tce_build and tce_flush.
This causes our spinlock hold times to be much higher than required
and can impact multiqueue adapters.

This patch moves tce_build and tce_flush outside of the lock in
iommu_alloc, and tce_flush outside of the lock in iommu_free.

Some performance numbers were obtained with a Chelsio T3 adapter on
two POWER7 boxes, running a 100 session TCP round robin test.

Performance improved 32% with this patch applied.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:47 +10:00
Gavin Shan
8127e723da powerpc/pci: cleanup on duplicate assignment
While creating the PCI root bus through function pci_create_root_bus()
of PCI core, it should have assigned the secondary bus number for the
newly created PCI root bus. Thus we needn't do the explicit assignment
for the secondary bus number again in pcibios_scan_phb().

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:45 +10:00
Anton Blanchard
ac1dc36558 powerpc: Clear RI and EE at the same time in system call exit
mtmsrd is an expensive instruction, we save a few cycles by
doing it once instead of twice.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:43 +10:00
Yong Zhang
e250d4bca6 powerpc/smp: remove call to ipi_call_lock()/ipi_call_unlock()
1) call_function.lock used in smp_call_function_many() is just to protect
   call_function.queue and &data->refs, cpu_online_mask is outside of the
   lock. And it's not necessary to protect cpu_online_mask,
   because data->cpumask is pre-calculate and even if a cpu is brougt up
   when calling arch_send_call_function_ipi_mask(), it's harmless because
   validation test in generic_smp_call_function_interrupt() will take care
   of it.

2) For cpu down issue, stop_machine() will guarantee that no concurrent
   smp_call_fuction() is processing.

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:42 +10:00
Steven Rostedt
65b8c7226e powerpc/ftrace: Use patch_instruction instead of probe_kernel_write()
The patch_instruction() interface is made to modify kernel text. It is
safer to use that then the probe_kernel_write() when modifying kernel
code.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:39 +10:00
Steven Rostedt
ee456bb346 powerpc/ftrace: Have PPC skip updating with stop_machine()
PowerPC does not have the synchronization issues that x86 has with
modifying code on one CPU while another CPU is executing it.
The other CPU will either see the old or new code without any
issues, unlike x86 which may issue a GPF.

Instead of calling the heavy stop_machine, just update the code.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:38 +10:00
Steven Rostedt
2d773aa481 powerpc/ftrace: Do not trace restore_interrupts()
As I was adding code that affects all archs, I started testing function
tracer against PPC64 and found that it currently locks up with 3.4
kernel. I figured it was due to tracing a function that shouldn't be, so
I went through the following process to bisect to find the culprit:

 cat /debug/tracing/available_filter_functions > t
 num=`wc -l t`
 sed -ne "1,${num}p" t > t1
 let num=num+1
 sed -ne "${num},$p" t > t2
 cat t1 > /debug/tracing/set_ftrace_filter
 echo function /debug/tracing/current_tracer
 <failed? bisect t1, if not bisect t2>

It finally came down to this function: restore_interrupts()

I'm not sure why this locks up the system. It just seems to prevent
scheduling from occurring. Interrupts seem to still work, as I can ping
the box. But all user processes freeze.

When restore_interrupts() is not traced, function tracing works fine.

Cc: stable@kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-06-29 14:35:36 +10:00
Li Zhong
2cb387ae75 powerpc: Fix Section mismatch warnings in prom_init.c
This patches tries to fix a couple of Section mismatch warnings like
following one:

WARNING: arch/powerpc/kernel/built-in.o(.text+0x2923c): Section mismatch
in reference from the function .prom_query_opal() to the
function .init.text:.call_prom()
The function .prom_query_opal() references
the function __init .call_prom().
This is often because .prom_query_opal lacks a __init
annotation or the annotation of .call_prom is wrong.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-06-29 14:35:36 +10:00
Tiejun Chen
c58ce2b1e3 ppc64: fix missing to check all bits of _TIF_USER_WORK_MASK in preempt
In entry_64.S version of ret_from_except_lite, you'll notice that
in the !preempt case, after we've checked MSR_PR we test for any
TIF flag in _TIF_USER_WORK_MASK to decide whether to go to do_work
or not. However, in the preempt case, we do a convoluted trick to
test SIGPENDING only if PR was set and always test NEED_RESCHED ...
but we forget to test any other bit of _TIF_USER_WORK_MASK !!! So
that means that with preempt, we completely fail to test for things
like single step, syscall tracing, etc...

This should be fixed as the following path:

 - Test PR. If not set, go to resume_kernel, else continue.

 - If go resume_kernel, to do that original do_work.

 - If else, then always test for _TIF_USER_WORK_MASK to decide to do
that original user_work, else restore directly.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-06-29 14:35:35 +10:00
Yinghai Lu
be8e60d8be powerpc/PCI: register busn_res for root buses
Add the host bridge bus number aperture to the resource list.
Like the MMIO and I/O port apertures, this is used when assigning
resources to hot-added devices or in the case of conflicts.

[bhelgaas: changelog]
CC: Paul Mackerras <paulus@samba.org>
CC: linuxppc-dev@lists.ozlabs.org
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-06-13 15:42:24 -06:00
Yinghai Lu
b918c62e08 PCI: replace struct pci_bus secondary/subordinate with busn_res
Replace the struct pci_bus secondary/subordinate members with the
struct resource busn_res.  Later we'll build a resource tree of these
bus numbers.

[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-06-13 15:42:22 -06:00
Steffen Rumler
3c75296562 powerpc: Fix kernel panic during kernel module load
This fixes a problem which can causes kernel oopses while loading
a kernel module.

According to the PowerPC EABI specification, GPR r11 is assigned
the dedicated function to point to the previous stack frame.
In the powerpc-specific kernel module loader, do_plt_call()
(in arch/powerpc/kernel/module_32.c), GPR r11 is also used
to generate trampoline code.

This combination crashes the kernel, in the case where the compiler
chooses to use a helper function for saving GPRs on entry, and the
module loader has placed the .init.text section far away from the
.text section, meaning that it has to generate a trampoline for
functions in the .init.text section to call the GPR save helper.
Because the trampoline trashes r11, references to the stack frame
using r11 can cause an oops.

The fix just uses GPR r12 instead of GPR r11 for generating the
trampoline code.  According to the statements from Freescale, this is
safe from an EABI perspective.

I've tested the fix for kernel 2.6.33 on MPC8541.

Cc: stable@vger.kernel.org
Signed-off-by: Steffen Rumler <steffen.rumler.ext@nsn.com>
[paulus@samba.org: reworded the description]
Signed-off-by: Paul Mackerras <paulus@samba.org>
2012-06-08 19:59:08 +10:00
Paul Mackerras
860aed25a1 powerpc/time: Sanity check of decrementer expiration is necessary
This reverts 68568add2c ("powerpc/time: Remove unnecessary sanity check
of decrementer expiration").  We do need to check whether we have reached
the expiration time of the next event, because we sometimes get an early
decrementer interrupt, most notably when we set the decrementer to 1 in
arch_irq_work_raise().  The effect of not having the sanity check is that
if timer_interrupt() gets called early, we leave the decrementer set to
its maximum value, which means we then don't get any more decrementer
interrupts for about 4 seconds (or longer, depending on timebase
frequency).  I saw these pauses as a consequence of getting a stray
hypervisor decrementer interrupt left over from exiting a KVM guest.

This isn't quite a straight revert because of changes to the surrounding
code, but it restores the same algorithm as was previously used.

Cc: stable@vger.kernel.org
Acked-by: Anton Blanchard <anton@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2012-06-08 14:07:35 +10:00
Avi Kivity
25e531a988 Merge branch 'for-upstream' of git://github.com/agraf/linux-2.6 into next
Alex says:

"Changes this time include:

  - Generalize KVM_GUEST support to overall ePAPR code
  - Fix reset for Book3S HV
  - Fix machine check deferral when CONFIG_KVM_GUEST=y
  - Add support for BookE register DECAR"

* 'for-upstream' of git://github.com/agraf/linux-2.6:
  KVM: PPC: Not optimizing MSR_CE and MSR_ME with paravirt.
  KVM: PPC: booke: Added DECAR support
  KVM: PPC: Book3S HV: Make the guest hash table size configurable
  KVM: PPC: Factor out guest epapr initialization

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-06 15:31:34 +03:00
Al Viro
efee984c27 new helper: signal_delivered()
Does block_sigmask() + tracehook_signal_handler();  called when
sigframe has been successfully built.  All architectures converted
to it; block_sigmask() itself is gone now (merged into this one).

I'm still not too happy with the signature, but that's a separate
story (IMO we need a structure that would contain signal number +
siginfo + k_sigaction, so that get_signal_to_deliver() would fill one,
signal_delivered(), handle_signal() and probably setup...frame() -
take one).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:58:52 -04:00
Al Viro
17440f171e powerpc: get rid of restore_sigmask()
... it's just a call of set_current_blocked() now

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:58:51 -04:00
Al Viro
77097ae503 most of set_current_blocked() callers want SIGKILL/SIGSTOP removed from set
Only 3 out of 63 do not.  Renamed the current variant to __set_current_blocked(),
added set_current_blocked() that will exclude unblockable signals, switched
open-coded instances to it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:58:51 -04:00
Al Viro
a610d6e672 pull clearing RESTORE_SIGMASK into block_sigmask()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:58:49 -04:00
Al Viro
b7f9a11a6c new helper: sigmask_to_save()
replace boilerplate "should we use ->saved_sigmask or ->blocked?"
with calls of obvious inlined helper...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:58:48 -04:00
Al Viro
51a7b448d4 new helper: restore_saved_sigmask()
first fruits of ..._restore_sigmask() helpers: now we can take
boilerplate "signal didn't have a handler, clear RESTORE_SIGMASK
and restore the blocked mask from ->saved_mask" into a common
helper.  Open-coded instances switched...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:58:47 -04:00
Linus Torvalds
fb21affa49 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull second pile of signal handling patches from Al Viro:
 "This one is just task_work_add() series + remaining prereqs for it.

  There probably will be another pull request from that tree this
  cycle - at least for helpers, to get them out of the way for per-arch
  fixes remaining in the tree."

Fix trivial conflict in kernel/irq/manage.c: the merge of Andrew's pile
had brought in commit 97fd75b7b8 ("kernel/irq/manage.c: use the
pr_foo() infrastructure to prefix printks") which changed one of the
pr_err() calls that this merge moves around.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  keys: kill task_struct->replacement_session_keyring
  keys: kill the dummy key_replace_session_keyring()
  keys: change keyctl_session_to_parent() to use task_work_add()
  genirq: reimplement exit_irq_thread() hook via task_work_add()
  task_work_add: generic process-context callbacks
  avr32: missed _TIF_NOTIFY_RESUME on one of do_notify_resume callers
  parisc: need to check NOTIFY_RESUME when exiting from syscall
  move key_repace_session_keyring() into tracehook_notify_resume()
  TIF_NOTIFY_RESUME is defined on all targets now
2012-05-31 18:47:30 -07:00
Bharat Bhushan
d35b1075af KVM: PPC: Not optimizing MSR_CE and MSR_ME with paravirt.
If there is pending critical or machine check interrupt then guest
would like to capture it when guest enable MSR.CE and MSR_ME respectively.
Also as mostly MSR_CE and MSR_ME are updated with rfi/rfci/rfmii
which anyway traps so removing the the paravirt optimization for MSR.CE
and MSR.ME.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-05-30 11:43:11 +02:00
Liu Yu-B13201
2e1ae9c07d KVM: PPC: Factor out guest epapr initialization
epapr paravirtualization support is now a Kconfig
selectable option

Signed-off-by: Liu Yu <yu.liu@freescale.com>
[stuart.yoder@freescale.com: misc minor fixes, description update]
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-05-30 11:43:10 +02:00
Paul Mackerras
1629372caa powerpc: Use the new generic strncpy_from_user() and strnlen_user()
This is much the same as for SPARC except that we can do the find_zero()
function more efficiently using the count-leading-zeroes instructions.
Tested on 32-bit and 64-bit PowerPC.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-27 21:00:07 -07:00
Linus Torvalds
07acfc2a93 Merge branch 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM changes from Avi Kivity:
 "Changes include additional instruction emulation, page-crossing MMIO,
  faster dirty logging, preventing the watchdog from killing a stopped
  guest, module autoload, a new MSI ABI, and some minor optimizations
  and fixes.  Outside x86 we have a small s390 and a very large ppc
  update.

  Regarding the new (for kvm) rebaseless workflow, some of the patches
  that were merged before we switch trees had to be rebased, while
  others are true pulls.  In either case the signoffs should be correct
  now."

Fix up trivial conflicts in Documentation/feature-removal-schedule.txt
arch/powerpc/kvm/book3s_segment.S and arch/x86/include/asm/kvm_para.h.

I suspect the kvm_para.h resolution ends up doing the "do I have cpuid"
check effectively twice (it was done differently in two different
commits), but better safe than sorry ;)

* 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (125 commits)
  KVM: make asm-generic/kvm_para.h have an ifdef __KERNEL__ block
  KVM: s390: onereg for timer related registers
  KVM: s390: epoch difference and TOD programmable field
  KVM: s390: KVM_GET/SET_ONEREG for s390
  KVM: s390: add capability indicating COW support
  KVM: Fix mmu_reload() clash with nested vmx event injection
  KVM: MMU: Don't use RCU for lockless shadow walking
  KVM: VMX: Optimize %ds, %es reload
  KVM: VMX: Fix %ds/%es clobber
  KVM: x86 emulator: convert bsf/bsr instructions to emulate_2op_SrcV_nobyte()
  KVM: VMX: unlike vmcs on fail path
  KVM: PPC: Emulator: clean up SPR reads and writes
  KVM: PPC: Emulator: clean up instruction parsing
  kvm/powerpc: Add new ioctl to retreive server MMU infos
  kvm/book3s: Make kernel emulated H_PUT_TCE available for "PR" KVM
  KVM: PPC: bookehv: Fix r8/r13 storing in level exception handler
  KVM: PPC: Book3S: Enable IRQs during exit handling
  KVM: PPC: Fix PR KVM on POWER7 bare metal
  KVM: PPC: Fix stbux emulation
  KVM: PPC: bookehv: Use lwz/stw instead of PPC_LL/PPC_STL for 32-bit fields
  ...
2012-05-24 16:17:30 -07:00
Al Viro
a42c6ded82 move key_repace_session_keyring() into tracehook_notify_resume()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-23 22:09:20 -04:00
Linus Torvalds
f9369910a6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull first series of signal handling cleanups from Al Viro:
 "This is just the first part of the queue (about a half of it);
  assorted fixes all over the place in signal handling.

  This one ends with all sigsuspend() implementations switched to
  generic one (->saved_sigmask-based).

  With this, a bunch of assorted old buglets are fixed and most of the
  missing bits of NOTIFY_RESUME hookup are in place.  Two more fixes sit
  in arm and um trees respectively, and there's a couple of broken ones
  that need obvious fixes - parisc and avr32 check TIF_NOTIFY_RESUME
  only on one of two codepaths; fixes for that will happen in the next
  series"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (55 commits)
  unicore32: if there's no handler we need to restore sigmask, syscall or no syscall
  xtensa: add handling of TIF_NOTIFY_RESUME
  microblaze: drop 'oldset' argument of do_notify_resume()
  microblaze: handle TIF_NOTIFY_RESUME
  score: add handling of NOTIFY_RESUME to do_notify_resume()
  m68k: add TIF_NOTIFY_RESUME and handle it.
  sparc: kill ancient comment in sparc_sigaction()
  h8300: missing checks of __get_user()/__put_user() return values
  frv: missing checks of __get_user()/__put_user() return values
  cris: missing checks of __get_user()/__put_user() return values
  powerpc: missing checks of __get_user()/__put_user() return values
  sh: missing checks of __get_user()/__put_user() return values
  sparc: missing checks of __get_user()/__put_user() return values
  avr32: struct old_sigaction is never used
  m32r: struct old_sigaction is never used
  xtensa: xtensa_sigaction doesn't exist
  alpha: tidy signal delivery up
  score: don't open-code force_sigsegv()
  cris: don't open-code force_sigsegv()
  blackfin: don't open-code force_sigsegv()
  ...
2012-05-23 18:11:45 -07:00
Linus Torvalds
ec0d7f18ab Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull fpu state cleanups from Ingo Molnar:
 "This tree streamlines further aspects of FPU handling by eliminating
  the prepare_to_copy() complication and moving that logic to
  arch_dup_task_struct().

  It also fixes the FPU dumps in threaded core dumps, removes and old
  (and now invalid) assumption plus micro-optimizes the exit path by
  avoiding an FPU save for dead tasks."

Fixed up trivial add-add conflict in arch/sh/kernel/process.c that came
in because we now do the FPU handling in arch_dup_task_struct() rather
than the legacy (and now gone) prepare_to_copy().

* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, fpu: drop the fpu state during thread exit
  x86, xsave: remove thread_has_fpu() bug check in __sanitize_i387_state()
  coredump: ensure the fpu state is flushed for proper multi-threaded core dump
  fork: move the real prepare_to_copy() users to arch_dup_task_struct()
2012-05-23 10:59:07 -07:00
Linus Torvalds
6f73b3629f Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc updates from Benjamin Herrenschmidt:
 "Here are the powerpc goodies for 3.5.  Main highlights are:

   - Support for the NX crypto engine in Power7+
   - A bunch of Anton goodness, including some micro optimization of our
     syscall entry on Power7
   - I converted a pile of our thermal control drivers to the new i2c
     APIs (essentially turning the old therm_pm72 into a proper set of
     windfarm drivers).  That's one more step toward removing the
     deprecated i2c APIs, there's still a few drivers to fix, but we are
     getting close
   - kexec/kdump support for 47x embedded cores

  The big missing thing here is no updates from Freescale.  Not sure
  what's up here, but with Kumar not working for them anymore things are
  a bit in a state of flux in that area."

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (71 commits)
  powerpc: Fix irq distribution
  Revert "powerpc/hw-breakpoint: Use generic hw-breakpoint interfaces for new PPC ptrace flags"
  powerpc: Fixing a cputhread code documentation
  powerpc/crypto: Enable the PFO-based encryption device
  powerpc/crypto: Build files for the nx device driver
  powerpc/crypto: debugfs routines and docs for the nx device driver
  powerpc/crypto: SHA512 hash routines for nx encryption
  powerpc/crypto: SHA256 hash routines for nx encryption
  powerpc/crypto: AES-XCBC mode routines for nx encryption
  powerpc/crypto: AES-GCM mode routines for nx encryption
  powerpc/crypto: AES-ECB mode routines for nx encryption
  powerpc/crypto: AES-CTR mode routines for nx encryption
  powerpc/crypto: AES-CCM mode routines for nx encryption
  powerpc/crypto: AES-CBC mode routines for nx encryption
  powerpc/crypto: nx driver code supporting nx encryption
  powerpc/pseries: Enable the PFO-based RNG accelerator
  powerpc/pseries/hwrng: PFO-based hwrng driver
  powerpc/pseries: Add PFO support to the VIO bus
  powerpc/pseries: Add pseries update notifier for OFDT prop changes
  powerpc/pseries: Add new hvcall constants to support PFO
  ...
2012-05-23 09:02:42 -07:00
Kim Phillips
2074b1d9d5 powerpc: Fix irq distribution
setting CONFIG_IRQ_ALL_CPUS distributes IRQs to CPUs only when
the number of online CPUs equals NR_CPUS.  See commit
280ff97494 "sparc64: fix and
optimize irq distribution" for more details.

Using the online mask fixes IRQ-to-CPU distribution on systems
that boot with less than NR_CPUS.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-05-22 14:38:26 +10:00
Benjamin Herrenschmidt
6749ef0b8b Revert "powerpc/hw-breakpoint: Use generic hw-breakpoint interfaces for new PPC ptrace flags"
This reverts commit 1b788400bb.

It causes oopses when passed incorrect arguments and has a
design fault using IPIs with interrupts disabled.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
2012-05-22 14:37:24 +10:00
Al Viro
43f16819d5 powerpc: missing checks of __get_user()/__put_user() return values
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-21 23:59:22 -04:00
Al Viro
68f3f16d9a new helper: sigsuspend()
guts of saved_sigmask-based sigsuspend/rt_sigsuspend.  Takes
kernel sigset_t *.

Open-coded instances replaced with calling it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-21 23:52:30 -04:00
Linus Torvalds
cb60e3e65c Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "New notable features:
   - The seccomp work from Will Drewry
   - PR_{GET,SET}_NO_NEW_PRIVS from Andy Lutomirski
   - Longer security labels for Smack from Casey Schaufler
   - Additional ptrace restriction modes for Yama by Kees Cook"

Fix up trivial context conflicts in arch/x86/Kconfig and include/linux/filter.h

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (65 commits)
  apparmor: fix long path failure due to disconnected path
  apparmor: fix profile lookup for unconfined
  ima: fix filename hint to reflect script interpreter name
  KEYS: Don't check for NULL key pointer in key_validate()
  Smack: allow for significantly longer Smack labels v4
  gfp flags for security_inode_alloc()?
  Smack: recursive tramsmute
  Yama: replace capable() with ns_capable()
  TOMOYO: Accept manager programs which do not start with / .
  KEYS: Add invalidation support
  KEYS: Do LRU discard in full keyrings
  KEYS: Permit in-place link replacement in keyring list
  KEYS: Perform RCU synchronisation on keys prior to key destruction
  KEYS: Announce key type (un)registration
  KEYS: Reorganise keys Makefile
  KEYS: Move the key config into security/keys/Kconfig
  KEYS: Use the compat keyctl() syscall wrapper on Sparc64 for Sparc32 compat
  Yama: remove an unused variable
  samples/seccomp: fix dependencies on arch macros
  Yama: add additional ptrace scopes
  ...
2012-05-21 20:27:36 -07:00
Linus Torvalds
bf67f3a5c4 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull smp hotplug cleanups from Thomas Gleixner:
 "This series is merily a cleanup of code copied around in arch/* and
  not changing any of the real cpu hotplug horrors yet.  I wish I'd had
  something more substantial for 3.5, but I underestimated the lurking
  horror..."

Fix up trivial conflicts in arch/{arm,sparc,x86}/Kconfig and
arch/sparc/include/asm/thread_info_32.h

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (79 commits)
  um: Remove leftover declaration of alloc_task_struct_node()
  task_allocator: Use config switches instead of magic defines
  sparc: Use common threadinfo allocator
  score: Use common threadinfo allocator
  sh-use-common-threadinfo-allocator
  mn10300: Use common threadinfo allocator
  powerpc: Use common threadinfo allocator
  mips: Use common threadinfo allocator
  hexagon: Use common threadinfo allocator
  m32r: Use common threadinfo allocator
  frv: Use common threadinfo allocator
  cris: Use common threadinfo allocator
  x86: Use common threadinfo allocator
  c6x: Use common threadinfo allocator
  fork: Provide kmemcache based thread_info allocator
  tile: Use common threadinfo allocator
  fork: Provide weak arch_release_[task_struct|thread_info] functions
  fork: Move thread info gfp flags to header
  fork: Remove the weak insanity
  sh: Remove cpu_idle_wait()
  ...
2012-05-21 19:43:57 -07:00
Suresh Siddha
55ccf3fe3f fork: move the real prepare_to_copy() users to arch_dup_task_struct()
Historical prepare_to_copy() is mostly a no-op, duplicated for majority of
the architectures and the rest following the x86 model of flushing the extended
register state like fpu there.

Remove it and use the arch_dup_task_struct() instead.

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1336692811-30576-1-git-send-email-suresh.b.siddha@intel.com
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Chris Zankel <chris@zankel.net>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-16 15:16:26 -07:00
Kent Yoder
7e3a4fa160 powerpc/crypto: Enable the PFO-based encryption device
This patch adds the cas bits to advertise support for the Platform
Facilities Option (PFO) based encryption accelerator device. The nx
device driver provides support for this hardware feature.

Signed-off-by: Kent Yoder <key@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-05-16 15:05:46 +10:00
Kent Yoder
828d2b5971 powerpc/pseries: Enable the PFO-based RNG accelerator
This patch adds the cas bits to advertise support for the Platform
Facilities Option (PFO) based random number generator accerator.
The pseries-rng driver provides support for this hardware feature.

Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: Kent Yoder <key@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-05-14 10:49:14 +10:00
Kent Yoder
f2ab621996 powerpc/pseries: Add PFO support to the VIO bus
Add support for the Platform Facilities Option (PFO) to the VIO bus.
These devices have a separate root node in OpenFirmware which
requires additional parsing to map into the existing VIO device
structure fields. This adds the interface for PFO device drivers to
make synchronous hypervisor calls.

Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: Kent Yoder <key@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-05-14 10:49:09 +10:00
K.Prasad
1b788400bb powerpc/hw-breakpoint: Use generic hw-breakpoint interfaces for new PPC ptrace flags
PPC_PTRACE_GETHWDBGINFO, PPC_PTRACE_SETHWDEBUG and PPC_PTRACE_DELHWDEBUG are
PowerPC specific ptrace flags that use the watchpoint register. While they are
targeted primarily towards BookE users, user-space applications such as GDB
have started using them for BookS too. This patch enables the use of generic
hardware breakpoint interfaces for these new flags.

Apart from the usual benefits of using generic hw-breakpoint interfaces, these
changes allow debuggers (such as GDB) to use a common set of ptrace flags for
their watchpoint needs and allow more precise breakpoint specification (length
of the variable can be specified).

Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-05-14 10:48:55 +10:00
Robert Jennings
404e32e4a8 powerpc/pseries: Support lower minimum entitlement for virtual processors
This patch changes the architecture vector to advertise support for a
lower minimum virtual processor entitled capacity.  The default
minimum without this patch is 10%, this patch specifies 1%.

Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-05-14 10:30:13 +10:00
Benjamin Herrenschmidt
8b6ee04067 Merge branch 'merge' into next
We want the irq fixes from the "merge" branch.
2012-05-14 10:19:22 +10:00
Benjamin Herrenschmidt
7c0482e3d0 powerpc/irq: Fix another case of lazy IRQ state getting out of sync
So we have another case of paca->irq_happened getting out of
sync with the HW irq state. This can happen when a perfmon
interrupt occurs while soft disabled, as it will return to a
soft disabled but hard enabled context while leaving a stale
PACA_IRQ_HARD_DIS flag set.

This patch fixes it, and also adds a test for the condition
of those flags being out of sync in arch_local_irq_restore()
when CONFIG_TRACE_IRQFLAGS is enabled.

This helps catching those gremlins faster (and so far I
can't seem see any anymore, so that's good news).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-05-12 09:40:41 +10:00
Benjamin Herrenschmidt
b48d441a8a Merge remote-tracking branch 'jwb/next' into next
Josh writes:
<<
A few patches from Suzie for 47x kexec/kdump support, and some MSI patches
from Mai La.
>>
2012-05-10 12:58:24 +10:00
Benjamin Herrenschmidt
ea4e89afed Merge branch 'merge' into next 2012-05-09 10:57:57 +10:00
Benjamin Herrenschmidt
a3512b2dd5 powerpc/irq: Make alignment & program interrupt behave the same
Alignment was the last user of the ENABLE_INTS macro, which we can
now remove. All non-syscall exceptions now disable interrupts on
entry, they get re-enabled conditionally from C code. Don't
unconditionally re-enable in program check either, check the
original context.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-05-09 09:42:33 +10:00
Benjamin Herrenschmidt
56dfa7fa19 powerpc/irq: Fix bug with new lazy IRQ handling code
We had a case where we could turn on hard interrupts while
leaving the PACA_IRQ_HARD_DIS bit set in the PACA. This can
in turn cause a BUG_ON() to hit in __check_irq_replay() due
to interrupt state getting out of sync.

The assembly code was also way too convoluted. Instead, we
now leave it to the C code to do the right thing which ends
up being smaller and more readable.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-05-09 09:42:21 +10:00
Thomas Gleixner
96c9511797 powerpc: Use common threadinfo allocator
The core now has a threadinfo allocator which uses a kmemcache when
THREAD_SIZE < PAGE_SIZE.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Link: http://lkml.kernel.org/r/20120505150142.059161130@linutronix.de
2012-05-08 14:08:45 +02:00
Thomas Gleixner
67ba5293f7 Merge branch 'smp/threadalloc' into smp/hotplug
Reason: Pull in the separate branch which was created so arch/tile can
base further work on it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-05-08 14:07:48 +02:00
Thomas Gleixner
c9b92b8407 powerpc: Remove unused cpu_idle_wait()
cpuidle uses a generic function now. Remove the cruft.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Link: http://lkml.kernel.org/r/20120507175652.330322737@linutronix.de
2012-05-08 12:35:07 +02:00
Thomas Gleixner
9cd75e13de powerpc: Fix broken cpu_idle_wait() implementation
commit 771dae818 (powerpc/cpuidle: Add cpu_idle_wait() to allow
switching of idle routines) implemented cpu_idle_wait() for powerpc.

The changelog says:
 "The equivalent routine for x86 is in arch/x86/kernel/process.c
  but the powerpc implementation is different.":

Unfortunately the changelog is completely useless as it does not tell
_WHY_ it is different.

Aside of being different the implementation is patently wrong.

The rescheduling IPI is async. That means that there is no guarantee,
that the other cores have executed the IPI when cpu_idle_wait()
returns. But that's the whole purpose of this function: to guarantee
that no CPU uses the old idle handler anymore.

Use the smp_functional_call() based implementation, which fulfils the
requirements.

[ This code is going to replaced by a core version to remove all the
  pointless copies in arch/*, but this one should go to stable ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Cc: Trinabh Gupta <g.trinabh@gmail.com>
Cc: Arun R Bharadwaj <arun.r.bharadwaj@gmail.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Link: http://lkml.kernel.org/r/20120507175651.980164748@linutronix.de
Cc: stable@vger.kernel.org
2012-05-08 12:35:05 +02:00
Benjamin Herrenschmidt
5b74716eba kvm/powerpc: Add new ioctl to retreive server MMU infos
This is necessary for qemu to be able to pass the right information
to the guest, such as the supported page sizes and corresponding
encodings in the SLB and hash table, which can vary depending
on the processor type, the type of KVM used (PR vs HV) and the
version of KVM

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[agraf: fix compilation on hv, adjust for newer ioctl numbers]
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-05-06 16:19:12 +02:00
Bharat Bhushan
6e35994d1f KVM: PPC: Use clockevent multiplier and shifter for decrementer
Time for which the hrtimer is started for decrementer emulation is calculated
using tb_ticks_per_usec. While hrtimer uses the clockevent for DEC
reprogramming (if needed) and which calculate timebase ticks using the
multiplier and shifter mechanism implemented within clockevent layer.

It was observed that this conversion (timebase->time->timebase) are not
correct because the mechanism are not consistent.
In our setup it adds 2% jitter.

With this patch clockevent multiplier and shifter mechanism are used when
starting hrtimer for decrementer emulation. Now the jitter is < 0.5%.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-05-06 16:19:07 +02:00
Thomas Gleixner
b0ce50aa89 powerpc: Use generic init_task
Same code. Use the generic version.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Link: http://lkml.kernel.org/r/20120503085035.211123184@linutronix.de
2012-05-05 13:00:25 +02:00
James Morris
898bfc1d46 Linux 3.4-rc5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQEcBAABAgAGBQJPnb50AAoJEHm+PkMAQRiGAE0H/A4zFZIUGmF3miKPDYmejmrZ
 oVDYxVAu6JHjHWhu8E3VsinvyVscowjV8dr15eSaQzmDmRkSHAnUQ+dB7Di7jLC2
 MNopxsWjwyZ8zvvr3rFR76kjbWKk/1GYytnf7GPZLbJQzd51om2V/TY/6qkwiDSX
 U8Tt7ihSgHAezefqEmWp2X/1pxDCEt+VFyn9vWpkhgdfM1iuzF39MbxSZAgqDQ/9
 JJrBHFXhArqJguhENwL7OdDzkYqkdzlGtS0xgeY7qio2CzSXxZXK4svT6FFGA8Za
 xlAaIvzslDniv3vR2ZKd6wzUwFHuynX222hNim3QMaYdXm012M+Nn1ufKYGFxI0=
 =4d4w
 -----END PGP SIGNATURE-----

Merge tag 'v3.4-rc5' into next

Linux 3.4-rc5

Merge to pull in prerequisite change for Smack:
86812bb0de

Requested by Casey.
2012-05-04 12:46:40 +10:00
Suzuki Poulose
6834302003 powerpc/47x: Kernel support for KEXEC
This patch adds support for creating 1:1 mapping for the PPC_47x during
a KEXEC. The implementation is similar to that of the PPC440x which is
described here :

	http://patchwork.ozlabs.org/patch/104323/

PPC_47x MMU :

The 47x uses Unified TLB 1024 entries, with 4-way associative mapping
(4 x 256 entries). The index to be used is calculated by the MMU by
hashing the PID, EPN and TS. The software can choose to specify the way
by setting bit 0(enable way select) and the way in bits 1-2 in the TLB
Word 0.

Implementation:

The patch erases all the UTLB entries which includes the tlb covering
the mapping for our code. The shadow TLB caches the mapping for the
running code which helps us to continue the execution until we do
isync/rfi. We then create a tmp mapping for the current code in the
other address space (TS) and switch to it.

Then we create a 1:1 mapping(EPN=RPN) for 0-2GiB in the original
address space and switch to the new mapping.

TODO: Add SMP support.

Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@gmail.com>
2012-05-03 08:40:23 -04:00
Suzuki Poulose
f13bfcc696 powerpc/44x: Fix/Initialize PID to kernel PID before the TLB search
Initialize the PID register with kernel pid (0) before we start
setting the TLB mapping for KEXEC. Also set the MMUCR[TID] to kernel
PID.

This was spotted while testing the kexec on ISS for 47x. ISS  doesn't
return a successful tlbsx for a kernel address with PID set to a user PID.
Though the hardware/qemu/simics work fine.

This patch is harmless and initializes the PID to 0 (kernel PID) which
is usually the case during a normal kernel boot. This would fix the kexec
on ISS for 440. I have tested this patch on sequoia board.

Signed-off-by: Suzuki K Poulose <suzuki@in.ibm.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Signed-off-by: Josh Boyer <jwboyer@gmail.com>
2012-05-03 08:37:36 -04:00
Anton Blanchard
ec34a68149 powerpc: Remove old powerpc specific ptrace getregs/setregs calls
PowerPC has non standard getregs calls that only dump the GPRs or
FPRs and have their arguments reversed. commit e17666ba48 (ptrace
updates & new, better requests) in 2.6.3 deprecated them and introduced
more standard versions.

It's been about 5 years and I know of no users of the old calls so
lets remove them.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-04-30 15:37:28 +10:00
Anton Blanchard
694caf0255 powerpc: Remove CONFIG_POWER4_ONLY
Remove CONFIG_POWER4_ONLY, the option is badly named and only does two
things:

- It wraps the MMU segment table code. With feature fixups there is
  little downside to compiling this in.

- It uses the newer mtocrf instruction in various assembly functions.
  Instead of making this a compile option just do it at runtime via
  a feature fixup.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-04-30 15:37:26 +10:00
Anton Blanchard
35000870fc powerpc: Optimise enable_kernel_altivec
Add two optimisations to enable_kernel_altivec:

- enable_kernel_altivec has already determined if we need to
save the previous task's state but we call giveup_altivec
in both cases, requiring an extra branch in giveup_altivec. Create
giveup_altivec_notask which only turns on the VMX bit in the
MSR.

- We write the VMX MSR bit each time we call enable_kernel_altivec
even it was already set. Check the bit and branch out if we have
already set it. The classic case for this is vectored IO
where we have to copy multiple buffers to or from userspace.

The following testcase was used to confirm this patch improves
performance:

http://ozlabs.org/~anton/junkcode/copy_to_user.c

Since the current breakpoint for using VMX in copy_tofrom_user is
4096 bytes, I'm using buffers of 4096 + 1 cacheline (4224) bytes.
A benchmark of 16 entry readvs (-s 16):

time copy_to_user -l 4224 -s 16 -i 1000000

completes 5.2% faster on a POWER7 PS700.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-04-30 15:37:17 +10:00
Anton Blanchard
8cd3c23df7 powerpc: Remove empty giveup_altivec function on book3e CPUs
Use an empty inline instead of an empty function to implement
giveup_altivec on book3e CPUs, similar to flush_altivec_to_thread.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-04-30 15:37:16 +10:00
Anton Blanchard
448054a650 powerpc: Remove iseries specific fields in lppaca
Remove all the iseries specific fields in the lppaca.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-04-30 15:37:16 +10:00
Anton Blanchard
fd6c40f3b0 powerpc: Better scheduling of CR save code in system call path
At the moment system call entry looks like:

crclr	so
...
mfcr	r9
...
std	r9,_CCR(r1)

commit bd19c8994a ([POWERPC] system call micro optimisation) put
some space between the crclr and mfcr in order to avoid a stall.

There is still a stall seen between the mfcr and std. We can avoid
the crclr by doing it in a GPR with rlwinm which gives us more room
to better schedule the sequence.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-04-30 15:37:14 +10:00
Anton Blanchard
82087414c6 powerpc: No need to preserve count register across system call
The count register is volatile so we don't need to preserve it.
Store zero to the entry in the exception frame.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-04-30 15:35:10 +10:00
Anton Blanchard
823df43552 powerpc: No need to save XER in a system call
The XER is a volatile register so there is no need to save and restore
it over a system call - zero it out in the exception stack frame
instead.

This should fix a 5 cycle stall of the mfxer/std seen on POWER7.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-04-30 15:34:44 +10:00
Anton Blanchard
d14299dec7 powerpc: Hide some system call labels from profile tools
syscall_dotrace_cont and syscall_error_cont tend to complicate perf
output so make them local.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-04-30 15:34:43 +10:00
Grant Likely
4013369f37 powerpc/irqdomain: Fix broken NR_IRQ references
The switch from using irq_map to irq_alloc_desc*() for managing irq
number allocations introduced new bugs in some of the powerpc
interrupt code.  Several functions rely on the value of NR_IRQS to
determine the maximum irq number that could get allocated.  However,
with sparse_irq and using irq_alloc_desc*() the maximum possible irq
number is now specified with 'nr_irqs' which may be a number larger
than NR_IRQS.  This has caused breakage on powermac when
CONFIG_NR_IRQS is set to 32.

This patch removes most of the direct references to NR_IRQS in the
powerpc code and replaces them with either a nr_irqs reference or by
using the common for_each_irq_desc() macro.  The powerpc-specific
for_each_irq() macro is removed at the same time.

Also, the Cell axon_msi driver is refactored to remove the global
build assumption on the size of NR_IRQS and instead add a limit to the
maximum irq number when calling irq_domain_add_nomap().

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-04-30 10:45:26 +10:00
Thomas Gleixner
17e32eacc3 powerpc: Use generic idle thread allocation
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Link: http://lkml.kernel.org/r/20120420124557.311212868@linutronix.de
2012-04-26 12:06:10 +02:00
Thomas Gleixner
8239c25f47 smp: Add task_struct argument to __cpu_up()
Preparatory patch to make the idle thread allocation for secondary
cpus generic.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/20120420124556.964170564@linutronix.de
2012-04-26 12:06:09 +02:00
Benjamin Herrenschmidt
aec49c7c0e Merge remote-tracking branch 'kumar/merge' into merge 2012-04-23 10:55:20 +10:00
Marcelo Tosatti
eac0556750 Merge branch 'linus' into queue
Merge reason: development work has dependency on kvm patches merged
upstream.

Conflicts:
	Documentation/feature-removal-schedule.txt

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-04-19 17:06:26 -03:00
Baruch Siach
eda713e219 powerpc: fix build when CONFIG_BOOKE_WDT is enabled
Commit ae3a197e (Disintegrate asm/system.h for PowerPC) broke build of
assembly files when CONFIG_BOOKE_WDT is enabled as follows:

  AS      arch/powerpc/lib/string.o
/home/baruch/git/stable/arch/powerpc/include/asm/reg_booke.h: Assembler messages:
/home/baruch/git/stable/arch/powerpc/include/asm/reg_booke.h:19: Error: Unrecognized opcode: `extern'
/home/baruch/git/stable/arch/powerpc/include/asm/reg_booke.h:20: Error: Unrecognized opcode: `extern'

Since setup_32.c is the only user of the booke_wdt configuration variables, move
the declarations there.

Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2012-04-19 14:55:43 -05:00
Will Drewry
e4da89d02f seccomp: ignore secure_computing return values
This change is inspired by
  https://lkml.org/lkml/2012/4/16/14
which fixes the build warnings for arches that don't support
CONFIG_HAVE_ARCH_SECCOMP_FILTER.

In particular, there is no requirement for the return value of
secure_computing() to be checked unless the architecture supports
seccomp filter.  Instead of silencing the warnings with (void)
a new static inline is added to encode the expected behavior
in a compiler and human friendly way.

v2: - cleans things up with a static inline
    - removes sfr's signed-off-by since it is a different approach
v1: - matches sfr's original change

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Will Drewry <wad@chromium.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-04-18 12:24:50 +10:00
Linus Torvalds
7e06648972 irqdomain bug fixes for v3.4-rc3
This branch fixes a bug in irq_create_mapping() where an error return
 from irq_alloc_desc_from() gets ignored.  It also removes irq_virq_count
 to fix a bug on powerpc where the irqdomain code does not find irqs
 allocated above the CONFIG_NR_IRQS boundary.  The remaining patches get
 rid of an completely pointless export and fix some minor bugs in the
 irqdomain debug output.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPhni4AAoJEEFnBt12D9kBA/cP/jv3ENYDy2/g1/eE6W1aSkUf
 /7FlfpXsufS0Bl+wfk7sN8D1NLoB/36bLVU0TStup90vL03WT9A+BHl9tjogpZVz
 oDuLFYHSuVVOK40SSrcnOUc6rncKAni9tGjVjFCxVAx3FlqebTHWDu/Cl4BAaWBo
 +j2u4HHelHgr8oXCY5avWS0cOn3L7rIoJ54/Jqpn10OooqH2cgz9xYMb+1/ORfz1
 xjpJ4OiXKnSvuG7WD0S1EKPMbaiyak+jBoHYYNpEOriTMtcOTNg5hjz7b3jDfOrm
 gkNReffdDXCnsCPj/1gEhJlB4i+iTES0lTBVfOZ8M2luhF6wuGUYeRaiy+/m00DZ
 qYFXD5TaVM0+2USCeo71DPfag8now6YrJNIv93CGEY0fLGDJJg2yJI3oUN728p9a
 E88JLPs8f//8rxQaBatGtHmReD4wKwCevciVekSWZSROnPxnIP8PvBPq8e4Bf04r
 q+VBmr+gJh+oaDAZrIaRPsRCidHhwzIrexa4cv7rt84vnx2Hltq75ijaPNlR3JU7
 FFhZj1l8185HxXEsTJHEmiKN0J/drVIu/beGgHD7NbWWIdt8tqgtNOEUudVTisfM
 VgBdgjjbKFwQDuOxgaYgERwCkb1YXFT/kDKpgKaYnxl0yGaALjxO+ISd2fIJOuKO
 fzeVN4LDvVCysAQ/SeOG
 =6Ejq
 -----END PGP SIGNATURE-----

Merge tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6

Pull irqdomain bug fixes from Grant Likely:
 "This branch fixes a bug in irq_create_mapping() where an error return
  from irq_alloc_desc_from() gets ignored.

  It also removes irq_virq_count to fix a bug on powerpc where the
  irqdomain code does not find irqs allocated above the CONFIG_NR_IRQS
  boundary.

  The remaining patches get rid of an completely pointless export and
  fix some minor bugs in the irqdomain debug output."

* tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6:
  irq_domain: Move irq_virq_count into NOMAP revmap
  irqdomain: Fix debugfs formatting
  irq_domain: correct the debugfs file name
  irq: Kill pointless irqd_to_hw export
  irq/irq_domain: Quit ignoring error returns from irq_alloc_desc_from().
2012-04-12 12:49:56 -07:00
Linus Torvalds
45852766a0 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc fixes from Benjamin Herrenschmidt:
 "Fixes for two nasty regression affecting powerpc in 3.4."

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc: Fix typo in runlatch code
  powerpc: Fix page fault with lockdep regression
2012-04-11 20:56:28 -07:00
Grant Likely
a699e4e49e irq: Kill pointless irqd_to_hw export
It makes no sense to export this trivial function.  Make it a static inline
instead.

This patch also drops virq_to_hw from arch/c6x since it is unused by that
architecture.

v2: Move irq_hw_number_t into types.h to fix ARM build failure

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-04-10 22:39:17 -06:00
Benjamin Herrenschmidt
fae2e0fb24 powerpc: Fix typo in runlatch code
Commit fe1952fc0a
"powerpc: Rework runlatch code" has a nasty typo
where it uses "TLF_RUNLATCH" instead of "_TLF_RUNLATCH"
(bit number instead of bit mask), causing some flags to
be potentially lost such as _TLF_RESTORE_SIGMASK

(Brown paper bag for me ! We should be able to make
that break at compile time with a bit of magic, any
volunteer ?)

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-04-11 10:42:15 +10:00
Benjamin Herrenschmidt
08f1ec8a59 powerpc: Fix page fault with lockdep regression
commit a546498f3b
introduced a regression on 32-bit when irq tracing
is enabled by exposing an old bug in our irq tracing
code for exception entry.

The code would save and restore some GPRs around the
calls to the C lockdep code, however, it tries to be
too smart for its own good and restores some of the
GPRs from the exception frame (as saved there on
exception entry).

However, for page faults, we do replace those GPRs with
arguments to do_page_fault before we call transfer_to_handler
and so restoring from the exception frame is plain wrong in
this case.

This was fine as long as we didn't touch the interrupt state
when taking page fault, but when I started doing it, it would
trigger the lockdep calls and the bug.

This fixes it by cleaning up that code a bit. It did create
a small stack frame for the sake of backtraces, so let's
make it a bit bigger and use it to save and restore the
stuff we care about.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-04-10 17:21:35 +10:00
Paul Mackerras
7657f4089b KVM: PPC: Book 3S: Fix compilation for !HV configs
Commits 2f5cdd5487 ("KVM: PPC: Book3S HV: Make secondary threads more
robust against stray IPIs") and 1c2066b0f7 ("KVM: PPC: Book3S HV: Make
virtual processor area registration more robust") added fields to
struct kvm_vcpu_arch inside #ifdef CONFIG_KVM_BOOK3S_64_HV regions,
and added lines to arch/powerpc/kernel/asm-offsets.c to generate
assembler constants for their offsets.  Unfortunately this led to
compile errors on Book 3S machines for configs that had KVM enabled
but not CONFIG_KVM_BOOK3S_64_HV.  This fixes the problem by moving
the offending lines inside #ifdef CONFIG_KVM_BOOK3S_64_HV regions.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 14:01:34 +03:00
Paul Mackerras
2e25aa5f64 KVM: PPC: Book3S HV: Make virtual processor area registration more robust
The PAPR API allows three sorts of per-virtual-processor areas to be
registered (VPA, SLB shadow buffer, and dispatch trace log), and
furthermore, these can be registered and unregistered for another
virtual CPU.  Currently we just update the vcpu fields pointing to
these areas at the time of registration or unregistration.  If this
is done on another vcpu, there is the possibility that the target vcpu
is using those fields at the time and could end up using a bogus
pointer and corrupting memory.

This fixes the race by making the target cpu itself do the update, so
we can be sure that the update happens at a time when the fields
aren't being used.  Each area now has a struct kvmppc_vpa which is
used to manage these updates.  There is also a spinlock which protects
access to all of the kvmppc_vpa structs, other than to the pinned_addr
fields.  (We could have just taken the spinlock when using the vpa,
slb_shadow or dtl fields, but that would mean taking the spinlock on
every guest entry and exit.)

This also changes 'struct dtl' (which was undefined) to 'struct dtl_entry',
which is what the rest of the kernel uses.

Thanks to Michael Ellerman <michael@ellerman.id.au> for pointing out
the need to initialize vcpu->arch.vpa_update_lock.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 14:01:27 +03:00
Paul Mackerras
f0888f7015 KVM: PPC: Book3S HV: Make secondary threads more robust against stray IPIs
Currently on POWER7, if we are running the guest on a core and we don't
need all the hardware threads, we do nothing to ensure that the unused
threads aren't executing in the kernel (other than checking that they
are offline).  We just assume they're napping and we don't do anything
to stop them trying to enter the kernel while the guest is running.
This means that a stray IPI can wake up the hardware thread and it will
then try to enter the kernel, but since the core is in guest context,
it will execute code from the guest in hypervisor mode once it turns the
MMU on, which tends to lead to crashes or hangs in the host.

This fixes the problem by adding two new one-byte flags in the
kvmppc_host_state structure in the PACA which are used to interlock
between the primary thread and the unused secondary threads when entering
the guest.  With these flags, the primary thread can ensure that the
unused secondaries are not already in kernel mode (i.e. handling a stray
IPI) and then indicate that they should not try to enter the kernel
if they do get woken for any reason.  Instead they will go into KVM code,
find that there is no vcpu to run, acknowledge and clear the IPI and go
back to nap mode.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 14:01:20 +03:00
Scott Wood
73196cd364 KVM: PPC: e500mc support
Add processor support for e500mc, using hardware virtualization support
(GS-mode).

Current issues include:
 - No support for external proxy (coreint) interrupt mode in the guest.

Includes work by Ashish Kalra <Ashish.Kalra@freescale.com>,
Varun Sethi <Varun.Sethi@freescale.com>, and
Liu Yu <yu.liu@freescale.com>.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:54:33 +03:00
Scott Wood
d30f6e4800 KVM: PPC: booke: category E.HV (GS-mode) support
Chips such as e500mc that implement category E.HV in Power ISA 2.06
provide hardware virtualization features, including a new MSR mode for
guest state.  The guest OS can perform many operations without trapping
into the hypervisor, including transitions to and from guest userspace.

Since we can use SRR1[GS] to reliably tell whether an exception came from
guest state, instead of messing around with IVPR, we use DO_KVM similarly
to book3s.

Current issues include:
 - Machine checks from guest state are not routed to the host handler.
 - The guest can cause a host oops by executing an emulated instruction
   in a page that lacks read permission.  Existing e500/4xx support has
   the same problem.

Includes work by Ashish Kalra <Ashish.Kalra@freescale.com>,
Varun Sethi <Varun.Sethi@freescale.com>, and
Liu Yu <yu.liu@freescale.com>.

Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: remove pt_regs usage]
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:51:19 +03:00
Scott Wood
cfac57847a powerpc/booke: Provide exception macros with interrupt name
DO_KVM will need to identify the particular exception type.

There is an existing set of arbitrary numbers that Linux passes,
but it's an undocumented mess that sort of corresponds to server/classic
exception vectors but not really.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:51:17 +03:00