android_kernel_motorola_sm6225/arch/powerpc
Michael Ellerman 1ad332936f powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi()
commit 7c6986ade69e3c81bac831645bc72109cd798a80 upstream.

In raise_backtrace_ipi() we iterate through the cpumask of CPUs, sending
each an IPI asking them to do a backtrace, but we don't wait for the
backtrace to happen.

We then iterate through the CPU mask again, and if any CPU hasn't done
the backtrace and cleared itself from the mask, we print a trace on its
behalf, noting that the trace may be "stale".

This works well enough when a CPU is not responding, because in that
case it doesn't receive the IPI and the sending CPU is left to print the
trace. But when all CPUs are responding we are left with a race between
the sending and receiving CPUs, if the sending CPU wins the race then it
will erroneously print a trace.

This leads to spurious "stale" traces from the sending CPU, which can
then be interleaved messily with the receiving CPU, note the CPU
numbers, eg:

  [ 1658.929157][    C7] rcu: Stack dump where RCU GP kthread last ran:
  [ 1658.929223][    C7] Sending NMI from CPU 7 to CPUs 1:
  [ 1658.929303][    C1] NMI backtrace for cpu 1
  [ 1658.929303][    C7] CPU 1 didn't respond to backtrace IPI, inspecting paca.
  [ 1658.929362][    C1] CPU: 1 PID: 325 Comm: kworker/1:1H Tainted: G        W   E     5.13.0-rc2+ #46
  [ 1658.929405][    C7] irq_soft_mask: 0x01 in_mce: 0 in_nmi: 0 current: 325 (kworker/1:1H)
  [ 1658.929465][    C1] Workqueue: events_highpri test_work_fn [test_lockup]
  [ 1658.929549][    C7] Back trace of paca->saved_r1 (0xc0000000057fb400) (possibly stale):
  [ 1658.929592][    C1] NIP:  c00000000002cf50 LR: c008000000820178 CTR: c00000000002cfa0

To fix it, change the logic so that the sending CPU waits 5s for the
receiving CPU to print its trace. If the receiving CPU prints its trace
successfully then the sending CPU just continues, avoiding any spurious
"stale" trace.

This has the added benefit of allowing all CPUs to print their traces in
order and avoids any interleaving of their output.

Fixes: 5cc05910f2 ("powerpc/64s: Wire up arch_trigger_cpumask_backtrace()")
Cc: stable@vger.kernel.org # v4.18+
Reported-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210625140408.3351173-1-mpe@ellerman.id.au
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20 16:15:42 +02:00
..
boot powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P1010 i2c controllers 2021-06-16 11:54:58 +02:00
configs vgacon: remove software scrollback support 2020-09-17 13:45:29 +02:00
crypto powerpc updates for 4.19 2018-08-17 11:32:50 -07:00
include powerpc/64s: Fix pte update for kernel memory on radix 2021-05-22 10:59:35 +02:00
kernel powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi() 2021-07-20 16:15:42 +02:00
kvm KVM: PPC: Make the VMX instruction emulation routines static 2021-03-04 09:39:45 +01:00
lib powerpc/64s: Fix crashes when toggling entry flush barrier 2021-05-22 10:59:45 +02:00
math-emu
mm powerpc/64s: Fix pte update for kernel memory on radix 2021-05-22 10:59:35 +02:00
net powerpc/bpf: Fix tail call implementation 2019-12-05 09:19:39 +01:00
oprofile
perf powerpc/perf: Fix PMU constraint check for EBB events 2021-05-22 10:59:35 +02:00
platforms powerpc/pseries: Stop calling printk in rtas_stop_self() 2021-05-22 10:59:41 +02:00
purgatory powerpc updates for 4.19 2018-08-17 11:32:50 -07:00
sysdev powerpc: sysdev: add missing iounmap() on error in mpic_msgr_probe() 2021-01-06 14:45:01 +01:00
tools powerpc/tools: Don't quote $objdump in scripts 2020-01-04 19:12:42 +01:00
xmon powerpc/xmon: Change printk() to pr_cont() 2020-12-30 11:26:12 +01:00
Kconfig powerpc: Fix HAVE_HARDLOCKUP_DETECTOR_ARCH build configuration 2021-05-22 10:59:35 +02:00
Kconfig.debug powerpc: iommu: fix build when neither PCI or IBMVIO is set 2021-05-22 10:59:35 +02:00
Makefile powerpc: Drop -me200 addition to build flags 2020-12-30 11:25:38 +01:00
Makefile.postlink