0888f06ac9
Fernando Lopez-Lezcano reported frequent scheduling latencies and audio
xruns starting at the 2.6.18-rt kernel, and those problems persisted all
until current -rt kernels. The latencies were serious and unjustified by
system load, often in the milliseconds range.
After a patient and heroic multi-month effort of Fernando, where he
tested dozens of kernels, tried various configs, boot options,
test-patches of mine and provided latency traces of those incidents, the
following 'smoking gun' trace was captured by him:
_------=> CPU#
/ _-----=> irqs-off
| / _----=> need-resched
|| / _---=> hardirq/softirq
||| / _--=> preempt-depth
|||| /
||||| delay
cmd pid ||||| time | caller
\ / ||||| \ | /
IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup (try_to_wake_up)
IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup <<...>-5856> (37 0)
IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup (c01262ba 0 0)
IRQ_19-1479 1D..1 0us : resched_task (try_to_wake_up)
IRQ_19-1479 1D..1 0us : __spin_unlock_irqrestore (try_to_wake_up)
...
<idle>-0 1...1 11us!: default_idle (cpu_idle)
...
<idle>-0 0Dn.1 602us : smp_apic_timer_interrupt (c0103baf 1 0)
...
<...>-5856 0D..2 618us : __switch_to (__schedule)
<...>-5856 0D..2 618us : __schedule <<idle>-0> (20 162)
<...>-5856 0D..2 619us : __spin_unlock_irq (__schedule)
<...>-5856 0...1 619us : trace_stop_sched_switched (__schedule)
<...>-5856 0D..1 619us : trace_stop_sched_switched <<...>-5856> (37 0)
what is visible in this trace is that CPU#1 ran try_to_wake_up() for
PID:5856, it placed PID:5856 on CPU#0's runqueue and ran resched_task()
for CPU#0. But it decided to not send an IPI that no CPU - due to
TS_POLLING. But CPU#0 never woke up after its NEED_RESCHED bit was set,
and only rescheduled to PID:5856 upon the next lapic timer IRQ. The
result was a 600+ usecs latency and a missed wakeup!
the bug turned out to be an idle-wakeup bug introduced into the mainline
kernel this summer via an optimization in the x86_64 tree:
commit
|
||
---|---|---|
.. | ||
cpufreq | ||
acpi-ext.c | ||
acpi-processor.c | ||
acpi.c | ||
asm-offsets.c | ||
audit.c | ||
brl_emu.c | ||
crash.c | ||
crash_dump.c | ||
cyclone.c | ||
efi.c | ||
efi_stub.S | ||
entry.h | ||
entry.S | ||
esi.c | ||
esi_stub.S | ||
fsys.S | ||
gate-data.S | ||
gate.lds.S | ||
gate.S | ||
head.S | ||
ia64_ksyms.c | ||
init_task.c | ||
iosapic.c | ||
irq.c | ||
irq_ia64.c | ||
irq_lsapic.c | ||
ivt.S | ||
jprobes.S | ||
kprobes.c | ||
machine_kexec.c | ||
machvec.c | ||
Makefile | ||
mca.c | ||
mca_asm.S | ||
mca_drv.c | ||
mca_drv.h | ||
mca_drv_asm.S | ||
minstate.h | ||
module.c | ||
msi_ia64.c | ||
numa.c | ||
pal.S | ||
palinfo.c | ||
patch.c | ||
perfmon.c | ||
perfmon_default_smpl.c | ||
perfmon_generic.h | ||
perfmon_itanium.h | ||
perfmon_mckinley.h | ||
perfmon_montecito.h | ||
process.c | ||
ptrace.c | ||
relocate_kernel.S | ||
sal.c | ||
salinfo.c | ||
semaphore.c | ||
setup.c | ||
sigframe.h | ||
signal.c | ||
smp.c | ||
smpboot.c | ||
sys_ia64.c | ||
time.c | ||
topology.c | ||
traps.c | ||
unaligned.c | ||
uncached.c | ||
unwind.c | ||
unwind_decoder.c | ||
unwind_i.h | ||
vmlinux.lds.S |