Every user of the network device notifiers is either a protocol
stack or a pseudo device. If a protocol stack that does not have
support for multiple network namespaces receives an event for a
device that is not in the initial network namespace it quite possibly
can get confused and do the wrong thing.
To avoid problems until all of the protocol stacks are converted
this patch modifies all netdev event handlers to ignore events on
devices that are not in the initial network namespace.
As the rest of the code is made network namespace aware these
checks can be removed.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When PTRACE_SYSCALL was used and then PTRACE_DETACH is used, the
TIF_SYSCALL_TRACE flag is left set on the formerly-traced task. This
means that when a new tracer comes along and does PTRACE_ATTACH, it's
possible he gets a syscall tracing stop even though he's never used
PTRACE_SYSCALL. This happens if the task was in the middle of a system
call when the second PTRACE_ATTACH was done. The symptom is an
unexpected SIGTRAP when the tracer thinks that only SIGSTOP should have
been provoked by his ptrace calls so far.
A few machines already fixed this in ptrace_disable (i386, ia64, m68k).
But all other machines do not, and still have this bug. On x86_64, this
constitutes a regression in IA32 compatibility support.
Since all machines now use TIF_SYSCALL_TRACE for this, I put the
clearing of TIF_SYSCALL_TRACE in the generic ptrace_detach code rather
than adding it to every other machine's ptrace_disable.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After my last patch we have a new header file for HP simulator use.
Here's code to use it for stuff that used to have `extern' statements
inline in the code. Functionality should not change with this patch.
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch cleans up the `enable early console for SKI' patch
(471e7a4484), and
1. potentially allows the gensparse_defconfig to work again.
(there are other problems running a generic kernel on Ski)
2. fixes the `console registered twice' problem.
3. Cleans up the code by moving the `extern hpsim_cons' declaration to
a new asm/hpsim.h file.
Thanks to Jes for comments.
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
When dumping memory via sysrq-m it is possible to take a bogus NMI watchdog
or softlockup watchdog because the dump can take a long time on big memory
systems.
Occasionally tickle the watchdog when doing the dump.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add additional support for CPU disable on SN platforms.
Correctly setup the smp_affinity mask for I/O error IRQs.
Restrict the use of the feature to Altix 4000 and 450 systems
running with a CPU disable capable PROM, and do not allow disabling
of CPU 0.
Signed-off-by: John Keller <jpk@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
vmalloc() returns a void pointer - no need to cast it.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
For hugepage mappings, the file offset, like the address and size, needs to
be aligned to the size of a hugepage.
In commit 68589bc353, the check for this was
moved into prepare_hugepage_range() along with the address and size checks.
But since BenH's rework of the get_unmapped_area() paths leading up to
commit 4b1d89290b, prepare_hugepage_range()
is only called for MAP_FIXED mappings, not for other mappings. This means
we're no longer ever checking for an aligned offset - I've confirmed that
mmap() will (apparently) succeed with a misaligned offset on both powerpc
and i386 at least.
This patch restores the check, removing it from prepare_hugepage_range()
and putting it back into hugetlbfs_file_mmap(). I'm putting it there,
rather than in the get_unmapped_area() path so it only needs to go in one
place, than separately in the half-dozen or so arch-specific
implementations of hugetlb_get_unmapped_area().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The pending interrupts can be remaining at boot up time on some
platform. This will cause spurious interrupts when interrupt is
enabled for the first time. This patch clears IVR at the CPU
initialization to eliminate such spurious interrupts.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Fix handling for spurious interrupts not being mapped to any IRQs.
Currently, spurious interrupts that are not mapped to any IRQs are
handled as IRQ 15 (== IA64_SPURIOUS_VECTOR). But it is not proper
because vector != irq. We need special handlings for such spurious
interrupts not being mapped to any IRQs.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
When using Ski to debug early startup, it's a bit of a pain not to
have printk.
This patch enables the simulated console very early.
It may be worth conditionalising on the command line... but this is
enough for now.
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The "ri" field in the processor status register only has defined
values of 0, 1, 2. Do not let ptrace set this to 3. As with
other reserved fields in registers we silently discard the value.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
There is a bug in the ia64_do_page_fault code that can cause a failure
to grow the register backing store, or any mapping that is marked as
VM_GROWSUP if the mapping is the highest mapped area of memory.
When the address accessed is below the first mapping the previous mapping
is returned as NULL, and this case is handled. However, when the address
accessed is above the highest mapping the vma returned is NULL, this
case is not handled correctly, and it fails to spot that this access
might require an existing mapping to grow upwards.
Signed-off-by: Andrew Burgess <andrew@transitive.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The core cpufreq code doesn't appear to understand returning -EAGAIN
for the get() function of the cpufreq_driver. If PAL_GET_PSTATE returns
-1, such as when running on Xen, scaling_cur_freq is happy to return
4294967285 kHz (ie. (unsigned)-11). The other drivers appear to return
0 for a failure, and doing so gives me the max frequency from
scaling_cur_frequency and "<unknown>" from cpuinfo_cur_frequency. I
believe that's the desired behavior.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
If the interrupt has been disabled, don't call the force_interrupt provider.
Doing so can result in an infinite runaway interrupt loop.
Signed-off-by: Mike Habeck <habeck@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The slab allocator was changed in 2.6.23 to default to SLUB. However,
the config files in arch/ia64/configs still use SLAB. Switch them to SLUB.
Added same change to arch/ia64/defconfig ... Tony
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Explicitly put the unwind section into its own program-header. This
used to be unnecessary (probably because binutils did it for us), but
with current binutils (e.g., v2.17.50.20070804) we won't get
the PT_IA_64_UNWIND header without this patch which will break
unwinding in a debugger and simulators such as Ski.
Signed-off-by: David Mosberger-Tang <dmosberger@gmail.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add NOTES to linker script such that the kernel can be built with
recent versions of binutils. Without this patch, final link fails
with this error:
ld: .tmp_vmlinux1: section `.text' can't be allocated in segment 0
ld: final link failed: Bad value
This error is due to the fact that the --build-id option is used
with newer linkers to include a .notes section on the kernel, but
without the NOTES macro, that section won't be included in the kernel
which then leads to the above error message.
Signed-off-by: David Mosberger-Tang <dmosberger@gmail.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add a dummy nop at the end of _start() to maintain the invariant that
the return-pointer (rp) always point to the calling function. This
makes unwinding stop at the last frame, as it should.
Signed-off-by: David Mosberger-Tang <dmosberger@gmail.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Use local_vector_to_irq() instead of looping through all NR_IRQS.
This avoids registering the CPE handler on multiple irqs. Only
register if the irq is valid. If no valid irq is found, print an
error message and set up polling.
Signed-off-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
arch/ia64/Kconfig failed to include kernel/Kconfig.preempt that meant it
did not support PREEMPT_VOLUNTARY and PREEMPT_BKL (inadvertently).
This was recently noticed when the newly-added PREEMPT_NOTIFIERS in
Kconfig.preempt that was "select"ed from drivers/kvm/Kconfig (therefore)
started giving bogus warnings ('select' used by config symbol 'KVM' refers
to undefined symbol 'PREEMPT_NOTIFIERS') on ia64 builds.
So let's remove the open-coded definition of CONFIG_PREEMPT in
arch/ia64/Kconfig and replace it with just including Kconfig.preempt
instead, like the other archs do.
Signed-off-by: Satyam Sharma <satyam@infradead.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add base support for implementing platform_irq_to_vector(), and
then use it on SN2.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Acked-by: John Keller <jpk@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
While sending interrupts to a cpu to repeatedly wake a thread, on occasion
that thread will take a full timer tick cycle (4002 usec in my case)
to wakeup.
The problem concerns a race condition in the code around the safe_halt()
call in the default_idle() routine. Setting 'nohalt' on the kernel
command line causes the long wakeups to disappear.
void
default_idle (void)
{
local_irq_enable();
while (!need_resched()) {
--> if (can_do_pal_halt)
--> safe_halt();
else
A timer tick could arrive between the check for !need_resched and the
actual call to safe_halt() (which does a pal call to PAL_HALT_LIGHT).
By the time the timer tick completes, a thread that might now need to run
could get held up for as long as a timer tick waiting for the halted cpu.
I'm proposing that we disable irq's and check need_resched again before
calling safe_halt(). Does anyone see any problem with this approach?
Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] ITC: Reduce rating for ITC clock if ITCs are drifty
[IA64] SN2: Fix up sn2_rtc clock
[IA64] Fix wrong access to irq_desc[] in iosapic_register_intr().
[IA64] Fix possible race in destroy_and_reserve_irq()
[IA64] Fix registered interrupt check
[IA64] Remove a few duplicate includes
[IA64] Allow smp_call_function_single() to current cpu
[IA64] fix a few section mismatch warnings
Make sure to reduce the rating of the ITC clock if ITCs are drifty. If they
are drifting then we have not synchronized the ITC values, nor are we doing
the jitter compensation (useless since drift may increase the differentials
arbitrarily).
Without this patch it is possible that the ITC clock becomes selected as
the system clock on systems with drifty ITCs which will result in
nanosleep hanging.
One can still select the itc clock manually on such systems via
clocksource=itc
(Produces nice hangs on SGI Altix.)
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
If the sn2_rtc clock is present then it is a must have since sn2_rtc
provides a synchronized time source on Altix systems. So elevate
the priority to 450. Otherwise the ITC would take precendence. Altix
systems currently do not boot because the ITC clocksource is broken. It
seems to assume that ITCs are synchronized and as a result nanosleep
hangs (may be fixed in a different patch).
While we are at it: Remove the sn2_mc definition. The sn2_rtc has a fixed
address. No point in reading the address from memory. Removing it avoids
touching one cacheline.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
In error path we must unlock irq_desc[irq].lock before we change
'irq'.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Remove unused TIF_NOTIFY_RESUME flag for all processor architectures. The
flag was not used excecpt on IA-64 where the patch replaces it with
TIF_PERFMON_WORK.
Signed-off-by: stephane eranian <eranian@hpl.hp.com>
Cc: <linux-arch@vger.kernel.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, destroy_and_reserve_irq() sets irq_status[irq] UNUSED using
clear_irq_vector() and sets irq_status[irq] RSVD using reserve_irq().
But there is a race window because vector_lock is once released between
them. This patch fixes this race window.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Fix the problem that interrupts are not initialized correctly at PCI
hotplug or driver reloading time.
By vector domain change, the iosapic_rte_info structure was changed to
be on the iosapic_intr_info[irq].rtes list even after the interrupts
are unregistered. So iosapic_intr_info[irq].rtes list must not be
checked to see if there are registered interrupts (RTEs) on the
irq. We must check iosapic_intr_info[irq].count counter instead.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch removes a few duplicate includes from arch/ia64/
Acked-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This removes the requirement for callers to get_cpu() to check in simple
cases. i386 and x86_64 already received a similar treatment.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Fix the following section mismatch warnings:
WARNING: vmlinux.o(.text+0x41902): Section mismatch: reference to .init.text:__alloc_bootmem (between 'ia64_mca_cpu_init' and 'ia64_do_tlb_purge')
WARNING: vmlinux.o(.text+0x49222): Section mismatch: reference to .init.text:__alloc_bootmem (between 'register_intr' and 'iosapic_register_intr')
WARNING: vmlinux.o(.text+0x62beb2): Section mismatch: reference to .init.text:__alloc_bootmem_node (between 'hubdev_init_node' and 'cnodeid_get_geoid')
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (28 commits)
[SCSI] mpt fusion: Changes in mptctl.c for logging support
[SCSI] mpt fusion: Changes in mptfc.c mptlan.c mptsas.c and mptspi.c for logging support
[SCSI] mpt fusion: Changes in mptscsih.c for logging support
[SCSI] mpt fusion: Changes in mptbase.c for logging support
[SCSI] mpt fusion: logging support in Kconfig, Makefile, mptbase.h and addition of mptdebug.h
[SCSI] libsas: Fix potential NULL dereference in sas_smp_get_phy_events()
[SCSI] bsg: Fix build for CONFIG_BLOCK=n
[SCSI] aacraid: fix Sunrise Lake reset handling
[SCSI] aacraid: add SCSI SYNCHONIZE_CACHE range checking
[SCSI] add easyRAID to the no report luns blacklist
[SCSI] advansys: lindent and other large, uninteresting changes
[SCSI] aic79xx, aic7xxx: Fix incorrect width setting
[SCSI] qla2xxx: fix to honor ignored parameters in sysfs attributes
[SCSI] aacraid: draw line in sand, sundry cleanup and version update
[SCSI] iscsi_tcp: Turn off bounce buffers
[SCSI] libiscsi: fix cmd seqeunce number checking
[SCSI] iscsi_tcp, ib_iser Enable module refcounting for iscsi host template
[SCSI] libiscsi: make sure session is not blocked when removing host
[SCSI] libsas: Remove PCI dependencies
[SCSI] simscsi: convert to use the data buffer accessors
...
When comparing a pointer, it's clearer to compare it to NULL than to 0.
Signed-off-by: Yoann Padioleau <padator@wanadoo.fr>
Signed-off-by: Tony Luck <tony.luck@intel.com>
b716395e2b added code to handle
a compatability issue with 32bit quota tools, but the new compat
routines are only needed when CONFIG_COMPAT=y (and with this set
to 'n' there are compilation problems since some new typedefs are
not visible).
Reported by Doug Chapman. Fix tuned by a cast of thousands (Andi,
Andreas, Arthur, HPA, Willy)
Signed-off-by: Tony Luck <tony.luck@intel.com>
Forgot to adjust this one with the acpi autoloading patches
in commit 8c8eb78f67
Acked-by: Myron Stowe <myron.stowe@hp.com>
Acked-by: Len Brown <len.brown@intel.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Fix wrong return value in parse_vector_domain().
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The ia64's acpi_gsi_to_irq() function assumes irq == vector. But in
fact irq can be different from vector. This patch fix this wrong
assumption.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add some sanity checks into __bind_irq_vector().
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
u32* volatile cyclone_timer means volatile auto pointer to u32,
which is clearly not what had been intended (we never even take
the address of that variable, let alone pass it to something that
could change it behind our back). u32 volatile * is what the
authors apparently wanted to say, but in reality we don't need that
qualifier there at all - it's (properly) only passed to iomem helpers
which takes care of that stuff just fine.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] Nail two more simple section mismatch errors
[IA64] fix section mismatch warnings
[IA64] rename partial_page
[IA64] Ensure that machvec is set up takes place before serial console
[IA64] vector-domain - fix vector_table
[IA64] vector-domain - handle assign_irq_vector(AUTO_ASSIGN)
- remove the unnecessary map_single path.
- convert to use the new accessors for the sg lists and the
parameters.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
pcibios_setup (between 'pci_setup' and 'quirk_mellanox_tavor')
setup_profiling_timer (between 'write_profile' and 'delayed_put_task_struct')
Signed-off-by: Tony Luck <tony.luck@intel.com>
In 741f98fe29 Sam added full
checking across the entire vmlinux image. This flushed out
a dozen new section mismatch warnings. Start the whack-a-mole
game again to stomp them out.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Jens has added a partial_page thing in splice whcih conflicts with the ia64
one. Rename ia64 out of the way. (ia64 chose poorly).
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>