1. exit_notify() always calls kill_orphaned_pgrp(). This is wrong, we
should do this only when the whole process exits.
2. exit_notify() uses "current" as "ignored_task", obviously wrong.
Use ->group_leader instead.
Test case:
void hup(int sig)
{
printf("HUP received\n");
}
void *tfunc(void *arg)
{
sleep(2);
printf("sub-thread exited\n");
return NULL;
}
int main(int argc, char *argv[])
{
if (!fork()) {
signal(SIGHUP, hup);
kill(getpid(), SIGSTOP);
exit(0);
}
pthread_t thr;
pthread_create(&thr, NULL, tfunc, NULL);
sleep(1);
printf("main thread exited\n");
syscall(__NR_exit, 0);
return 0;
}
output:
main thread exited
HUP received
Hangup
With this patch the output is:
main thread exited
sub-thread exited
HUP received
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
p->exit_state != 0 doesn't mean this process is dead, it may have
sub-threads. Change the code to use "p->exit_state && thread_group_empty(p)"
instead.
Without this patch, ^Z doesn't deliver SIGTSTP to the foreground process
if the main thread has exited.
However, the new check is not perfect either. There is a window when
exit_notify() drops tasklist and before release_task(). Suppose that
the last (non-leader) thread exits. This means that entire group exits,
but thread_group_empty() is not true yet.
As Eric pointed out, is_global_init() is wrong as well, but I did not
dare to do other changes.
Just for the record, has_stopped_jobs() is absolutely wrong too. But we
can't fix it now, we should first fix SIGNAL_STOP_STOPPED issues.
Even with this patch ^Z doesn't play well with the dead main thread.
The task is stopped correctly but do_wait(WSTOPPED) won't see it. This
is another unrelated issue, will be (hopefully) fixed separately.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Factor out the common code in reparent_thread() and exit_notify().
No functional changes.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oddly enough, unsigned int c = '\300'; puts a "negative" value in c, not
0300... This fixes the default unicode compose table by using integers
instead of character constants.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix docbook problems in fusion source files.
These cause the generated docbook to be incorrect.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix docbook problems in kernel-api.tmpl.
These cause the generated docbook to be incorrect.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix docbook problems in USB source files.
These cause the generated docbook to be incorrect.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix docbook problem in SCSI source files.
These cause the generated docbook to be incorrect.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix docbook problems in rapidio source files.
These cause the generated docbook to be incorrect.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix docbook problems in filesystems.tmpl.
These cause the generated docbook to be incorrect.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
x86: revert "x86: fix pmd_bad and pud_bad to support huge pages"
x86: revert "x86: CPA: avoid split of alias mappings"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (24 commits)
[POWERPC] Convert the cell IOMMU fixed mapping to 16M IOMMU pages
[POWERPC] Allow for different IOMMU page sizes in cell IOMMU code
[POWERPC] Cell IOMMU: n_pte_pages is in 4K page units, not IOMMU_PAGE_SIZE
[POWERPC] Split setup of IOMMU stab and ptab, allocate dynamic/fixed ptabs separately
[POWERPC] Move allocation of cell IOMMU pad page
[POWERPC] Remove unused pte_offset variable
[POWERPC] Use it_offset not pte_offset in cell IOMMU code
[POWERPC] Clearup cell IOMMU fixed mapping terminology
[POWERPC] enable hardware watchpoints on cell blades
[POWERPC] move celleb DABRX definitions
[POWERPC] OProfile: enable callgraph support for Cell
[POWERPC] spufs: fix use time accounting on SPE-overcommit
[POWERPC] spufs: serialize SLB invalidation against SLB loading
[POWERPC] spufs: invalidate SLB translation before adding a new entry
[POWERPC] spufs: synchronize IRQ when disabling
[POWERPC] spufs: fix order of sputrace thread IDs
[POWERPC] Xilinx: hwicap cleanup
[POWERPC] 4xx: Use correct board info structure in cuboot wrappers
[POWERPC] spufs: fix invalid scheduling of forgotten contexts
[POWERPC] 44x: add missing define TARGET_4xx and TARGET_440GX to cuboot-taishan
...
The new code that removed the limitation on the execve string size
(which was historically 32 pages) replaced it with a much softer limit
based on RLIMIT_STACK which is usually much larger than the traditional
limit. See commit b6a2fea393 ("mm:
variable length argument support") for details.
However, if you have a small stack limit (perhaps because you need lots
of stacks in a threaded environment), the new heuristic of allowing up
to 1/4th of RLIMIT_STACK to be used for argument and environment strings
could actually be smaller than the old limit.
So just say that it's ok to have up to ARG_MAX strings regardless of the
value of RLIMIT_STACK, and check the rlimit only when going over that
traditional limit.
(Of course, if you actually have a *really* small stack limit, the whole
stack itself will be limited before you hit ARG_MAX, but that has always
been true and is clearly the right behaviour anyway).
Acked-by: Carlos O'Donell <carlos@codesourcery.com>
Cc: Michael Kerrisk <michael.kerrisk@googlemail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ollie Wild <aaw@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts commit cded932b75.
Arjan bisected down a boot-time hang to this, saying:
".. it prevents the kernel to finish booting on my (Penryn based)
laptop. The boot stops right after freeing the init memory."
and while it's not clear exactly what triggers it, at this stage we're
better off just reverting it while Ingo tries to figure out what went
wrong.
Requested-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Hans Rosenfeld <hans.rosenfeld@amd.com>
Cc: Nish Aravamudan <nish.aravamudan@gmail.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
revert commit cded932b75,
"x86: fix pmd_bad and pud_bad to support huge pages", it causes
a bootup hang, as reported and bisected by Arjan van de Ven.
Bisected-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Revert:
commit 8be8f54bae
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Sat Feb 23 20:43:21 2008 +0100
x86: CPA: avoid split of alias mappings
because it clearly mishandles the case when __change_page_attr(), called
from __change_page_attr_set_clr(), changes cpa->processed to 1 and
cpa_process_alias(cpa) is executed right after that.
This crashes my x86-64 test box early in the boot process
(ref. http://bugzilla.kernel.org/show_bug.cgi?id=10140#c4).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The only tricky part is we need to adjust the PTE insertion loop to
cater for holes in the page table. The PTEs for each segment start on
a 4K boundary, so with 16M pages we have 16 PTEs per segment and then
a gap to the next 4K page boundary.
It might be possible to allocate the PTEs for each segment separately,
saving the memory currently filling the gaps. However we'd need to
check that's OK with the hardware, and that it actually saves memory.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Make some preliminary changes to cell_iommu_alloc_ptab() to allow it to
take the page size as a parameter rather than assuming IOMMU_PAGE_SIZE.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
We use n_pte_pages to calculate the stride through the page tables, but
we also use it to set the NPPT value in the segment table entry. That is
defined as the number of 4K pages per segment, so we should calculate
it as such regardless of the IOMMU page size.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Currently the cell IOMMU code allocates the entire IOMMU page table in a
contiguous chunk. This is nice and tidy, but for machines with larger
amounts of RAM the page table allocation can fail due to it simply being
too large.
So split the segment table and page table setup routine, and arrange to
have the dynamic and fixed page tables allocated separately.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
There's no need to allocate the pad page unless we're going to actually
use it - so move the allocation to where we know we're going to use it.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The cell IOMMU code no longer needs to save the pte_offset variable
separately, it is incorporated into tbl->it_offset.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The cell IOMMU tce build and free routines use pte_offset to convert
the index passed from the generic IOMMU code into a page table offset.
This takes into account the SPIDER_DMA_OFFSET which sets the top bit
of every DMA address.
However it doesn't cater for the IOMMU window starting at a non-zero
address, as the base of the window is not incorporated into pte_offset
at all.
As it turns out tbl->it_offset already contains the value we need, it
takes into account the base of the window and also pte_offset. So use
it instead!
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
It's called the fixed mapping, not the static mapping.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Ulrich Weigand has found that the hardware watchpoints on cell were not
working back in November :
http://ozlabs.org/pipermail/linuxppc-dev/2007-November/046135.html
This patch sets them during initialization.
Signed-off-by: Jens Osterkamp <jens@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This moves the private DABRX definitions for celleb from beat.h to
reg.h to make them usable for all.
Signed-off-by: Jens Osterkamp <jens@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This patch enables OProfile callgraph support for the Cell processor. The
original code was just calling a function to add the PC value, now it will
call a function that first checks the callgraph depth. Callgraph is already
enabled on the other Power platforms.
Signed-off-by: Bob Nelson <rrnelson@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: fix crash in automatic module unloading
firewire: potentially invalid pointers used in fw_card_bm_work
firewire: fw-sbp2: better fix for NULL pointer dereference in scsi_remove_device
The bus management workqueue job was in danger to dereference NULL
pointers. Also, after having temporarily lifted card->lock, a few node
pointers and a device pointer may have become invalid.
Add NULL pointer checks and get the necessary references. Also, move
card->local_node out of fw_card_bm_work's sight during shutdown of the
card.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
Patch "firewire: fw-sbp2: fix NULL pointer deref. in scsi_remove_device"
had the unintended effect that firewire-sbp2 could not be unloaded
anymore until all SBP-2 devices were unplugged.
We now fix the NULL pointer bug by reacquiring a reference to the sdev
instead of holding a reference to the sdev (and to the module) all the
time.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Tested-by: Jarod Wilson <jwilson@redhat.com>
Since 2f569af (CONFIG_HIGHPTE vs. sub-page page tables.) pte_free() calls
pte_lock_deinit() and dec_zone_page_state(). So free_pgd_slow must not call
the latter two when calling the first.
Signed-off-by: Uwe Kleine-König <Uwe.Kleine-Koenig@digi.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Hi,
While we are looking at the printk issue, I see that its printk'ing the EOE
(end of event) records which is really not something that we need in syslog.
Its really intended for the realtime audit event stream handled by the audit
daemon. So, lets avoid printk'ing that record type.
Signed-off-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
On the latest kernels if one was to load about 15 rules, set the failure
state to panic, and then run service auditd stop the kernel will panic.
This is because auditd stops, then the script deletes all of the rules.
These deletions are sent as audit messages out of the printk kernel
interface which is already known to be lossy. These will overun the
default kernel rate limiting (10 really fast messages) and will call
audit_panic(). The same effect can happen if a slew of avc's come
through while auditd is stopped.
This can be fixed a number of ways but this patch fixes the problem by
just not panicing if auditd is not running. We know printk is lossy and
if the user chooses to set the failure mode to panic and tries to use
printk we can't make any promises no matter how hard we try, so why try?
At least in this way we continue to get lost message accounting and will
eventually know that things went bad.
The other change is to add a new call to audit_log_lost() if auditd
disappears. We already pulled the skb off the queue and couldn't send
it so that message is lost. At least this way we will account for the
last message and panic if the machine is configured to panic. This code
path should only be run if auditd dies for unforeseen reasons. If
auditd closes correctly audit_pid will get set to 0 and we won't walk
this code path.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Fix the following compiler warning by using "%zu" as defined in C99.
CC kernel/auditsc.o
kernel/auditsc.c: In function 'audit_log_single_execve_arg':
kernel/auditsc.c:1074: warning: format '%ld' expects type 'long int', but
argument 4 has type 'size_t'
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[libata] wrap kmap_atomic(KM_IRQ0) with local_irq_save/restore()
sata_svw: Add support for HT1100 SATA controller
Interrupts must be disabled if using kmap_atomic(KM_IRQ0), but that was
not the case in a few code paths coming directly from ATA driver
interrupt handlers (which use spin_lock rather than spin_lock_irqsave).
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The PXA3xx AC97 controller has an additional control bit GCR_CLKBPB
which must be used during cold reset.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: eric miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This is unnecessary since it is already protected by
spin_lock_irq{save, restore} in clock.c.
Signed-off-by: eric miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This typo causes the incorrect calculation of the IRQ numbers
in the ICIP2 registers.
Signed-off-by: eric miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
"cat /dev/mem" may cause kernel Oops for boards with PHYS_OFFSET != 0
because character device is mapped to addresses starting from zero
and there is no protection against such situation.
Patch just add this.
Signed-off-by: Alexandre Rusev <arusev@ru.mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Convert debug-only (and removed) MODULE_PARM() to module_param().
Compiles cleanly (with DEBUG=1).
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch sets KEXEC_CONTROL_MEMORY_LIMIT to (-1)UL. As the value is
compared with physical addresses TASK_SIZE makes no sense. Machines
where the RAM addresses start above TASK_SIZE kexecs eats all memory
and crashes the kernel without this patch.
Signed-off-by: Thomas Kunze <thommycheck@gmx.de>
Acked-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Eric Sandeen tracked an XFS on ARM corruption bug down to a function
under fs/xfs/ involving some get_unaligned() calls on u64 pointers.
As it turns out, calling ARM's get_unaligned() on a u64 pointer
pointing to the following byte sequence:
80 81 82 83 84 85 86 87
would return ffffffff83828180 (LE mode.) This turns out to be
because of implicit u8 -> int promotion in ARM's implementation of
various helpers for get_unaligned(), causing them to accidentally
return signed instead of unsigned values, which in turn caused the
subsequent casts to unsigned long long in __get_unaligned_8_[bl]e()
to sign-extend the lower words.
Fix by casting the return values of __get_unaligned_[24]_[bl]e()
to unsigned int.
Cc: Eric Sandeen <sandeen@sandeen.net>
Cc: Rabeeh Khoury <rabeeh@marvell.com>
Cc: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>