neigh_changeaddr attempts to delete neighbour timers without setting
nud_state. This doesn't work because the timer may have already fired
when we acquire the write lock in neigh_changeaddr. The result is that
the timer may keep firing for quite a while until the entry reaches
NEIGH_FAILED.
It should be setting the nud_state straight away so that if the timer
has already fired it can simply exit once we relinquish the lock.
In fact, this whole function is simply duplicating the logic in
neigh_ifdown which in turn is already doing the right thing when
it comes to deleting timers and setting nud_state.
So all we have to do is take that code out and put it into a common
function and make both neigh_changeaddr and neigh_ifdown call it.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
neigh_add_timer cannot use add_timer unconditionally. The reason is that
by the time it has obtained the write lock someone else (e.g., neigh_update)
could have already added a new timer.
So it should only use mod_timer and deal with its return value accordingly.
This bug would have led to rare neighbour cache entry leaks.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Stack traces are very helpful in determining the exact nature of a bug.
So let's print a stack trace when the timer is added twice.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As stated in Documentation/atomic_ops.txt, atomic functions
returning values must have the memory barriers both before and after
the operation.
Thanks to DaveM for pointing that out.
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
On architectures where the char type defaults to unsigned some of the
arithmetic in the AX.25 stack to fail, resulting in some packets being dropped
on receive.
Credits for tracking this down and the original patch to
Bob Brose N0QBJ <linuxhams@n0qbj-11.ampr.org>.
Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
IPVS used flag NFC_IPVS_PROPERTY in nfcache but as now nfcache was removed the
new flag 'ipvs_property' still needs to be copied. This patch should be
included in 2.6.14.
Further comments from Harald Welte:
Sorry, seems like the bug was introduced by me.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Not sure how it slipped by, but here's a trivial typo fix for powernow.
Signed-off-by: Chris Wright <chrisw@osdl.org>
[ It's "nurter" backwards.. Maybe we have a hillbilly The Shining fan? ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When I originally moved exit_itimers into __exit_signal, that was the only
place where we could reliably know it was the last thread in the group
dying, without races. Since then we've gotten the signal_struct.live
counter, and do_exit can reliably do group-wide cleanup work.
This patch moves the call to do_exit, where it's made without locks. This
avoids the deadlock issues that the old __exit_signal code's comment talks
about, and the one that Oleg found recently with process CPU timers.
[ This replaces e03d13e985, which is why
it was just reverted. ]
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
AMD recently discovered that on some hardware, there is a race condition
possible when a C-state change request goes onto the bus at the same
time as a P-state change request.
Both requests happen, but the southbridge hardware only acknowledges the
C-state change. The PowerNow! driver is then stuck in a loop, waiting
for the P-state change acknowledgement. The driver eventually times
out, but can no longer perform P-state changes.
It turns out the solution is to resend the P-state change, which the
southbridge will acknowledge normally.
Thanks to Johannes Winkelmann for reporting this and testing the fix.
Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This fixes a stupid typo bug in the iSeries hash table code.
When we place a hash PTE in the secondary bucket, instead of setting the
SECONDARY flag bit, as we should, we (redundantly) set the VALID flag.
This was introduced with the patch abolishing bitfields from the hash
table code. Mea culpa, oops. It hasn't been noticed until now because
in practice we don't hit the secondary bucket terribly often.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The wrong state emission routines were being called for G550, and
consistent maps weren't correctly mapped...
Signed-off-by: Dave Airlie <airlied@linux.ie>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
While working on 64K pages, I found this little buglet in our
update_mmu_cache() implementation.
The code calls __hash_page() passing it an "access" parameter (the type
of access that triggers the hash) containing the bits _PAGE_RW and
_PAGE_USER of the linux PTE. The latter is useless in this case and the
former is wrong. In fact, if we have a writeable PTE and we pass
_PAGE_RW to hash_page(), it will set _PAGE_DIRTY (since we track dirty
that way, by hash faulting !dirty) which is not what we want.
In fact, the correct fix is to always pass 0. That means that only
read-only or already dirty read write PTEs will be preloaded. The
(hopefully rare) case of a non dirty read write PTE can't be preloaded
this way, it will have to fault in hash_page on the actual access.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This fixes a typo in the div128_by_32 function used in the timekeeping
calculations on ppc64. If you look at the code it's quite obvious
that we need (rb + c) rather than (rb + b). The "b" is clearly just a
typo.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This fixes handling of the phy identifiers in mptsas.
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
[ split it a pre-2.6.14 portion from Eric's bigger patch ]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Ben Dooks
From: Guillaume Gourat <guillaume.gourat@nexvision.fr>
Add MASK definitions for DCLK0 and DCLK1
Signed-off-by: Guillaume Gourat <guillaume.gourat@nexvision.fr>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
The current Simtec BAST nand area timings are a little
too slow to be obtained by a 2410 running at 266MHz,
so reduce the timings slightly to bring them into the
acceptable range.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
Avoid the possiblity that if the board is using
a 16.9334 or higher crystal with a high PLL
multiplier, then the pll value could overflow
the capability of an int.
Also fix the value types of the intermediate
variables to unsigned int.
Rewrite of patch from Guillaume Gourat
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Matt Reimer
Adds an I2S platform_device for PXA. I2S is used to interface
with sound chips on systems like iPAQ h1910/h2200/hx4700 and
Asus 716.
Signed-off-by: mreimer@vpop.net
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
It is legitimate to call tcp_fragment with len == skb->len since
that is done for FIN packets and the FIN flag counts as one byte.
So we should only check for the len > skb->len case.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Turns out the problem has nothing to do with use-after-free or double-free.
It's just that we're not clearing the CB area and DCCP unlike TCP uses a CB
format that's incompatible with IP.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Ian McDonald <imcdnzl@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
icmp_send doesn't use skb->sk at all so even if skb->sk has already
been freed it can't cause crash there (it would've crashed somewhere
else first, e.g., ip_queue_xmit).
I found a double-free on an skb that could explain this though.
dccp_sendmsg and dccp_write_xmit are a little confused as to what
should free the packet when something goes wrong. Sometimes they
both go for the ball and end up in each other's way.
This patch makes dccp_write_xmit always free the packet no matter
what. This makes sense since dccp_transmit_skb which in turn comes
from the fact that ip_queue_xmit always frees the packet.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
David S. Miller <davem@davemloft.net> wrote:
> One thing you can probably do for this bug is to mark data packets
> explicitly somehow, perhaps in the SKB control block DCCP already
> uses for other data. Put some boolean in there, set it true for
> data packets. Then change the test in dccp_transmit_skb() as
> appropriate to test the boolean flag instead of "skb_cloned(skb)".
I agree. In fact we already have that flag, it's called skb->sk.
So here is patch to test that instead of skb_cloned().
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Ian McDonald <imcdnzl@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This reverts commit 3359b54c8c and
replaces it with a cleaner version that is purely based on page table
operations, so that the synchronization between inode size and hugetlb
mappings becomes moot.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Missing half of the [PATCH] uml: Fix sysrq-r support for skas mode
We need to remove these (UPT_[DEFG]S) from the read side as well as the
write one - otherwise it simply won't build.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Found in the -rt patch set. The scsi_error thread likely will be in the
TASK_INTERRUPTIBLE state upon exit. This patch fixes this bug.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This introduces a limit parameter to the core bootmem allocator; The new
parameter indicates that physical memory allocated by the bootmem
allocator should be within the requested limit.
We also introduce alloc_bootmem_low_pages_limit, alloc_bootmem_node_limit,
alloc_bootmem_low_pages_node_limit apis, but alloc_bootmem_low_pages_limit
is the only api used for swiotlb.
The existing alloc_bootmem_low_pages() api could instead have been
changed and made to pass right limit to the core allocator. But that
would make the patch more intrusive for 2.6.14, as other arches use
alloc_bootmem_low_pages(). We may be done that post 2.6.14 as a
cleanup.
With this, swiotlb gets memory within 4G for both x86_64 and ia64
arches.
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Ravikiran G Thirumalai <kiran@scalex86.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In drivers/acpi/glue.c the address of an integer is cast to the address of
an unsigned long. This breaks on systems where a long is larger than an
int --- for a start the int can be misaligned; for a second the assignment
through the pointer will overwrite part of the next variable.
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Acked-by: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I've gotten a report on lkml, of a possible regression in the MGA DRM in
2.6.14-rc4 (since -rc1), I haven't been able to reproduce it here, but I've
figured out some possible issues in the mga code that were definitely
wrong, some of these are from DRM CVS, the main fix is the agp enable bit
on the old code path still used by everyone.....
Signed-off-by: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The PF_NOFREEZE process flag should not be inherited when a thread is
forked. This patch (as585) removes the flag from the child.
This problem is starting to show up more and more as drivers turn to the
kthread API instead of using kernel_thread(). As a result, their kernel
threads are now children of the kthread worker instead of modprobe, and
they inherit the PF_NOFREEZE flag. This can cause problems during system
suspend; the kernel threads are not getting frozen as they ought to be.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The variable RCS_TAR_IGNORE is used in scripts/packaging/Makefile, but not
exported from the main Makefile, so it's never used.
This results in the rpm targets being very unhappy in quilted trees.
Signed-off-by: Tom Rini <trini@kernel.crashing.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The implementation of __kernel_gettimeofday() in the 32 bits vDSO has a
small bug (a typo actually) that will cause it to lose 1 bit of precision.
Not terribly bad but worth fixing.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The main problem fixes is that in certain situations stopping md arrays may
take longer than you expect, or may require multiple attempts. This would
only happen when resync/recovery is happening.
This patch fixes three vaguely related bugs.
1/ The recent change to use kthreads got the setting of the
process name wrong. This fixes it.
2/ The recent change to use kthreads lost the ability for
md threads to be signalled with SIG_KILL. This restores that.
3/ There is a long standing bug in that if:
- An array needs recovery (onto a hot-spare) and
- The recovery is being blocked because some other array being
recovered shares a physical device and
- The recovery thread is killed with SIG_KILL
Then the recovery will appear to have completed with no IO being
done, which can cause data corruption.
This patch makes sure that incomplete recovery will be treated as
incomplete.
Note that any kernel affected by bug 2 will not suffer the problem of bug
3, as the signal can never be delivered. Thus the current 2.6.14-rc
kernels are not susceptible to data corruption. Note also that if arrays
are shutdown (with "mdadm -S" or "raidstop") then the problem doesn't
occur. It only happens if a SIGKILL is independently delivered as done by
'init' when shutting down.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Changes all spinlocks that can be held during an irq handler to disable
interrupts while the lock is held. Changes spin_[un]lock_irq to use the
irqsave/irqrestore variants for robustness and readability.
In raw1394.c:handle_iso_listen(), don't grab host_info_lock at all -- we're
not accessing host_info_list or host_count, and holding this lock while
trying to tasklet_kill the iso tasklet this can cause an ABBA deadlock if
ohci:dma_rcv_tasklet is running and tries to grab host_info_lock in
raw1394.c:receive_iso. Test program attached reliably deadlocks all SMP
machines I have been able to test without this patch.
Signed-off-by: Andy Wingo <wingo@pobox.com>
Acked-by: Ben Collins <bcollins@ubuntu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Brice Goglin <Brice.Goglin@ens-lyon.org> reports a printk storm from this
driver. Fix.
Acked-by: David Gibson <hermes@gibson.dropbear.id.au>
Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
hugetlbfs allows truncation of its files (should it?), but hugetlb.c often
forgets that: crashes and misaccounting ensue.
copy_hugetlb_page_range better grab the src page_table_lock since we don't
want to guess what happens if concurrently truncated. unmap_hugepage_range
rss accounting must not assume the full range was mapped. follow_hugetlb_page
must guard with page_table_lock and be prepared to exit early.
Restyle copy_hugetlb_page_range with a for loop like the others there.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov reported an SMP deadlock. If there is a running timer
tracking a different process's CPU time clock when the process owning
the timer exits, we deadlock on tasklist_lock in posix_cpu_timer_del via
exit_itimers.
That code was using tasklist_lock to check for a race with __exit_signal
being called on the timer-target task and clearing its ->signal.
However, there is actually no such race. __exit_signal will have called
posix_cpu_timers_exit and posix_cpu_timers_exit_group before it does
that. Those will clear those k_itimer's association with the dying
task, so posix_cpu_timer_del will return early and never reach the code
in question.
In addition, posix_cpu_timer_del called from exit_itimers during execve
or directly from timer_delete in the process owning the timer can race
with an exiting timer-target task to cause a double put on timer-target
task struct. Make sure we always access cpu_timers lists with sighand
lock held.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Chris Wright <chrisw@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Tony Lindgren
Machine restart calls cpu_proc_fin() to clean and disable
cache, and turn off interrupts. This patch adds proper
cpu_v6_proc_fin.
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The hugetlb pages are currently pre-faulted. At the time of mmap of
hugepages, we populate the new PTEs. It is possible that HW has already
cached some of the unused PTEs internally. These stale entries never
get a chance to be purged in existing control flow.
This patch extends the check in page fault code for hugepages. Check if
a faulted address falls with in size for the hugetlb file backing it.
We return VM_FAULT_MINOR for these cases (assuming that the arch
specific page-faulting code purges the stale entry for the archs that
need it).
Signed-off-by: Rohit Seth <rohit.seth@intel.com>
[ This is apparently arguably an ia64 port bug. But the code won't
hurt, and for now it fixes a real problem on some ia64 machines ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Paul Schulz
The following trivial patch is to fix what looks like a typo in the PXA register
definitions. The correction comes directly from the definition in the
Intel Documentation.
http://www.intel.com/design/pca/applicationsprocessors/manuals/278693.htm
Intel(R) PXA 255 Processor - Developers Manual - Jan 2004 - Page 12-33
Neither 'UDCCS_IO_ROF' or 'UDCCS_IO_DME' are currently used elseware
in the main code (from grep of tree)... The current definitions have been
in the code since at lease 2.4.7.
Signed-off-by: Paul Schulz <paul@mawsonlakes.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Reported by: Bob Tracy <rct@gherkin.frus.com>
"...I've got a Toshiba notebook (730XCDT -- Pentium 150MMX) for which
I'm using the Vesa FB driver. When the machine has been idle for some
time and the driver attempts to powerdown the display, rather than the
display going blank, it goes gray with several strange lines. When I
hit the "shift" key or other-wise wake up the display, the old video
state is not fully restored..."
vesafb recently added a blank method which has only 2 states, powerup and
powerdown. The powerdown state is used for all blanking levels, but in his
case, powerdown does not work correctly for higher levels of display
powersaving. Thus, for intermediate power levels, use software blanking,
and use only hardware blanking for an explicit powerdown.
Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This still leaves driver and architecture-specific subdirectories alone,
but gets rid of the bulk of the "generic" generated files that we should
ignore.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>