Commit graph

98292 commits

Author SHA1 Message Date
Andrew Morton
27df66a406 arch/x86/mm/init_64.c: early_memtest(): fix types
fix this warning:

arch/x86/mm/init_64.c: In function 'early_memtest':
arch/x86/mm/init_64.c:524: warning: passing argument 2 of 'find_e820_area_size' from incompatible pointer type

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-03 10:15:51 +02:00
Hugh Dickins
216705d272 x86: fix Intel Mac booting with EFI
Fedora reports that mem_init()'s zap_low_mappings(), extended to SMP in
61165d7a03 x86: fix app crashes after SMP
resume causes 32-bit Intel Mac machines to reboot very early when
booting with EFI.

The EFI code appears to manage low mappings for itself when needed; but
like many before it, confuses PSE with PAE.  So it has only been mapping
half the space it needed when PSE but not PAE.  This remained unnoticed
until we moved the SMP zap_low_mappings() before
efi_enter_virtual_mode().  Presumably could have been noticed years ago
if anyone ran a UP kernel on such machines?

Reported-by: Peter Jones <pjones@redhat.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Peter Jones <pjones@redhat.com>
Cc: Glauber Costa <gcosta@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Peter Jones <pjones@redhat.com>
2008-07-03 08:19:18 +02:00
Thomas Gleixner
efac41894d x86: fix NODES_SHIFT Kconfig range
commit 4323838215
       x86: change size of node ids from u8 to s16

set the range for NODES_SHIFT to 1..15.

The possible range is 1..9

Fixes Bugzilla #10726

Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-07-01 08:56:32 +02:00
TAKADA Yoshihito
11dbc963a8 ptrace GET/SET FPXREGS broken
When I update kernel 2.6.25 from 2.6.24, gdb does not work.
On 2.6.25, ptrace(PTRACE_GETFPXREGS, ...) returns ENODEV.

But 2.6.24 kernel's ptrace() returns EIO.
It is issue of compatibility.

I attached test program as pt.c and patch for fix it.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <errno.h>
#include <sys/ptrace.h>
#include <sys/types.h>

struct user_fxsr_struct {
	unsigned short	cwd;
	unsigned short	swd;
	unsigned short	twd;
	unsigned short	fop;
	long	fip;
	long	fcs;
	long	foo;
	long	fos;
	long	mxcsr;
	long	reserved;
	long	st_space[32];	/* 8*16 bytes for each FP-reg = 128 bytes */
	long	xmm_space[32];	/* 8*16 bytes for each XMM-reg = 128 bytes */
	long	padding[56];
};

int main(void)
{
  pid_t pid;

  pid = fork();

  switch(pid){
  case -1:/*  error */
    break;
  case 0:/*  child */
    child();
    break;
  default:
    parent(pid);
    break;
  }
  return 0;
}

int child(void)
{
  ptrace(PTRACE_TRACEME);
  kill(getpid(), SIGSTOP);
  sleep(10);
  return 0;
}
int parent(pid_t pid)
{
  int ret;
  struct user_fxsr_struct fpxregs;

  ret = ptrace(PTRACE_GETFPXREGS, pid, 0, &fpxregs);
  if(ret < 0){
    printf("%d: %s.\n", errno, strerror(errno));
  }
  kill(pid, SIGCONT);
  wait(pid);
  return 0;
}

/* in the kerel, at kernel/i387.c get_fpxregs() */

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-30 14:35:18 +02:00
Zhang, Yanmin
fcb43042ef x86: fix cpu hotplug crash
Vegard Nossum reported crashes during cpu hotplug tests:

  http://marc.info/?l=linux-kernel&m=121413950227884&w=4

In function _cpu_up, the panic happens when calling
__raw_notifier_call_chain at the second time. Kernel doesn't panic when
calling it at the first time. If just say because of nr_cpu_ids, that's
not right.

By checking the source code, I found that function do_boot_cpu is the culprit.
Consider below call chain:
 _cpu_up=>__cpu_up=>smp_ops.cpu_up=>native_cpu_up=>do_boot_cpu.

So do_boot_cpu is called in the end. In do_boot_cpu, if
boot_error==true, cpu_clear(cpu, cpu_possible_map) is executed. So later
on, when _cpu_up calls __raw_notifier_call_chain at the second time to
report CPU_UP_CANCELED, because this cpu is already cleared from
cpu_possible_map, get_cpu_sysdev returns NULL.

Many resources are related to cpu_possible_map, so it's better not to
change it.

Below patch against 2.6.26-rc7 fixes it by removing the bit clearing in
cpu_possible_map.

Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Tested-by: Vegard Nossum <vegard.nossum@gmail.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-30 13:15:43 +02:00
Daniel J Blueman
0b1faeef5f x86: section/warning fixes
WARNING: arch/x86/mm/built-in.o(.text+0x3a1): Section mismatch in
reference from the function set_pte_phys() to the function
.init.text:spp_getpage()
The function set_pte_phys() references
the function __init spp_getpage().
This is often because set_pte_phys lacks a __init
annotation or the annotation of spp_getpage is wrong.

arch/x86/mm/init_64.c: In function 'early_memtest':
arch/x86/mm/init_64.c:520: warning: passing argument 2 of
'find_e820_area_size' from incompatible pointer type

Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
Cc: "Linus Torvalds" <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-26 15:33:09 +02:00
Max Asbock
41aefdcc98 x86: shift bits the right way in native_read_tscp
native_read_tscp shifts the bits in the high order value in the
wrong direction, the attached patch fixes that.

Signed-off-by: Max Asbock <masbock@linux.vnet.ibm.com>
Acked-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-26 14:49:17 +02:00
Jeremy Fitzhardinge
2849914393 xen: remove support for non-PAE 32-bit
Non-PAE operation has been deprecated in Xen for a while, and is
rarely tested or used.  xen-unstable has now officially dropped
non-PAE support.  Since Xen/pvops' non-PAE support has also been
broken for a while, we may as well completely drop it altogether.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-24 17:00:55 +02:00
Jeremy Fitzhardinge
ebb9cfe20f xen: don't drop NX bit
Because NX is now enforced properly, we must put the hypercall page
into the .text segment so that it is executable.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
Cc: the arch/x86 maintainers <x86@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-20 14:56:41 +02:00
Jeremy Fitzhardinge
05345b0f00 xen: mask unwanted pte bits in __supported_pte_mask
[ Stable: this isn't a bugfix in itself, but it's a pre-requiste
  for "xen: don't drop NX bit" ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
Cc: the arch/x86 maintainers <x86@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-20 14:56:36 +02:00
Isaku Yamahata
4653938379 xen: Use wmb instead of rmb in xen_evtchn_do_upcall().
This patch is ported one from 534:77db69c38249 of linux-2.6.18-xen.hg.
Use wmb instead of rmb to enforce ordering between
evtchn_upcall_pending and evtchn_pending_sel stores
in xen_evtchn_do_upcall().

Cc: Samuel Thibault <samuel.thibault@eu.citrix.com>
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: the arch/x86 maintainers <x86@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-20 14:56:30 +02:00
Suresh Siddha
54481cf88b x86: fix NULL pointer deref in __switch_to
I am able to reproduce the oops reported by Simon in __switch_to() with
lguest.

My debug showed that there is at least one lguest specific
issue (which should be present in 2.6.25 and before aswell) and it got
exposed with a kernel oops with the recent fpu dynamic allocation patches.

In addition to the previous possible scenario (with fpu_counter), in the
presence of lguest, it is possible that the cpu's TS bit it still set and the
lguest launcher task's thread_info has TS_USEDFPU still set.

This is because of the way the lguest launcher handling the guest's TS bit.
(look at lguest_set_ts() in lguest_arch_run_guest()). This can result
in a DNA fault while doing unlazy_fpu() in __switch_to(). This will
end up causing a DNA fault in the context of new process thats
getting context switched in (as opossed to handling DNA fault in the context
of lguest launcher/helper process).

This is wrong in both pre and post 2.6.25 kernels. In the recent
2.6.26-rc series, this is showing up as NULL pointer dereferences or
sleeping function called from atomic context(__switch_to()), as
we free and dynamically allocate the FPU context for the newly
created threads. Older kernels might show some FPU corruption for processes
running inside of lguest.

With the appended patch, my test system is running for more than 50 mins
now. So atleast some of your oops (hopefully all!) should get fixed.
Please give it a try. I will spend more time with this fix tomorrow.

Reported-by: Simon Holm Thøgersen <odie@cs.aau.dk>
Reported-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-20 13:26:18 +02:00
Jordan Crouse
ffe6e1da86 x86, geode: add a VSA2 ID for General Software
General Software writes their own VSA2 module for their version
of the Geode BIOS, which returns a different ID then the standard
VSA2.  This was causing the framebuffer driver to break for most
GSW boards.

Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Cc: tglx@linutronix.de
Cc: linux-geode@lists.infradead.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-19 14:19:03 +02:00
Bernhard Walle
d3942cff62 x86: use BOOTMEM_EXCLUSIVE on 32-bit
This patch uses the BOOTMEM_EXCLUSIVE for crashkernel reservation also for
i386 and prints a error message on failure.

The patch is still for 2.6.26 since it is only bug fixing. The unification
of reserve_crashkernel() between i386 and x86_64 should be done for 2.6.27.

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
2008-06-19 10:08:48 +02:00
Mikael Pettersson
df17b1d990 x86, 32-bit: fix boot failure on TSC-less processors
Booting 2.6.26-rc6 on my 486 DX/4 fails with a "BUG: Int 6"
(invalid opcode) and a kernel halt immediately after the
kernel has been uncompressed. The BUG shows EIP pointing
to an rdtsc instruction in native_read_tsc(), invoked from
native_sched_clock().

(This error occurs so early that not even the serial console
can capture it.)

A bisection showed that this bug first occurs in 2.6.26-rc3-git7,
via commit 9ccc906c97:

>x86: distangle user disabled TSC from unstable
>
>tsc_enabled is set to 0 from the command line switch "notsc" and from
>the mark_tsc_unstable code. Seperate those functionalities and replace
>tsc_enable with tsc_disable. This makes also the native_sched_clock()
>decision when to use TSC understandable.
>
>Preparatory patch to solve the sched_clock() issue on 32 bit.
>
>Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

The core reason for this bug is that native_sched_clock() gets
called before tsc_init().

Before the commit above, tsc_32.c used a "tsc_enabled" variable
which defaulted to 0 == disabled, and which only got enabled late
in tsc_init(). Thus early calls to native_sched_clock() would skip
the TSC and use jiffies instead.

After the commit above, tsc_32.c uses a "tsc_disabled" variable
which defaults to 0, meaning that the TSC is Ok to use. Early calls
to native_sched_clock() now erroneously try to use the TSC on
!cpu_has_tsc processors, leading to invalid opcode exceptions.

My proposed fix is to initialise tsc_disabled to a "soft disabled"
state distinct from the hard disabled state set up by the "notsc"
kernel option. This fixes the native_sched_clock() problem. It also
allows tsc_init() to be simplified: instead of setting tsc_disabled = 1
on every error return, we just set tsc_disabled = 0 once when all
checks have succeeded.

I've verified that this lets my 486 boot again. I've also verified
that a Core2 machine still uses the TSC as clocksource after the patch.

Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-19 10:08:47 +02:00
Suresh Siddha
75118a82e2 x86: fix NULL pointer deref in __switch_to
Patrick McHardy reported a crash:

> > I get this oops once a day, its apparently triggered by something
> > run by cron, but the process is a different one each time.
> >
> > Kernel is -git from yesterday shortly before the -rc6 release
> > (last commit is the usb-2.6 merge, the x86 patches are missing),
> > .config is attached.
> >
> > I'll retry with current -git, but the patches that have gone in
> > since I last updated don't look related.
> >
> > [62060.043009] BUG: unable to handle kernel NULL pointer dereference at
> > 000001ff
> > [62060.043009] IP: [<c0102a9b>] __switch_to+0x2f/0x118
> > [62060.043009] *pde = 00000000
> > [62060.043009] Oops: 0002 [#1] PREEMPT

Vegard Nossum analyzed it:

> This decodes to
>
>    0:   0f ae 00                fxsave (%eax)
>
> so it's related to the floating-point context. This is the exact
> location of the crash:
>
> $ addr2line -e arch/x86/kernel/process_32.o -i ab0
> include/asm/i387.h:232
> include/asm/i387.h:262
> arch/x86/kernel/process_32.c:595
>
> ...so it looks like prev_task->thread.xstate->fxsave has become NULL.
> Or maybe it never had any other value.

Somehow (as described below) TS_USEDFPU is set but the fpu is not
allocated or freed.

Another possible FPU pre-emption issue with the sleazy FPU optimization
which was benign before but not so anymore, with the dynamic FPU allocation
patch.

New task is getting exec'd and it is prempted at the below point.

flush_thread() {
	...
	/*
	* Forget coprocessor state..
	*/
	clear_fpu(tsk);
		<----- Preemption point
	clear_used_math();
	...
}

Now when it context switches in again, as the used_math() is still set
and fpu_counter can be > 5, we will do a math_state_restore() which sets
the task's TS_USEDFPU. After it continues from the above preemption point
it does clear_used_math() and much later free_thread_xstate().

Now, at the next context switch, it is quite possible that xstate is
null, used_math() is not set and TS_USEDFPU is still set. This will
trigger unlazy_fpu() causing kernel oops.

Fix this  by clearing tsk's fpu_counter before clearing task's fpu.

Reported-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-19 10:08:45 +02:00
Jeremy Fitzhardinge
ad524d46f3 x86: set PAE PHYSICAL_MASK_SHIFT to 44 bits.
When a 64-bit x86 processor runs in 32-bit PAE mode, a pte can
potentially have the same number of physical address bits as the
64-bit host ("Enhanced Legacy PAE Paging").  This means, in theory,
we could have up to 52 bits of physical address in a pte.

The 32-bit kernel uses a 32-bit unsigned long to represent a pfn.
This means that it can only represent physical addresses up to 32+12=44
bits wide.  Rather than widening pfns everywhere, just set 2^44 as the
Linux x86_32-PAE architectural limit for physical address size.

This is a bugfix for two cases:
1. running a 32-bit PAE kernel on a machine with
  more than 64GB RAM.
2. running a 32-bit PAE Xen guest on a host machine with
  more than 64GB RAM

In both cases, a pte could need to have more than 36 bits of physical,
and masking it to 36-bits will cause fairly severe havoc.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-19 10:08:44 +02:00
Dave Airlie
9bedbcb207 agp: brown paper bag patch - put back two lines that got lost
Commit 62c96b9d09 ("agp/intel: cleanup
some serious whitespace badness") didn't just fix whitespace.  It also
lost two lines.

Noticed by Linus. No more whitespace diffs for me.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-18 22:12:50 -07:00
Linus Torvalds
3506ba7b08 Merge branch 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6
* 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6:
  agp/intel: cleanup some serious whitespace badness
  [AGP] intel_agp: Add support for Intel 4 series chipsets
  [AGP] intel_agp: extra stolen mem size available for IGD_GM chipset
  agp: more boolean conversions.
  drivers/char/agp - use bool
  agp: two-stage page destruction issue
  agp/via: fixup pci ids
2008-06-18 21:52:35 -07:00
Dave Airlie
62c96b9d09 agp/intel: cleanup some serious whitespace badness
Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-06-19 14:27:53 +10:00
Zhenyu Wang
25ce77abf8 [AGP] intel_agp: Add support for Intel 4 series chipsets
Signed-off-by: Zhenyu Wang <zhenyu.z.wang@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-06-19 14:17:58 +10:00
Zhenyu Wang
598d144823 [AGP] intel_agp: extra stolen mem size available for IGD_GM chipset
This adds missing stolen memory size detect for IGD_GM, be sure to
detect right size as current X intel driver (2.3.2) which has already
worked out.

Signed-off-by: Zhenyu Wang <zhenyu.z.wang@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-06-19 14:00:37 +10:00
Dave Airlie
9516b030b4 agp: more boolean conversions.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-06-19 10:42:17 +10:00
Joe Perches
c725801292 drivers/char/agp - use bool
Use boolean in AGP instead of having own TRUE/FALSE

--
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-06-19 10:04:20 +10:00
Jan Beulich
da503fa60b agp: two-stage page destruction issue
besides it apparently being useful only in 2.6.24 (the changes in 2.6.25
really mean that it could be converted back to a single-stage mechanism),
I'm seeing an issue in Xen Dom0 kernels, which is caused by the calling
of gart_to_virt() in the second stage invocations of the destroy function.
I think that besides this being a real issue with Xen (where
unmap_page_from_agp() is not just a page table attribute change), this
also is invalid from a theoretical perspective: One should not assume that
gart_to_virt() is still valid after unmapping a page. So minimally (keeping
the 2-stage mechanism) a patch like the one below would be needed.

Jan

Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-06-19 09:56:16 +10:00
Greg KH
dcd981a77b agp/via: fixup pci ids
add a new PCI ID and remove an old dodgy one, include the explaination
in the commented code so nobody readds later.

(davej also sent the pci id addition).

Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-06-19 09:52:26 +10:00
Linus Torvalds
f9d1c6ca2b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/uverbs: Fix check of is_closed flag check in ib_uverbs_async_handler()
  RDMA/nes: Fix off-by-one in nes_reg_user_mr() error path
2008-06-18 16:08:59 -07:00
Jack Morgenstein
fb77bcef9f IB/uverbs: Fix check of is_closed flag check in ib_uverbs_async_handler()
Commit 1ae5c187 ("IB/uverbs: Don't store struct file * for event
files") changed the way that closed files are handled in the uverbs
code.  However, after the conversion, is_closed flag is checked
incorrectly in ib_uverbs_async_handler().  As a result, no async
events are ever passed to applications.

Found by: Ronni Zimmerman <ronniz@mellanox.co.il>

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-06-18 15:36:38 -07:00
Linus Torvalds
a8051fde6b Merge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
  Revert "[WATCHDOG] hpwdt: Fix NMI handling."
  [WATCHDOG] hpwdt: Add CFLAGS to get driver working
  Revert "[WATCHDOG] make watchdog/hpwdt.c:asminline_call() static"
2008-06-18 12:55:00 -07:00
Linus Torvalds
5dfd06215b Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
  [SCSI] dpt_i2o: Add PROC_IA64 define
  [SCSI] scsi_host regression: fix scsi host leak
  [SCSI] sr: fix corrupt CD data after media change and delay
2008-06-18 11:55:43 -07:00
Linus Torvalds
f32c23f59a Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
  [POWERPC] Clear sub-page HPTE present bits when demoting page size
  [POWERPC] 4xx: Clear new TLB cache attribute bits in Data Storage vector
2008-06-18 11:55:19 -07:00
Linus Torvalds
e899536470 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6:
  udf: restore UDFFS_DEBUG to being undefined by default
2008-06-18 11:55:03 -07:00
Linus Torvalds
d83b14c0db Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (43 commits)
  netlink: genl: fix circular locking
  Revert "mac80211: Use skb_header_cloned() on TX path."
  af_unix: fix 'poll for write'/ connected DGRAM sockets
  tun: Proper handling of IPv6 header in tun driver when TUN_NO_PI is set
  atl1: relax eeprom mac address error check
  net/enc28j60: low power mode
  net/enc28j60: section fix
  sky2: 88E8040T pci device id
  netxen: download firmware in pci probe
  netxen: cleanup debug messages
  netxen: remove global physical_port array
  netxen: fix portnum for hp mezz cards
  ibm_newemac: select CRC32 in Kconfig
  xfrm: fix fragmentation for ipv4 xfrm tunnel
  netfilter: nf_conntrack_h323: fix module unload crash
  netfilter: nf_conntrack_h323: fix memory leak in module initialization error path
  netfilter: nf_nat: fix RCU races
  atm: [he] send idle cells instead of unassigned when in SDH mode
  atm: [he] limit queries to the device's register space
  atm: [br2864] fix routed vcmux support
  ...
2008-06-18 11:48:40 -07:00
Wim Van Sebroeck
fdf7be6f13 Revert "[WATCHDOG] hpwdt: Fix NMI handling."
The old setup works better.

Signed-off-by: Thomas Mingarelli <Thomas.Mingarelli@hp.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2008-06-18 16:22:48 +00:00
Paul Mackerras
65ba6cdc83 [POWERPC] Clear sub-page HPTE present bits when demoting page size
When we demote a slice from 64k to 4k, and we are about to insert an
HPTE for a 4k subpage and we notice that there is an existing 64k
HPTE, we first invalidate that HPTE before inserting the new 4k
subpage HPTE.  Since the bits that encode which hash bucket the old
HPTE was in overlap with the bits that encode which of the 16 subpages
have HPTEs, we need to clear out the subpage HPTE-present bits before
starting to insert HPTEs for the 4k subpages.  If we don't do that, we
can erroneously think that a subpage already has an HPTE when it
doesn't.

That in itself wouldn't be such a problem except that when we go to
update the HPTE that we think is present on machines with a
hypervisor, the hypervisor can tell us that the HPTE we think is there
is actually there even though it isn't, which can lead to a process
getting stuck in a loop, continually faulting.  The reason for the
confusion is that the AVPN (abbreviated virtual page number) we are
looking for in the HPTE for a 4k subpage can actually match the AVPN
in a stale HPTE for another 64k page.  For example, the HPTE for
the 4k subpage at 0x84000f000 will be in the same hash bucket and have
the same AVPN as the HPTE for the 64k page at 0x8400f0000.

This fixes the code to clear out the subpage HPTE-present bits.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-06-18 21:40:43 +10:00
Josh Boyer
b17879f71c [POWERPC] 4xx: Clear new TLB cache attribute bits in Data Storage vector
A recent commit added support for the new 440x6 and 464 cores that have the
added WL1, IL1I, IL1D, IL2I, and ILD2 bits for the caching attributes in the
TLBs.  The new bits were cleared in the finish_tlb_load function, however a
similar bit of code was missed in the DataStorage interrupt vector.

Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-06-18 21:40:43 +10:00
Patrick McHardy
6d1a3fb567 netlink: genl: fix circular locking
genetlink has a circular locking dependency when dumping the registered
families:

- dump start:
genl_rcv()            : take genl_mutex
genl_rcv_msg()        : call netlink_dump_start() while holding genl_mutex
netlink_dump_start(),
netlink_dump()        : take nlk->cb_mutex
ctrl_dumpfamily()     : try to detect this case and not take genl_mutex a
                        second time

- dump continuance:
netlink_rcv()         : call netlink_dump
netlink_dump          : take nlk->cb_mutex
ctrl_dumpfamily()     : take genl_mutex

Register genl_lock as callback mutex with netlink to fix this. This slightly
widens an already existing module unload race, the genl ops used during the
dump might go away when the module is unloaded. Thomas Graf is working on a
seperate fix for this.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-18 02:07:07 -07:00
David S. Miller
3a5be7d4b0 Revert "mac80211: Use skb_header_cloned() on TX path."
This reverts commit 608961a5ec.

The problem is that the mac80211 stack not only needs to be able to
muck with the link-level headers, it also might need to mangle all of
the packet data if doing sw wireless encryption.

This fixes kernel bugzilla #10903.  Thanks to Didier Raboud (for the
bugzilla report), Andrew Prince (for bisecting), Johannes Berg (for
bringing this bisection analysis to my attention), and Ilpo (for
trying to analyze this purely from the TCP side).

In 2.6.27 we can take another stab at this, by using something like
skb_cow_data() when the TX path of mac80211 ends up with a non-NULL
tx->key.  The ESP protocol code in the IPSEC stack can be used as a
model for implementation.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-18 01:19:51 -07:00
Rainer Weikusat
3c73419c09 af_unix: fix 'poll for write'/ connected DGRAM sockets
The unix_dgram_sendmsg routine implements a (somewhat crude)
form of receiver-imposed flow control by comparing the length of the
receive queue of the 'peer socket' with the max_ack_backlog value
stored in the corresponding sock structure, either blocking
the thread which caused the send-routine to be called or returning
EAGAIN. This routine is used by both SOCK_DGRAM and SOCK_SEQPACKET
sockets. The poll-implementation for these socket types is
datagram_poll from core/datagram.c. A socket is deemed to be writeable
by this routine when the memory presently consumed by datagrams
owned by it is less than the configured socket send buffer size. This
is always wrong for connected PF_UNIX non-stream sockets when the
abovementioned receive queue is currently considered to be full.
'poll' will then return, indicating that the socket is writeable, but
a subsequent write result in EAGAIN, effectively causing an
(usual) application to 'poll for writeability by repeated send request
with O_NONBLOCK set' until it has consumed its time quantum.

The change below uses a suitably modified variant of the datagram_poll
routines for both type of PF_UNIX sockets, which tests if the
recv-queue of the peer a socket is connected to is presently
considered to be 'full' as part of the 'is this socket
writeable'-checking code. The socket being polled is additionally
put onto the peer_wait wait queue associated with its peer, because the
unix_dgram_sendmsg routine does a wake up on this queue after a
datagram was received and the 'other wakeup call' is done implicitly
as part of skb destruction, meaning, a process blocked in poll
because of a full peer receive queue could otherwise sleep forever
if no datagram owned by its socket was already sitting on this queue.
Among this change is a small (inline) helper routine named
'unix_recvq_full', which consolidates the actual testing code (in three
different places) into a single location.

Signed-off-by: Rainer Weikusat <rweikusat@mssgmbh.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 22:28:05 -07:00
David S. Miller
4552e1198a Merge branch 'davem-fixes' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2008-06-17 21:32:08 -07:00
Ang Way Chuang
f09f7ee20c tun: Proper handling of IPv6 header in tun driver when TUN_NO_PI is set
By default, tun.c running in TUN_TUN_DEV mode will set the protocol of
packet to IPv4 if TUN_NO_PI is set. My program failed to work when I
assumed that the driver will check the first nibble of packet,
determine IP version and set the appropriate protocol.

Signed-off-by: Ang Way Chuang <wcang@nav6.org>
Acked-by: Max Krasnyansky <maxk@qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 21:10:33 -07:00
Radu Cristescu
58c7821c42 atl1: relax eeprom mac address error check
The atl1 driver tries to determine the MAC address thusly:

	- If an EEPROM exists, read the MAC address from EEPROM and
	  validate it.
	- If an EEPROM doesn't exist, try to read a MAC address from
	  SPI flash.
	- If that fails, try to read a MAC address directly from the
	  MAC Station Address register.
	- If that fails, assign a random MAC address provided by the
	  kernel.

We now have a report of a system fitted with an EEPROM containing all
zeros where we expect the MAC address to be, and we currently handle
this as an error condition.  Turns out, on this system the BIOS writes
a valid MAC address to the NIC's MAC Station Address register, but we
never try to read it because we return an error when we find the all-
zeros address in EEPROM.

This patch relaxes the error check and continues looking for a MAC
address even if it finds an illegal one in EEPROM.

Signed-off-by: Radu Cristescu <advantis@gmx.net>
Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-17 23:09:21 -04:00
David Brownell
7dac6f8df6 net/enc28j60: low power mode
Keep enc28j60 chips in low-power mode when they're not in use.
At typically 120 mA, these chips run hot even when idle; this
low power mode cuts that power usage by a factor of around 100.

This version provides a generic routine to poll a register until
its masked value equals some value ... e.g. bit set or cleared.
It's basically what the previous wait_phy_ready() did, but this
version is generalized to support the handshaking needed to
enter and exit low power mode.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Claudio Lanconelli <lanconelli.claudio@eptar.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-17 23:07:29 -04:00
David Brownell
6fd65882f5 net/enc28j60: section fix
Minor bugfixes to the enc28j60 driver ... wrong section marking,
indentation, and bogus use of spi_bus_type.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: Claudio Lanconelli <lanconelli.claudio@eptar.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-17 23:07:05 -04:00
Stephen Hemminger
a3b4fcedee sky2: 88E8040T pci device id
Missed one pci id for 88E8040T.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-17 23:07:03 -04:00
Dhananjay Phadke
439b454edf netxen: download firmware in pci probe
Downloading firmware in pci probe allows recovery in case of
firmware failure by reloading the driver.

Also reduced delays in firmware load.

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-17 23:07:01 -04:00
Dhananjay Phadke
dcd56fdbae netxen: cleanup debug messages
o Remove unnecessary debug prints and functions.
o Explicitly specify pci class (0x020000) to avoid enabling
  management function.

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-17 23:07:00 -04:00
Dhananjay Phadke
3276fbad83 netxen: remove global physical_port array
Store physical port number in netxen_adapter structure.

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-17 23:06:59 -04:00
Dhananjay Phadke
dc515f2e0b netxen: fix portnum for hp mezz cards
This fixes a the issue where logical port number is set incorrectly
for HP blade mezz cards.

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-17 23:06:58 -04:00
Josh Boyer
8b8091fbf4 ibm_newemac: select CRC32 in Kconfig
The ibm_newemac driver requires ether_crc to be defined.  Apparently it is
possible to generate a .config without CONFIG_CRC32 set which causes the
following link errors if IBM_NEW_EMAC is selected:

  LD      .tmp_vmlinux1
drivers/built-in.o: In function `emac_hash_mc':
core.c:(.text+0x2f524): undefined reference to `crc32_le'
core.c:(.text+0x2f528): undefined reference to `bitrev32'
make: *** [.tmp_vmlinux1] Error 1

This patch has IBM_NEW_EMAC select CRC32 so we don't hit this error.

Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-17 23:06:56 -04:00