Commit graph

190450 commits

Author SHA1 Message Date
Mauro Carvalho Chehab
d1fd4fb69e i7core_edac: Add a code to probe Xeon 55xx bus
This code changes the detection procedure of i7core_edac. Instead of
directly probing for MC registers, it probes for another register found
on Nehalem. If found, it tries to pick the first MC PCI BUS. This should
work fine with Xeon 35xx, but, on Xeon 55xx, this is at bus 254 and 255
that are not properly detected by the non-legacy PCI methods.

The new detection code scans specifically at buses 254 and 255 for the
Xeon 55xx devices.

This code has not tested yet. After working, a change at the code will
be needed, since the i7core is not yet ready for working with 2 sets of
MC.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:51 -03:00
Aristeu Rozanski
5707b24a50 pci: Add a probing code that seeks for an specific bus
This patch adds a probing code that seeks for an specific pci bus. It
still needs testing, but it is hoped that this will help to identify the
memory controller with Xeon 55xx series.

Signed-off-by: Aristeu Sergio <arozansk@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:51 -03:00
Mauro Carvalho Chehab
e9bd2e7379 i7core_edac: Adds write unlock to MC registers
The public Intel Xeon 5500 volume 2 datasheet describes, on page 53,
session 2.6.7 a register that can lock/unlock Memory Controller the
configuration register, called MC_CFG_CONTROL.

Adds support for it in the hope that software error injection would
work. With my tests with Xeon 35xx, there's still something missing.
With a program that does sequencial bit writes at dev 0.0, sometimes, it
produces error injection, after unblocking the MC_CFG_CONTROL (and,
sometimes, it just locks my testing machine).

I'll try later to discover by trial and error what's the register that
solves this issue on Xeon 35xx.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:50 -03:00
Mauro Carvalho Chehab
d5381642ab i7core_edac: Add edac_mce glue
Adds a glue code to allow i7core to work with mcelog. With the glue,
i7core registers itself on edac_mce. At mce, when an error is detected,
it calls all registered drivers (in this case, i7core), for EDAC error
handling.

TODO: It currently just prints the MCE error log using about the same
      format as mce panic messages. The error message should be enhanced
      with mcelog userspace info and converted into the proper EDAC format,
      to feed the EDAC error counts.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:50 -03:00
Mauro Carvalho Chehab
963c5ba359 edac/Kconfig: edac_mce can't be module
Since mcelog is bool, edac_mce glue should also be bool, or otherwise
will not work.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:50 -03:00
Mauro Carvalho Chehab
696e409dbd edac_mce: Add an interface driver to report mce errors via edac
edac_mce module is an interface module that gets mcelog data and
forwards to any registered edac module that expects to receive data via
mce.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:49 -03:00
Mauro Carvalho Chehab
41fcb7feed i7core_edac: CodingStyle fixes
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:48 -03:00
Mauro Carvalho Chehab
eb94fc402f i7core_edac: fill csrows edac sysfs info
csrows is still fake, since we can't identify its representation with
Nehalem registers.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:48 -03:00
Mauro Carvalho Chehab
5566cb7c91 i7core_edac: Memory info fixes and preparation for properly filling cswrow data
Now, memory size is properly displayed:

    EDAC i7core: DOD Max limits: DIMMS: 2, 1-ranked, 8-banked
    EDAC i7core: DOD Max rows x colums = 0x4000 x 0x400
    EDAC i7core: Memory channel configuration:
    EDAC i7core: Ch0 phy rd0, wr0 (0x063f7c31): 2 ranks, UDIMMs
    EDAC i7core:    dimm 0 (0x00000288) 1024 Mb offset: 0, numbank: 8,
                    numrank: 1, numrow: 0x4000, numcol: 0x400
    EDAC i7core:    dimm 1 (0x00001288) 1024 Mb offset: 4, numbank: 8,
                    numrank: 1, numrow: 0x4000, numcol: 0x400
    EDAC i7core: Ch1 phy rd1, wr1 (0x063f7c31): 2 ranks, UDIMMs
    EDAC i7core:    dimm 0 (0x00000288) 1024 Mb offset: 0, numbank: 8,
                    numrank: 1, numrow: 0x4000, numcol: 0x400
    EDAC i7core: Ch2 phy rd3, wr3 (0x063f7c31): 2 ranks, UDIMMs
    EDAC i7core:    dimm 0 (0x00000288) 1024 Mb offset: 0, numbank: 8,
                    numrank: 1, numrow: 0x4000, numcol: 0x400

Still, as the way to retrieve csrows info is not known, it does a
mapping of what's available to csrows basic unit at edac core.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:48 -03:00
Mauro Carvalho Chehab
854d334997 i7core_edac: Get more info about the memory DIMMs
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:48 -03:00
Mauro Carvalho Chehab
7dd6953c5f i7core_edac: Add more information about each active dimm
Thanks-to: Aristeu Rozanski <aris@redhat.com> for part of the code

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:47 -03:00
Mauro Carvalho Chehab
b7c761512c i7core_edac: Improve error handling
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:47 -03:00
Mauro Carvalho Chehab
1c6fed808f i7core_edac: Properly fill struct csrow_info
Thanks-to: Aristeu Rozanski <aris@redhat.com> for part of the code

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:47 -03:00
Mauro Carvalho Chehab
ef708b53b9 i7core_edac: Add additional tests for error detection
Properly check the number of channels and improve probing error detection

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:47 -03:00
Mauro Carvalho Chehab
442305b152 i7core_edac: Add a memory check routine, based on device 3 function 4
This function appears only on Xeon 5500 datasheet. Yet, testing with a
Xeon 3503 showed that this is also implemented on other Nehalem
processors.

At the first read, MC_TEST_ERR_RCV1 and MC_TEST_ERR_RCV0 can contain any
value. Modify CE error logic to update the error count only after the
second read.

An alternative approach would be to do a write at rcv0 and rcv1
registers, but it seemed better to keep they untouched, since BIOS might
eventually assume that they are exclusive for their usage.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:46 -03:00
Mauro Carvalho Chehab
87d1d272ba i7core_edac: need mci->edac_check, otherwise module removal doesn't work
There are some locking troubles with edac_core: if you don't declare an
edac_check, module may suffer from soft lock.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:46 -03:00
Mauro Carvalho Chehab
7b029d03c3 i7core_edac: A few fixes at error injection code
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:46 -03:00
Mauro Carvalho Chehab
f122a89222 i7core_edac: Show read/write virtual/physical channel association
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:46 -03:00
Mauro Carvalho Chehab
8f33190757 i7core_edac: Registers all supported MC functions
Now, it will try to register on all supported Memory Controller
functions.

It should be noticed that dev3, function 2 is present only on chips with
Registered DIMM's, according to the datasheet. So, the driver doesn't
return -ENODEV is all functions but this one were successfully
registered and enabled:

    EDAC i7core: Registered device 8086:2c18 fn=3 0
    EDAC i7core: Registered device 8086:2c19 fn=3 1
    EDAC i7core: Device not found: PCI ID 8086:2c1a (dev 3, func 2)
    EDAC i7core: Registered device 8086:2c1c fn=3 4
    EDAC i7core: Registered device 8086:2c20 fn=4 0
    EDAC i7core: Registered device 8086:2c21 fn=4 1
    EDAC i7core: Registered device 8086:2c22 fn=4 2
    EDAC i7core: Registered device 8086:2c23 fn=4 3
    EDAC i7core: Registered device 8086:2c28 fn=5 0
    EDAC i7core: Registered device 8086:2c29 fn=5 1
    EDAC i7core: Registered device 8086:2c2a fn=5 2
    EDAC i7core: Registered device 8086:2c2b fn=5 3
    EDAC i7core: Registered device 8086:2c30 fn=6 0
    EDAC i7core: Registered device 8086:2c31 fn=6 1
    EDAC i7core: Registered device 8086:2c32 fn=6 2
    EDAC i7core: Registered device 8086:2c33 fn=6 3
    EDAC i7core: Driver loaded.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:45 -03:00
Mauro Carvalho Chehab
0b2b7b7ec0 i7core_edac: Add more status functions to EDAC driver
This patch were co-authored with Aristeu Rozanski.

Signed-off-by: Aristeu Sergio <arozansk@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:45 -03:00
Mauro Carvalho Chehab
194a40feab i7core_edac: Add error insertion code for Nehalem
Implements set_inject_error() with the low-level code needed to inject
memory errors at Nehalem, and adds some sysfs nodes to allow error injection

The next patch will add an API for error injection.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:45 -03:00
Mauro Carvalho Chehab
a0c36a1f0f i7core_edac: Add an EDAC memory controller driver for Nehalem chipsets
This driver is meant to support i7 core/i7core extreme desktop
processors and Xeon 35xx/55xx series with integrated memory controller.
It is likely that it can be expanded in the future to work with other
processor series based at the same Memory Controller design.

For now, it has just a few MCH status reads.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:45 -03:00
Linus Torvalds
66f41d4c5c Linux 2.6.34-rc6 2010-04-29 20:02:05 -07:00
Linus Torvalds
b18262eda3 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
  kgdb: don't needlessly skip PAGE_USER test for Fsl booke
2010-04-29 20:01:42 -07:00
Linus Torvalds
e97e7120eb Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: add a shrinker to background inode reclaim
2010-04-29 19:49:34 -07:00
Wufei
56151e7534 kgdb: don't needlessly skip PAGE_USER test for Fsl booke
The bypassing of this test is a leftover from 2.4 vintage
kernels, and is no longer appropriate, or even used by KGDB.
Currently KGDB uses probe_kernel_write() for all access to
memory via the KGDB core, so it can simply be deleted.

This fixes CVE-2010-1446.

CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
CC: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Wufei <fei.wu@windriver.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-04-29 21:41:44 -05:00
Linus Torvalds
fed0a9c644 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  exofs: Fix "add bdi backing to mount session" fall out
  fs: fs/super.c needs to include backing-dev.h for !CONFIG_BLOCK
2010-04-29 17:18:07 -07:00
Linus Torvalds
6bec11921a Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
  ARM: 6061/1: PL061 GPIO: Bug fix - setting gpio for HIGH_LEVEL interrupt is not working.
  ARM: 5957/1: ARM: RealView SD/MMC Card detection and write-protect using GPIOLIB
  ARM: 6030/1: KS8695: enable console
  ARM: 6060/1: PL061 GPIO: Setting gpio val after changing direction to OUT.
  ARM: 6059/1: PL061 GPIO: Changing *_irq_chip_data with *_irq_data for real irqs.
  ARM: 6023/1: update bcmring_defconfig to latest version and fix build error
  ARM: fix build error in arch/arm/kernel/process.c
2010-04-29 17:17:35 -07:00
Linus Torvalds
553cbf0a8f Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/ps3: Update ps3_defconfig
  powerpc/ps3: Update platform maintainer
  powerpc/pseries: Flush lazy kernel mappings after unplug operations
  powerpc/numa: Add form 1 NUMA affinity
  powerpc/fsl-booke: Fix CONFIG_RELOCATABLE support on FSL Book-E ppc32
  powerpc: 2.6.34 update of defconfigs for embedded 6xx/7xxx, 8xx, 8xxx
  powerpc/mpc8xxx defconfigs - turn off SYSFS_DEPRECATED
  powerpc/83xx: configure SIL SATA driver in 83xx-wide defconfig
  powerpc/83xx: enable EPOLL syscall in defconfig
  powerpc/83xx: add RTC drivers in 83xx defconfig
  powerpc/fsl-cpm: Configure clock correctly for SCC
  powerpc/fsl_booke: Correct test for MMU_FTR_BIG_PHYS
  powerpc/85xx/86xx: Fix build w/ CONFIG_PCI=n
2010-04-29 17:16:36 -07:00
viresh kumar
db7e1bc479 ARM: 6061/1: PL061 GPIO: Bug fix - setting gpio for HIGH_LEVEL interrupt is not working.
In current implementation of PL061, setting type of irq to HIGH_LEVEL is not
working. This patch fixes this bug.

Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-04-29 23:23:49 +01:00
Dave Chinner
9bf729c0af xfs: add a shrinker to background inode reclaim
On low memory boxes or those with highmem, kernel can OOM before the
background reclaims inodes via xfssyncd. Add a shrinker to run inode
reclaim so that it inode reclaim is expedited when memory is low.

This is more complex than it needs to be because the VM folk don't
want a context added to the shrinker infrastructure. Hence we need
to add a global list of XFS mount structures so the shrinker can
traverse them.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-04-29 16:22:13 -05:00
Boaz Harrosh
3c2023dd8e exofs: Fix "add bdi backing to mount session" fall out
The patch: add bdi backing to mount session
	(b3d0ab7e60)

Has a bug in the placement of the bdi member at
struct exofs_sb_info. The layout member must be kept
last.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-29 20:35:29 +02:00
Jens Axboe
5477d0face fs: fs/super.c needs to include backing-dev.h for !CONFIG_BLOCK
When CONFIG_BLOCK is set, it ends up getting backing-dev.h included.
But for !CONFIG_BLOCK, it isn't so lucky. The proper thing to do is
include <linux/backing-dev.h> directly from the file it's used from,
so do that.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-29 20:33:35 +02:00
Linus Torvalds
27fb8d7b1f Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  nfs: fix memory leak in nfs_get_sb with CONFIG_NFS_V4
  nfs: fix some issues in nfs41_proc_reclaim_complete()
  NFS: Ensure that nfs_wb_page() waits for Pg_writeback to clear
  NFS: Fix an unstable write data integrity race
  nfs: testing for null instead of ERR_PTR()
  NFS: rsize and wsize settings ignored on v4 mounts
  NFSv4: Don't attempt an atomic open if the file is a mountpoint
  SUNRPC: Fix a bug in rpcauth_prune_expired
2010-04-29 10:23:44 -07:00
Arnd Bergmann
f80a0ca6ad pktcdvd: improve BKL and compat_ioctl.c usage
The pktcdvd driver uses proper locking and does not need the BKL in the
ioctl and llseek functions of the character device, so kill both.

Moving the compat_ioctl handling from common code into the driver itself
fixes build problems when CONFIG_BLOCK is disabled.

Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-29 08:44:37 -07:00
Boaz Harrosh
a36fed12a4 exofs: Fix "add bdi backing to mount session" fall out
Commit b3d0ab7e60 ("exofs: add bdi backing
to mount session") has a bug in the placement of the bdi member at
struct exofs_sb_info.  The layout member must be kept last.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-29 07:59:16 -07:00
Linus Torvalds
dfad53d48e Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip:
  x86: Disable large pages on CPUs with Atom erratum AAE44
  x86-64: Clear a 64-bit FS/GS base on fork if selector is nonzero
  x86, mrst: Conditionally register cpu hotplug notifier for apbt
2010-04-28 20:41:55 -07:00
Linus Torvalds
79dba2eaa7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  x86/PCI: compute Address Space length rather than using _LEN
  x86/PCI: never allocate PCI MMIO resources below BIOS_END
2010-04-28 20:40:17 -07:00
Al Viro
d9e80b7de9 nfs d_revalidate() is too trigger-happy with d_drop()
If dentry found stale happens to be a root of disconnected tree, we
can't d_drop() it; its d_hash is actually part of s_anon and d_drop()
would simply hide it from shrink_dcache_for_umount(), leading to
all sorts of fun, including busy inodes on umount and oopsen after
that.

Bug had been there since at least 2006 (commit c636eb already has it),
so it's definitely -stable fodder.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-28 20:40:03 -07:00
Colin Tuckley
b56ba8aa6f ARM: 5957/1: ARM: RealView SD/MMC Card detection and write-protect using GPIOLIB
The switch to using GPIOLIB broke the sd/mmc card detection on the
RealView development boards if GPIO_PL061 was not selected.
This patch selects GPIO_PL061 if GPIOLIB is selected.
The sense of the return value from mmc_status has also changed
and is corrected.

Signed-off-by: Colin Tuckley <colin.tuckley@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-04-28 22:21:52 +01:00
Linus Torvalds
1d16b0f2f3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6:
  regulator: fix enabling regulator issue on max8925
2010-04-28 13:37:31 -07:00
Linus Torvalds
032b734d29 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits)
  sfc: Change falcon_probe_board() to fail for unsupported boards
  sfc: Always close net device at the end of a disabling reset
  sfc: Wait at most 10ms for the MC to finish reading out MAC statistics
  sctp: Fix oops when sending queued ASCONF chunks
  sctp: fix to calc the INIT/INIT-ACK chunk length correctly is set
  sctp: per_cpu variables should be in bh_disabled section
  sctp: fix potential reference of a freed pointer
  sctp: avoid irq lock inversion while call sk->sk_data_ready()
  Revert "tcp: bind() fix when many ports are bound"
  net/usb: add sierra_net.c driver
  cdc_ether: fix autosuspend for mbm devices
  bluetooth: handle l2cap_create_connless_pdu() errors
  gianfar: Wait for both RX and TX to stop
  ipheth: potential null dereferences on error path
  smc91c92_cs: spin_unlock_irqrestore before calling smc_interrupt()
  drivers/usb/net/kaweth.c: add device "Allied Telesyn AT-USB10 USB Ethernet Adapter"
  bnx2: Update version to 2.0.9.
  bnx2: Prevent "scheduling while atomic" warning with cnic, bonding and vlan.
  bnx2: Fix lost MSI-X problem on 5709 NICs.
  cxgb3: Wait longer for control packets on initialization
  ...
2010-04-28 13:37:06 -07:00
Ben Hutchings
e41c11ee0c sfc: Change falcon_probe_board() to fail for unsupported boards
The driver needs specific PHY and board support code for each SFC4000
board; there is no point trying to continue if it is missing.
Currently unsupported boards can trigger an 'oops'.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-28 12:18:27 -07:00
Ben Hutchings
f49a4589e9 sfc: Always close net device at the end of a disabling reset
This fixes a regression introduced by commit
eb9f6744cb "sfc: Implement ethtool
reset operation".

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-28 12:18:26 -07:00
Ben Hutchings
aabc564907 sfc: Wait at most 10ms for the MC to finish reading out MAC statistics
The original code would wait indefinitely if MAC stats DMA failed.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-28 12:18:26 -07:00
Vlad Yasevich
c078669340 sctp: Fix oops when sending queued ASCONF chunks
When we finish processing ASCONF_ACK chunk, we try to send
the next queued ASCONF.  This action runs the sctp state
machine recursively and it's not prepared to do so.

kernel BUG at kernel/timer.c:790!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/module/ipv6/initstate
Modules linked in: sha256_generic sctp libcrc32c ipv6 dm_multipath
uinput 8139too i2c_piix4 8139cp mii i2c_core pcspkr virtio_net joydev
floppy virtio_blk virtio_pci [last unloaded: scsi_wait_scan]

Pid: 0, comm: swapper Not tainted 2.6.34-rc4 #15 /Bochs
EIP: 0060:[<c044a2ef>] EFLAGS: 00010286 CPU: 0
EIP is at add_timer+0xd/0x1b
EAX: cecbab14 EBX: 000000f0 ECX: c0957b1c EDX: 03595cf4
ESI: cecba800 EDI: cf276f00 EBP: c0957aa0 ESP: c0957aa0
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process swapper (pid: 0, ti=c0956000 task=c0988ba0 task.ti=c0956000)
Stack:
 c0957ae0 d1851214 c0ab62e4 c0ab5f26 0500ffff 00000004 00000005 00000004
<0> 00000000 d18694fd 00000004 1666b892 cecba800 cecba800 c0957b14
00000004
<0> c0957b94 d1851b11 ceda8b00 cecba800 cf276f00 00000001 c0957b14
000000d0
Call Trace:
 [<d1851214>] ? sctp_side_effects+0x607/0xdfc [sctp]
 [<d1851b11>] ? sctp_do_sm+0x108/0x159 [sctp]
 [<d1863386>] ? sctp_pname+0x0/0x1d [sctp]
 [<d1861a56>] ? sctp_primitive_ASCONF+0x36/0x3b [sctp]
 [<d185657c>] ? sctp_process_asconf_ack+0x2a4/0x2d3 [sctp]
 [<d184e35c>] ? sctp_sf_do_asconf_ack+0x1dd/0x2b4 [sctp]
 [<d1851ac1>] ? sctp_do_sm+0xb8/0x159 [sctp]
 [<d1863334>] ? sctp_cname+0x0/0x52 [sctp]
 [<d1854377>] ? sctp_assoc_bh_rcv+0xac/0xe1 [sctp]
 [<d1858f0f>] ? sctp_inq_push+0x2d/0x30 [sctp]
 [<d186329d>] ? sctp_rcv+0x797/0x82e [sctp]

Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Yuansong Qiao <ysqiao@research.ait.ie>
Signed-off-by: Shuaijun Zhang <szhang@research.ait.ie>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-28 12:16:34 -07:00
Wei Yongjun
a8170c35e7 sctp: fix to calc the INIT/INIT-ACK chunk length correctly is set
When calculating the INIT/INIT-ACK chunk length, we should not
only account the length of parameters, but also the parameters
zero padding length, such as AUTH HMACS parameter and CHUNKS
parameter. Without the parameters zero padding length we may get
following oops.

skb_over_panic: text:ce2068d2 len:130 put:6 head:cac3fe00 data:cac3fe00 tail:0xcac3fe82 end:0xcac3fe80 dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:127!
invalid opcode: 0000 [#2] SMP
last sysfs file: /sys/module/aes_generic/initstate
Modules linked in: authenc ......

Pid: 4102, comm: sctp_darn Tainted: G      D    2.6.34-rc2 #6
EIP: 0060:[<c0607630>] EFLAGS: 00010282 CPU: 0
EIP is at skb_over_panic+0x37/0x3e
EAX: 00000078 EBX: c07c024b ECX: c07c02b9 EDX: cb607b78
ESI: 00000000 EDI: cac3fe7a EBP: 00000002 ESP: cb607b74
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process sctp_darn (pid: 4102, ti=cb607000 task=cabdc990 task.ti=cb607000)
Stack:
 c07c02b9 ce2068d2 00000082 00000006 cac3fe00 cac3fe00 cac3fe82 cac3fe80
<0> c07c024b cac3fe7c cac3fe7a c0608dec ca986e80 ce2068d2 00000006 0000007a
<0> cb8120ca ca986e80 cb812000 00000003 cb8120c4 ce208a25 cb8120ca cadd9400
Call Trace:
 [<ce2068d2>] ? sctp_addto_chunk+0x45/0x85 [sctp]
 [<c0608dec>] ? skb_put+0x2e/0x32
 [<ce2068d2>] ? sctp_addto_chunk+0x45/0x85 [sctp]
 [<ce208a25>] ? sctp_make_init+0x279/0x28c [sctp]
 [<c0686a92>] ? apic_timer_interrupt+0x2a/0x30
 [<ce1fdc0b>] ? sctp_sf_do_prm_asoc+0x2b/0x7b [sctp]
 [<ce202823>] ? sctp_do_sm+0xa0/0x14a [sctp]
 [<ce2133b9>] ? sctp_pname+0x0/0x14 [sctp]
 [<ce211d72>] ? sctp_primitive_ASSOCIATE+0x2b/0x31 [sctp]
 [<ce20f3cf>] ? sctp_sendmsg+0x7a0/0x9eb [sctp]
 [<c064eb1e>] ? inet_sendmsg+0x3b/0x43
 [<c04244b7>] ? task_tick_fair+0x2d/0xd9
 [<c06031e1>] ? sock_sendmsg+0xa7/0xc1
 [<c0416afe>] ? smp_apic_timer_interrupt+0x6b/0x75
 [<c0425123>] ? dequeue_task_fair+0x34/0x19b
 [<c0446abb>] ? sched_clock_local+0x17/0x11e
 [<c052ea87>] ? _copy_from_user+0x2b/0x10c
 [<c060ab3a>] ? verify_iovec+0x3c/0x6a
 [<c06035ca>] ? sys_sendmsg+0x186/0x1e2
 [<c042176b>] ? __wake_up_common+0x34/0x5b
 [<c04240c2>] ? __wake_up+0x2c/0x3b
 [<c057e35c>] ? tty_wakeup+0x43/0x47
 [<c04430f2>] ? remove_wait_queue+0x16/0x24
 [<c0580c94>] ? n_tty_read+0x5b8/0x65e
 [<c042be02>] ? default_wake_function+0x0/0x8
 [<c0604e0e>] ? sys_socketcall+0x17f/0x1cd
 [<c040264c>] ? sysenter_do_call+0x12/0x22
Code: 0f 45 de 53 ff b0 98 00 00 00 ff b0 94 ......
EIP: [<c0607630>] skb_over_panic+0x37/0x3e SS:ESP 0068:cb607b74

To reproduce:

# modprobe sctp
# echo 1 > /proc/sys/net/sctp/addip_enable
# echo 1 > /proc/sys/net/sctp/auth_enable
# sctp_test -H 3ffe:501:ffff💯20c:29ff:fe4d:f37e -P 800 -l
# sctp_darn -H 3ffe:501:ffff💯20c:29ff:fe4d:f37e -P 900 -h 192.168.0.21 -p 800 -I -s -t
sctp_darn ready to send...
3ffe:501:ffff💯20c:29ff:fe4d:f37e:900-192.168.0.21:800 Interactive mode> bindx-add=192.168.0.21
3ffe:501:ffff💯20c:29ff:fe4d:f37e:900-192.168.0.21:800 Interactive mode> bindx-add=192.168.1.21
3ffe:501:ffff💯20c:29ff:fe4d:f37e:900-192.168.0.21:800 Interactive mode> snd=10

------------------------------------------------------------------
eth0 has addresses: 3ffe:501:ffff💯20c:29ff:fe4d:f37e and 192.168.0.21
eth1 has addresses: 192.168.1.21
------------------------------------------------------------------

Reported-by: George Cheimonidis <gchimon@gmail.com>
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-28 12:16:33 -07:00
Vlad Yasevich
81419d862d sctp: per_cpu variables should be in bh_disabled section
Since the change of the atomics to percpu variables, we now
have to disable BH in process context when touching percpu variables.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-28 12:16:33 -07:00
Vlad Yasevich
0c42749cff sctp: fix potential reference of a freed pointer
When sctp attempts to update an assocition, it removes any
addresses that were not in the updated INITs.  However, the loop
may attempt to refrence a transport with address after removing it.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-28 12:16:32 -07:00
Wei Yongjun
561b1733a4 sctp: avoid irq lock inversion while call sk->sk_data_ready()
sk->sk_data_ready() of sctp socket can be called from both BH and non-BH
contexts, but the default sk->sk_data_ready(), sock_def_readable(), can
not be used in this case. Therefore, we have to make a new function
sctp_data_ready() to grab sk->sk_data_ready() with BH disabling.

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.33-rc6 #129
---------------------------------------------------------
sctp_darn/1517 just changed the state of lock:
 (clock-AF_INET){++.?..}, at: [<c06aab60>] sock_def_readable+0x20/0x80
but this lock took another, SOFTIRQ-unsafe lock in the past:
 (slock-AF_INET){+.-...}

and interrupts could create inverse lock ordering between them.

other info that might help us debug this:
1 lock held by sctp_darn/1517:
 #0:  (sk_lock-AF_INET){+.+.+.}, at: [<cdfe363d>] sctp_sendmsg+0x23d/0xc00 [sctp]

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-28 12:16:31 -07:00