Written by Adrian Sun (asun@darksunrising.com).
Ported to 2.6.x by Tom 'spot' Callaway <tcallawa@redhat.com>.
Further cleaned up and integrated by David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
I have been experimenting with loadable protocol modules, and ran into
several issues with module reference counting.
The first issue was that __module_get failed at the BUG_ON check at
the top of the routine (checking that my module reference count was
not zero) when I created the first socket. When sk_alloc() is called,
my module reference count was still 0. When I looked at why sctp
didn't have this problem, I discovered that sctp creates a control
socket during module init (when the module ref count is not 0), which
keeps the reference count non-zero. This section has been updated to
address the point Stephen raised about checking the return value of
try_module_get().
The next problem arose when my socket init routine returned an error.
This resulted in my module reference count being decremented below 0.
My socket ops->release routine was also being called. The issue here
is that sock_release() calls the ops->release routine and decrements
the ref count if sock->ops is not NULL. Since the socket probably
didn't get correctly initialized, this should not be done, so we will
set sock->ops to NULL because we will not call try_module_get().
While searching for another bug, I also noticed that sys_accept() has
a possibility of doing a module_put() when it did not do an
__module_get so I re-ordered the call to security_socket_accept().
Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Place them on separate cache lines in SMP to lower memory bouncing
between multiple CPU accessing the device.
- One part is mostly used on receive path (including
eth_type_trans()) (poll_list, poll, quota, weight, last_rx,
dev_addr, broadcast)
- One part is mostly used on queue transmit path (qdisc)
(queue_lock, qdisc, qdisc_sleeping, qdisc_list, tx_queue_len)
- One part is mostly used on xmit path (device)
(xmit_lock, xmit_lock_owner, priv, hard_start_xmit, trans_start)
'features' is placed outside of these hot points, in a location that
may be shared by all cpus (because mostly read)
name_hlist is moved close to name[IFNAMSIZ] to speedup __dev_get_by_name()
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use iteration instead of recursion. Fraglists within fraglists
should never occur, so we BUG check this.
Signed-off-by: Daniel Phillips <phillips@istop.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The IB spec defines the field to be 32 bits, not 16 bits.
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
We ignored umask when creating new queues via mq_open (when creating
with open() on mqueue fs it is ok of course). According to the
specification this a bug. This trivial patch fixes this.
Signed-off-by: Krzysztof Benedyczak <golbi@mat.uni.torun.pl>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix interrupt test handler by adding check for IRQ assertion in
PCI_STATE register in addition to the status block updated bit.
Add test for valid ethernet address in tg3_set_mac_addr().
Add tg3_bus_string() to setup the PCI bus speed/width string for all
PCI/PCIX/PCI Express devices. This is used to print the bus type
during init_one().
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix 5780 PHY related problems:
1. MAC_RX_MODE reset must be done before setting up the MAC_MODE
register on 5705_PLUS chips or the chip will stop receiving after
a while. The MAC_RX_MODE reset is needed to prevent intermittently
losing the first receive packet on serdes chips.
2. Skip MAC loopback test on 5780 because of hardware errata. Normal
traffic including PHY loopback is not affected by the errata.
3. PHY loopback fails intermittently on 5708S and this is fixed by
putting the PHY in loopback mode first before programming the MAC
mode register. A MAC_RX_MODE reset is also added.
4. Return -EINVAL in tg3_nway_reset() if device is in TBI mode. Allow
nway_reset if 5780S is in parallel detect mode.
5. Add missing PHY IDs in KNOWN_PHY_ID() macro.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we double-add a neighbour entry timer, which should be
impossible but has been reported, dump the current state of
the entry so that we can debug this.
Signed-off-by: David S. Miller <davem@davemloft.net>
When allocating a table for mem-free HCA context, don't assume that
obj_size * nobj is an even multiple of MTHCA_TABLE_CHUNK_SIZE. In
particular, make sure we allocate at least one slot even if the table
is smaller than MTHCA_TABLE_CHUNK_SIZE.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
arm maketools needs include/asm-arm in place in the build tree.
On normal builds it's always there, of course, but on O= it's created
(by generic code) too late - when we get to asm-offset.h.
We used to get away with that by accident - creation of
include/asm-arm/arch symlink creates include/asm-arm and it happened
to go before maketools. However, we did not have such dependency,
so that luck didn't last - now maketools is picked first and we are screwed.
Both the symlink and maketools are prerequisites of the same
target (archprepare). This fix is obvious - make the latter explicitly
depend on the former and be done with that.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We do _not_ need "sparse" in sparse arguments ;-)
What we do need is __BIG_ENDIAN__; right now unconditional, when m32r
starts using CPU_LITTLE_ENDIAN, we'll need to adjust.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Most of these guys are simply not needed (pulled by other stuff
via asm-i386/hardirq.h). One that is not entirely useless is hilarious -
arch/i386/oprofile/nmi_timer_int.c includes linux/irq.h... as a way to
get linux/errno.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In order to do it correctly on UltraSPARC-III+ and later we'd
need to add some complicated code to set the TAG access extension
register before loading the TLB.
Since this optimization gives questionable gains, it's best to
just remove it for now instead of adding the fix for Ultra-III+
Signed-off-by: David S. Miller <davem@davemloft.net>
It tries to batch up the tag loads and comparisons, and
then the stores. And this is just complicated instead
of efficient.
Also, make the symbol of the Cheetah version more grepable.
Signed-off-by: David S. Miller <davem@davemloft.net>
Received from Mark Salyzyn from Adaptec.
High Priority Queues have *never* been used in the entire history of the
aac based adapters. Associated with this, aac_insert_entry can be
removed, SavedIrql can be removed & padding variable can be removed.
With the movement of SavedIrql out & replaced with an automatic variable
qflags, the locking can be refined somewhat. The sparse warnings did not
catch the need for byte swapping in the 'dprintk' debugging print
macros, so fixed this up when this code was moved outside of the now
refined locking.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn from Adaptec.
The size of the command packet's scatter gather list maximum size was
miscalculated in the low range leading to the driver initialization
limiting the maximum i/o size that could go to the Adapter. There were
no negative operational side effects resulting from this bad math, only
a subtle limit in performance of the Adapter at the top end of the
range.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn from Adaptec.
In the rare instances where the adapter, or the motherboard, is
misbehaving; driver initialization or shutdown becomes problematic. By
introducing a 3 minute timeout on the first interrupt driven command
during initialization, or the issuance of the adapter shutdown command
during driver unload, we can resolve the lockup problems induced by
common (but rare) hardware misbehaviors.
The timeout during initialization, should it occur, is accompanied by a
message presented to the console and the logs indicating that the user
should inspect and resolve problems with interrupt routing.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch adds some additional error return checking and error return
value propagation during initialization. Also, the deprecation of
pci_module_init with pci_register_driver along with the change in return
values.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn from Adaptec.
Hotplug sniffs the AIFs (events) from the adapter and if a container
change resulting in the device going offline (container zero), online
(container zero completed) or changing capacity (morph) it will take
actions by calling the appropriate API.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Recevied from Mark Salyzyn from Adaptec.
Aif pre-allocation is used to pull the kmalloc outside of the locks.
Applies to the scsi-misc-2.6 git tree.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn from Adaptec.
There are a few adapters that are capable of creating devices with this large
of a capacity, but now that we have the large fib support in, the management
applications will be capable of generating them. The problem is, once they are
created, the driver will not be able to access the devices correctly without
this patch.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
When you've enabled conntrack and NAT as a module (standard case in all
distributions), and you've also enabled the new conntrack netlink
interface, loading ip_conntrack_netlink.ko will auto-load iptable_nat.ko.
This causes a huge performance penalty, since for every packet you iterate
the nat code, even if you don't want it.
This patch splits iptable_nat.ko into the NAT core (ip_nat.ko) and the
iptables frontend (iptable_nat.ko). Threfore, ip_conntrack_netlink.ko will
only pull ip_nat.ko, but not the frontend. ip_nat.ko will "only" allocate
some resources, but not affect runtime performance.
This separation is also a nice step in anticipation of new packet filters
(nf-hipac, ipset, pkttables) being able to use the NAT core.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
These broke existing apps, and the checks are superfluous
as the values being verified aren't even used.
Signed-off-by: David S. Miller <davem@davemloft.net>
> Steps to reproduce:
> 1. Boot Linux, do NOT setup any IPv6 routes
> 2. ip route get 2001::1 (or any unroutable address)
Well caught. We never set rt6i_idev on ip6_null_entry.
This patch should make the problem go away.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
If input message rate from userspace is too high, do not drop them,
but try to deliver using work queue allocation.
Failing there is some kind of congestion control.
It also removes warn_on on this condition, which scares people.
Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's on the stack and declared as "unsigned char[]", but pointers
and similar can be in here thus we need to give it an explicit
alignment attribute.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Al Viro pointed out that the current IB userspace verbs interface
allows userspace to cause mischief by closing file descriptors before
we're ready, or issuing the same command twice at the same time. This
patch closes those races, and fixes other obvious problems such as a
module reference leak.
Some other interface bogosities will require an ABI change to fix
properly, so I'm deferring those fixes until 2.6.15.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Patch from Vincent Sanders
A recent patch which made IXP4xx mach_desc's depend on config options
had the effect of not building the kernel for several machines it
possibly could be, this patch updates the default config to ensure all
possible machines are built for by default.
Signed-off-by: Vincent Sanders <vince@arm.linux.org.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The loop in mthca_map_cmd() would fill one entry past the end of the
mailbox buffer before calling the firmware command.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
We should use the first word of the clear interrupt register if
the bit we're after is < 32, not < 31.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If we allocate a bunch of doorbell records and then free them, we'll
end up with completely empty pages, which we then free. However, when
we come back to allocate more doorbell pages, we have to reallocate
those empty pages rather than always trying to take a slot that we've
never used. If we don't, we eventually use up every slot and fail to
allocate a doorbell record, even though we have plenty of free space.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- Added a missing TO_NATIVE call to scripts/mod/file2alias.c:do_pcmcia_entry()
- Add an alignment attribute to struct pcmcia_device_no to solve an alignment
issue seen when cross-compiling on x86 for m68k.
Signed-off-by: Kars de Jong <jongk@linux-m68k.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Christian Zoz reported there are multiple NinjaATA devices all sharing the
second product ID string, but not the first one.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>