Commit graph

5858 commits

Author SHA1 Message Date
Linus Torvalds
42a2d923cc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) The addition of nftables.  No longer will we need protocol aware
    firewall filtering modules, it can all live in userspace.

    At the core of nftables is a, for lack of a better term, virtual
    machine that executes byte codes to inspect packet or metadata
    (arriving interface index, etc.) and make verdict decisions.

    Besides support for loading packet contents and comparing them, the
    interpreter supports lookups in various datastructures as
    fundamental operations.  For example sets are supports, and
    therefore one could create a set of whitelist IP address entries
    which have ACCEPT verdicts attached to them, and use the appropriate
    byte codes to do such lookups.

    Since the interpreted code is composed in userspace, userspace can
    do things like optimize things before giving it to the kernel.

    Another major improvement is the capability of atomically updating
    portions of the ruleset.  In the existing netfilter implementation,
    one has to update the entire rule set in order to make a change and
    this is very expensive.

    Userspace tools exist to create nftables rules using existing
    netfilter rule sets, but both kernel implementations will need to
    co-exist for quite some time as we transition from the old to the
    new stuff.

    Kudos to Patrick McHardy, Pablo Neira Ayuso, and others who have
    worked so hard on this.

 2) Daniel Borkmann and Hannes Frederic Sowa made several improvements
    to our pseudo-random number generator, mostly used for things like
    UDP port randomization and netfitler, amongst other things.

    In particular the taus88 generater is updated to taus113, and test
    cases are added.

 3) Support 64-bit rates in HTB and TBF schedulers, from Eric Dumazet
    and Yang Yingliang.

 4) Add support for new 577xx tigon3 chips to tg3 driver, from Nithin
    Sujir.

 5) Fix two fatal flaws in TCP dynamic right sizing, from Eric Dumazet,
    Neal Cardwell, and Yuchung Cheng.

 6) Allow IP_TOS and IP_TTL to be specified in sendmsg() ancillary
    control message data, much like other socket option attributes.
    From Francesco Fusco.

 7) Allow applications to specify a cap on the rate computed
    automatically by the kernel for pacing flows, via a new
    SO_MAX_PACING_RATE socket option.  From Eric Dumazet.

 8) Make the initial autotuned send buffer sizing in TCP more closely
    reflect actual needs, from Eric Dumazet.

 9) Currently early socket demux only happens for TCP sockets, but we
    can do it for connected UDP sockets too.  Implementation from Shawn
    Bohrer.

10) Refactor inet socket demux with the goal of improving hash demux
    performance for listening sockets.  With the main goals being able
    to use RCU lookups on even request sockets, and eliminating the
    listening lock contention.  From Eric Dumazet.

11) The bonding layer has many demuxes in it's fast path, and an RCU
    conversion was started back in 3.11, several changes here extend the
    RCU usage to even more locations.  From Ding Tianhong and Wang
    Yufen, based upon suggestions by Nikolay Aleksandrov and Veaceslav
    Falico.

12) Allow stackability of segmentation offloads to, in particular, allow
    segmentation offloading over tunnels.  From Eric Dumazet.

13) Significantly improve the handling of secret keys we input into the
    various hash functions in the inet hashtables, TCP fast open, as
    well as syncookies.  From Hannes Frederic Sowa.  The key fundamental
    operation is "net_get_random_once()" which uses static keys.

    Hannes even extended this to ipv4/ipv6 fragmentation handling and
    our generic flow dissector.

14) The generic driver layer takes care now to set the driver data to
    NULL on device removal, so it's no longer necessary for drivers to
    explicitly set it to NULL any more.  Many drivers have been cleaned
    up in this way, from Jingoo Han.

15) Add a BPF based packet scheduler classifier, from Daniel Borkmann.

16) Improve CRC32 interfaces and generic SKB checksum iterators so that
    SCTP's checksumming can more cleanly be handled.  Also from Daniel
    Borkmann.

17) Add a new PMTU discovery mode, IP_PMTUDISC_INTERFACE, which forces
    using the interface MTU value.  This helps avoid PMTU attacks,
    particularly on DNS servers.  From Hannes Frederic Sowa.

18) Use generic XPS for transmit queue steering rather than internal
    (re-)implementation in virtio-net.  From Jason Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1622 commits)
  random32: add test cases for taus113 implementation
  random32: upgrade taus88 generator to taus113 from errata paper
  random32: move rnd_state to linux/random.h
  random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized
  random32: add periodic reseeding
  random32: fix off-by-one in seeding requirement
  PHY: Add RTL8201CP phy_driver to realtek
  xtsonic: add missing platform_set_drvdata() in xtsonic_probe()
  macmace: add missing platform_set_drvdata() in mace_probe()
  ethernet/arc/arc_emac: add missing platform_set_drvdata() in arc_emac_probe()
  ipv6: protect for_each_sk_fl_rcu in mem_check with rcu_read_lock_bh
  vlan: Implement vlan_dev_get_egress_qos_mask as an inline.
  ixgbe: add warning when max_vfs is out of range.
  igb: Update link modes display in ethtool
  netfilter: push reasm skb through instead of original frag skbs
  ip6_output: fragment outgoing reassembled skb properly
  MAINTAINERS: mv643xx_eth: take over maintainership from Lennart
  net_sched: tbf: support of 64bit rates
  ixgbe: deleting dfwd stations out of order can cause null ptr deref
  ixgbe: fix build err, num_rx_queues is only available with CONFIG_RPS
  ...
2013-11-13 17:40:34 +09:00
Oleg Nesterov
008208c6b2 list: introduce list_next_entry() and list_prev_entry()
Add two trivial helpers list_next_entry() and list_prev_entry(), they
can have a lot of users including list.h itself.  In fact the 1st one is
already defined in events/core.c and bnx2x_sp.c, so the patch simply
moves the definition to list.h.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Eilon Greenstein <eilong@broadcom.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13 12:09:23 +09:00
Linus Torvalds
10d0c9705e DeviceTree updates for 3.13. This is a bit larger pull request than
usual for this cycle with lots of clean-up.
 
 - Cross arch clean-up and consolidation of early DT scanning code.
 - Clean-up and removal of arch prom.h headers. Makes arch specific
   prom.h optional on all but Sparc.
 - Addition of interrupts-extended property for devices connected to
   multiple interrupt controllers.
 - Refactoring of DT interrupt parsing code in preparation for deferred
   probe of interrupts.
 - ARM cpu and cpu topology bindings documentation.
 - Various DT vendor binding documentation updates.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQEcBAABAgAGBQJSgPQ4AAoJEMhvYp4jgsXif28H/1WkrXq5+lCFQZF8nbYdE2h0
 R8PsfiJJmAl6/wFgQTsRel+ScMk2hiP08uTyqf2RLnB1v87gCF7MKVaLOdONfUDi
 huXbcQGWCmZv0tbBIklxJe3+X3FIJch4gnyUvPudD1m8a0R0LxWXH/NhdTSFyB20
 PNjhN/IzoN40X1PSAhfB5ndWnoxXBoehV/IVHVDU42vkPVbVTyGAw5qJzHW8CLyN
 2oGTOalOO4ffQ7dIkBEQfj0mrgGcODToPdDvUQyyGZjYK2FY2sGrjyquir6SDcNa
 Q4gwatHTu0ygXpyphjtQf5tc3ZCejJ/F0s3olOAS1ahKGfe01fehtwPRROQnCK8=
 =GCbY
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree updates from Rob Herring:
 "DeviceTree updates for 3.13.  This is a bit larger pull request than
  usual for this cycle with lots of clean-up.

   - Cross arch clean-up and consolidation of early DT scanning code.
   - Clean-up and removal of arch prom.h headers.  Makes arch specific
     prom.h optional on all but Sparc.
   - Addition of interrupts-extended property for devices connected to
     multiple interrupt controllers.
   - Refactoring of DT interrupt parsing code in preparation for
     deferred probe of interrupts.
   - ARM cpu and cpu topology bindings documentation.
   - Various DT vendor binding documentation updates"

* tag 'devicetree-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (82 commits)
  powerpc: add missing explicit OF includes for ppc
  dt/irq: add empty of_irq_count for !OF_IRQ
  dt: disable self-tests for !OF_IRQ
  of: irq: Fix interrupt-map entry matching
  MIPS: Netlogic: replace early_init_devtree() call
  of: Add Panasonic Corporation vendor prefix
  of: Add Chunghwa Picture Tubes Ltd. vendor prefix
  of: Add AU Optronics Corporation vendor prefix
  of/irq: Fix potential buffer overflow
  of/irq: Fix bug in interrupt parsing refactor.
  of: set dma_mask to point to coherent_dma_mask
  of: add vendor prefix for PHYTEC Messtechnik GmbH
  DT: sort vendor-prefixes.txt
  of: Add vendor prefix for Cadence
  of: Add empty for_each_available_child_of_node() macro definition
  arm/versatile: Fix versatile irq specifications.
  of/irq: create interrupts-extended property
  microblaze/pci: Drop PowerPC-ism from irq parsing
  of/irq: Create of_irq_parse_and_map_pci() to consolidate arch code.
  of/irq: Use irq_of_parse_and_map()
  ...
2013-11-12 16:52:17 +09:00
Linus Torvalds
4b4d2b4634 H8/300 has been dead for several years, the kernel for it has
not compiled for ages, and recent versions of gcc for it are broken.
 Remove support for it.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJSev31AAoJEMsfJm/On5mBzSAQAKRBYLqtf3nJGm9pXGDhZPGG
 7KSQ8S11pg/wnXYW6P/XJhFRBrYkOOCeqVKQHtmxG8MmXQkkOz95rsIvBbUzU/FT
 yJAPKpOHdh1yLhBGgCj3WhGtjVwpbut1/y9n2M5SpGautUgxfLj9fJiswSJx0n7t
 VRWKwfIpBFPLPs9w6hdDf94tIXhSX8Me2gd3LDCPBEQ2SZYd8rtBasYtDeC2+FLa
 Xow4ZQrCU7hpYscSUFzJpok35hl7weGhJ9jjXwtic4byFHvdiyHUwCOaEWC0hqNi
 fOLWFbvBogqjyAktfZhfyL9R9/7lGlLshLQNmJWR3bO+nCJ21h9ATw0R4gLBdT4/
 lzLRnJ/4GdtbvmdqRxNjxxR4zHkZ+tE8HmaCmUzvqGfQyA5sJNBRrBDcWLUOVlO9
 0iIZsJBZjSQXKXSk9P5xH4G0tlbAFEUnEHKsrt/mgsD9Z3SgbPKAIWSBAJA0AMQk
 DXZaXrBRilXOPUCZASZfmK8AQFC1GYB0tz7nT4x1mjT2/JClgG2kHCAGhNmI+CbK
 l9VRIgBydppLFPOGhZLSNGQp29xBhw9JgOVns4a1k7kJQEw9ht38h8Q2ckRYxXhP
 /z53eZKMQk62quWlyLRgR9mWqZc2CIifLVdFjiOELMh7wKPwL6eGrrrGBDbPtctS
 PX5K26geb0oA3ZMjpBLr
 =V6n6
 -----END PGP SIGNATURE-----

Merge tag 'h8300-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull h8300 platform removal from Guenter Roeck:
 "The patch series has been in -next for more than one relase cycle.  I
  did get a number of Acks, and no objections.

  H8/300 has been dead for several years, the kernel for it has not
  compiled for ages, and recent versions of gcc for it are broken.
  Remove support for it"

* tag 'h8300-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  CREDITS: Add Yoshinori Sato for h8300
  fs/minix: Drop dependency on H8300
  Drop remaining references to H8/300 architecture
  Drop MAINTAINERS entry for H8/300
  watchdog: Drop references to H8300 architecture
  net/ethernet: Drop H8/300 Ethernet driver
  net/ethernet: smsc9194: Drop conditional code for H8/300
  ide: Drop H8/300 driver
  Drop support for Renesas H8/300 (h8300) architecture
2013-11-12 14:13:14 +09:00
Wei Yongjun
8724be0e4a xtsonic: add missing platform_set_drvdata() in xtsonic_probe()
Add missing platform_set_drvdata() in xtsonic_probe(), otherwise
calling platform_get_drvdata() in xtsonic_device_remove() may
returns NULL.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-11 14:02:08 -05:00
Wei Yongjun
06a2feb9e3 macmace: add missing platform_set_drvdata() in mace_probe()
Add missing platform_set_drvdata() in mace_probe(), otherwise
calling platform_get_drvdata() in mac_mace_device_remove() may
returns NULL.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-11 14:02:08 -05:00
Wei Yongjun
45f1b02728 ethernet/arc/arc_emac: add missing platform_set_drvdata() in arc_emac_probe()
Add missing platform_set_drvdata() in arc_emac_probe(), otherwise
calling platform_get_drvdata() in arc_emac_remove() may returns NULL.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-11 14:02:07 -05:00
Jacob Keller
170e85430b ixgbe: add warning when max_vfs is out of range.
The max_vfs parameter has a limit of 63 and silently fails (adding 0 vfs) when
it is out of range. This patch adds a warning so that the user knows something
went wrong. Also, this patch moves the warning in ixgbe_enable_sriov() to where
max_vfs is checked, so that even an out of range value will show the deprecated
warning. Previously, an out of range parameter didn't even warn the user to use
the new sysfs interface instead.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-11 00:19:36 -05:00
Carolyn Wyborny
0123713957 igb: Update link modes display in ethtool
This patch fixes multiple problems in the link modes display in ethtool.
Newer parts have more complicated methods to determine actual link
capabilities.  Older parts cannot communicate with their SFP modules.
Finally, all the available defines are not displayed by ethtool.  This
updates the link modes to be as accurate as possible depending on what data
is available to the driver at any given time.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-11 00:19:35 -05:00
John Fastabend
51f3773bde ixgbe: deleting dfwd stations out of order can cause null ptr deref
The number of stations in use is kept in the num_rx_pools counter
in the ixgbe_adapter structure. This is in turn used by the queue
allocation scheme to determine how many queues are needed to support
the number of pools in use with the current feature set.

This works as long as the pools are added and destroyed in order
because (num_rx_pools * queues_per_pool) is equal to the last
queue in use by a pool. But as soon as you delete a pool out of
order this is no longer the case. So the above multiplication
allocates to few queues and a pool may reference a ring that has
not been allocated/initialized.

To resolve use the bit mask of in use pools to determine the final
pool being used and allocate enough queues so that we don't
inadvertently remove its queues.

# ip link add link eth2 \
	numtxqueues 4 numrxqueues 4 txqueuelen 50 type macvlan
# ip link set dev macvlan0 up
# ip link add link eth2 \
	numtxqueues 4 numrxqueues 4 txqueuelen 50 type macvlan
# ip link set dev macvlan1 up
# for i in {0..100}; do
  ip link set dev macvlan0 down; ip link set dev macvlan0 up;
  done;

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-08 15:21:08 -05:00
John Fastabend
219354d489 ixgbe: fix build err, num_rx_queues is only available with CONFIG_RPS
In the recent support for layer 2 hardware acceleration, I added a
few references to real_num_rx_queues and num_rx_queues which are
only available with CONFIG_RPS.

The fix is first to remove unnecessary references to num_rx_queues.
Because the hardware offload case is limited to cases where RX queues
and TX queues are equal we only need a single check. Then wrap the
single case in an ifdef.

The patch that introduce this is here,

commit a6cc0cfa72
Author: John Fastabend <john.r.fastabend@intel.com>
Date:   Wed Nov 6 09:54:46 2013 -0800

    net: Add layer 2 hardware acceleration operations for macvlan devices

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-08 15:21:08 -05:00
Baruch Siach
cdc4ead09d netdev: smc91x: enable for xtensa
Tested in VLAB Works Xtensa simulation.

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-08 13:27:55 -05:00
Andreas Herrmann
b5ad795e52 net: calxedaxgmac: Fix panic caused by MTU change of active interface
Changing MTU size of an xgmac network interface while it is active can
cause a panic like

  skbuff: skb_over_panic: text:c03bc62c len:1090 put:1090 head:edfb6900 data:edfb6942 tail:0xedfb6d84 end:0xedfb6bc0 dev:eth0
  ------------[ cut here ]------------
  kernel BUG at net/core/skbuff.c:126!
  Internal error: Oops - BUG: 0 [#1] SMP ARM
  Modules linked in:
  CPU: 0 PID: 762 Comm: python Tainted: G        W    3.10.0-00015-g3e33cd7 #309
  task: edcfe000 ti: ed67e000 task.ti: ed67e000
  PC is at skb_panic+0x64/0x70
  LR is at wake_up_klogd+0x5c/0x68

This happens because xgmac_change_mtu modifies dev->mtu before the
network interface is quiesced. And thus there still might be buffers
in use which have a buffer size based on the old MTU.

To fix this I moved the change of dev->mtu after the call to
xgmac_stop.

Another modification is required (in xgmac_stop) to ensure that
xgmac_xmit is really not called anymore (xgmac_tx_complete might wake
up the queue again).

I've tested the fix by switching MTU size every second between 600 and
1500 while network traffic was going on. The test box survived a test
of several hours (until I've stopped it) whereas w/o this fix above
panic occurs after several minutes (at most).

Change since v1:
- remove call to netif_stop_queue at beginning of xgmac_stop
- use netif_tx_disable instead of locking+netif_stop_queue

Signed-off-by: Andreas Herrmann <andreas.herrmann@calxeda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:25:53 -05:00
Eugenia Emantayev
163561a4e2 net/mlx4_en: Datapath structures are allocated per NUMA node
For each RX/TX ring and its CQ, allocation is done on a NUMA node that
corresponds to the core that the data structure should operate on.
The assumption is that the core number is reflected by the ring index.
The affected allocations are the ring/CQ data structures,
the TX/RX info and the shared HW/SW buffer.
For TX rings, each core has rings of all UPs.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:22:48 -05:00
Eugenia Emantayev
6e7136ed77 net/mlx4_core: ICM pages are allocated on device NUMA node
This is done to optimize FW/HW access to host memory.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:22:48 -05:00
Eugenia Emantayev
41d942d56c net/mlx4_en: Datapath resources allocated dynamically
Currently all TX/RX rings and completion queues are part of the
netdev priv structure and are allocated statically. This patch
will change the priv to hold only arrays of pointers and therefore
all TX/RX rings and completetion queues will be allocated
dynamically. This is in preparation for NUMA aware allocations.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:22:48 -05:00
Rony Efraim
f0f829bf42 net/mlx4_core: Add immediate activate for VGT->VST->VGT
Allow immediate activate of VGT->VST and VST->VGT transitions, without
the need of rebinding in mlx4_master_immediate_activate_vlan_qos().

Also in struct res_qp: add qp parameters (vlan_index,fvl,vlan_cntrol..)
to the saved set, in order to restore when move to VGT.
 - Clear at mlx4_RST2INIT_QP_wrapper()
 - Save at mlx4_INIT2RTR_QP_wrapper()
 - Restore at mlx4_vf_immed_vlan_work_handler()

Update mlx4_vf_immed_vlan_work_handler() to support VGT.

Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:22:47 -05:00
Jack Morgenstein
571b8b92c7 net/mlx4_core: Initialize all mailbox buffers to zero before use
To guarantee that all unused fields in all FW commands for both inboxes
and outboxes are zeroed out, initialize the mailbox buffer to all zeroes.

This is especially important for SRIOV comm-channel virtual commands
(such as QUERY_FUNC_CAP), where if new fields are added to support new
features, the driver can depend on older kernels passing zeroes in these
fields.

In addition to zeroing out the mailbox buffer at allocation time, all
(now unnecessary) calls to memset by the callers of
mlx4_alloc_cmd_mailbox() are removed.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:22:47 -05:00
Eyal Perry
75a353d476 net/mlx4_en: Add RFS support in UDP
Modify RFS code to support applying filters for incoming UDP streams.

Signed-off-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:22:47 -05:00
John Fastabend
2a47fa45d4 ixgbe: enable l2 forwarding acceleration for macvlans
Now that l2 acceleration ops are in place from the prior patch,
enable ixgbe to take advantage of these operations.  Allow it to
allocate queues for a macvlan so that when we transmit a frame,
we can do the switching in hardware inside the ixgbe card, rather
than in software.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:11:41 -05:00
Amir Vadai
1ec4864b10 net/mlx4_en: Fixed crash when port type is changed
timecounter_init() was was called only after first potential
timecounter_read().
Moved mlx4_en_init_timestamp() before mlx4_en_init_netdev()

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:11:13 -05:00
Ivan Vecera
85aec73d59 tg3: avoid double-freeing of rx data memory
If build_skb fails the memory associated with the ring buffer is freed but
the ri->data member is not zeroed in this case. This causes a double-free
of this memory in tg3_free_rings->... path. The patch moves this block after
setting ri->data to NULL.
It would be nice to fix this bug also in stable >= v3.4 trees.

Cc: Nithin Nayak Sujir <nsujir@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:09:44 -05:00
Linus Torvalds
3ae423fe47 Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS updates from Ralf Baechle:
 - Some minor work bringing the Cobalt MIPS platforms in line with other
   MIPS platforms
 - Make vmlinux.32 and vmlinux.64 build messages less verbose
 - Always register the R4k clocksource when selected, the clock source's
   rating will decide if this or another clock source is actually going
   to be used
 - Drop support for the Cisco (formerly Scientific Atlanta) PowerTV
   platform.  There appears to be nobody left who cares and the USB
   driver went stale while waiting for years to be merged
 - Some cleanup of Loongson 2 related #ifdefery
 - Various minor cleanups
 - Major rework on all things related to tracing / ptrace on MIPS,
   including switching the MIPS ELF core dumper to regsets, enabling the
   entries for SIGSYS in struct siginfo for MIPS, enabling ftrace
   syscall trace points
 - Some more work to bring DECstation support code in line with other
   more modern code
 - Report the name of the detected CPU, not just its CP0 PrID value
 - Some more BCM 47xx and atheros ath79xx work
 - Support for compressed kernels using the XZ compression scheme

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (53 commits)
  MIPS: remove duplicate define
  MIPS: Random whitespace clean-ups
  MIPS: traps: Reformat notify_die invocations to 80 columns.
  MIPS: Print correct PC in trace dump after NMI exception
  MIPS: kernel: cpu-probe: Report CPU id during probe
  MIPS: Remove unused defines in piix4.h
  MIPS: Get rid of hard-coded values for Malta PIIX4 fixups
  MIPS: Always register R4K clock when selected
  MIPS: Loongson: Get rid of Loongson 2 #ifdefery all over arch/mips.
  MIPS: cacheops.h: Increase indentation by one tab.
  MIPS: Remove bogus BUG_ON()
  MIPS: PowerTV: Remove support code.
  MIPS: ftrace: Add support for syscall tracepoints.
  MIPS: ptrace: Switch syscall reporting to tracehook_report_syscall_entry().
  MIPS: Move audit_arch() helper function to __syscall_get_arch().
  MIPS: Enable HAVE_ARCH_TRACEHOOK.
  MIPS: Switch ELF core dumper to use regsets.
  MIPS: Implement task_user_regset_view.
  MIPS: ptrace: Use tracehook helpers.
  MIPS: O32 / 32-bit: Always copy 4 stack arguments.
  ...
2013-11-08 08:32:58 +09:00
Rob Herring
b5480950c6 Merge remote-tracking branch 'grant/devicetree/next' into for-next 2013-11-07 10:34:46 -06:00
Duan Jiong
17102f8be4 net:drivers/net: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO
This patch fixes coccinelle error regarding usage of IS_ERR and
PTR_ERR instead of PTR_ERR_OR_ZERO.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 03:01:59 -05:00
Duan Jiong
c1fcbaa57a smsc: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO
This patch fixes coccinelle error regarding usage of IS_ERR and
PTR_ERR instead of PTR_ERR_OR_ZERO.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 03:01:59 -05:00
Joe Perches
acec6d75ac smsc9420: Use netif_<level>
Use a more standard logging style.

Convert smsc_<level> macros to use netif_<level>.
Remove unused #define PFX
Add pr_fmt and neaten pr_<level> uses.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 02:14:32 -05:00
Joe Perches
f5ba0b0eda jme: Remove unused #define PFX
It's unused, remove it.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 02:14:32 -05:00
Jason Gunthorpe
1cce16d37d net: mv643xx_eth: Add missing phy_addr_set in DT mode
Commit cc9d4598 'net: mv643xx_eth: use of_phy_connect if phy_node
present' made the call to phy_scan optional, if the DT has a link to
the phy node.

However phy_scan has the side effect of calling phy_addr_set, which
writes the phy MDIO address to the ethernet controller. If phy_addr_set
is not called, and the bootloader has not set the correct address then
the driver will fail to function.

Tested on Kirkwood.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Tested-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-05 22:07:03 -05:00
Jack Morgenstein
146f3ef4a1 net/mlx4_core: Implement resource quota enforcement
Implements resource quota grant decision when resources are requested,
for the following resources:  QPs, CQs, SRQs, MPTs, MTTs, vlans, MACs,
and Counters.

When granting a resource, the quota system increases the allocated-count
for that slave.

When the slave later frees the resource, its allocated-count is reduced.

A spinlock is used to protect the integrity of each resource's free-pool counter.
(One slave may be in the process of being granted a resource while another
slave has crashed, initiating cleanup of that slave's resource quotas).

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 16:19:08 -05:00
Jack Morgenstein
eb456a68c6 net/mlx4_core: Fix quota handling in the QUERY_FUNC_CAP wrapper
In current kernels, the mlx4 driver running on a VM does not
differentiate between max resource numbers for the HCA and
max quotas -- it simply takes the quota values passed to it
as max-resource values.

However, the driver actually requires the VFs to be aware of
the actual number of resources that the HCA was initialized with,
for QPs, CQs, SRQs and MPTs.

For QPs, CQs and SRQs, the reason is that in completion handling
the driver must know which of the 24 bits are the actual resource
number, and which are "padding" bits.

For MPTs, also, the driver assumes knowledge of the number of MPTs
in the system.

The previous commit fixes the quota logic on the VM for the quota values
passed to it by QUERY_FUNC_CAPS.

For QPs, CQs, SRQs, and MPTs, it takes the max resource numbers
from QUERY_HCA (and not QUERY_FUNC_CAPS).  The quotas passed
in QUERY_FUNC_CAPS are used to report max resource number values
in the response to ib_query_device.

However, the Hypervisor driver must consider that VMs
may be running previous kernels, and compatibility must be preserved.

To resolve the incompatibility with previous kernels running on VMs,
we deprecated the quota fields in mlx4_QUERY_FUNC_CAP.  In the
deprecated fields, we pass the max-resource values from INIT_HCA

The quota fields are moved to a new location, and the current kernel
driver takes the proper values from that location. There is
also a new flag in dword 0, bit 28 of the mlx4_QUERY_FUNC_CAP mailbox;
if this flag is set, the (VM) driver takes the quota values from the
new location.

VMs running previous kernels will work properly, except that the max resource
numbers reported in ib_query_device for these resources will be
too high.  The Hypervisor driver will, however, enforce the quotas
for these VMs.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 16:19:08 -05:00
Jack Morgenstein
5a0d0a6161 mlx4: Structures and init/teardown for VF resource quotas
This is step #1 for implementing SRIOV resource quotas for VFs.

Quotas are implemented per resource type for VFs and the PF, to prevent
any entity from simply grabbing all the resources for itself and leaving
the other entities unable to obtain such resources.

Resources which are allocated using quotas:  QPs, CQs, SRQs, MPTs, MTTs, MAC,
                                             VLAN, and Counters.

The quota system works as follows:
Each entity (VF or PF) is given a max number of a given resource (its quota),
and a guaranteed minimum number for each resource (starvation prevention).

For QPs, CQs, SRQs, MPTs and MTTs:
50% of the available quantity for the resource is divided equally among
the PF and all the active VFs (i.e., the number of VFs in the mlx4_core module
parameter "num_vfs"). This 50% represents the "guaranteed minimum" pool.
The other 50% is the "free pool", allocated on a first-come-first-serve basis.
For each VF/PF, resources are first allocated from its "guaranteed-minimum"
pool. When that pool is exhausted, the driver attempts to allocate from
the resource "free-pool".

The quota (i.e., max) for the VFs and the PF is:
  The free-pool amount (50% of the real max) + the guaranteed minimum

For MACs:
  Guarantee 2 MACs per VF/PF per port. As a result, since we have only
  128 MACs per port, reduce the allowable number of VFs from 64 to 63.
  Any remaining MACs are put into a free pool.

For VLANs:
  For the PF, the per-port quota is 128 and guarantee is 64
     (to allow the PF to register at least a VLAN per VF in VST mode).
  For the VFs, the per-port quota is 64 and the guarantee is 0.
      We assume that VGT VFs are trusted not to abuse the VLAN resource.

For Counters:
  For all functions (PF and VFs), the quota is 128 and the guarantee is 0.

In this patch, we define the needed structures, which are added to the
resource-tracker struct.  In addition, we do initialization
for the resource quota, and adjust the query_device response to use quotas
rather than resource maxima.

As part of the implementation, we introduce a new field in
mlx4_dev: quotas.  This field holds the resource quotas used
to report maxima to the upper layers (ib_core, via query_device).

The HCA maxima of these values are passed to the VFs (via
QUERY_HCA) so that they may continue to use these in handling
QPs, CQs, SRQs and MPTs.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 16:19:07 -05:00
Jack Morgenstein
a30f1bc5c0 net/mlx4_core: Fix checking order in MR table init
In procedure mlx4_init_mr_table(), slaves should do no processing,
but should return success. This initialization is hypervisor-only.

However, the check for num_mpts being a power-of-2 was performed
before the check to return immediately if the driver is for a slave.
This resulted in spurious failures.

The order of performing the checks is reversed, so that if the
driver is for a slave, no processing is done and success is returned.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 16:19:07 -05:00
Jack Morgenstein
2c957ff27d net/mlx4_core: Don't fail reg/unreg vlan for older guests
In upstream kernels under SRIOV, the vlan register/unregister calls
were NOPs (doing nothing and returning OK). We detect these old
calls from guests (via the comm channel), since previously the
port number in mlx4_register_vlan was passed (improperly) in the
out_param. This has been corrected so that the port number is now
passed in bits 8..15 of the in_modifier field.

For old calls, these bits will be zero, so if the passed port
number is zero, we can still look at the out_param field to see
if it contains a valid port number. If yes, the VM is running
an old driver.

Since for old drivers, the register/unregister_vlan wrappers were
NOPs, we continue this policy -- the reason being that upstream
had an additional bug in eth driver running on guests (where
procedure mlx4_en_vlan_rx_kill_vid() had the following code:

if (!mlx4_find_cached_vlan(mdev->dev, priv->port, vid, &idx))
        mlx4_unregister_vlan(mdev->dev, priv->port, idx);
else
        en_err(priv, "could not find vid %d in cache\n", vid);

On a VM, mlx4_find_cached_vlan() will always fail, since the
vlan cache is located on the Hypervisor; on guests it is empty.

Therefore, if we allow upstream guests to register vlans, we will
have vlan leakage since the unregister will never be performed.
Leaving vlan reg/unreg for old guest drivers as a NOP is not a
feature regression, since in upstream the register/unregister
vlan wrapper is a NOP.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 16:19:07 -05:00
Jack Morgenstein
4874080dee net/mlx4_core: Resource tracker for reg/unreg vlans
Add resource tracker support for reg/unreg vlans calls done by VFs.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 16:19:07 -05:00
Jack Morgenstein
2009d0059c net/mlx4_en: Use vlan id instead of vlan index for unregistration
Use of vlan_index created problems unregistering vlans on guests.

In addition, tools delete vlan by tag, not by index, lets follow that.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 16:19:07 -05:00
Jack Morgenstein
acddd5dd44 net/mlx4_core: Fix reg/unreg vlan/mac to conform to the firmware spec
The functions mlx4_register_vlan, mlx4_unregister_vlan, mlx4_register_mac,
mlx4_unregister_mac all made illegal use of the out_param in multifunc mode
to pass the port number. The firmware spec specifies that the port number
should be passed in bits 8..15 of the input-modifier field for ALLOC_RES and
FREE_RES (sections 20.15.1 and 20.15.2).

For MAC register/unregister, this patch contains workarounds so that guests
running previous kernels continue to work on a new Hypervisor, and guests
running the new kernel will continue to work on old hypervisors.

Vlan registeration capability is still not operational in multifunction mode,
since the vlan wrapper functions are not implemented in this patch.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 16:19:06 -05:00
Jack Morgenstein
162226a1dc net/mlx4_core: Fix register/unreg vlan flow
The reg/unreg vlan code was broken:

1. a wrapped function called another wrapped function, causing a deadlock.

2. unregister_vlan called cmd_box instead of cmd_box_imm, leading to
   incorrectly passed parameters.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 16:19:06 -05:00
Sergei Shtylyov
3b4c5cbf42 sh_eth: check platform data pointer
Check the platform data pointer before dereferencing it and error out of the
probe() method if it's NULL.

This has additional effect of preventing kernel oops with outdated platform data
containing zero PHY address instead (such as on SolutionEngine7710).

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 15:49:28 -05:00
Himanshu Madhani
db62d7d96a qlcnic: update version to 5.3.52
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 15:33:19 -05:00
Himanshu Madhani
18afc102fd qlcnic: Enable multiple Tx queue support for 83xx/84xx Series adapters.
o 83xx and 84xx firmware is capable of multiple Tx queues.
  This patch will enable multiple Tx queues for 83xx/84xx
  series adapters. Max number of Tx queues supported will be 8.

Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 15:33:19 -05:00
Himanshu Madhani
34e8c406fd qlcnic: refactor Tx/SDS ring calculation and validation in driver.
o Current driver has duplicate code for validating user input
  for changing Tx/SDS rings using set_channel ethtool interface.
  This patch removes duplicate code and refactored Tx/SDS ring
  validation for 82xx/83xx/84xx series adapter.
o Refactored code now calculates maximum Tx/Rx ring driver can
  support based on Default, NPAR and SRIOV PF/VF mode of driver.

Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 15:33:19 -05:00
Himanshu Madhani
f27c75b390 qlcnic: Enhance ethtool Statistics for Multiple Tx queue.
o Enhance ethtool statistics to display multiple Tx queue stats for
  all supported adapters.

Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 15:33:19 -05:00
Sucheta Chakraborty
78ea2d977a qlcnic: Register netdev in FAILED state for 83xx/84xx
o Without failing probe, register netdev when device is in FAILED state.
o Device will come up with minimum functionality and allow diagnostics and
  repair of the adapter.

Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 15:33:19 -05:00
David S. Miller
394efd19d5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/emulex/benet/be.h
	drivers/net/netconsole.c
	net/bridge/br_private.h

Three mostly trivial conflicts.

The net/bridge/br_private.h conflict was a function signature (argument
addition) change overlapping with the extern removals from Joe Perches.

In drivers/net/netconsole.c we had one change adjusting a printk message
whilst another changed "printk(KERN_INFO" into "pr_info(".

Lastly, the emulex change was a new inline function addition overlapping
with Joe Perches's extern removals.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 13:48:30 -05:00
Jack Morgenstein
c32b7dfbb1 net/mlx4_core: Fix call to __mlx4_unregister_mac
In function mlx4_master_deactivate_admin_state() __mlx4_unregister_mac was
called using the MAC index. It should be called with the value of the MAC itself.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 00:51:10 -05:00
Ben Boeckel
4800599397 smsc9420: replace printk with netdev_ calls
Signed-off-by: Ben Boeckel <mathstuf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-02 01:19:25 -04:00
Ben Boeckel
c501b1f57b smc91c92_cs: replace printk with netdev_ calls
Also snipes some trailing whitespace.

Signed-off-by: Ben Boeckel <mathstuf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-02 01:19:25 -04:00
Ben Boeckel
2ad02bdc88 smc9194: replace printk with netdev_ calls
Also snipes some whitespace errors.

Signed-off-by: Ben Boeckel <mathstuf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-02 01:19:25 -04:00
Ben Boeckel
b1a04a62f3 smsc911x: replace printk with netdev_ calls
Signed-off-by: Ben Boeckel <mathstuf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-02 01:19:24 -04:00