Commit graph

440868 commits

Author SHA1 Message Date
Cornelia Huck
2f32d4ea28 KVM: s390: reinject io interrupt on tpi failure
The tpi instruction should be suppressed on addressing and protection
exceptions, so we need to re-inject the dequeued io interrupt in that
case.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:45 +02:00
Heiko Carstens
4799b557c9 KVM: s390: convert handle_tpi()
Convert handle_tpi() to new guest access functions.

The code now sets up a structure which is copied with a single call to
guest space instead of issuing several separate guest access calls.
This is necessary since the to be copied data may cross a page boundary.
If a protection exception happens while accessing any of the pages, the
instruction is suppressed and may not have modified any memory contents.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:44 +02:00
Heiko Carstens
ef23e7790e KVM: s390: convert handle_test_block()
Convert handle_test_block() to new guest access functions.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:44 +02:00
Heiko Carstens
8b96de0e03 KVM: s390: convert handle_store_cpu_address()
Convert handle_store_cpu_address() to new guest access functions.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:43 +02:00
Heiko Carstens
f748f4a7ec KVM: s390: convert handle_store_prefix()
Convert handle_store_prefix() to new guest access functions.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:43 +02:00
Heiko Carstens
0e7a3f9405 KVM: s390: convert handle_set_clock()
Convert handle_set_clock() to new guest access functions.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:42 +02:00
Heiko Carstens
665170cb47 KVM: s390: convert __sigp_set_prefix()/handle_set_prefix()
Convert __sigp_set_prefix() and handle_set_prefix() to new guest
access functions.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:42 +02:00
Heiko Carstens
d0bce6054a KVM: s390: convert kvm_s390_store_status_unloaded()
Convert kvm_s390_store_status_unloaded() to new guest access functions.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:41 +02:00
Heiko Carstens
0040e7d20f KVM: s390: convert handle_prog()
Convert handle_prog() to new guest access functions.
Also make the code a bit more readable and look at the return code
of write_guest_lc() which was missing before.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:41 +02:00
Heiko Carstens
81480cc19c KVM: s390: convert pfault code
Convert pfault code to new guest access functions.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:40 +02:00
Heiko Carstens
0f9701c6c2 KVM: s390: convert handle_stfl()
Convert handle_stfl() to new guest access functions.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:40 +02:00
Jens Freimann
1a03b76422 KVM: s390: convert local irqs in __do_deliver_interrupt()
Convert local irqs in __do_deliver_interrupt() to new guest
access functions.

Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:40 +02:00
Heiko Carstens
7988276df7 KVM: s390: convert __do_deliver_interrupt()
Convert __do_deliver_interrupt() to new guest access functions.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:39 +02:00
Heiko Carstens
8a242234b4 KVM: s390: make use of ipte lock
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:39 +02:00
Heiko Carstens
217a440683 KVM: s390/sclp: correctly set eca siif bit
Check if siif is available before setting.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:38 +02:00
Heiko Carstens
2293897805 KVM: s390: add architecture compliant guest access functions
The new guest memory access function write_guest() and read_guest() can be
used to access guest memory in an architecture compliant way.
These functions will look at the vcpu's PSW and select the correct address
space for memory access and also perform correct address wrap around.
In case DAT is turned on, page tables will be walked otherwise access will
happen to real or absolute memory.

Any access exception will be recognized and exception data will be stored
in the vcpu's kvm_vcpu_arch.pgm member. Subsequently an exception can be
injected if necessary.

Missing are:
- key protection checks
- access register mode support
- program event recording support

This patch also adds write_guest_real(), read_guest_real(),
write_guest_absolute() and read_guest_absolute() guest functions which can
be used to access real and absolute storage. These functions currently do
not perform any access checks, since there is no use case (yet?).

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:38 +02:00
Heiko Carstens
d95fb12ff4 KVM: s390: add lowcore access functions
put_guest_lc, read_guest_lc and write_guest_lc are guest access
functions which shall only be used to access the lowcore of a vcpu.
These functions should be used for e.g. interrupt handlers where no
guest memory access protection facilities, like key or low address
protection, are applicable.

At a later point guest vcpu lowcore access should happen via pinned
prefix pages, so that these pages can be accessed directly via the
kernel mapping. All of these *_lc functions can be removed then.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:37 +02:00
Heiko Carstens
1b0462e574 KVM: s390: add 'pgm' member to kvm_vcpu_arch and helper function
Add a 'struct kvm_s390_pgm_info pgm' member to kvm_vcpu_arch. This
structure will be used if during instruction emulation in the context
of a vcpu exception data needs to be stored somewhere.

Also add a helper function kvm_s390_inject_prog_cond() which can inject
vcpu's last exception if needed.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:37 +02:00
Heiko Carstens
072c9878ee KVM: s390: add kvm_s390_logical_to_effective() helper
Add kvm_s390_logical_to_effective() helper which converts a guest vcpu's
logical storage address to a guest vcpu effective address by applying the
rules of the vcpu's addressing mode defined by PSW bits 31 and 32
(extendended and basic addressing mode).
Depending on the vcpu's addressing mode the upper 40 bits (24 bit addressing
mode), 33 bits (31 bit addressing mode) or no bits (64 bit addressing mode)
will be zeroed and the remaining bits will be returned.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:36 +02:00
Heiko Carstens
5f4e87a227 s390/ctl_reg: add union type for control register 0
Add 'union ctlreg0_bits' to easily allow setting and testing bits of
control register 0 bits.
This patch only adds the bits needed for the new guest access functions.
Other bits and control registers can be added when needed.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:36 +02:00
Heiko Carstens
1365632bde s390/ptrace: add struct psw and accessor function
Introduce a 'struct psw' which makes it easier to decode and test if
certain bits in a psw are set or are not set.
In addition also add a 'psw_bits()' helper define which allows to
directly modify and test a psw_t structure. E.g.

psw_t psw;
psw_bits(psw).t = 1; /* set dat bit */

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:36 +02:00
Heiko Carstens
280ef0f1f9 KVM: s390: export test_vfacility()
Make test_vfacility() available for other files. This is needed for the
new guest access functions, which test if certain facilities are available
for a guest.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:35 +02:00
Heiko Carstens
dfeec843fb KVM: add kvm_is_error_gpa() helper
It's quite common (in the s390 guest access code) to test if a guest
physical address points to a valid guest memory area or not.
So add a simple helper function in common code, since this might be
of interest for other architectures as well.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:35 +02:00
Jens Freimann
bcd846837c KVM: s390: allow injecting every kind of interrupt
Add a new data structure and function that allows to inject
all kinds of interrupt as defined in the PoP

Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:34 +02:00
Dominik Dingel
4f718eab26 KVM: s390: Exploiting generic userspace interface for cmma
To enable CMMA and to reset its state we use the vm kvm_device ioctls,
encapsulating attributes within the KVM_S390_VM_MEM_CTRL group.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:32 +02:00
Dominik Dingel
b31605c12f KVM: s390: make cmma usage conditionally
When userspace reset the guest without notifying kvm, the CMMA state
of the pages might be unused, resulting in guest data corruption.
To avoid this, CMMA must be enabled only if userspace understands
the implications.

CMMA must be enabled before vCPU creation. It can't be switched off
once enabled.  All subsequently created vCPUs will be enabled for
CMMA according to the CMMA state of the VM.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
[remove now unnecessary calls to page_table_reset_pgste]
2014-04-22 13:24:13 +02:00
Dominik Dingel
f206165620 KVM: s390: Per-vm kvm device controls
We sometimes need to get/set attributes specific to a virtual machine
and so need something else than ONE_REG.

Let's copy the KVM_DEVICE approach, and define the respective ioctls
for the vm file descriptor.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 13:24:12 +02:00
Jason J. Herne
15f36ebd34 KVM: s390: Add proper dirty bitmap support to S390 kvm.
Replace the kvm_s390_sync_dirty_log() stub with code to construct the KVM
dirty_bitmap from S390 memory change bits.  Also add code to properly clear
the dirty_bitmap size when clearing the bitmap.

Signed-off-by: Jason J. Herne <jjherne@us.ibm.com>
CC: Dominik Dingel <dingel@linux.vnet.ibm.com>
[Dominik Dingel: use gmap_test_and_clear_dirty, locking fixes]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 09:36:28 +02:00
Dominik Dingel
a0bf4f149b KVM: s390/mm: new gmap_test_and_clear_dirty function
For live migration kvm needs to test and clear the dirty bit of guest pages.

That for is ptep_test_and_clear_user_dirty, to be sure we are not racing with
other code, we protect the pte. This needs to be done within
the architecture memory management code.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 09:36:27 +02:00
Martin Schwidefsky
0a61b222df KVM: s390/mm: use software dirty bit detection for user dirty tracking
Switch the user dirty bit detection used for migration from the hardware
provided host change-bit in the pgste to a fault based detection method.
This reduced the dependency of the host from the storage key to a point
where it becomes possible to enable the RCP bypass for KVM guests.

The fault based dirty detection will only indicate changes caused
by accesses via the guest address space. The hardware based method
can detect all changes, even those caused by I/O or accesses via the
kernel page table. The KVM/qemu code needs to take this into account.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 09:36:26 +02:00
Dominik Dingel
693ffc0802 KVM: s390: Don't enable skeys by default
The first invocation of storage key operations on a given cpu will be intercepted.

On these intercepts we will enable storage keys for the guest and remove the
previously added intercepts.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 09:36:26 +02:00
Dominik Dingel
934bc131ef KVM: s390: Allow skeys to be enabled for the current process
Introduce a new function s390_enable_skey(), which enables storage key
handling via setting the use_skey flag in the mmu context.

This function is only useful within the context of kvm.

Note that enabling storage keys will cause a one-time hickup when
walking the page table; however, it saves us special effort for cases
like clear reset while making it possible for us to be architecture
conform.

s390_enable_skey() takes the page table lock to prevent reseting
storage keys triggered from multiple vcpus.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 09:36:25 +02:00
Dominik Dingel
d4cb11340b KVM: s390: Clear storage keys
page_table_reset_pgste() already does a complete page table walk to
reset the pgste. Enhance it to initialize the storage keys to
PAGE_DEFAULT_KEY if requested by the caller. This will be used
for lazy storage key handling. Also provide an empty stub for
!CONFIG_PGSTE

Lets adopt the current code (diag 308) to not clear the keys.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 09:36:24 +02:00
Dominik Dingel
65eef33550 KVM: s390: Adding skey bit to mmu context
For lazy storage key handling, we need a mechanism to track if the
process ever issued a storage key operation.

This patch adds the basic infrastructure for making the storage
key handling optional, but still leaves it enabled for now by default.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-04-22 09:36:23 +02:00
Linus Torvalds
0f689a33ad Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 patches from Martin Schwidefsky:
 "An update to the oops output with additional information about the
  crash.  The renameat2 system call is enabled.  Two patches in regard
  to the PTR_ERR_OR_ZERO cleanup.  And a bunch of bug fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/sclp_cmd: replace PTR_RET with PTR_ERR_OR_ZERO
  s390/sclp: replace PTR_RET with PTR_ERR_OR_ZERO
  s390/sclp_vt220: Fix kernel panic due to early terminal input
  s390/compat: fix typo
  s390/uaccess: fix possible register corruption in strnlen_user_srst()
  s390: add 31 bit warning message
  s390: wire up sys_renameat2
  s390: show_registers() should not map user space addresses to kernel symbols
  s390/mm: print control registers and page table walk on crash
  s390/smp: fix smp_stop_cpu() for !CONFIG_SMP
  s390: fix control register update
2014-04-16 11:28:25 -07:00
Linus Torvalds
7d38cc0290 Small workaround for a rare, but annoying, erratum
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJTTsl0AAoJEKurIx+X31iBlZgP/1B01Zsqbq/Oh2JFRwjnLJbW
 FSMDMoAYHbIoB+4tIvQdQf6vGxPz++oBtJq95CeDeez2wh/BPvGFj/LrdBfYYgHo
 uutrB4c6P0WBOQJnfRcXS/MC4frihzV5IaA+VXAXTIBTVKlEi3PGUBM3/WgUA8QS
 Aga73Q0ze3XCjbtGeqjwE3YYzuiOzELeb4MJbNjH3a40ZkXCCmZcerZT8Zp0O5Qi
 hYZr6OTPmmoXrgX4MeBX7kFcwhHe6zb86TvdRAoPIY8fwPJt/zrhQlqrTfoUzRH5
 xyX3y4ByU7HU0CEk0g8AMiiR6EZNmUBo8onei0xVwRP8Z2u884RKQ1cC6D059ars
 isyMO/7lH6Rn+48OOTiXb0e2NOmejr/TRZ3asrvtfvb9uyMGFkG+6g2/pYxbnTb2
 QC8jknY07PhCB9J738RTllAFX6OFRvhk8b3DtrtXC5GTfIiA7HkdkTQ3HISNTgif
 C5WalPhyjh6iCJbtFtdFg0uwEpXgQ5QX5sN08NQOkuDXiI9obftD7pw5EIPiZCTt
 GIIxXh1KDWiktoOHrFAiIm4bh+CAxNRG10WvbZQj3CY+m9GYcI30h/XFmkoil4ds
 IoqVMan42yNK5AwqSa8L9Eq6t5Za1Uic1OgOofX2iYEEqp84aklUI0m5b8woN+/u
 t0BK8ThF4msbaWlJgKRK
 =lloU
 -----END PGP SIGNATURE-----

Merge tag 'please-pull-ia64-erratum' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux

Pull itanium erratum fix from Tony Luck:
 "Small workaround for a rare, but annoying, erratum #237"

* tag 'please-pull-ia64-erratum' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
  [IA64] Change default PSR.ac from '1' to '0' (Fix erratum #237)
2014-04-16 11:22:45 -07:00
Tony Luck
c0b5a64d93 [IA64] Change default PSR.ac from '1' to '0' (Fix erratum #237)
April 2014 Itanium processor specification update:

http://www.intel.com/content/www/us/en/processors/itanium/itanium-specification-update.html

describes this erratum:

=========================================================================
237. Under a complex set of conditions, store to load forwarding for a
sub 8-byte load may complete incorrectly

Problem: A load instruction may complete incorrectly when a code sequence
using 4-byte or smaller load and store operations to the same address
is executed in combination with specific timing of all the following
concurrent conditions: store to load forwarding, alignment checking
enabled, a mis-predicted branch, and complex cache utilization activity.

Implication: The affected sub 8-byte instruction may complete
incorrectly resulting in unpredictable system behavior. There is an
extremely low probability of exposure due to the significant number of
complex microarchitectural concurrent conditions required to encounter
the erratum.

Workaround: Set PSR.ac = 0 to completely avoid the erratum. Disabling
Hyper-Threading will significantly reduce exposure to the conditions
that contribute to encountering the erratum.

Status: See the Summary Table of Changes for the affected steppings.
=========================================================================

[Table of changes essentially lists all models from McKinley to Tukwila]

The PSR.ac bit controls whether the processor will always generate
an unaligned reference trap (0x5a00) for a misaligned data access
(when PSR.ac=1) or if it will let the access succeed when running
on a cpu that implements logic to handle some unaligned accesses.

Way back in 2008 in commit b704882e70
  [IA64] Rationalize kernel mode alignment checking
we made the decision to always enable strict checking. We were
already doing so in trap/interrupt context because the common
preamble code set this bit - but the rest of supervisor code
(and by inheritance user code) ran with PSR.ac=0.

We now reverse that decision and set PSR.ac=0 everywhere in the
kernel (also inherited by user processes). This will avoid the
erratum using the method described in the Itanium specification
update.  Net effect for users is that the processor will handle
unaligned access when it can (typically with a tiny performance
bubble in the pipeline ... but much less invasive than taking a
trap and having the OS perform the access).

Signed-off-by: Tony Luck <tony.luck@intel.com>
2014-04-16 10:20:34 -07:00
Linus Torvalds
10ec34fcb1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix BPF filter validation of netlink attribute accesses, from
    Mathias Kruase.

 2) Netfilter conntrack generation seqcount not initialized properly,
    from Andrey Vagin.

 3) Fix comparison mask computation on big-endian in nft_cmp_fast(),
    from Patrick McHardy.

 4) Properly limit MTU over ipv6, from Eric Dumazet.

 5) Fix seccomp system call argument population on 32-bit, from Daniel
    Borkmann.

 6) skb_network_protocol() should not use hard-coded ETH_HLEN, instead
    skb->mac_len needs to be used.  From Vlad Yasevich.

 7) We have several cases of using socket based communications to
    implement a tunnel.  For example, some tunnels are encapsulations
    over UDP so we use an internal kernel UDP socket to do the
    transmits.

    These tunnels should behave just like other software devices and
    pass the packets on down to the next layer.

    Most importantly we want the top-level socket (eg TCP) that created
    the traffic to be charged for the SKB memory.

    However, once you get into the IP output path, we have code that
    assumed that whatever was attached to skb->sk is an IP socket.

    To keep the top-level socket being charged for the SKB memory,
    whilst satisfying the needs of the IP output path, we now pass in an
    explicit 'sk' argument.

    From Eric Dumazet.

 8) ping_init_sock() leaks group info, from Xiaoming Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits)
  cxgb4: use the correct max size for firmware flash
  qlcnic: Fix MSI-X initialization code
  ip6_gre: don't allow to remove the fb_tunnel_dev
  ipv4: add a sock pointer to dst->output() path.
  ipv4: add a sock pointer to ip_queue_xmit()
  driver/net: cosa driver uses udelay incorrectly
  at86rf230: fix __at86rf230_read_subreg function
  at86rf230: remove check if AVDD settled
  net: cadence: Add architecture dependencies
  net: Start with correct mac_len in skb_network_protocol
  Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer"
  cxgb4: Save the correct mac addr for hw-loopback connections in the L2T
  net: filter: seccomp: fix wrong decoding of BPF_S_ANC_SECCOMP_LD_W
  seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF
  qlcnic: Do not disable SR-IOV when VFs are assigned to VMs
  qlcnic: Fix QLogic application/driver interface for virtual NIC configuration
  qlcnic: Fix PVID configuration on eSwitch port.
  qlcnic: Fix max ring count calculation
  qlcnic: Fix to send INIT_NIC_FUNC as first mailbox.
  qlcnic: Fix panic due to uninitialzed delayed_work struct in use.
  ...
2014-04-15 20:30:30 -07:00
Steve Wise
6f1d721037 cxgb4: use the correct max size for firmware flash
The wrong max fw size was being used and causing false
"too big" errors running ethtool -f.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-15 15:50:02 -04:00
Alexander Gordeev
8564ae09e0 qlcnic: Fix MSI-X initialization code
Function qlcnic_setup_tss_rss_intr() might enter endless
loop in case pci_enable_msix() contiguously returns a
positive number of MSI-Xs that could have been allocated.
Besides, the function contains 'err = -EIO;' assignment
that never could be reached. This update fixes the
aforementioned issues.

Cc: Shahed Shaikh <shahed.shaikh@qlogic.com>
Cc: Dept-HSGLinuxNICDev@qlogic.com
Cc: netdev@vger.kernel.org
Cc: linux-pci@vger.kernel.org

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Acked-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-15 15:14:19 -04:00
Nicolas Dichtel
54d63f787b ip6_gre: don't allow to remove the fb_tunnel_dev
It's possible to remove the FB tunnel with the command 'ip link del ip6gre0' but
this is unsafe, the module always supposes that this device exists. For example,
ip6gre_tunnel_lookup() may use it unconditionally.

Let's add a rtnl handler for dellink, which will never remove the FB tunnel (we
let ip6gre_destroy_tunnels() do the job).

Introduced by commit c12b395a46 ("gre: Support GRE over IPv6").

CC: Dmitry Kozlov <xeb@mail.ru>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-15 14:56:19 -04:00
Eric Dumazet
aad88724c9 ipv4: add a sock pointer to dst->output() path.
In the dst->output() path for ipv4, the code assumes the skb it has to
transmit is attached to an inet socket, specifically via
ip_mc_output() : The sk_mc_loop() test triggers a WARN_ON() when the
provider of the packet is an AF_PACKET socket.

The dst->output() method gets an additional 'struct sock *sk'
parameter. This needs a cascade of changes so that this parameter can
be propagated from vxlan to final consumer.

Fixes: 8f646c922d ("vxlan: keep original skb ownership")
Reported-by: lucien xin <lucien.xin@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-15 13:47:15 -04:00
Eric Dumazet
b0270e9101 ipv4: add a sock pointer to ip_queue_xmit()
ip_queue_xmit() assumes the skb it has to transmit is attached to an
inet socket. Commit 31c70d5956 ("l2tp: keep original skb ownership")
changed l2tp to not change skb ownership and thus broke this assumption.

One fix is to add a new 'struct sock *sk' parameter to ip_queue_xmit(),
so that we do not assume skb->sk points to the socket used by l2tp
tunnel.

Fixes: 31c70d5956 ("l2tp: keep original skb ownership")
Reported-by: Zhan Jianyu <nasa4836@gmail.com>
Tested-by: Zhan Jianyu <nasa4836@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-15 12:58:34 -04:00
Li, Zhen-Hua
1dd333f470 driver/net: cosa driver uses udelay incorrectly
In cosa driver, udelay with more than 20000 may cause __bad_udelay.
Use msleep for instead.

Signed-off-by: Li, Zhen-Hua <zhen-hual@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-15 00:08:22 -04:00
Alexander Aring
2168746cfc at86rf230: fix __at86rf230_read_subreg function
The __at86rf230_read_subreg function don't mask and shift register
contents which it should do. This patch adds the necessary masks and
shift operations in this function.

Since we have csma support this can make some trouble on state changes.
Since CSMA support turned on some bits in the TRX_STATUS register that
used to be zero, not masking broke checking of the TRX_STATUS field
after commanding a state change.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Reviewed-by: Werner Almesberger <werner@almesberger.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-15 00:08:22 -04:00
Alexander Aring
bb78864a0c at86rf230: remove check if AVDD settled
The AVDD regulator is only enabled when the RF section is active TX_ON
(PLL_ON) state. Since commit 7dcbd22a97
("ieee802154: ensure that first RF212 state comes from TRX_OFF").
We are in TRX_OFF state at the time at86rf230_hw_init is run.

Note that this test would only fail in case of a severe hardware
malfunction (faulty/shorted power supply, etc.) so it wasn't all that
useful in the first place.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Reviewed-by: Werner Almesberger <werner@almesberger.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-15 00:08:22 -04:00
Jean Delvare
ea05df4e8f net: cadence: Add architecture dependencies
The Cadence ethernet chipsets are only used on specific ARM
architectures. Add Kconfig dependencies so that drivers for these
chipsets are only buildable on the relevant architectures.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-15 00:08:22 -04:00
Linus Torvalds
55101e2d6c Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Marcelo Tosatti:
 - Fix for guest triggerable BUG_ON (CVE-2014-0155)
 - CR4.SMAP support
 - Spurious WARN_ON() fix

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: remove WARN_ON from get_kernel_ns()
  KVM: Rename variable smep to cr4_smep
  KVM: expose SMAP feature to guest
  KVM: Disable SMAP for guests in EPT realmode and EPT unpaging mode
  KVM: Add SMAP support when setting CR4
  KVM: Remove SMAP bit from CR4_RESERVED_BITS
  KVM: ioapic: try to recover if pending_eoi goes out of range
  KVM: ioapic: fix assignment of ioapic->rtc_status.pending_eoi (CVE-2014-0155)
2014-04-14 16:21:28 -07:00
Linus Torvalds
dafe344d22 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull bmc2835 crypto fix from Herbert Xu:
 "This fixes a potential boot crash on bcm2835 due to the recent change
  that now causes hardware RNGs to be accessed on registration"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  hwrng: bcm2835 - fix oops when rng h/w is accessed during registration
2014-04-14 16:04:14 -07:00
Mikulas Patocka
e79323bd87 user namespace: fix incorrect memory barriers
smp_read_barrier_depends() can be used if there is data dependency between
the readers - i.e. if the read operation after the barrier uses address
that was obtained from the read operation before the barrier.

In this file, there is only control dependency, no data dependecy, so the
use of smp_read_barrier_depends() is incorrect. The code could fail in the
following way:
* the cpu predicts that idx < entries is true and starts executing the
  body of the for loop
* the cpu fetches map->extent[0].first and map->extent[0].count
* the cpu fetches map->nr_extents
* the cpu verifies that idx < extents is true, so it commits the
  instructions in the body of the for loop

The problem is that in this scenario, the cpu read map->extent[0].first
and map->nr_extents in the wrong order. We need a full read memory barrier
to prevent it.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-14 16:03:02 -07:00