Commit graph

506227 commits

Author SHA1 Message Date
Ben Hutchings
dacc73e0cf sh_eth: Really fix padding of short frames on TX
My previous fix to clear padding of short frames used skb->len as the
DMA length, assuming that skb_padto() extended skb->len to include the
padding.  That isn't the case; we need to use skb_put_padto() instead.

(This wasn't immediately obvious because software padding isn't
actually needed on the R-Car H2.  We could make it conditional on
which chip is being driven, but it's probably not worth the effort.)

Reported-by: "Violeta Menéndez González" <violeta.menendez@codethink.co.uk>
Fixes: 612a17a54b50 ("sh_eth: Fix padding of short frames on TX")
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02 21:30:56 -05:00
Ben Hutchings
9b4a6364a6 Revert "sh_eth: Enable Rx descriptor word 0 shift for r8a7790"
This reverts commit fd9af07c34.

The hardware manual states that the frame error and multicast bits are
copied to bits 9:0 of RD0, not bits 25:16.  I've tested that this is
true for RFS1 (CRC error), RFS3 (frame too short), RFS4 (frame too
long) and RFS8 (multicast).

Also adjust a comment to agree with this.

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02 21:30:56 -05:00
Ben Hutchings
6ded286555 sh_eth: Fix RX recovery on R-Car in case of RX ring underrun
In case of RX ring underrun (RDE), we attempt to reset the software
descriptor pointers (dirty_rx and cur_rx) to match where the hardware
will read the next descriptor from, as that might not be the first
dirty descriptor.  This relies on reading RDFAR, but that register
doesn't exist on all supported chips - specifically, not on the R-Car
chips.  This will result in unpredictable behaviour on those chips
after an RDE.

Make this pointer reset conditional and assume that it isn't needed on
the R-Car chips.  This fix also assumes that RDFAR is never exposed at
offset 0 in the memory map - this is currently true, and a subsequent
commit will fix the ambiguity between offset 0 and no-offset in the
register offset maps.

Fixes: 79fba9f517 ("net: sh_eth: fix the rxdesc pointer when rx ...")
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02 21:30:56 -05:00
Ben Hutchings
7d7355f58b sh_eth: Ensure proper ordering of descriptor active bit write/read
When submitting a DMA descriptor, the active bit must be written last.
When reading a completed DMA descriptor, the active bit must be read
first.

Add memory barriers to ensure that this ordering is maintained.

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02 21:30:56 -05:00
Eduardo Valentin
78045bfe85 Merge branch 'fixes' of github.com:lmajewski/linux-samsung-thermal into work-fixes
Pull samsung thermal fixes from Lukasz Majewski:
"Changes:
- Exynos7 power down detection mode fix
- Fix for cpufreq cooling device regression
- Updating MAINTAINER's entry for Samsung Exynos Thermal"

Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-03-02 22:18:31 -04:00
Peter Zijlstra
c064a0de1b livepatch: fix RCU usage in klp_find_external_symbol()
While one must hold RCU-sched (aka. preempt_disable) for find_symbol()
one must equally hold it over the use of the object returned.

The moment you release the RCU-sched read lock, the object can be dead
and gone.

[jkosina@suse.cz: change subject line to be aligned with other patches]
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Petr Mladek <pmladek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2015-03-03 00:22:55 +01:00
Trond Myklebust
ec3ca4e57e NFSv4: Ensure we skip delegations that are already being returned
In nfs_client_return_marked_delegations() and nfs_delegation_reap_unclaimed()
we want to optimise the loop traversal by skipping delegations that are
already in the process of being returned.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-02 18:09:15 -05:00
Trond Myklebust
9f0f8e12c4 NFSv4: Pin the superblock while we're returning the delegation
This patch ensures that the superblock doesn't go ahead and disappear
underneath us while the state manager thread is returning delegations.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-02 18:09:14 -05:00
Trond Myklebust
ade04647dd NFSv4: Ensure we honour NFS_DELEGATION_RETURNING in nfs_inode_set_delegation()
Ensure that nfs_inode_set_delegation() doesn't inadvertently detach a
delegation that is already in the process of being returned.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-02 18:09:14 -05:00
Trond Myklebust
b04b22f4ca NFSv4: Ensure that we don't reap a delegation that is being returned
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-02 18:09:13 -05:00
Anna Schumaker
369d6b7f00 NFS: Fix stateid used for NFS v4 closes
After 566fcec60 the client uses the "current stateid" from the
nfs4_state structure to close a file.  This could potentially contain a
delegation stateid, which is disallowed by the protocol and causes
servers to return NFS4ERR_BAD_STATEID.  This patch restores the
(correct) behavior of sending the open stateid to close a file.

Reported-by: Olga Kornievskaia <kolga@netapp.com>
Fixes: 566fcec60 (NFSv4: Fix an atomicity problem in CLOSE)
Signed-off-by: Anna Schumaker <Anna.Schumaker@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-02 18:06:42 -05:00
Tapasweni Pathak
cfec0e75f5 KVM: MIPS: Enable after disabling interrupt
Enable disabled interrupt, on unsuccessful operation.

Found by Coccinelle.

Signed-off-by: Tapasweni Pathak <tapaswenipathak@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2015-03-02 19:18:12 -03:00
James Hogan
b3cffac04e KVM: MIPS: Fix trace event to save PC directly
Currently the guest exit trace event saves the VCPU pointer to the
structure, and the guest PC is retrieved by dereferencing it when the
event is printed rather than directly from the trace record. This isn't
safe as the printing may occur long afterwards, after the PC has changed
and potentially after the VCPU has been freed. Usually this results in
the same (wrong) PC being printed for multiple trace events. It also
isn't portable as userland has no way to access the VCPU data structure
when interpreting the trace record itself.

Lets save the actual PC in the structure so that the correct value is
accessible later.

Fixes: 669e846e6c ("KVM/MIPS32: MIPS arch specific APIs for KVM")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org> # v3.10+
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2015-03-02 19:17:52 -03:00
Linus Torvalds
023a6007a0 Two GPIO fixes for the v4.0 kernel series:
- Fix a translation problem in of_get_named_gpiod_flags()
 - Fix a long standing container_of() mistake in the TPS65912
   driver.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJU9EyeAAoJEEEQszewGV1z16YP/1/sPyqZpj6f6Z9Q3shAffGY
 chDyxuaf8X7weiRd7vap93BPnnYJeJQkLQQCOEbsGmGsXOxLCIpqv6ShINYsRcnD
 aUnhVt6c9PxxkllfDaBJfKgXOa+M647Uj0Bzfkl2W9zuIJaeyGqUVOu7rvsFmf8f
 44ofuNdHYKHgkFtcdhPthIHC3zhGpDUwKR4OUElgZd89sHLcIEYVT0KQddRY0qE/
 RVb3KaP4FrlEL9vFrXABDsh9UufvN29gybAJSuCe/fgqdLAxTsOIoKktA8xNSXZR
 wWj47pjopRE1/GIJ03ug0boiv0eKwumvUwAn5xlrdJurcIGh0NrHSSF9JPCgMdSK
 48+45k+MmYQPJVQG/n4NRgAUv10KbN+0u/4MViNLYzTQuGkoCriei7/FL5/04TOi
 52xpdJ3Nf0R/ItzpPrmoNRx8vWzt7vg3SLiQi3kzeej9ej1DW+a9OvDeGiImAtKO
 MEx0Q3Nm5VNQ5kjiZaRan8/HK/Yys1fESqYdlbOxAEPRaCh3tl78x1jIN+ulivIn
 myyMyCn3H5y6DEYqORRyw97egqvCjLz6/BqIIuApKNVOy+gpkdmYtpL1GMEOWOJK
 J+w1fx7cnHXBhGAQHKgmqFvHF9L1Bqadd3RlvXk17XDhxM9mRWka4S4E+08/BEtb
 qL7OgdAzI0EPn0WxWBKM
 =5nhV
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v4.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
 "Two GPIO fixes:

   - Fix a translation problem in of_get_named_gpiod_flags()

   - Fix a long standing container_of() mistake in the TPS65912 driver"

* tag 'gpio-v4.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: tps65912: fix wrong container_of arguments
  gpiolib: of: allow of_gpiochip_find_and_xlate to find more than one chip per node
2015-03-02 14:13:39 -08:00
Linus Torvalds
10d6dfc197 Merge branch 'fixes-for-4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal
Pull thermal management fixes from Eduardo Valentin:
 "Specifics:

   - Several fixes in tmon tool.

   - Fixes in intel int340x for _ART and _TRT tables.

   - Add id for Avoton SoC into powerclamp driver.

   - Fixes in RCAR thermal driver to remove race conditions and fix fail
     path

   - Fixes in TI thermal driver: removal of unnecessary code and build
     fix if !CONFIG_PM_SLEEP

   - Cleanups in exynos thermal driver

   - Add stubs for include/linux/thermal.h.  Now drivers using thermal
     calls but that also work without CONFIG_THERMAL will be able to
     compile for systems that don't care about thermal.

  Note: I am sending this pull on Rui's behalf while he fixes issues in
  his Linux box"

* 'fixes-for-4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
  thermal: int340x_thermal: Ignore missing _ART, _TRT tables
  thermal/intel_powerclamp: add id for Avoton SoC
  tools/thermal: tmon: silence 'set but not used' warnings
  tools/thermal: tmon: use pkg-config to determine library dependencies
  tools/thermal: tmon: support cross-compiling
  tools/thermal: tmon: add .gitignore
  tools/thermal: tmon: fixup tui windowing calculations
  tools/thermal: tmon: tui: don't hard-code dialog window size assumptions
  tools/thermal: tmon: add min/max macros
  tools/thermal: tmon: add --target-temp parameter
  thermal: exynos: Clean-up code to use oneline entry for exynos compatible table
  thermal: rcar: Make error and remove paths symmetrical with init
  thermal: rcar: Fix race condition between init and interrupt
  thermal: Introduce dummy functions when thermal is not defined
  ti-soc-thermal: Delete an unnecessary check before the function call "cpufreq_cooling_unregister"
  thermal: ti-soc-thermal: bandgap: Fix build warning if !CONFIG_PM_SLEEP
2015-03-02 14:08:10 -08:00
Filipe Manana
84471e2429 Btrfs: incremental send, don't rename a directory too soon
There's one more case where we can't issue a rename operation for a
directory as soon as we process it. We used to delay directory renames
only if they have some ancestor directory with a higher inode number
that got renamed too, but there's another case where we need to delay
the rename too - when a directory A is renamed to the old name of a
directory B but that directory B has its rename delayed because it
has now (in the send root) an ancestor with a higher inode number that
was renamed. If we don't delay the directory rename in this case, the
receiving end of the send stream will attempt to rename A to the old
name of B before B got renamed to its new name, which results in a
"directory not empty" error. So fix this by delaying directory renames
for this case too.

Steps to reproduce:

  $ mkfs.btrfs -f /dev/sdb
  $ mount /dev/sdb /mnt

  $ mkdir /mnt/a
  $ mkdir /mnt/b
  $ mkdir /mnt/c
  $ touch /mnt/a/file

  $ btrfs subvolume snapshot -r /mnt /mnt/snap1

  $ mv /mnt/c /mnt/x
  $ mv /mnt/a /mnt/x/y
  $ mv /mnt/b /mnt/a

  $ btrfs subvolume snapshot -r /mnt /mnt/snap2

  $ btrfs send /mnt/snap1 -f /tmp/1.send
  $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/2.send

  $ mkfs.btrfs -f /dev/sdc
  $ mount /dev/sdc /mnt2
  $ btrfs receive /mnt2 -f /tmp/1.send
  $ btrfs receive /mnt2 -f /tmp/2.send
  ERROR: rename b -> a failed. Directory not empty

A test case for xfstests follows soon.

Reported-by: Ames Cornish <ames@cornishes.net>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02 14:04:45 -08:00
David Sterba
1932b7be97 btrfs: fix lost return value due to variable shadowing
A block-local variable stores error code but btrfs_get_blocks_direct may
not return it in the end as there's a ret defined in the function scope.

CC: <stable@vger.kernel.org>	# 3.6+
Fixes: d187663ef2 ("Btrfs: lock extents as we map them in DIO")
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02 14:04:45 -08:00
Filipe Manana
5cdf83edb8 Btrfs: do not ignore errors from btrfs_lookup_xattr in do_setxattr
The return value from btrfs_lookup_xattr() can be a pointer encoding an
error, therefore deal with it. This fixes commit 5f5bc6b1e2
("Btrfs: make xattr replace operations atomic").

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02 14:04:45 -08:00
Filipe Manana
5dfe2be7ea Btrfs: fix off-by-one logic error in btrfs_realloc_node
The end_slot variable actually matches the number of pointers in the
node and not the last slot (which is 'nritems - 1'). Therefore in order
to check that the current slot in the for loop doesn't match the last
one, the correct logic is to check if 'i' is less than 'end_slot - 1'
and not 'end_slot - 2'.

Fix this and set end_slot to be 'nritems - 1', as it's less confusing
since the variable name implies it's inclusive rather then exclusive.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02 14:04:45 -08:00
Filipe Manana
e8c1c76e80 Btrfs: add missing inode update when punching hole
When punching a file hole if we endup only zeroing parts of a page,
because the start offset isn't a multiple of the sector size or the
start offset and length fall within the same page, we were not updating
the inode item. This prevented an fsync from doing anything, if no other
file changes happened in the current transaction, because the fields
in btrfs_inode used to check if the inode needs to be fsync'ed weren't
updated.

This issue is easy to reproduce and the following excerpt from the
xfstest case I made shows how to trigger it:

  _scratch_mkfs >> $seqres.full 2>&1
  _init_flakey
  _mount_flakey

  # Create our test file.
  $XFS_IO_PROG -f -c "pwrite -S 0x22 -b 16K 0 16K" \
      $SCRATCH_MNT/foo | _filter_xfs_io

  # Fsync the file, this makes btrfs update some btrfs inode specific fields
  # that are used to track if the inode needs to be written/updated to the fsync
  # log or not. After this fsync, the new values for those fields indicate that
  # a subsequent fsync does not need to touch the fsync log.
  $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo

  # Force a commit of the current transaction. After this point, any operation
  # that modifies the data or metadata of our file, should update those fields in
  # the btrfs inode with values that make the next fsync operation write to the
  # fsync log.
  sync

  # Punch a hole in our file. This small range affects only 1 page.
  # This made the btrfs hole punching implementation write only some zeroes in
  # one page, but it did not update the btrfs inode fields used to determine if
  # the next fsync needs to write to the fsync log.
  $XFS_IO_PROG -c "fpunch 8000 4K" $SCRATCH_MNT/foo

  # Another variation of the previously mentioned case.
  $XFS_IO_PROG -c "fpunch 15000 100" $SCRATCH_MNT/foo

  # Now fsync the file. This was a no-operation because the previous hole punch
  # operation didn't update the inode's fields mentioned before, so they remained
  # with the values they had after the first fsync - that is, they indicate that
  # it is not needed to write to fsync log.
  $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo

  echo "File content before:"
  od -t x1 $SCRATCH_MNT/foo

  # Simulate a crash/power loss.
  _load_flakey_table $FLAKEY_DROP_WRITES
  _unmount_flakey

  # Enable writes and mount the fs. This makes the fsync log replay code run.
  _load_flakey_table $FLAKEY_ALLOW_WRITES
  _mount_flakey

  # Because the last fsync didn't do anything, here the file content matched what
  # it was after the first fsync, before the holes were punched, and not what it
  # was after the holes were punched.
  echo "File content after:"
  od -t x1 $SCRATCH_MNT/foo

This issue has been around since 2012, when the punch hole implementation
was added, commit 2aaa665581 ("Btrfs: add hole punching").

A test case for xfstests follows soon.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02 14:04:44 -08:00
Josef Bacik
0c0ef4bc84 Btrfs: abort the transaction if we fail to update the free space cache inode
Our gluster boxes were hitting a problem where they'd run out of space when
updating the block group cache and therefore wouldn't be able to update the free
space inode.  This is a problem because this is how we invalidate the cache and
protect ourselves from errors further down the stack, so if this fails we have
to abort the transaction so we make sure we don't end up with stale free space
cache.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02 14:04:44 -08:00
Filipe Manana
4d884fceaa Btrfs: fix fsync race leading to ordered extent memory leaks
We can have multiple fsync operations against the same file during the
same transaction and they can collect the same ordered extents while they
don't complete (still accessible from the inode's ordered tree). If this
happens, those ordered extents will never get their reference counts
decremented to 0, leading to memory leaks and inode leaks (an iput for an
ordered extent's inode is scheduled only when the ordered extent's refcount
drops to 0). The following sequence diagram explains this race:

         CPU 1                                         CPU 2

btrfs_sync_file()

                                                 btrfs_sync_file()

  mutex_lock(inode->i_mutex)
  btrfs_log_inode()
    btrfs_get_logged_extents()
      --> collects ordered extent X
      --> increments ordered
          extent X's refcount
    btrfs_submit_logged_extents()
  mutex_unlock(inode->i_mutex)

                                                   mutex_lock(inode->i_mutex)
  btrfs_sync_log()
     btrfs_wait_logged_extents()
       --> list_del_init(&ordered->log_list)
                                                     btrfs_log_inode()
                                                       btrfs_get_logged_extents()
                                                         --> Adds ordered extent X
                                                             to logged_list because
                                                             at this point:
                                                             list_empty(&ordered->log_list)
                                                             && test_bit(BTRFS_ORDERED_LOGGED,
                                                                         &ordered->flags) == 0
                                                         --> Increments ordered extent
                                                             X's refcount
       --> check if ordered extent's io is
           finished or not, start it if
           necessary and wait for it to finish
       --> sets bit BTRFS_ORDERED_LOGGED
           on ordered extent X's flags
           and adds it to trans->ordered
  btrfs_sync_log() finishes

                                                       btrfs_submit_logged_extents()
                                                     btrfs_log_inode() finishes
                                                   mutex_unlock(inode->i_mutex)

btrfs_sync_file() finishes

                                                   btrfs_sync_log()
                                                      btrfs_wait_logged_extents()
                                                        --> Sees ordered extent X has the
                                                            bit BTRFS_ORDERED_LOGGED set in
                                                            its flags
                                                        --> X's refcount is untouched
                                                   btrfs_sync_log() finishes

                                                 btrfs_sync_file() finishes

btrfs_commit_transaction()
  --> called by transaction kthread for e.g.
  btrfs_wait_pending_ordered()
    --> waits for ordered extent X to
        complete
    --> decrements ordered extent X's
        refcount by 1 only, corresponding
        to the increment done by the fsync
        task ran by CPU 1

In the scenario of the above diagram, after the transaction commit,
the ordered extent will remain with a refcount of 1 forever, leaking
the ordered extent structure and preventing the i_count of its inode
from ever decreasing to 0, since the delayed iput is scheduled only
when the ordered extent's refcount drops to 0, preventing the inode
from ever being evicted by the VFS.

Fix this by using the flag BTRFS_ORDERED_LOGGED differently. Use it to
mean that an ordered extent is already being processed by an fsync call,
which will attach it to the current transaction, preventing it from being
collected by subsequent fsync operations against the same inode.

This race was introduced with the following change (added in 3.19 and
backported to stable 3.18 and 3.17):

  Btrfs: make sure logged extents complete in the current transaction V3
  commit 50d9aa99bd

I ran into this issue while running xfstests/generic/113 in a loop, which
failed about 1 out of 10 runs with the following warning in dmesg:

[ 2612.440038] WARNING: CPU: 4 PID: 22057 at fs/btrfs/disk-io.c:3558 free_fs_root+0x36/0x133 [btrfs]()
[ 2612.442810] Modules linked in: btrfs crc32c_generic xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc loop processor parport_pc parport psmouse therma
l_sys i2c_piix4 serio_raw pcspkr evdev microcode button i2c_core ext4 crc16 jbd2 mbcache sd_mod sg sr_mod cdrom virtio_scsi ata_generic virtio_pci ata_piix virtio_ring libata virtio flo
ppy e1000 scsi_mod [last unloaded: btrfs]
[ 2612.452711] CPU: 4 PID: 22057 Comm: umount Tainted: G        W      3.19.0-rc5-btrfs-next-4+ #1
[ 2612.454921] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[ 2612.457709]  0000000000000009 ffff8801342c3c78 ffffffff8142425e ffff88023ec8f2d8
[ 2612.459829]  0000000000000000 ffff8801342c3cb8 ffffffff81045308 ffff880046460000
[ 2612.461564]  ffffffffa036da56 ffff88003d07b000 ffff880046460000 ffff880046460068
[ 2612.463163] Call Trace:
[ 2612.463719]  [<ffffffff8142425e>] dump_stack+0x4c/0x65
[ 2612.464789]  [<ffffffff81045308>] warn_slowpath_common+0xa1/0xbb
[ 2612.466026]  [<ffffffffa036da56>] ? free_fs_root+0x36/0x133 [btrfs]
[ 2612.467247]  [<ffffffff810453c5>] warn_slowpath_null+0x1a/0x1c
[ 2612.468416]  [<ffffffffa036da56>] free_fs_root+0x36/0x133 [btrfs]
[ 2612.469625]  [<ffffffffa036f2a7>] btrfs_drop_and_free_fs_root+0x93/0x9b [btrfs]
[ 2612.471251]  [<ffffffffa036f353>] btrfs_free_fs_roots+0xa4/0xd6 [btrfs]
[ 2612.472536]  [<ffffffff8142612e>] ? wait_for_completion+0x24/0x26
[ 2612.473742]  [<ffffffffa0370bbc>] close_ctree+0x1f3/0x33c [btrfs]
[ 2612.475477]  [<ffffffff81059d1d>] ? destroy_workqueue+0x148/0x1ba
[ 2612.476695]  [<ffffffffa034e3da>] btrfs_put_super+0x19/0x1b [btrfs]
[ 2612.477911]  [<ffffffff81153e53>] generic_shutdown_super+0x73/0xef
[ 2612.479106]  [<ffffffff811540e2>] kill_anon_super+0x13/0x1e
[ 2612.480226]  [<ffffffffa034e1e3>] btrfs_kill_super+0x17/0x23 [btrfs]
[ 2612.481471]  [<ffffffff81154307>] deactivate_locked_super+0x3b/0x50
[ 2612.482686]  [<ffffffff811547a7>] deactivate_super+0x3f/0x43
[ 2612.483791]  [<ffffffff8116b3ed>] cleanup_mnt+0x59/0x78
[ 2612.484842]  [<ffffffff8116b44c>] __cleanup_mnt+0x12/0x14
[ 2612.485900]  [<ffffffff8105d019>] task_work_run+0x8f/0xbc
[ 2612.486960]  [<ffffffff810028d8>] do_notify_resume+0x5a/0x6b
[ 2612.488083]  [<ffffffff81236e5b>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 2612.489333]  [<ffffffff8142a17f>] int_signal+0x12/0x17
[ 2612.490353] ---[ end trace 54a960a6bdcb8d93 ]---
[ 2612.557253] VFS: Busy inodes after unmount of sdb. Self-destruct in 5 seconds.  Have a nice day...

Kmemleak confirmed the ordered extent leak (and btrfs inode specific
structures such as delayed nodes):

$ cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff880154290db0 (size 576):
  comm "btrfsck", pid 21980, jiffies 4295542503 (age 1273.412s)
  hex dump (first 32 bytes):
    01 40 00 00 01 00 00 00 b0 1d f1 4e 01 88 ff ff  .@.........N....
    00 00 00 00 00 00 00 00 c8 0d 29 54 01 88 ff ff  ..........)T....
  backtrace:
    [<ffffffff8141d74d>] kmemleak_update_trace+0x4c/0x6a
    [<ffffffff8122f2c0>] radix_tree_node_alloc+0x6d/0x83
    [<ffffffff8122fb26>] __radix_tree_create+0x109/0x190
    [<ffffffff8122fbdd>] radix_tree_insert+0x30/0xac
    [<ffffffffa03b9bde>] btrfs_get_or_create_delayed_node+0x130/0x187 [btrfs]
    [<ffffffffa03bb82d>] btrfs_delayed_delete_inode_ref+0x32/0xac [btrfs]
    [<ffffffffa0379dae>] __btrfs_unlink_inode+0xee/0x288 [btrfs]
    [<ffffffffa037c715>] btrfs_unlink_inode+0x1e/0x40 [btrfs]
    [<ffffffffa037c797>] btrfs_unlink+0x60/0x9b [btrfs]
    [<ffffffff8115d7f0>] vfs_unlink+0x9c/0xed
    [<ffffffff8115f5de>] do_unlinkat+0x12c/0x1fa
    [<ffffffff811601a7>] SyS_unlinkat+0x29/0x2b
    [<ffffffff81429e92>] system_call_fastpath+0x12/0x17
    [<ffffffffffffffff>] 0xffffffffffffffff
unreferenced object 0xffff88014ef11db0 (size 576):
  comm "rm", pid 22009, jiffies 4295542593 (age 1273.052s)
  hex dump (first 32 bytes):
    02 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 c8 1d f1 4e 01 88 ff ff  ...........N....
  backtrace:
    [<ffffffff8141d74d>] kmemleak_update_trace+0x4c/0x6a
    [<ffffffff8122f2c0>] radix_tree_node_alloc+0x6d/0x83
    [<ffffffff8122fb26>] __radix_tree_create+0x109/0x190
    [<ffffffff8122fbdd>] radix_tree_insert+0x30/0xac
    [<ffffffffa03b9bde>] btrfs_get_or_create_delayed_node+0x130/0x187 [btrfs]
    [<ffffffffa03bb82d>] btrfs_delayed_delete_inode_ref+0x32/0xac [btrfs]
    [<ffffffffa0379dae>] __btrfs_unlink_inode+0xee/0x288 [btrfs]
    [<ffffffffa037c715>] btrfs_unlink_inode+0x1e/0x40 [btrfs]
    [<ffffffffa037c797>] btrfs_unlink+0x60/0x9b [btrfs]
    [<ffffffff8115d7f0>] vfs_unlink+0x9c/0xed
    [<ffffffff8115f5de>] do_unlinkat+0x12c/0x1fa
    [<ffffffff811601a7>] SyS_unlinkat+0x29/0x2b
    [<ffffffff81429e92>] system_call_fastpath+0x12/0x17
    [<ffffffffffffffff>] 0xffffffffffffffff
unreferenced object 0xffff8800336feda8 (size 584):
  comm "aio-stress", pid 22031, jiffies 4295543006 (age 1271.400s)
  hex dump (first 32 bytes):
    00 40 3e 00 00 00 00 00 00 00 8f 42 00 00 00 00  .@>........B....
    00 00 01 00 00 00 00 00 00 00 01 00 00 00 00 00  ................
  backtrace:
    [<ffffffff8114eb34>] create_object+0x172/0x29a
    [<ffffffff8141d790>] kmemleak_alloc+0x25/0x41
    [<ffffffff81141ae6>] kmemleak_alloc_recursive.constprop.52+0x16/0x18
    [<ffffffff81145288>] kmem_cache_alloc+0xf7/0x198
    [<ffffffffa0389243>] __btrfs_add_ordered_extent+0x43/0x309 [btrfs]
    [<ffffffffa038968b>] btrfs_add_ordered_extent_dio+0x12/0x14 [btrfs]
    [<ffffffffa03810e2>] btrfs_get_blocks_direct+0x3ef/0x571 [btrfs]
    [<ffffffff81181349>] do_blockdev_direct_IO+0x62a/0xb47
    [<ffffffff8118189a>] __blockdev_direct_IO+0x34/0x36
    [<ffffffffa03776e5>] btrfs_direct_IO+0x16a/0x1e8 [btrfs]
    [<ffffffff81100373>] generic_file_direct_write+0xb8/0x12d
    [<ffffffffa038615c>] btrfs_file_write_iter+0x24b/0x42f [btrfs]
    [<ffffffff8118bb0d>] aio_run_iocb+0x2b7/0x32e
    [<ffffffff8118c99a>] do_io_submit+0x26e/0x2ff
    [<ffffffff8118ca3b>] SyS_io_submit+0x10/0x12
    [<ffffffff81429e92>] system_call_fastpath+0x12/0x17

CC: <stable@vger.kernel.org> # 3.19, 3.18 and 3.17
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02 14:04:44 -08:00
Radim Krčmář
f563db4bdb KVM: SVM: fix interrupt injection (apic->isr_count always 0)
In commit b4eef9b36d, we started to use hwapic_isr_update() != NULL
instead of kvm_apic_vid_enabled(vcpu->kvm).  This didn't work because
SVM had it defined and "apicv" path in apic_{set,clear}_isr() does not
change apic->isr_count, because it should always be 1.  The initial
value of apic->isr_count was based on kvm_apic_vid_enabled(vcpu->kvm),
which is always 0 for SVM, so KVM could have injected interrupts when it
shouldn't.

Fix it by implicitly setting SVM's hwapic_isr_update to NULL and make the
initial isr_count depend on hwapic_isr_update() for good measure.

Fixes: b4eef9b36d ("kvm: x86: vmx: NULL out hwapic_isr_update() in case of !enable_apicv")
Reported-and-tested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2015-03-02 19:04:40 -03:00
Linus Torvalds
1a6f77ab08 3 md fixes for 4.0
- fix a read-balance problem that was reported 2 years ago, but
   that I never noticed the report :-(
 - fix for rare RAID6 problem causing incorrect bitmap updates when
   two devices fail.
 - add __ATTR_PREALLOC annotation now that it is possible.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIVAwUAVPOlKznsnt1WYoG5AQJKlw//TXHI4MFB/3Zy0ncbHMEpKwgyuTYD0kCM
 lpsQGowAaqKdUfmXxhtLjSgQXmxpUf/q200EKUr81nV/v+HQTraC91ZmyHNvUPaB
 4+blSoEDqF/spo6rlbEXw6ByWAcaO6w3SVDLDci4rXoQoqzmGPzzjD4zqr485j61
 xRk4cV0zDVpdzp7OX+bR/fCt3A0ELbAXi22E+8U6NXnYwQPb3vIYNydcjQEPEpKk
 nLpQRoinz+XpnidUneuFO2/3Lgax5bsgK3ruxxgTUWrlF2weCD5+3g1S2FQrqZFp
 d+FyEgyv5hGgpg6mqGRvERIrzlwkqdaZAhP0haC82ZhOR5VnZFR2KS+1sACDR3jQ
 0QSR7IX8opTgvZaepNdjRAp2W4/zYnhIceMwgi9TPHWiTTT3xW7KW99kj5DdxiCg
 21i/SHXuTnw//rlNfE663wwtuBnyCEDeTCmjUNBJ0Nset+Cnc4wq6pdvt8Wzxh/a
 rGuTkD9eTQ3oR33hfJD2iUAQKYfvdr2u9zun8TzBwe50zTS+MTd3+k1xYNwcUC8z
 LfUarTLlv59L8anBhNoBzGMhZa62jqqz1Tvj3EI5u/sXbDqtzZhixhoafpsWmBnA
 8h2YyvVU4q3Oxalaqk2gEufscAtD8bAHzbbHKdd9HYLWnyoiWCYydN1QAUWKvfWP
 ycs7YftfNDM=
 =CaGN
 -----END PGP SIGNATURE-----

Merge tag 'md/4.0-fixes' of git://neil.brown.name/md

Pull md fixes from Neil Brown:
 "Three md fixes:

   - fix a read-balance problem that was reported 2 years ago, but that
     I never noticed the report :-(

   - fix for rare RAID6 problem causing incorrect bitmap updates when
     two devices fail.

   - add __ATTR_PREALLOC annotation now that it is possible"

* tag 'md/4.0-fixes' of git://neil.brown.name/md:
  md: mark some attributes as pre-alloc
  raid5: check faulty flag for array status during recovery.
  md/raid1: fix read balance when a drive is write-mostly.
2015-03-02 14:03:27 -08:00
Linus Torvalds
49db1f0ef2 arch/metag fixes for v4.0
This is just a single patch to fix the KSTK_EIP() and KSTK_ESP() macros
 for metag which have always been erronously returning the PC and stack
 pointer of the task's kernel context rather than from its user context
 saved at entry from userland into the kernel, which affects the contents
 of /proc/<pid>/maps and /proc/<pid>/stat.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJU9EnnAAoJEGwLaZPeOHZ67uYQALvIb6R/Webca0hEPuHa95ic
 4mfKzj7iD4DZTBLsuYyq9+NIt2g24mCd5vbDLfH63PpEnRwjdt7y+n4xhRVoQS87
 ZC2aLx3Ry6NC7ByGhEq0n4aTOaKYffe9y4XE8FZddn/9rZUIE2sEdXGtyrg6rsYf
 Eb/e/senPBRPNT8LpurSmYcPsCB2q2yo+0503aj41VjCjPbYe92/QrIDU8Ag3R5y
 c5C0btD9NOcB4xt/vIGU7H0OH85Q+OvLHBzu/5aVFyPelPtIE4xpYP1fRyd/P002
 Jmm6KH52ILMArgqB3KavKMvCebQBwwf92LLUtQ5ZhdeX9TMYzgG22P3CmZcS49Ha
 xwkIgDbeI1BQeMoVgTgVRMDnAOXmF/HdzxlbILHonaptiHDEOj3izdWpfubrIGi7
 9/69L/hF3DY5udt8qBQ4fWDJrvBYQpoqyUEiv/eFfyhFpVaCxKQ0YQgWto3UujWG
 7ESNkNp3kTTlo4NeUh47x1TE0CBiNHAGU+r72Uysb/u9N3Aya8b/jy8x4wCBemLs
 vHL3bfgg7Pee067/O+w9GTQoe7ldzifcSrTGV3s7wpUqKPBGUdq4MtPaXDvJqk/W
 uqnjoH1+/juvBpjwNwoavCXAO5CI6j19kKQH9iCc3v3YizSRtCG4VwfnVd2HgSc4
 LtUrSZkfkwQvBn1oc4mg
 =1+sf
 -----END PGP SIGNATURE-----

Merge tag 'metag-fixes-v4.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag

Pull arch/metag fix from James Hogan:
 "This is just a single patch to fix the KSTK_EIP() and KSTK_ESP()
  macros for metag which have always been erronously returning the PC
  and stack pointer of the task's kernel context rather than from its
  user context saved at entry from userland into the kernel, which
  affects the contents of /proc/<pid>/maps and /proc/<pid>/stat"

* tag 'metag-fixes-v4.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
  metag: Fix KSTK_EIP() and KSTK_ESP() macros
2015-03-02 14:02:17 -08:00
Rafael J. Wysocki
dfcacc154f cpuidle: Clean up fallback handling in cpuidle_idle_call()
Move the fallback code path in cpuidle_idle_call() to the end of the
function to avoid jumping to a label in an if () branch.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-03-02 22:25:37 +01:00
David S. Miller
eee617a1c3 Merge branch 'mlx4'
Or Gerlitz says:

====================
Mellanox driver fixes

Two small fixes, please apply to net.

Both patches should go to 3.19.y too.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02 15:27:23 -05:00
Ido Shamay
1037ebbbd2 net/mlx4_en: Disbale GRO for incoming loopback/selftest packets
Packets which are sent from the selftest (ethtool) flow,
should not be passed to GRO stack but rather dropped by
the driver after validation. To achieve that, we disable
GRO for the duration of the selftest.

Fixes: dd65beac48 ("net/mlx4_en: Extend usage of napi_gro_frags")
Reported-by: Carol Soto <clsoto@linux.vnet.ibm.com>
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02 15:27:19 -05:00
Or Gerlitz
f5956fafb0 net/mlx4_core: Fix wrong mask and error flow for the update-qp command
The bit mask for currently supported driver features (MLX4_UPDATE_QP_SUPPORTED_ATTRS)
of the update-qp command was defined twice (using enum value and pre-processor
define directive) and wrong.

The return value of the call to mlx4_update_qp() from within the SRIOV
resource-tracker was wrongly voided down.

Fix both issues.

issue: none
Fixes: 09e05c3f78 ('net/mlx4: Set vlan stripping policy by the right command')
Fixes: ce8d9e0d67 ('net/mlx4_core: Add UPDATE_QP SRIOV wrapper support')
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02 15:27:19 -05:00
Geert Uytterhoeven
b6d1778bc5 dmaengine: shdma: Move DMA stop to (runtime) suspend callbacks
During system reboot, the sh-dma-engine device may be runtime-suspended,
causing a crash:

    Unhandled fault: imprecise external abort (0x1406) at 0x0002c02c
    Internal error: : 1406 [#1] SMP ARM
    ...
    PC is at sh_dmae_ctl_stop+0x28/0x64
    LR is at sh_dmae_ctl_stop+0x24/0x64

If the sh-dma-engine is runtime-suspended, its module clock is turned
off, and its registers cannot be accessed.

To fix this, move the call to sh_dmae_ctl_stop(), which touches the
DMAOR register, to the sh_dmae_suspend() and sh_dmae_runtime_suspend()
callbacks.  This makes PM operations more symmetric, as both
sh_dmae_resume() and sh_dmae_runtime_resume() already call sh_dmae_rst()
to re-initialize the DMAOR register.

Remove sh_dmae_shutdown(), as it became empty.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2015-03-02 22:10:44 +05:30
Alexandre Belloni
d7a6fe015b ASoC: sam9g20_wm8731: drop machine_is_xxx
Atmel based boards can now only be used with device tree. Drop non DT
initialization.

Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2015-03-02 16:19:26 +00:00
Ingo Molnar
be482d624c * Fix regression in DMI sysfs code for handling "End of Table" entry
and a type bug that could lead to integer overflow - Ivan Khoronzhuk
 
  * Fix boundary checking in efi_high_alloc() which can lead to memory
    corruption in the EFI boot stubs - Yinghai Lu
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJU9FtlAAoJEC84WcCNIz1VjfsP/jnZPtkSapSsFP9c7AfV/vpg
 i4PLGk+18QhXpNrCVC1U4sdx3y+zefqImrDNEv72BLX6YDb10RvtydxEy4Kg2aaE
 XzCRinHWu3+IEwv4fKAmNj2HORTl+jn79JDZ97jm1PN5sOxVcRG9e3QBg6aTVhHr
 MdTXRMAKHYD+ZX5hrCMrbFXi1dboxVsUb1zwMTbJcmPSVPWToqNKCruSwp29LNfP
 /2ZsJJSHgFP3tobk37JHDTHxjXaN/GUIwQC9cIWUQMPiwU3+WeOvROBPeKUTFNv7
 kS4CXY5Q6eKz+pWYqG+FhbfHM71GTWPyFEJNeLtALg2DSKbgL6lJbtkrPpBVXrcU
 TeHlHnYTlqEpcMqHW3JtrVb0Of0/8X/9YfWjpmdxNcNbbp7KvzTtoBcP8MjGdbIq
 CztyB4clFsiyy1bEoGHFTVArzch5nn7sRCL3mYhTNQaeyN6TZc0wMXOFF/JU7N5a
 GCn9VO6T396L/7WdzG0B/Uo01xw11OS/R0jZVoDvtGfAregO+NU+yLunTEYaRtkC
 prxQ62Bu21EjLKJcdr/toFkEG8sT08XJnGTixRJnJlw+hmsK8WaigBrdpirXT5SV
 TDJJNyo6A/drfjcPoTI4lCR1CpPV3QXjCTmhh+K6tbvX5/npuWN/i4KJh54WuwT4
 BKouS5gjrgYcHH/XJjsQ
 =GJnM
 -----END PGP SIGNATURE-----

Merge tag 'efi-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/urgent

Pull EFI fixes from Matt Fleming:

" - Fix regression in DMI sysfs code for handling "End of Table" entry
    and a type bug that could lead to integer overflow. (Ivan Khoronzhuk)

  - Fix boundary checking in efi_high_alloc() which can lead to memory
    corruption in the EFI boot stubs. (Yinghai Lu)"

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-02 14:18:57 +01:00
Lukasz Majewski
93c537affd MAINTAINERS: Add entry for SAMSUNG THERMAL DRIVER
This patch adds entry for SAMSUNG THERMAL DRIVER in the MAINTAINERS file.
It has been agreed, that pull request are going to be sent to Eduardo
Valentin.

Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
2015-03-02 10:06:09 +01:00
Lukasz Majewski
0fc83929d0 cpufreq: exynos: Use simple approach to asses if cpu cooling can be used
Commit: e725d26c48 provided possibility to
use device tree to asses if cpu can be used as cooling device. Since the
code was somewhat awkward, simpler approach has been proposed.

Test HW: Exynos 4412 - Odroid U3.

Suggested-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2015-03-02 10:04:52 +01:00
Chanwoo Choi
42b696e808 thermal: exynos: Fix wrong control of power down detection mode for Exynos7
This patch fixes the wrong control of PD_DET_EN (power down detection mode)
for Exynos7 because exynos7_tmu_control() always enables the power down detection
mode regardless 'on' parameter.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Tested-by: Abhilash Kesavan <a.kesavan@samsung.com>
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
2015-03-02 10:04:51 +01:00
Nicolas PLANEL
aa91def41a USB: ch341: set tty baud speed according to tty struct
The ch341_set_baudrate() function initialize the device baud speed
according to the value on priv->baud_rate. By default the ch341_open() set
it to a hardcoded value (DEFAULT_BAUD_RATE 9600). Unfortunately, the
tty_struct is not initialized with the same default value. (usually 56700)

This means that the tty_struct and the device baud rate generator are not
synchronized after opening the port.

Fixup is done by calling ch341_set_termios() if tty exist.
Remove unnecessary variable priv->baud_rate setup as it's already done by
ch341_port_probe().
Remove unnecessary call to ch341_set_{handshake,baudrate}() in
ch341_open() as there already called in ch341_configure() and
ch341_set_termios()

Signed-off-by: Nicolas PLANEL <nicolas.planel@enovance.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
2015-03-02 09:31:02 +01:00
Trond Myklebust
7c0af9ffb7 NFSv4: Don't call put_rpccred() under the rcu_read_lock()
put_rpccred() can sleep.

Fixes: 8f649c3762 ("NFSv4: Fix the locking in nfs_inode_reclaim_delegation()")
Cc: stable@vger.kernel.org # 2.6.35+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-01 23:23:07 -05:00
Trond Myklebust
fa9233699c NFS: Don't require a filehandle to refresh the inode in nfs_prime_dcache()
If the server does not return a valid set of attributes that we can
use to either create a file or refresh the inode, then there is no
value in calling nfs_prime_dcache().

However if we're just refreshing the inode using the attributes that
the server returned, then it shouldn't matter whether or not we have
a filehandle, as long as we check the fsid+fileid combination.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-01 23:23:07 -05:00
Trond Myklebust
1ae04b2523 NFSv3: Use the readdir fileid as the mounted-on-fileid
When we call readdirplus, set the fileid normally returned by readdir
as the mounted-on-fileid, since that is commonly the case if there is
a mountpoint. To ensure that we get it right, we only set the flag if
the readdir fileid differs from the one returned in the readdirplus
attributes.

This again means that we can avoid the issues described in commit
2ef47eb1ae ("NFS: Fix use of nfs_attr_use_mounted_on_fileid()"),
which only fixed NFSv4.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-01 23:23:07 -05:00
Trond Myklebust
6c441c254e NFS: Don't invalidate a submounted dentry in nfs_prime_dcache()
If we're traversing a directory which contains a submounted filesystem,
or one that has a referral, the NFS server that is processing the READDIR
request will often return information for the underlying (mounted-on)
directory. It may, or may not, also return filehandle information.

If this happens, and the lookup in nfs_prime_dcache() returns the
dentry for the submounted directory, the filehandle comparison will
fail, and we call d_invalidate(). Post-commit 8ed936b567
("vfs: Lazily remove mounts on unlinked files and directories."), this
means the entire subtree is unmounted.

The following minimal patch addresses this problem by punting on
the invalidation if there is a submount.

Kudos to Neil Brown <neilb@suse.de> for having tracked down this
issue (see link).

Reported-by: Nix <nix@esperi.org.uk>
Link: http://lkml.kernel.org/r/87iofju9ht.fsf@spindle.srvr.nix
Cc: stable@vger.kernel.org # 3.18+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-01 23:23:06 -05:00
Trond Myklebust
3235b40303 NFSv4: Set a barrier in the update_changeattr() helper
Ensure that we don't regress the changes that were made to the
directory.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
2015-03-01 23:23:06 -05:00
Trond Myklebust
92d64e47b6 NFS: Fix nfs_post_op_update_inode() to set an attribute barrier
nfs_post_op_update_inode() is called after a self-induced attribute
update. Ensure that it also sets the barrier.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
2015-03-01 23:23:06 -05:00
Trond Myklebust
00fb4c9f84 NFS: Remove size hack in nfs_inode_attrs_need_update()
Prior to this patch, we used to always OK attribute updates that extended
the file size on the assumption that we might be performing writeback.
Now that we have attribute barriers to protect the writeback related updates,
we should remove this hack, as it can cause truncate() operations to
apparently be reverted if/when a readahead or getattr RPC call races
with our on-the-wire SETATTR.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
2015-03-01 23:23:06 -05:00
Trond Myklebust
8f8ba1d739 NFSv4: Add attribute update barriers to delegreturn and pNFS layoutcommit
Ensure that other operations that race with delegreturn and layoutcommit
cannot revert the attribute updates that were made on the server.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
2015-03-01 23:23:06 -05:00
Trond Myklebust
a08a8cd375 NFS: Add attribute update barriers to NFS writebacks
Ensure that other operations that race with our write RPC calls
cannot revert the file size updates that were made on the server.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
2015-03-01 23:23:06 -05:00
Trond Myklebust
f506200346 NFS: Set an attribute barrier on all updates
Ensure that we update the attribute barrier even if there were no
invalidations, provided that this value is newer than the old one.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
2015-03-01 23:23:06 -05:00
Trond Myklebust
f044636d97 NFS: Add attribute update barriers to nfs_setattr_update_inode()
Ensure that other operations which raced with our setattr RPC call
cannot revert the file attribute changes that were made on the server.
To do so, we artificially bump the attribute generation counter on
the inode so that all calls to nfs_fattr_init() that precede ours
will be dropped.

The motivation for the patch came from Chuck Lever's reports of readaheads
racing with truncate operations and causing the file size to be reverted.

Reported-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
2015-03-01 23:23:05 -05:00
Trond Myklebust
140e049c64 NFS: Add a helper to set attribute barriers
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
2015-03-01 23:23:05 -05:00
Trond Myklebust
aa5accea40 NFS: Ensure that buffered writes wait for O_DIRECT writes to complete
The O_DIRECT code will grab the inode->i_mutex and flush out buffered
writes, before scheduling a read or a write. However there is no
equivalent in the buffered write code to wait for O_DIRECT to complete.

Fixes a reported issue in xfstests generic/133, when first performing an
O_DIRECT write followed by a buffered write.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
2015-03-01 23:22:40 -05:00
Alexander Usyskin
6c15a8516b mei: make device disabled on stop unconditionally
Set the internal device state to to disabled after hardware reset in stop flow.
This will cover cases when driver was not brought to disabled state because of
an error and in stop flow we wish not to retry the reset.

Cc: <stable@vger.kernel.org> #3.10+
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:34:50 -08:00