android_kernel_motorola_sm6225/mm
Steven Whitehouse 9fe55eea7e Fix race when checking i_size on direct i/o read
So far I've had one ACK for this, and no other comments. So I think it
is probably time to send this via some suitable tree. I'm guessing that
the vfs tree would be the most appropriate route, but not sure that
there is one at the moment (don't see anything recent at kernel.org)
so in that case I think -mm is the "back up plan". Al, please let me
know if you will take this?

Steve.

---------------------

Following on from the "Re: [PATCH v3] vfs: fix a bug when we do some dio
reads with append dio writes" thread on linux-fsdevel, this patch is my
current version of the fix proposed as option (b) in that thread.

Removing the i_size test from the direct i/o read path at vfs level
means that filesystems now have to deal with requests which are beyond
i_size themselves. These I've divided into three sets:

 a) Those with "no op" ->direct_IO (9p, cifs, ceph)
These are obviously not going to be an issue

 b) Those with "home brew" ->direct_IO (nfs, fuse)
I've been told that NFS should not have any problem with the larger
i_size, however I've added an extra test to FUSE to duplicate the
original behaviour just to be on the safe side.

 c) Those using __blockdev_direct_IO()
These call through to ->get_block() which should deal with the EOF
condition correctly. I've verified that with GFS2 and I believe that
Zheng has verified it for ext4. I've also run the test on XFS and it
passes both before and after this change.

The part of the patch in filemap.c looks a lot larger than it really is
- there are only two lines of real change. The rest is just indentation
of the contained code.

There remains a test of i_size though, which was added for btrfs. It
doesn't cause the other filesystems a problem as the test is performed
after ->direct_IO has been called. It is possible that there is a race
that does matter to btrfs, however this patch doesn't change that, so
its still an overall improvement.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Reported-by: Zheng Liu <gnehzuil.liu@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>
Acked-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-01-26 08:26:42 -05:00
..
backing-dev.c mm/backing-dev.c: check user buffer length before copying data to the related user buffer 2013-09-11 15:58:03 -07:00
balloon_compaction.c
bootmem.c mm/bootmem.c: remove unused local `map' 2013-11-13 12:09:09 +09:00
bounce.c mm/bounce.c: fix a regression where MS_SNAP_STABLE (stable pages snapshotting) was ignored 2013-09-30 14:31:02 -07:00
cleancache.c
compaction.c mm/compaction: respect ignore_skip_hint in update_pageblock_skip 2013-12-18 19:04:52 -08:00
debug-pagealloc.c
dmapool.c
fadvise.c
failslab.c
filemap.c Fix race when checking i_size on direct i/o read 2014-01-26 08:26:42 -05:00
filemap_xip.c seqcount: Add lockdep functionality to seqcount/seqlock structures 2013-11-06 12:40:26 +01:00
fremap.c mm: fix use-after-free in sys_remap_file_pages 2014-01-02 14:40:30 -08:00
frontswap.c
highmem.c
huge_memory.c thp: fix copy_page_rep GPF by testing is_huge_zero_pmd once only 2014-01-12 16:47:15 +07:00
hugetlb.c mm: hugetlbfs: fix hugetlbfs optimization 2013-11-21 16:42:27 -08:00
hugetlb_cgroup.c
hwpoison-inject.c mm/hwpoison: fix the lack of one reference count against poisoned page 2013-09-30 14:31:03 -07:00
init-mm.c
internal.h mm: vmscan: fix do_try_to_free_pages() livelock 2013-09-11 15:58:01 -07:00
interval_tree.c
Kconfig mm: add missing dependency in Kconfig 2013-12-18 19:04:52 -08:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c mm: kmemleak: avoid false negatives on vmalloc'ed objects 2013-11-13 12:09:07 +09:00
ksm.c ksm: remove redundant __GFP_ZERO from kcalloc 2013-11-13 12:09:02 +09:00
list_lru.c mm: list_lru: fix almost infinite loop causing effective livelock 2013-10-30 12:57:46 -07:00
maccess.c
madvise.c mm/hwpoison: fix traversal of hugetlbfs pages to avoid printk flood 2013-09-30 14:31:02 -07:00
Makefile list: add a new LRU list type 2013-09-10 18:56:30 -04:00
memblock.c mm/memblock.c: introduce bottom-up allocation mode 2013-11-13 12:09:08 +09:00
memcontrol.c memcg: fix memcg_size() calculation 2014-01-02 14:40:30 -08:00
memory-failure.c mm/memory-failure.c: transfer page count from head page to tail page after split thp 2014-01-02 14:40:30 -08:00
memory.c mm: fix build of split ptlock code 2013-12-20 15:41:27 -08:00
memory_hotplug.c mem-hotplug: introduce movable_node boot option 2013-11-13 12:09:09 +09:00
mempolicy.c mm/mempolicy: fix !vma in new_vma_page() 2013-12-18 19:04:52 -08:00
mempool.c mm/mempool.c: convert kmalloc_node(...GFP_ZERO...) to kzalloc_node(...) 2013-09-11 15:58:14 -07:00
migrate.c aio/migratepages: make aio migrate pages sane 2013-12-21 17:56:08 -05:00
mincore.c
mlock.c mm: munlock: fix deadlock in __munlock_pagevec() 2014-01-02 14:40:30 -08:00
mm_init.c mm: numa: Change page last {nid,pid} into {cpu,pid} 2013-10-09 14:47:45 +02:00
mmap.c mm: convert mm->nr_ptes to atomic_long_t 2013-11-15 09:32:14 +09:00
mmu_context.c
mmu_notifier.c
mmzone.c mm: numa: Change page last {nid,pid} into {cpu,pid} 2013-10-09 14:47:45 +02:00
mprotect.c mm: fix TLB flush race between migration, and change_protection_range 2013-12-18 19:04:51 -08:00
mremap.c mm: revert mremap pud_free anti-fix 2013-10-16 21:35:53 -07:00
msync.c
nobootmem.c mm/nobootmem.c: have __free_pages_memory() free in larger chunks. 2013-11-13 12:09:04 +09:00
nommu.c Merge branch 'akpm' (patches from Andrew Morton) 2013-11-13 15:45:43 +09:00
oom_kill.c mm: convert mm->nr_ptes to atomic_long_t 2013-11-15 09:32:14 +09:00
page-writeback.c writeback: fix negative bdi max pause 2013-10-16 21:35:53 -07:00
page_alloc.c mm: page_alloc: revert NUMA aspect of fair allocation policy 2013-12-20 12:19:18 -08:00
page_cgroup.c
page_io.c
page_isolation.c mm: memory-hotplug: enable memory hotplug to handle hugepage 2013-09-11 15:57:48 -07:00
pagewalk.c mm/pagewalk.c: fix walk_page_range() access of wrong PTEs 2013-10-30 14:27:03 -07:00
percpu-km.c
percpu-vm.c
percpu.c percpu: fix bootmem error handling in pcpu_page_first_chunk() 2013-09-23 10:51:45 -04:00
pgtable-generic.c mm: fix TLB flush race between migration, and change_protection_range 2013-12-18 19:04:51 -08:00
process_vm_access.c
quicklist.c
readahead.c readahead: fix sequential read cache miss detection 2013-11-13 12:09:09 +09:00
rmap.c mm/hugetlb: check for pte NULL pointer in __page_check_address() 2013-12-18 19:04:52 -08:00
shmem.c fs: remove generic_acl 2014-01-26 08:26:40 -05:00
slab.c Merge branch 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux 2013-11-22 08:10:34 -08:00
slab.h memcg, kmem: rename cache_from_memcg to cache_from_memcg_idx 2013-11-13 12:09:10 +09:00
slab_common.c memcg, kmem: rename cache_from_memcg to cache_from_memcg_idx 2013-11-13 12:09:10 +09:00
slob.c mm/sl[aou]b: Move kmallocXXX functions to common code 2013-09-04 20:51:33 +03:00
slub.c Merge branch 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux 2013-11-22 08:10:34 -08:00
sparse-vmemmap.c
sparse.c mm/sparsemem: fix a bug in free_map_bootmem when CONFIG_SPARSEMEM_VMEMMAP 2013-11-13 12:09:06 +09:00
swap.c mm: hugetlbfs: fix hugetlbfs optimization 2013-11-21 16:42:27 -08:00
swap_state.c lib/radix-tree.c: make radix_tree_node_alloc() work correctly within interrupt 2013-09-11 15:59:36 -07:00
swapfile.c frontswap: enable call to invalidate area on swapoff 2013-11-13 12:09:07 +09:00
truncate.c truncate: drop 'oldsize' truncate_pagecache() parameter 2013-09-12 15:38:02 -07:00
util.c mm: fix crash when using XFS on loopback 2014-01-15 14:19:42 +07:00
vmalloc.c mm: kmemleak: avoid false negatives on vmalloc'ed objects 2013-11-13 12:09:07 +09:00
vmpressure.c Merge branch 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2013-09-03 18:25:03 -07:00
vmscan.c mm/vmscan.c: don't forget to free shrinker->nr_deferred 2013-10-16 21:35:52 -07:00
vmstat.c mm: numa: return the number of base pages altered by protection changes 2013-11-13 12:09:11 +09:00
zbud.c mm/zbud: fix some trivial typos in comments 2013-09-11 15:57:35 -07:00
zswap.c mm/zswap: refactor the get/put routines 2013-11-13 12:09:11 +09:00