Commit graph

804370 commits

Author SHA1 Message Date
Nick Desaulniers
6b80031fa1 Revert "Revert "ANDROID: enable LLVM_IAS=1 for clang's integrated assembler for x86_64""
This reverts commit e9a55d9770.

Re-enables the use of LLVM_IAS=1. This was disabled in aosp/1559459 due
to a single time-based test failure. This test failure is not
reproducible locally. The failure has never been observed on newer
kernel versions either. We should consider disabling this test if we
observe further flakes.

Test: 1. apply this patch
2. build kernel image
  $ BUILD_CONFIG=common/build.config.gki.x86_64 build/build.sh && \
    BUILD_CONFIG=common-modules/virtual-device/build.config.cuttlefish.x86_64 \
    build/build.sh
3. launch cuttlefish with custom kernel image
  $ launch_cvd -kernel_path=/path/to/out/android-4.19-stable/dist/bzImage \
    -initramfs_path=/path/to/out/android-4.19-stable/dist/initramfs.img &
4. run test
  $ atest vts_ltp_test_\
    x86:syscalls.nanosleep01_32bit#syscalls.nanosleep01_32bit -- --abi x86
Test was repeated 10 times locally, 0 failures observed.
Bug: 141693040
Bug: 178427746
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: I6e93fafe0477df9c55cc6a0fa145660975927294
2021-02-03 13:34:51 -08:00
Yiwei Zhang
a3eba2009a ANDROID: GKI: Update ABI
Leaf changes summary: 2 artifacts changed
Changed leaf types summary: 0 leaf type changed
Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 1 Added function
Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 1 Added variable

1 Added function:

  [A] 'function bool ns_capable_noaudit(user_namespace*, int)'

1 Added variable:

  [A] 'user_namespace init_user_ns'

Bug: 154525079
Signed-off-by: Yiwei Zhang <zzyiwei@google.com>
Change-Id: I6f2a12664f85f580163849a48b400b67bac57d32
2021-02-02 20:12:40 +00:00
Yiwei Zhang
34cf65e60d ANDROID: GKI: Update cuttlefish symbol list
For GPU tracepoint.

Bug: 154525079
Signed-off-by: Yiwei Zhang <zzyiwei@google.com>
Change-Id: If288a224bace2f9d81e08bf84aa17c836e9cdc37
2021-02-02 19:57:05 +00:00
Greg Kroah-Hartman
c9b5531f57 ANDROID: GKI: fix up abi issues with 4.19.172
The futex changes in 4.19.172 required some additions to struct
task_struct, which of course, is a structure used by just about
everyone.

To preserve the abi, do some gyrations with the reserved fields in order
to handle the growth of the structure.  Given that we are adding a
larger structure than a pointer, carve out a chunk of reserved fields
from the block we were reserving.

These changes fix the genksyms issues, but libabigail is smarter than
that, so we also need to update the .xml file to make it happy with this
change.

The results of libabigail is:

Leaf changes summary: 1 artifact changed
Changed leaf types summary: 1 leaf type changed
Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 0 Added function
Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable

'struct task_struct at sched.h:647:1' changed:
  type size hasn't changed
  3 data member deletions:
    'u64 task_struct::android_kabi_reserved4', at offset 22592 (in bits) at sched.h:1300:1
    'u64 task_struct::android_kabi_reserved5', at offset 22656 (in bits) at sched.h:1301:1
    'u64 task_struct::android_kabi_reserved6', at offset 22720 (in bits) at sched.h:1302:1
  there are data member changes:
    data member u64 task_struct::android_kabi_reserved2 at offset 22464 (in bits) became anonymous data member 'union {unsigned int futex_state; struct {u64 android_kabi_reserved2;} __UNIQUE_ID_android_kabi_hide48; union {};}'
    type 'typedef u64' of 'task_struct::android_kabi_reserved3' changed:
      entity changed from 'typedef u64' to 'struct mutex' at mutex.h:53:1
      type size changed from 64 to 256 (in bits)
    and name of 'task_struct::android_kabi_reserved3' changed to 'task_struct::futex_exit_mutex' at sched.h:1313:1
  1955 impacted interfaces

Bug: 161946584
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Iab623aa5441c1d11e2dc4eb77c7153e4e9517429
2021-02-01 13:56:51 +01:00
Greg Kroah-Hartman
1a02ec69a6 This is the 4.19.172 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmAVUdoACgkQONu9yGCS
 aT6CVw/8CyxF0mRo1B3AdXBy4Tx9a++myhnU0Mg3K2AawKzN2dzF2qWMJq1jz8Ut
 jrRXeaoAQ9Wx9676NCfpkLCDGKPFtRGPq4fel9g4eBbqzSGuhzoe64Qr9mfbIHuU
 3efAiutlTMjQL5FhpeHTqgJTh1j1NsmThIRZA+yIUcL3YbH6HWx7wQGthF1JooWP
 hZ/YQ1MKt9IZlhyafNcO6wvtEfL5DY6DANHSyKsbwY1jMPJIQ0k90Z4zbHRAlwKZ
 HaMdV1vvCXjVNXu6e6Mlto2HcQolzg5l3uNVsc7ZzqHp9yOwrDfPMqRqxNuI2MrP
 r3J38mfRywOV2Woe++aTwOHSj0c/YGTThxbWj/lqJepu3Bc4LBwkACVchWskglY/
 W59XNg5ijxG1PBJy7NW05hkH/d2C6KWilhXvlqe4hRPf6/H3VM1YGTwpHiiVlNsr
 vZYYx0A8ugRo6rigtIrfOBt3xc8ZyQSxlA/mrnzHddH1zzoaZJ7+ecIQgO0lEZh1
 ICV2SY4cinvY5sBGcrgcFYFoQSyCHCjO36h03hHGzVxGVBYIas80DYuqRDes4E9H
 6jEz3TphqCdtSxBsT1D1iIacr+xYyfgAO4YwkpiPhjztRIUaOjAop6U94BHhmPha
 Yz+ia5+odCGo4n6u0k7BYAwSGFlr0+xz/MTMAN5IuFcPWB7w4qA=
 =7rAE
 -----END PGP SIGNATURE-----

Merge 4.19.172 into android-4.19-stable

Changes in 4.19.172
	gpio: mvebu: fix pwm .get_state period calculation
	Revert "mm/slub: fix a memory leak in sysfs_slab_add()"
	futex: Move futex exit handling into futex code
	futex: Replace PF_EXITPIDONE with a state
	exit/exec: Seperate mm_release()
	futex: Split futex_mm_release() for exit/exec
	futex: Set task::futex_state to DEAD right after handling futex exit
	futex: Mark the begin of futex exit explicitly
	futex: Sanitize exit state handling
	futex: Provide state handling for exec() as well
	futex: Add mutex around futex exit
	futex: Provide distinct return value when owner is exiting
	futex: Prevent exit livelock
	futex: Ensure the correct return value from futex_lock_pi()
	futex: Replace pointless printk in fixup_owner()
	futex: Provide and use pi_state_update_owner()
	rtmutex: Remove unused argument from rt_mutex_proxy_unlock()
	futex: Use pi_state_update_owner() in put_pi_state()
	futex: Simplify fixup_pi_state_owner()
	futex: Handle faults correctly for PI futexes
	HID: wacom: Correct NULL dereference on AES pen proximity
	tracing: Fix race in trace_open and buffer resize call
	tools: Factor HOSTCC, HOSTLD, HOSTAR definitions
	dm integrity: conditionally disable "recalculate" feature
	writeback: Drop I_DIRTY_TIME_EXPIRE
	fs: fix lazytime expiration handling in __writeback_single_inode()
	Linux 4.19.172

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9b5391e9e955a105ab9c144fa6258dcbea234211
2021-02-01 12:59:33 +01:00
Greg Kroah-Hartman
811218ecee Linux 4.19.172
Tested-by: Pavel Machek (CIP) <pavel@denx.de>
Link: https://lore.kernel.org/r/20210129105910.685105711@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:13 +01:00
Eric Biggers
b4da738055 fs: fix lazytime expiration handling in __writeback_single_inode()
commit 1e249cb5b7fc09ff216aa5a12f6c302e434e88f9 upstream.

When lazytime is enabled and an inode is being written due to its
in-memory updated timestamps having expired, either due to a sync() or
syncfs() system call or due to dirtytime_expire_interval having elapsed,
the VFS needs to inform the filesystem so that the filesystem can copy
the inode's timestamps out to the on-disk data structures.

This is done by __writeback_single_inode() calling
mark_inode_dirty_sync(), which then calls ->dirty_inode(I_DIRTY_SYNC).

However, this occurs after __writeback_single_inode() has already
cleared the dirty flags from ->i_state.  This causes two bugs:

- mark_inode_dirty_sync() redirties the inode, causing it to remain
  dirty.  This wastefully causes the inode to be written twice.  But
  more importantly, it breaks cases where sync_filesystem() is expected
  to clean dirty inodes.  This includes the FS_IOC_REMOVE_ENCRYPTION_KEY
  ioctl (as reported at
  https://lore.kernel.org/r/20200306004555.GB225345@gmail.com), as well
  as possibly filesystem freezing (freeze_super()).

- Since ->i_state doesn't contain I_DIRTY_TIME when ->dirty_inode() is
  called from __writeback_single_inode() for lazytime expiration,
  xfs_fs_dirty_inode() ignores the notification.  (XFS only cares about
  lazytime expirations, and it assumes that i_state will contain
  I_DIRTY_TIME during those.)  Therefore, lazy timestamps aren't
  persisted by sync(), syncfs(), or dirtytime_expire_interval on XFS.

Fix this by moving the call to mark_inode_dirty_sync() to earlier in
__writeback_single_inode(), before the dirty flags are cleared from
i_state.  This makes filesystems be properly notified of the timestamp
expiration, and it avoids incorrectly redirtying the inode.

This fixes xfstest generic/580 (which tests
FS_IOC_REMOVE_ENCRYPTION_KEY) when run on ext4 or f2fs with lazytime
enabled.  It also fixes the new lazytime xfstest I've proposed, which
reproduces the above-mentioned XFS bug
(https://lore.kernel.org/r/20210105005818.92978-1-ebiggers@kernel.org).

Alternatively, we could call ->dirty_inode(I_DIRTY_SYNC) directly.  But
due to the introduction of I_SYNC_QUEUED, mark_inode_dirty_sync() is the
right thing to do because mark_inode_dirty_sync() now knows not to move
the inode to a writeback list if it is currently queued for sync.

Fixes: 0ae45f63d4 ("vfs: add support for a lazytime mount option")
Cc: stable@vger.kernel.org
Depends-on: 5afced3bf281 ("writeback: Avoid skipping inode writeback")
Link: https://lore.kernel.org/r/20210112190253.64307-2-ebiggers@kernel.org
Suggested-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:13 +01:00
Jan Kara
9f9623fc93 writeback: Drop I_DIRTY_TIME_EXPIRE
commit 5fcd57505c002efc5823a7355e21f48dd02d5a51 upstream.

The only use of I_DIRTY_TIME_EXPIRE is to detect in
__writeback_single_inode() that inode got there because flush worker
decided it's time to writeback the dirty inode time stamps (either
because we are syncing or because of age). However we can detect this
directly in __writeback_single_inode() and there's no need for the
strange propagation with I_DIRTY_TIME_EXPIRE flag.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:13 +01:00
Mikulas Patocka
26b30d36cb dm integrity: conditionally disable "recalculate" feature
commit 5c02406428d5219c367c5f53457698c58bc5f917 upstream.

Otherwise a malicious user could (ab)use the "recalculate" feature
that makes dm-integrity calculate the checksums in the background
while the device is already usable. When the system restarts before all
checksums have been calculated, the calculation continues where it was
interrupted even if the recalculate feature is not requested the next
time the dm device is set up.

Disable recalculating if we use internal_hash or journal_hash with a
key (e.g. HMAC) and we don't have the "legacy_recalculate" flag.

This may break activation of a volume, created by an older kernel,
that is not yet fully recalculated -- if this happens, the user should
add the "legacy_recalculate" flag to constructor parameters.

Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Daniel Glockner <dg@emlix.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:13 +01:00
Jean-Philippe Brucker
440ed9aef5 tools: Factor HOSTCC, HOSTLD, HOSTAR definitions
commit c8a950d0d3b926a02c7b2e713850d38217cec3d1 upstream.

Several Makefiles in tools/ need to define the host toolchain variables.
Move their definition to tools/scripts/Makefile.include

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/bpf/20201110164310.2600671-2-jean-philippe@linaro.org
Cc: Alistair Delva <adelva@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:13 +01:00
Gaurav Kohli
acfa7ad7b7 tracing: Fix race in trace_open and buffer resize call
commit bbeb97464eefc65f506084fd9f18f21653e01137 upstream.

Below race can come, if trace_open and resize of
cpu buffer is running parallely on different cpus
CPUX                                CPUY
				    ring_buffer_resize
				    atomic_read(&buffer->resize_disabled)
tracing_open
tracing_reset_online_cpus
ring_buffer_reset_cpu
rb_reset_cpu
				    rb_update_pages
				    remove/insert pages
resetting pointer

This race can cause data abort or some times infinte loop in
rb_remove_pages and rb_insert_pages while checking pages
for sanity.

Take buffer lock to fix this.

Link: https://lkml.kernel.org/r/1601976833-24377-1-git-send-email-gkohli@codeaurora.org

Cc: stable@vger.kernel.org
Fixes: 83f40318da ("ring-buffer: Make removal of ring buffer pages atomic")
Reported-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Gaurav Kohli <gkohli@codeaurora.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:13 +01:00
Jason Gerecke
1feeef6cc4 HID: wacom: Correct NULL dereference on AES pen proximity
commit 179e8e47c02a1950f1c556f2b854bdb2259078fb upstream.

The recent commit to fix a memory leak introduced an inadvertant NULL
pointer dereference. The `wacom_wac->pen_fifo` variable was never
intialized, resuling in a crash whenever functions tried to use it.
Since the FIFO is only used by AES pens (to buffer events from pen
proximity until the hardware reports the pen serial number) this would
have been easily overlooked without testing an AES device.

This patch converts `wacom_wac->pen_fifo` over to a pointer (since the
call to `devres_alloc` allocates memory for us) and ensures that we assign
it to point to the allocated and initalized `pen_fifo` before the function
returns.

Link: https://github.com/linuxwacom/input-wacom/issues/230
Fixes: 37309f47e2f5 ("HID: wacom: Fix memory leakage caused by kfifo_alloc")
CC: stable@vger.kernel.org # v4.19+
Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com>
Tested-by: Ping Cheng <ping.cheng@wacom.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:13 +01:00
Thomas Gleixner
6e7bfa046d futex: Handle faults correctly for PI futexes
commit 34b1a1ce1458f50ef27c54e28eb9b1947012907a upstream

fixup_pi_state_owner() tries to ensure that the state of the rtmutex,
pi_state and the user space value related to the PI futex are consistent
before returning to user space. In case that the user space value update
faults and the fault cannot be resolved by faulting the page in via
fault_in_user_writeable() the function returns with -EFAULT and leaves
the rtmutex and pi_state owner state inconsistent.

A subsequent futex_unlock_pi() operates on the inconsistent pi_state and
releases the rtmutex despite not owning it which can corrupt the RB tree of
the rtmutex and cause a subsequent kernel stack use after free.

It was suggested to loop forever in fixup_pi_state_owner() if the fault
cannot be resolved, but that results in runaway tasks which is especially
undesired when the problem happens due to a programming error and not due
to malice.

As the user space value cannot be fixed up, the proper solution is to make
the rtmutex and the pi_state consistent so both have the same owner. This
leaves the user space value out of sync. Any subsequent operation on the
futex will fail because the 10th rule of PI futexes (pi_state owner and
user space value are consistent) has been violated.

As a consequence this removes the inept attempts of 'fixing' the situation
in case that the current task owns the rtmutex when returning with an
unresolvable fault by unlocking the rtmutex which left pi_state::owner and
rtmutex::owner out of sync in a different and only slightly less dangerous
way.

Fixes: 1b7558e457 ("futexes: fix fault handling in futex_lock_pi")
Reported-by: gzobqq@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:13 +01:00
Thomas Gleixner
a4649185a9 futex: Simplify fixup_pi_state_owner()
commit f2dac39d93987f7de1e20b3988c8685523247ae2 upstream

Too many gotos already and an upcoming fix would make it even more
unreadable.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:12 +01:00
Thomas Gleixner
9d5dbf57d6 futex: Use pi_state_update_owner() in put_pi_state()
commit 6ccc84f917d33312eb2846bd7b567639f585ad6d upstream

No point in open coding it. This way it gains the extra sanity checks.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:12 +01:00
Thomas Gleixner
29013e4f4b rtmutex: Remove unused argument from rt_mutex_proxy_unlock()
commit 2156ac1934166d6deb6cd0f6ffc4c1076ec63697 upstream

Nothing uses the argument. Remove it as preparation to use
pi_state_update_owner().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:12 +01:00
Thomas Gleixner
0e1501f7b1 futex: Provide and use pi_state_update_owner()
commit c5cade200ab9a2a3be9e7f32a752c8d86b502ec7 upstream

Updating pi_state::owner is done at several places with the same
code. Provide a function for it and use that at the obvious places.

This is also a preparation for a bug fix to avoid yet another copy of the
same code or alternatively introducing a completely unpenetratable mess of
gotos.

Originally-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:12 +01:00
Thomas Gleixner
f03b21494d futex: Replace pointless printk in fixup_owner()
commit 04b79c55201f02ffd675e1231d731365e335c307 upstream

If that unexpected case of inconsistent arguments ever happens then the
futex state is left completely inconsistent and the printk is not really
helpful. Replace it with a warning and make the state consistent.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:12 +01:00
Thomas Gleixner
72f38fffa4 futex: Ensure the correct return value from futex_lock_pi()
commit 12bb3f7f1b03d5913b3f9d4236a488aa7774dfe9 upstream

In case that futex_lock_pi() was aborted by a signal or a timeout and the
task returned without acquiring the rtmutex, but is the designated owner of
the futex due to a concurrent futex_unlock_pi() fixup_owner() is invoked to
establish consistent state. In that case it invokes fixup_pi_state_owner()
which in turn tries to acquire the rtmutex again. If that succeeds then it
does not propagate this success to fixup_owner() and futex_lock_pi()
returns -EINTR or -ETIMEOUT despite having the futex locked.

Return success from fixup_pi_state_owner() in all cases where the current
task owns the rtmutex and therefore the futex and propagate it correctly
through fixup_owner(). Fixup the other callsite which does not expect a
positive return value.

Fixes: c1e2f0eaf0 ("futex: Avoid violating the 10th rule of futex")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:12 +01:00
Thomas Gleixner
7874eee013 futex: Prevent exit livelock
commit 3ef240eaff36b8119ac9e2ea17cbf41179c930ba upstream

Oleg provided the following test case:

int main(void)
{
	struct sched_param sp = {};

	sp.sched_priority = 2;
	assert(sched_setscheduler(0, SCHED_FIFO, &sp) == 0);

	int lock = vfork();
	if (!lock) {
		sp.sched_priority = 1;
		assert(sched_setscheduler(0, SCHED_FIFO, &sp) == 0);
		_exit(0);
	}

	syscall(__NR_futex, &lock, FUTEX_LOCK_PI, 0,0,0);
	return 0;
}

This creates an unkillable RT process spinning in futex_lock_pi() on a UP
machine or if the process is affine to a single CPU. The reason is:

 parent	    	    			child

  set FIFO prio 2

  vfork()			->	set FIFO prio 1
   implies wait_for_child()	 	sched_setscheduler(...)
 			   		exit()
					do_exit()
 					....
					mm_release()
					  tsk->futex_state = FUTEX_STATE_EXITING;
					  exit_futex(); (NOOP in this case)
					  complete() --> wakes parent
  sys_futex()
    loop infinite because
    tsk->futex_state == FUTEX_STATE_EXITING

The same problem can happen just by regular preemption as well:

  task holds futex
  ...
  do_exit()
    tsk->futex_state = FUTEX_STATE_EXITING;

  --> preemption (unrelated wakeup of some other higher prio task, e.g. timer)

  switch_to(other_task)

  return to user
  sys_futex()
	loop infinite as above

Just for the fun of it the futex exit cleanup could trigger the wakeup
itself before the task sets its futex state to DEAD.

To cure this, the handling of the exiting owner is changed so:

   - A refcount is held on the task

   - The task pointer is stored in a caller visible location

   - The caller drops all locks (hash bucket, mmap_sem) and blocks
     on task::futex_exit_mutex. When the mutex is acquired then
     the exiting task has completed the cleanup and the state
     is consistent and can be reevaluated.

This is not a pretty solution, but there is no choice other than returning
an error code to user space, which would break the state consistency
guarantee and open another can of problems including regressions.

For stable backports the preparatory commits ac31c7ff8624 .. ba31c1a48538
are required as well, but for anything older than 5.3.y the backports are
going to be provided when this hits mainline as the other dependencies for
those kernels are definitely not stable material.

Fixes: 778e9a9c3e ("pi-futex: fix exit races and locking problems")
Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Stable Team <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20191106224557.041676471@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:12 +01:00
Thomas Gleixner
7f237695d3 futex: Provide distinct return value when owner is exiting
commit ac31c7ff8624409ba3c4901df9237a616c187a5d upstream

attach_to_pi_owner() returns -EAGAIN for various cases:

 - Owner task is exiting
 - Futex value has changed

The caller drops the held locks (hash bucket, mmap_sem) and retries the
operation. In case of the owner task exiting this can result in a live
lock.

As a preparatory step for seperating those cases, provide a distinct return
value (EBUSY) for the owner exiting case.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191106224556.935606117@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:12 +01:00
Thomas Gleixner
f9b0c6c556 futex: Add mutex around futex exit
commit 3f186d974826847a07bc7964d79ec4eded475ad9 upstream

The mutex will be used in subsequent changes to replace the busy looping of
a waiter when the futex owner is currently executing the exit cleanup to
prevent a potential live lock.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191106224556.845798895@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:12 +01:00
Thomas Gleixner
ab89202056 futex: Provide state handling for exec() as well
commit af8cbda2cfcaa5515d61ec500498d46e9a8247e2 upstream

exec() attempts to handle potentially held futexes gracefully by running
the futex exit handling code like exit() does.

The current implementation has no protection against concurrent incoming
waiters. The reason is that the futex state cannot be set to
FUTEX_STATE_DEAD after the cleanup because the task struct is still active
and just about to execute the new binary.

While its arguably buggy when a task holds a futex over exec(), for
consistency sake the state handling can at least cover the actual futex
exit cleanup section. This provides state consistency protection accross
the cleanup. As the futex state of the task becomes FUTEX_STATE_OK after the
cleanup has been finished, this cannot prevent subsequent attempts to
attach to the task in case that the cleanup was not successfull in mopping
up all leftovers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191106224556.753355618@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:12 +01:00
Thomas Gleixner
b45696340f futex: Sanitize exit state handling
commit 4a8e991b91aca9e20705d434677ac013974e0e30 upstream

Instead of having a smp_mb() and an empty lock/unlock of task::pi_lock move
the state setting into to the lock section.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191106224556.645603214@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:12 +01:00
Thomas Gleixner
226eed1ef7 futex: Mark the begin of futex exit explicitly
commit 18f694385c4fd77a09851fd301236746ca83f3cb upstream

Instead of relying on PF_EXITING use an explicit state for the futex exit
and set it in the futex exit function. This moves the smp barrier and the
lock/unlock serialization into the futex code.

As with the DEAD state this is restricted to the exit path as exec
continues to use the same task struct.

This allows to simplify that logic in a next step.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191106224556.539409004@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:11 +01:00
Thomas Gleixner
8f9a98a0e0 futex: Set task::futex_state to DEAD right after handling futex exit
commit f24f22435dcc11389acc87e5586239c1819d217c upstream

Setting task::futex_state in do_exit() is rather arbitrarily placed for no
reason. Move it into the futex code.

Note, this is only done for the exit cleanup as the exec cleanup cannot set
the state to FUTEX_STATE_DEAD because the task struct is still in active
use.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191106224556.439511191@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:11 +01:00
Thomas Gleixner
1dd589346a futex: Split futex_mm_release() for exit/exec
commit 150d71584b12809144b8145b817e83b81158ae5f upstream

To allow separate handling of the futex exit state in the futex exit code
for exit and exec, split futex_mm_release() into two functions and invoke
them from the corresponding exit/exec_mm_release() callsites.

Preparatory only, no functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191106224556.332094221@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:11 +01:00
Thomas Gleixner
9425476fb1 exit/exec: Seperate mm_release()
commit 4610ba7ad877fafc0a25a30c6c82015304120426 upstream

mm_release() contains the futex exit handling. mm_release() is called from
do_exit()->exit_mm() and from exec()->exec_mm().

In the exit_mm() case PF_EXITING and the futex state is updated. In the
exec_mm() case these states are not touched.

As the futex exit code needs further protections against exit races, this
needs to be split into two functions.

Preparatory only, no functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191106224556.240518241@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:11 +01:00
Thomas Gleixner
095444fad7 futex: Replace PF_EXITPIDONE with a state
commit 3d4775df0a89240f671861c6ab6e8d59af8e9e41 upstream

The futex exit handling relies on PF_ flags. That's suboptimal as it
requires a smp_mb() and an ugly lock/unlock of the exiting tasks pi_lock in
the middle of do_exit() to enforce the observability of PF_EXITING in the
futex code.

Add a futex_state member to task_struct and convert the PF_EXITPIDONE logic
over to the new state. The PF_EXITING dependency will be cleaned up in a
later step.

This prepares for handling various futex exit issues later.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191106224556.149449274@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:11 +01:00
Thomas Gleixner
3fe0ed7bd7 futex: Move futex exit handling into futex code
commit ba31c1a48538992316cc71ce94fa9cd3e7b427c0 upstream

The futex exit handling is #ifdeffed into mm_release() which is not pretty
to begin with. But upcoming changes to address futex exit races need to add
more functionality to this exit code.

Split it out into a function, move it into futex code and make the various
futex exit functions static.

Preparatory only and no functional change.

Folded build fix from Borislav.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191106224556.049705556@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:11 +01:00
Wang Hai
08b227d9f3 Revert "mm/slub: fix a memory leak in sysfs_slab_add()"
commit 757fed1d0898b893d7daa84183947c70f27632f3 upstream.

This reverts commit dde3c6b72a16c2db826f54b2d49bdea26c3534a2.

syzbot report a double-free bug. The following case can cause this bug.

 - mm/slab_common.c: create_cache(): if the __kmem_cache_create() fails,
   it does:

	out_free_cache:
		kmem_cache_free(kmem_cache, s);

 - but __kmem_cache_create() - at least for slub() - will have done

	sysfs_slab_add(s)
		-> sysfs_create_group() .. fails ..
		-> kobject_del(&s->kobj); .. which frees s ...

We can't remove the kmem_cache_free() in create_cache(), because other
error cases of __kmem_cache_create() do not free this.

So, revert the commit dde3c6b72a16 ("mm/slub: fix a memory leak in
sysfs_slab_add()") to fix this.

Reported-by: syzbot+d0bd96b4696c1ef67991@syzkaller.appspotmail.com
Fixes: dde3c6b72a16 ("mm/slub: fix a memory leak in sysfs_slab_add()")
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:11 +01:00
Baruch Siach
4e5ee86dcb gpio: mvebu: fix pwm .get_state period calculation
commit e73b0101ae5124bf7cd3fb5d250302ad2f16a416 upstream.

The period is the sum of on and off values. That is, calculate period as

  ($on + $off) / clkrate

instead of

  $off / clkrate - $on / clkrate

that makes no sense.

Reported-by: Russell King <linux@armlinux.org.uk>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Fixes: 757642f9a5 ("gpio: mvebu: Add limited PWM support")
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
[baruch: backport to kernels <= v5.10]
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:32:11 +01:00
Jaegeuk Kim
d47298cd4d FROMGIT: f2fs: flush data when enabling checkpoint back
During checkpoint=disable period, f2fs bypasses all the synchronous IOs such as
sync and fsync. So, when enabling it back, we must flush all of them in order
to keep the data persistent. Otherwise, suddern power-cut right after enabling
checkpoint will cause data loss.

Bug: 171063590
Fixes: 4354994f097d ("f2fs: checkpoint disabling")
Cc: stable@vger.kernel.org
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
(cherry picked from commit 8d52dbb373579b48f5758dd0cdd2ac0fb4e5be7f git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git dev)
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: Iaca2d6fc1841fffa8677d5d592732c94241fb3fb
2021-01-28 18:05:51 +00:00
linjoey
7a274b68c2 ANDROID: GKI: Added the get_task_pid function
Leaf changes summary: 1 artifact changed
Changed leaf types summary: 0 leaf type changed
Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 1 Added function
Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable

1 Added function:

  [A] 'function pid* get_task_pid(task_struct*, pid_type)'

Bug: 175037520
Signed-off-by: linjoey <linjoey@google.com>
Change-Id: Ibe049a10ec588b4b7b0c601f22499b40c16c0bda
2021-01-27 17:39:19 +00:00
Greg Kroah-Hartman
709f9b702e This is the 4.19.171 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmAROwsACgkQONu9yGCS
 aT4T5w//XnQz3KLe6G/Gu641fwHbOmNmYg/Ob43DRj0JtMRDFsGm7wl5KVYXgf3l
 eI4/K8vMevO0XlSly8Xo8MkYoKiVbv8BX/owsZ9SUTbcGguJ79xKjJ++cXnezsFf
 knwiHKQRg+3KM/PZdT29mgkcYo2T6ndcmeq1Q2Q49Xn7vDdCCJuzLYccKRRaR5zm
 RoWn9AKtobUaFhuhkMm4gi0mkpH9viD4Su62k3EjQW8ysE5l4OPmlMcy6M+dpt5A
 Nh0DMsMnjuqPp1Bm5bSSY5gJ7foHZ9iNAC7UxoD+EIXPMN1jVOoCnqmfSKy+dE65
 ROgex14HP1Zmrn5Zj7spc4zX2zWMbIyJgLQDpp+0dK+4wpINw97yXXoyYW+V5iyY
 ouf5BKXfudv4Du/PiKkAqf/s0Q5Lh19XSns0hygb+WG67h0CLJxQcH3o2wFWKzl3
 JniD5BohTYpqsu5V/zSANqJuugHVxyx2N2SSWtpsYZoFX7GRCtNBOV2NSgH65jW7
 S+N4Md8Hfx6rlp+BOkhKfBtDePev6orrSt4wxZ4JSRH++/ztFKo/1CY23NkLsdGN
 9PIC6dawYIPTNGtrczwvmebEdBUbEYDBLyeKTcCeRLB1yP4sbb4a2yOoCmjdmAeD
 AKUMnx2+20Tl5k67HdrUKxA17KJ1qD5IkurfQkRpUo2uUkSEqpY=
 =Ao4u
 -----END PGP SIGNATURE-----

Merge 4.19.171 into android-4.19-stable

Changes in 4.19.171
	i2c: bpmp-tegra: Ignore unknown I2C_M flags
	ALSA: seq: oss: Fix missing error check in snd_seq_oss_synth_make_info()
	ALSA: hda/via: Add minimum mute flag
	ACPI: scan: Make acpi_bus_get_device() clear return pointer on error
	btrfs: fix lockdep splat in btrfs_recover_relocation
	mmc: core: don't initialize block size from ext_csd if not present
	mmc: sdhci-xenon: fix 1.8v regulator stabilization
	dm: avoid filesystem lookup in dm_get_dev_t()
	dm integrity: fix a crash if "recalculate" used without "internal_hash"
	drm/atomic: put state on error path
	ASoC: Intel: haswell: Add missing pm_ops
	scsi: ufs: Correct the LUN used in eh_device_reset_handler() callback
	scsi: qedi: Correct max length of CHAP secret
	riscv: Fix kernel time_init()
	HID: Ignore battery for Elan touchscreen on ASUS UX550
	clk: tegra30: Add hda clock default rates to clock driver
	xen: Fix event channel callback via INTX/GSI
	drm/nouveau/bios: fix issue shadowing expansion ROMs
	drm/nouveau/privring: ack interrupts the same way as RM
	drm/nouveau/i2c/gm200: increase width of aux semaphore owner fields
	drm/nouveau/mmu: fix vram heap sizing
	drm/nouveau/kms/nv50-: fix case where notifier buffer is at offset 0
	scsi: megaraid_sas: Fix MEGASAS_IOC_FIRMWARE regression
	i2c: octeon: check correct size of maximum RECV_LEN packet
	platform/x86: intel-vbtn: Drop HP Stream x360 Convertible PC 11 from allow-list
	selftests: net: fib_tests: remove duplicate log test
	can: dev: can_restart: fix use after free bug
	can: vxcan: vxcan_xmit: fix use after free bug
	can: peak_usb: fix use after free bugs
	iio: ad5504: Fix setting power-down state
	irqchip/mips-cpu: Set IPI domain parent chip
	intel_th: pci: Add Alder Lake-P support
	stm class: Fix module init return on allocation failure
	serial: mvebu-uart: fix tx lost characters at power off
	ehci: fix EHCI host controller initialization sequence
	USB: ehci: fix an interrupt calltrace error
	usb: gadget: aspeed: fix stop dma register setting.
	usb: udc: core: Use lock when write to soft_connect
	usb: bdc: Make bdc pci driver depend on BROKEN
	xhci: make sure TRB is fully written before giving it to the controller
	xhci: tegra: Delay for disabling LFPS detector
	driver core: Extend device_is_dependent()
	netfilter: rpfilter: mask ecn bits before fib lookup
	sh: dma: fix kconfig dependency for G2_DMA
	sh_eth: Fix power down vs. is_opened flag ordering
	skbuff: back tiny skbs with kmalloc() in __netdev_alloc_skb() too
	kasan: fix unaligned address is unhandled in kasan_remove_zero_shadow
	kasan: fix incorrect arguments passing in kasan_add_zero_shadow
	udp: mask TOS bits in udp_v4_early_demux()
	ipv6: create multicast route with RTPROT_KERNEL
	net_sched: avoid shift-out-of-bounds in tcindex_set_parms()
	net_sched: reject silly cell_log in qdisc_get_rtab()
	ipv6: set multicast flag on the multicast route
	net: mscc: ocelot: allow offloading of bridge on top of LAG
	net: Disable NETIF_F_HW_TLS_RX when RXCSUM is disabled
	net: dsa: b53: fix an off by one in checking "vlan->vid"
	Linux 4.19.171

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I2be7d084a443bfc87e6a3d5753d9c233f54788ac
2021-01-27 11:45:59 +01:00
Greg Kroah-Hartman
c4ff839de1 Linux 4.19.171
Tested-by: Pavel Machek (CIP) <pavel@denx.de>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/faca5e02-cc43-0a14-51dc-2bcb25dafdc0@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27 11:05:44 +01:00
Dan Carpenter
8e9205fa36 net: dsa: b53: fix an off by one in checking "vlan->vid"
commit 8e4052c32d6b4b39c1e13c652c7e33748d447409 upstream.

The > comparison should be >= to prevent accessing one element beyond
the end of the dev->vlans[] array in the caller function, b53_vlan_add().
The "dev->vlans" array is allocated in the b53_switch_init() function
and it has "dev->num_vlans" elements.

Fixes: a2482d2ce3 ("net: dsa: b53: Plug in VLAN support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/YAbxI97Dl/pmBy5V@mwanda
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27 11:05:44 +01:00
Tariq Toukan
fffe7ab69d net: Disable NETIF_F_HW_TLS_RX when RXCSUM is disabled
commit a3eb4e9d4c9218476d05c52dfd2be3d6fdce6b91 upstream.

With NETIF_F_HW_TLS_RX packets are decrypted in HW. This cannot be
logically done when RXCSUM offload is off.

Fixes: 14136564c8 ("net: Add TLS RX offload feature")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Boris Pismenny <borisp@nvidia.com>
Link: https://lore.kernel.org/r/20210117151538.9411-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27 11:05:44 +01:00
Vladimir Oltean
47ece1b19d net: mscc: ocelot: allow offloading of bridge on top of LAG
commit 79267ae22615496655feee2db0848f6786bcf67a upstream.

The blamed commit was too aggressive, and it made ocelot_netdevice_event
react only to network interface events emitted for the ocelot switch
ports.

In fact, only the PRECHANGEUPPER should have had that check.

When we ignore all events that are not for us, we miss the fact that the
upper of the LAG changes, and the bonding interface gets enslaved to a
bridge. This is an operation we could offload under certain conditions.

Fixes: 7afb3e575e5a ("net: mscc: ocelot: don't handle netdev events for other netdevs")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210118135210.2666246-1-olteanv@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27 11:05:44 +01:00
Matteo Croce
47d6a70043 ipv6: set multicast flag on the multicast route
commit ceed9038b2783d14e0422bdc6fd04f70580efb4c upstream.

The multicast route ff00::/8 is created with type RTN_UNICAST:

  $ ip -6 -d route
  unicast ::1 dev lo proto kernel scope global metric 256 pref medium
  unicast fe80::/64 dev eth0 proto kernel scope global metric 256 pref medium
  unicast ff00::/8 dev eth0 proto kernel scope global metric 256 pref medium

Set the type to RTN_MULTICAST which is more appropriate.

Fixes: e8478e80e5 ("net/ipv6: Save route type in rt6_info")
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27 11:05:43 +01:00
Eric Dumazet
df4646250f net_sched: reject silly cell_log in qdisc_get_rtab()
commit e4bedf48aaa5552bc1f49703abd17606e7e6e82a upstream.

iproute2 probably never goes beyond 8 for the cell exponent,
but stick to the max shift exponent for signed 32bit.

UBSAN reported:
UBSAN: shift-out-of-bounds in net/sched/sch_api.c:389:22
shift exponent 130 is too large for 32-bit type 'int'
CPU: 1 PID: 8450 Comm: syz-executor586 Not tainted 5.11.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x183/0x22e lib/dump_stack.c:120
 ubsan_epilogue lib/ubsan.c:148 [inline]
 __ubsan_handle_shift_out_of_bounds+0x432/0x4d0 lib/ubsan.c:395
 __detect_linklayer+0x2a9/0x330 net/sched/sch_api.c:389
 qdisc_get_rtab+0x2b5/0x410 net/sched/sch_api.c:435
 cbq_init+0x28f/0x12c0 net/sched/sch_cbq.c:1180
 qdisc_create+0x801/0x1470 net/sched/sch_api.c:1246
 tc_modify_qdisc+0x9e3/0x1fc0 net/sched/sch_api.c:1662
 rtnetlink_rcv_msg+0xb1d/0xe60 net/core/rtnetlink.c:5564
 netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2494
 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
 netlink_unicast+0x7de/0x9b0 net/netlink/af_netlink.c:1330
 netlink_sendmsg+0xaa6/0xe90 net/netlink/af_netlink.c:1919
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg net/socket.c:672 [inline]
 ____sys_sendmsg+0x5a2/0x900 net/socket.c:2345
 ___sys_sendmsg net/socket.c:2399 [inline]
 __sys_sendmsg+0x319/0x400 net/socket.c:2432
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Cong Wang <cong.wang@bytedance.com>
Link: https://lore.kernel.org/r/20210114160637.1660597-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27 11:05:43 +01:00
Eric Dumazet
22c1b22672 net_sched: avoid shift-out-of-bounds in tcindex_set_parms()
commit bcd0cf19ef8258ac31b9a20248b05c15a1f4b4b0 upstream.

tc_index being 16bit wide, we need to check that TCA_TCINDEX_SHIFT
attribute is not silly.

UBSAN: shift-out-of-bounds in net/sched/cls_tcindex.c:260:29
shift exponent 255 is too large for 32-bit type 'int'
CPU: 0 PID: 8516 Comm: syz-executor228 Not tainted 5.10.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
 valid_perfect_hash net/sched/cls_tcindex.c:260 [inline]
 tcindex_set_parms.cold+0x1b/0x215 net/sched/cls_tcindex.c:425
 tcindex_change+0x232/0x340 net/sched/cls_tcindex.c:546
 tc_new_tfilter+0x13fb/0x21b0 net/sched/cls_api.c:2127
 rtnetlink_rcv_msg+0x8b6/0xb80 net/core/rtnetlink.c:5555
 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2494
 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
 netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1330
 netlink_sendmsg+0x907/0xe40 net/netlink/af_netlink.c:1919
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:672
 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2336
 ___sys_sendmsg+0xf3/0x170 net/socket.c:2390
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2423
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Link: https://lore.kernel.org/r/20210114185229.1742255-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27 11:05:43 +01:00
Matteo Croce
be33a52751 ipv6: create multicast route with RTPROT_KERNEL
commit a826b04303a40d52439aa141035fca5654ccaccd upstream.

The ff00::/8 multicast route is created without specifying the fc_protocol
field, so the default RTPROT_BOOT value is used:

  $ ip -6 -d route
  unicast ::1 dev lo proto kernel scope global metric 256 pref medium
  unicast fe80::/64 dev eth0 proto kernel scope global metric 256 pref medium
  unicast ff00::/8 dev eth0 proto boot scope global metric 256 pref medium

As the documentation says, this value identifies routes installed during
boot, but the route is created when interface is set up.
Change the value to RTPROT_KERNEL which is a better value.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27 11:05:43 +01:00
Guillaume Nault
372d963821 udp: mask TOS bits in udp_v4_early_demux()
commit 8d2b51b008c25240914984208b2ced57d1dd25a5 upstream.

udp_v4_early_demux() is the only function that calls
ip_mc_validate_source() with a TOS that hasn't been masked with
IPTOS_RT_MASK.

This results in different behaviours for incoming multicast UDPv4
packets, depending on if ip_mc_validate_source() is called from the
early-demux path (udp_v4_early_demux) or from the regular input path
(ip_route_input_noref).

ECN would normally not be used with UDP multicast packets, so the
practical consequences should be limited on that side. However,
IPTOS_RT_MASK is used to also masks the TOS' high order bits, to align
with the non-early-demux path behaviour.

Reproducer:

  Setup two netns, connected with veth:
  $ ip netns add ns0
  $ ip netns add ns1
  $ ip -netns ns0 link set dev lo up
  $ ip -netns ns1 link set dev lo up
  $ ip link add name veth01 netns ns0 type veth peer name veth10 netns ns1
  $ ip -netns ns0 link set dev veth01 up
  $ ip -netns ns1 link set dev veth10 up
  $ ip -netns ns0 address add 192.0.2.10 peer 192.0.2.11/32 dev veth01
  $ ip -netns ns1 address add 192.0.2.11 peer 192.0.2.10/32 dev veth10

  In ns0, add route to multicast address 224.0.2.0/24 using source
  address 198.51.100.10:
  $ ip -netns ns0 address add 198.51.100.10/32 dev lo
  $ ip -netns ns0 route add 224.0.2.0/24 dev veth01 src 198.51.100.10

  In ns1, define route to 198.51.100.10, only for packets with TOS 4:
  $ ip -netns ns1 route add 198.51.100.10/32 tos 4 dev veth10

  Also activate rp_filter in ns1, so that incoming packets not matching
  the above route get dropped:
  $ ip netns exec ns1 sysctl -wq net.ipv4.conf.veth10.rp_filter=1

  Now try to receive packets on 224.0.2.11:
  $ ip netns exec ns1 socat UDP-RECVFROM:1111,ip-add-membership=224.0.2.11:veth10,ignoreeof -

  In ns0, send packet to 224.0.2.11 with TOS 4 and ECT(0) (that is,
  tos 6 for socat):
  $ echo test0 | ip netns exec ns0 socat - UDP-DATAGRAM:224.0.2.11:1111,bind=:1111,tos=6

  The "test0" message is properly received by socat in ns1, because
  early-demux has no cached dst to use, so source address validation
  is done by ip_route_input_mc(), which receives a TOS that has the
  ECN bits masked.

  Now send another packet to 224.0.2.11, still with TOS 4 and ECT(0):
  $ echo test1 | ip netns exec ns0 socat - UDP-DATAGRAM:224.0.2.11:1111,bind=:1111,tos=6

  The "test1" message isn't received by socat in ns1, because, now,
  early-demux has a cached dst to use and calls ip_mc_validate_source()
  immediately, without masking the ECN bits.

Fixes: bc044e8db7 ("udp: perform source validation for mcast early demux")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27 11:05:43 +01:00
Lecopzer Chen
1ad3d65c19 kasan: fix incorrect arguments passing in kasan_add_zero_shadow
commit 5dabd1712cd056814f9ab15f1d68157ceb04e741 upstream.

kasan_remove_zero_shadow() shall use original virtual address, start and
size, instead of shadow address.

Link: https://lkml.kernel.org/r/20210103063847.5963-1-lecopzer@gmail.com
Fixes: 0207df4fa1 ("kernel/memremap, kasan: make ZONE_DEVICE with work with KASAN")
Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27 11:05:43 +01:00
Lecopzer Chen
5cb8624526 kasan: fix unaligned address is unhandled in kasan_remove_zero_shadow
commit a11a496ee6e2ab6ed850233c96b94caf042af0b9 upstream.

During testing kasan_populate_early_shadow and kasan_remove_zero_shadow,
if the shadow start and end address in kasan_remove_zero_shadow() is not
aligned to PMD_SIZE, the remain unaligned PTE won't be removed.

In the test case for kasan_remove_zero_shadow():

    shadow_start: 0xffffffb802000000, shadow end: 0xffffffbfbe000000

    3-level page table:
      PUD_SIZE: 0x40000000 PMD_SIZE: 0x200000 PAGE_SIZE: 4K

0xffffffbf80000000 ~ 0xffffffbfbdf80000 will not be removed because in
kasan_remove_pud_table(), kasan_pmd_table(*pud) is true but the next
address is 0xffffffbfbdf80000 which is not aligned to PUD_SIZE.

In the correct condition, this should fallback to the next level
kasan_remove_pmd_table() but the condition flow always continue to skip
the unaligned part.

Fix by correcting the condition when next and addr are neither aligned.

Link: https://lkml.kernel.org/r/20210103135621.83129-1-lecopzer@gmail.com
Fixes: 0207df4fa1 ("kernel/memremap, kasan: make ZONE_DEVICE with work with KASAN")
Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: YJ Chiang <yj.chiang@mediatek.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27 11:05:42 +01:00
Alexander Lobakin
66fb76f3a8 skbuff: back tiny skbs with kmalloc() in __netdev_alloc_skb() too
commit 66c556025d687dbdd0f748c5e1df89c977b6c02a upstream.

Commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for
tiny skbs") ensured that skbs with data size lower than 1025 bytes
will be kmalloc'ed to avoid excessive page cache fragmentation and
memory consumption.
However, the fix adressed only __napi_alloc_skb() (primarily for
virtio_net and napi_get_frags()), but the issue can still be achieved
through __netdev_alloc_skb(), which is still used by several drivers.
Drivers often allocate a tiny skb for headers and place the rest of
the frame to frags (so-called copybreak).
Mirror the condition to __netdev_alloc_skb() to handle this case too.

Since v1 [0]:
 - fix "Fixes:" tag;
 - refine commit message (mention copybreak usecase).

[0] https://lore.kernel.org/netdev/20210114235423.232737-1-alobakin@pm.me

Fixes: a1c7fff7e1 ("net: netdev_alloc_skb() use build_skb()")
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Link: https://lore.kernel.org/r/20210115150354.85967-1-alobakin@pm.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27 11:05:42 +01:00
Geert Uytterhoeven
a45a83301a sh_eth: Fix power down vs. is_opened flag ordering
commit f6a2e94b3f9d89cb40771ff746b16b5687650cbb upstream.

sh_eth_close() does a synchronous power down of the device before
marking it closed.  Revert the order, to make sure the device is never
marked opened while suspended.

While at it, use pm_runtime_put() instead of pm_runtime_put_sync(), as
there is no reason to do a synchronous power down.

Fixes: 7fa2955ff7 ("sh_eth: Fix sleeping function called from invalid context")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://lore.kernel.org/r/20210118150812.796791-1-geert+renesas@glider.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27 11:05:42 +01:00
Necip Fazil Yildiran
1dd5b858e4 sh: dma: fix kconfig dependency for G2_DMA
commit f477a538c14d07f8c45e554c8c5208d588514e98 upstream.

When G2_DMA is enabled and SH_DMA is disabled, it results in the following
Kbuild warning:

WARNING: unmet direct dependencies detected for SH_DMA_API
  Depends on [n]: SH_DMA [=n]
  Selected by [y]:
  - G2_DMA [=y] && SH_DREAMCAST [=y]

The reason is that G2_DMA selects SH_DMA_API without depending on or
selecting SH_DMA while SH_DMA_API depends on SH_DMA.

When G2_DMA was first introduced with commit 40f49e7ed7
("sh: dma: Make G2 DMA configurable."), this wasn't an issue since
SH_DMA_API didn't have such dependency, and this way was the only way to
enable it since SH_DMA_API was non-visible. However, later SH_DMA_API was
made visible and dependent on SH_DMA with commit d8902adcc1
("dmaengine: sh: Add Support SuperH DMA Engine driver").

Let G2_DMA depend on SH_DMA_API instead to avoid Kbuild issues.

Fixes: d8902adcc1 ("dmaengine: sh: Add Support SuperH DMA Engine driver")
Signed-off-by: Necip Fazil Yildiran <fazilyildiran@gmail.com>
Signed-off-by: Rich Felker <dalias@libc.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27 11:05:42 +01:00
Guillaume Nault
c9d1f70298 netfilter: rpfilter: mask ecn bits before fib lookup
commit 2e5a6266fbb11ae93c468dfecab169aca9c27b43 upstream.

RT_TOS() only masks one of the two ECN bits. Therefore rpfilter_mt()
treats Not-ECT or ECT(1) packets in a different way than those with
ECT(0) or CE.

Reproducer:

  Create two netns, connected with a veth:
  $ ip netns add ns0
  $ ip netns add ns1
  $ ip link add name veth01 netns ns0 type veth peer name veth10 netns ns1
  $ ip -netns ns0 link set dev veth01 up
  $ ip -netns ns1 link set dev veth10 up
  $ ip -netns ns0 address add 192.0.2.10/32 dev veth01
  $ ip -netns ns1 address add 192.0.2.11/32 dev veth10

  Add a route to ns1 in ns0:
  $ ip -netns ns0 route add 192.0.2.11/32 dev veth01

  In ns1, only packets with TOS 4 can be routed to ns0:
  $ ip -netns ns1 route add 192.0.2.10/32 tos 4 dev veth10

  Ping from ns0 to ns1 works regardless of the ECN bits, as long as TOS
  is 4:
  $ ip netns exec ns0 ping -Q 4 192.0.2.11   # TOS 4, Not-ECT
    ... 0% packet loss ...
  $ ip netns exec ns0 ping -Q 5 192.0.2.11   # TOS 4, ECT(1)
    ... 0% packet loss ...
  $ ip netns exec ns0 ping -Q 6 192.0.2.11   # TOS 4, ECT(0)
    ... 0% packet loss ...
  $ ip netns exec ns0 ping -Q 7 192.0.2.11   # TOS 4, CE
    ... 0% packet loss ...

  Now use iptable's rpfilter module in ns1:
  $ ip netns exec ns1 iptables-legacy -t raw -A PREROUTING -m rpfilter --invert -j DROP

  Not-ECT and ECT(1) packets still pass:
  $ ip netns exec ns0 ping -Q 4 192.0.2.11   # TOS 4, Not-ECT
    ... 0% packet loss ...
  $ ip netns exec ns0 ping -Q 5 192.0.2.11   # TOS 4, ECT(1)
    ... 0% packet loss ...

  But ECT(0) and ECN packets are dropped:
  $ ip netns exec ns0 ping -Q 6 192.0.2.11   # TOS 4, ECT(0)
    ... 100% packet loss ...
  $ ip netns exec ns0 ping -Q 7 192.0.2.11   # TOS 4, CE
    ... 100% packet loss ...

After this patch, rpfilter doesn't drop ECT(0) and CE packets anymore.

Fixes: 8f97339d3f ("netfilter: add ipv4 reverse path filter match")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27 11:05:42 +01:00