28846 commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Christian Brauner
|
5eecbc68ed |
UPSTREAM: signal: support CLONE_PIDFD with pidfd_send_signal
Let pidfd_send_signal() use pidfds retrieved via CLONE_PIDFD. With this patch pidfd_send_signal() becomes independent of procfs. This fullfils the request made when we merged the pidfd_send_signal() patchset. The pidfd_send_signal() syscall is now always available allowing for it to be used by users without procfs mounted or even users without procfs support compiled into the kernel. Signed-off-by: Christian Brauner <christian@brauner.io> Co-developed-by: Jann Horn <jannh@google.com> Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Howells <dhowells@redhat.com> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> Cc: Andy Lutomirsky <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> (cherry picked from commit 2151ad1b067275730de1b38c7257478cae47d29e) Bug: 135608568 Test: test program using syscall(__NR_sys_pidfd_open,..) and poll() Change-Id: If6ee59279f41e0ae3c6d9e24f7b5481f23aab469 Signed-off-by: Suren Baghdasaryan <surenb@google.com> |
||
Christian Brauner
|
66faab946a |
UPSTREAM: clone: add CLONE_PIDFD
This patchset makes it possible to retrieve pid file descriptors at process creation time by introducing the new flag CLONE_PIDFD to the clone() system call. Linus originally suggested to implement this as a new flag to clone() instead of making it a separate system call. As spotted by Linus, there is exactly one bit for clone() left. CLONE_PIDFD creates file descriptors based on the anonymous inode implementation in the kernel that will also be used to implement the new mount api. They serve as a simple opaque handle on pids. Logically, this makes it possible to interpret a pidfd differently, narrowing or widening the scope of various operations (e.g. signal sending). Thus, a pidfd cannot just refer to a tgid, but also a tid, or in theory - given appropriate flag arguments in relevant syscalls - a process group or session. A pidfd does not represent a privilege. This does not imply it cannot ever be that way but for now this is not the case. A pidfd comes with additional information in fdinfo if the kernel supports procfs. The fdinfo file contains the pid of the process in the callers pid namespace in the same format as the procfs status file, i.e. "Pid:\t%d". As suggested by Oleg, with CLONE_PIDFD the pidfd is returned in the parent_tidptr argument of clone. This has the advantage that we can give back the associated pid and the pidfd at the same time. To remove worries about missing metadata access this patchset comes with a sample program that illustrates how a combination of CLONE_PIDFD, and pidfd_send_signal() can be used to gain race-free access to process metadata through /proc/<pid>. The sample program can easily be translated into a helper that would be suitable for inclusion in libc so that users don't have to worry about writing it themselves. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christian Brauner <christian@brauner.io> Co-developed-by: Jann Horn <jannh@google.com> Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Howells <dhowells@redhat.com> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> Cc: Andy Lutomirsky <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> (cherry picked from commit b3e5838252665ee4cfa76b82bdf1198dca81e5be) Bug: 135608568 Test: test program using syscall(__NR_sys_pidfd_open,..) and poll() Change-Id: I8a8f87e8fb23de0adb6d6acf2e622926b7a9f55c Signed-off-by: Suren Baghdasaryan <surenb@google.com> |
||
Christian Brauner
|
c46b05132a |
UPSTREAM: signal: use fdget() since we don't allow O_PATH
As stated in the original commit for pidfd_send_signal() we don't allow to signal processes through O_PATH file descriptors since it is semantically equivalent to a write on the pidfd. We already correctly error out right now and return EBADF if an O_PATH fd is passed. This is because we use file->f_op to detect whether a pidfd is passed and O_PATH fds have their file->f_op set to empty_fops in do_dentry_open() and thus fail the test. Thus, there is no regression. It's just semantically correct to use fdget() and return an error right from there instead of taking a reference and returning an error later. Signed-off-by: Christian Brauner <christian@brauner.io> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jann Horn <jann@thejh.net> Cc: David Howells <dhowells@redhat.com> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> Cc: Andy Lutomirsky <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 738a7832d21e3d911fcddab98ce260b79010b461) Bug: 135608568 Test: test program using syscall(__NR_pidfd_send_signal,..) to send SIGKILL Change-Id: Ib102922f9793e8610940d34ad5fb1256d4b07476 Signed-off-by: Suren Baghdasaryan <surenb@google.com> |
||
Jann Horn
|
74c14d6081 |
UPSTREAM: signal: don't silently convert SI_USER signals to non-current pidfd
The current sys_pidfd_send_signal() silently turns signals with explicit SI_USER context that are sent to non-current tasks into signals with kernel-generated siginfo. This is unlike do_rt_sigqueueinfo(), which returns -EPERM in this case. If a user actually wants to send a signal with kernel-provided siginfo, they can do that with pidfd_send_signal(pidfd, sig, NULL, 0); so allowing this case is unnecessary. Instead of silently replacing the siginfo, just bail out with an error; this is consistent with other interfaces and avoids special-casing behavior based on security checks. Fixes: 3eb39f47934f ("signal: add pidfd_send_signal() syscall") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Christian Brauner <christian@brauner.io> (cherry picked from commit 556a888a14afe27164191955618990fb3ccc9aad) Bug: 135608568 Test: test program using syscall(__NR_pidfd_send_signal,..) to send SIGKILL Change-Id: I004452c19c50296730a2c6852a5ef47abd69d819 Signed-off-by: Suren Baghdasaryan <surenb@google.com> |
||
Christian Brauner
|
1f27ef8d9b |
BACKPORT: signal: add pidfd_send_signal() syscall
The kill() syscall operates on process identifiers (pid). After a process has exited its pid can be reused by another process. If a caller sends a signal to a reused pid it will end up signaling the wrong process. This issue has often surfaced and there has been a push to address this problem [1]. This patch uses file descriptors (fd) from proc/<pid> as stable handles on struct pid. Even if a pid is recycled the handle will not change. The fd can be used to send signals to the process it refers to. Thus, the new syscall pidfd_send_signal() is introduced to solve this problem. Instead of pids it operates on process fds (pidfd). /* prototype and argument /* long pidfd_send_signal(int pidfd, int sig, siginfo_t *info, unsigned int flags); /* syscall number 424 */ The syscall number was chosen to be 424 to align with Arnd's rework in his y2038 to minimize merge conflicts (cf. [25]). In addition to the pidfd and signal argument it takes an additional siginfo_t and flags argument. If the siginfo_t argument is NULL then pidfd_send_signal() is equivalent to kill(<positive-pid>, <signal>). If it is not NULL pidfd_send_signal() is equivalent to rt_sigqueueinfo(). The flags argument is added to allow for future extensions of this syscall. It currently needs to be passed as 0. Failing to do so will cause EINVAL. /* pidfd_send_signal() replaces multiple pid-based syscalls */ The pidfd_send_signal() syscall currently takes on the job of rt_sigqueueinfo(2) and parts of the functionality of kill(2), Namely, when a positive pid is passed to kill(2). It will however be possible to also replace tgkill(2) and rt_tgsigqueueinfo(2) if this syscall is extended. /* sending signals to threads (tid) and process groups (pgid) */ Specifically, the pidfd_send_signal() syscall does currently not operate on process groups or threads. This is left for future extensions. In order to extend the syscall to allow sending signal to threads and process groups appropriately named flags (e.g. PIDFD_TYPE_PGID, and PIDFD_TYPE_TID) should be added. This implies that the flags argument will determine what is signaled and not the file descriptor itself. Put in other words, grouping in this api is a property of the flags argument not a property of the file descriptor (cf. [13]). Clarification for this has been requested by Eric (cf. [19]). When appropriate extensions through the flags argument are added then pidfd_send_signal() can additionally replace the part of kill(2) which operates on process groups as well as the tgkill(2) and rt_tgsigqueueinfo(2) syscalls. How such an extension could be implemented has been very roughly sketched in [14], [15], and [16]. However, this should not be taken as a commitment to a particular implementation. There might be better ways to do it. Right now this is intentionally left out to keep this patchset as simple as possible (cf. [4]). /* naming */ The syscall had various names throughout iterations of this patchset: - procfd_signal() - procfd_send_signal() - taskfd_send_signal() In the last round of reviews it was pointed out that given that if the flags argument decides the scope of the signal instead of different types of fds it might make sense to either settle for "procfd_" or "pidfd_" as prefix. The community was willing to accept either (cf. [17] and [18]). Given that one developer expressed strong preference for the "pidfd_" prefix (cf. [13]) and with other developers less opinionated about the name we should settle for "pidfd_" to avoid further bikeshedding. The "_send_signal" suffix was chosen to reflect the fact that the syscall takes on the job of multiple syscalls. It is therefore intentional that the name is not reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the fomer because it might imply that pidfd_send_signal() is a replacement for kill(2), and not the latter because it is a hassle to remember the correct spelling - especially for non-native speakers - and because it is not descriptive enough of what the syscall actually does. The name "pidfd_send_signal" makes it very clear that its job is to send signals. /* zombies */ Zombies can be signaled just as any other process. No special error will be reported since a zombie state is an unreliable state (cf. [3]). However, this can be added as an extension through the @flags argument if the need ever arises. /* cross-namespace signals */ The patch currently enforces that the signaler and signalee either are in the same pid namespace or that the signaler's pid namespace is an ancestor of the signalee's pid namespace. This is done for the sake of simplicity and because it is unclear to what values certain members of struct siginfo_t would need to be set to (cf. [5], [6]). /* compat syscalls */ It became clear that we would like to avoid adding compat syscalls (cf. [7]). The compat syscall handling is now done in kernel/signal.c itself by adding __copy_siginfo_from_user_generic() which lets us avoid compat syscalls (cf. [8]). It should be noted that the addition of __copy_siginfo_from_user_any() is caused by a bug in the original implementation of rt_sigqueueinfo(2) (cf. 12). With upcoming rework for syscall handling things might improve significantly (cf. [11]) and __copy_siginfo_from_user_any() will not gain any additional callers. /* testing */ This patch was tested on x64 and x86. /* userspace usage */ An asciinema recording for the basic functionality can be found under [9]. With this patch a process can be killed via: #define _GNU_SOURCE #include <errno.h> #include <fcntl.h> #include <signal.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/stat.h> #include <sys/syscall.h> #include <sys/types.h> #include <unistd.h> static inline int do_pidfd_send_signal(int pidfd, int sig, siginfo_t *info, unsigned int flags) { #ifdef __NR_pidfd_send_signal return syscall(__NR_pidfd_send_signal, pidfd, sig, info, flags); #else return -ENOSYS; #endif } int main(int argc, char *argv[]) { int fd, ret, saved_errno, sig; if (argc < 3) exit(EXIT_FAILURE); fd = open(argv[1], O_DIRECTORY | O_CLOEXEC); if (fd < 0) { printf("%s - Failed to open \"%s\"\n", strerror(errno), argv[1]); exit(EXIT_FAILURE); } sig = atoi(argv[2]); printf("Sending signal %d to process %s\n", sig, argv[1]); ret = do_pidfd_send_signal(fd, sig, NULL, 0); saved_errno = errno; close(fd); errno = saved_errno; if (ret < 0) { printf("%s - Failed to send signal %d to process %s\n", strerror(errno), sig, argv[1]); exit(EXIT_FAILURE); } exit(EXIT_SUCCESS); } /* Q&A * Given that it seems the same questions get asked again by people who are * late to the party it makes sense to add a Q&A section to the commit * message so it's hopefully easier to avoid duplicate threads. * * For the sake of progress please consider these arguments settled unless * there is a new point that desperately needs to be addressed. Please make * sure to check the links to the threads in this commit message whether * this has not already been covered. */ Q-01: (Florian Weimer [20], Andrew Morton [21]) What happens when the target process has exited? A-01: Sending the signal will fail with ESRCH (cf. [22]). Q-02: (Andrew Morton [21]) Is the task_struct pinned by the fd? A-02: No. A reference to struct pid is kept. struct pid - as far as I understand - was created exactly for the reason to not require to pin struct task_struct (cf. [22]). Q-03: (Andrew Morton [21]) Does the entire procfs directory remain visible? Just one entry within it? A-03: The same thing that happens right now when you hold a file descriptor to /proc/<pid> open (cf. [22]). Q-04: (Andrew Morton [21]) Does the pid remain reserved? A-04: No. This patchset guarantees a stable handle not that pids are not recycled (cf. [22]). Q-05: (Andrew Morton [21]) Do attempts to signal that fd return errors? A-05: See {Q,A}-01. Q-06: (Andrew Morton [22]) Is there a cleaner way of obtaining the fd? Another syscall perhaps. A-06: Userspace can already trivially retrieve file descriptors from procfs so this is something that we will need to support anyway. Hence, there's no immediate need to add another syscalls just to make pidfd_send_signal() not dependent on the presence of procfs. However, adding a syscalls to get such file descriptors is planned for a future patchset (cf. [22]). Q-07: (Andrew Morton [21] and others) This fd-for-a-process sounds like a handy thing and people may well think up other uses for it in the future, probably unrelated to signals. Are the code and the interface designed to permit such future applications? A-07: Yes (cf. [22]). Q-08: (Andrew Morton [21] and others) Now I think about it, why a new syscall? This thing is looking rather like an ioctl? A-08: This has been extensively discussed. It was agreed that a syscall is preferred for a variety or reasons. Here are just a few taken from prior threads. Syscalls are safer than ioctl()s especially when signaling to fds. Processes are a core kernel concept so a syscall seems more appropriate. The layout of the syscall with its four arguments would require the addition of a custom struct for the ioctl() thereby causing at least the same amount or even more complexity for userspace than a simple syscall. The new syscall will replace multiple other pid-based syscalls (see description above). The file-descriptors-for-processes concept introduced with this syscall will be extended with other syscalls in the future. See also [22], [23] and various other threads already linked in here. Q-09: (Florian Weimer [24]) What happens if you use the new interface with an O_PATH descriptor? A-09: pidfds opened as O_PATH fds cannot be used to send signals to a process (cf. [2]). Signaling processes through pidfds is the equivalent of writing to a file. Thus, this is not an operation that operates "purely at the file descriptor level" as required by the open(2) manpage. See also [4]. /* References */ [1]: https://lore.kernel.org/lkml/20181029221037.87724-1-dancol@google.com/ [2]: https://lore.kernel.org/lkml/874lbtjvtd.fsf@oldenburg2.str.redhat.com/ [3]: https://lore.kernel.org/lkml/20181204132604.aspfupwjgjx6fhva@brauner.io/ [4]: https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru2ku@brauner.io/ [5]: https://lore.kernel.org/lkml/20181121213946.GA10795@mail.hallyn.com/ [6]: https://lore.kernel.org/lkml/20181120103111.etlqp7zop34v6nv4@brauner.io/ [7]: https://lore.kernel.org/lkml/36323361-90BD-41AF-AB5B-EE0D7BA02C21@amacapital.net/ [8]: https://lore.kernel.org/lkml/87tvjxp8pc.fsf@xmission.com/ [9]: https://asciinema.org/a/IQjuCHew6bnq1cr78yuMv16cy [11]: https://lore.kernel.org/lkml/F53D6D38-3521-4C20-9034-5AF447DF62FF@amacapital.net/ [12]: https://lore.kernel.org/lkml/87zhtjn8ck.fsf@xmission.com/ [13]: https://lore.kernel.org/lkml/871s6u9z6u.fsf@xmission.com/ [14]: https://lore.kernel.org/lkml/20181206231742.xxi4ghn24z4h2qki@brauner.io/ [15]: https://lore.kernel.org/lkml/20181207003124.GA11160@mail.hallyn.com/ [16]: https://lore.kernel.org/lkml/20181207015423.4miorx43l3qhppfz@brauner.io/ [17]: https://lore.kernel.org/lkml/CAGXu5jL8PciZAXvOvCeCU3wKUEB_dU-O3q0tDw4uB_ojMvDEew@mail.gmail.com/ [18]: https://lore.kernel.org/lkml/20181206222746.GB9224@mail.hallyn.com/ [19]: https://lore.kernel.org/lkml/20181208054059.19813-1-christian@brauner.io/ [20]: https://lore.kernel.org/lkml/8736rebl9s.fsf@oldenburg.str.redhat.com/ [21]: https://lore.kernel.org/lkml/20181228152012.dbf0508c2508138efc5f2bbe@linux-foundation.org/ [22]: https://lore.kernel.org/lkml/20181228233725.722tdfgijxcssg76@brauner.io/ [23]: https://lwn.net/Articles/773459/ [24]: https://lore.kernel.org/lkml/8736rebl9s.fsf@oldenburg.str.redhat.com/ [25]: https://lore.kernel.org/lkml/CAK8P3a0ej9NcJM8wXNPbcGUyOUZYX+VLoDFdbenW3s3114oQZw@mail.gmail.com/ Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Jann Horn <jannh@google.com> Cc: Andy Lutomirsky <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Florian Weimer <fweimer@redhat.com> Signed-off-by: Christian Brauner <christian@brauner.io> Reviewed-by: Tycho Andersen <tycho@tycho.ws> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Serge Hallyn <serge@hallyn.com> Acked-by: Aleksa Sarai <cyphar@cyphar.com> (cherry picked from commit 3eb39f47934f9d5a3027fe00d906a45fe3a15fad) Conflicts: include/linux/proc_fs.h - trivial manual merge include/uapi/asm-generic/unistd.h - trivial manual merge kernel/signal.c (1. manual merges because of 4.19 differences 2. change prepare_kill_siginfo() to use struct siginfo instead of kernel_siginfo 3. change copy_siginfo_from_user_any() to use struct siginfo instead of kernel_siginfo 4. change pidfd_send_signal() to use struct siginfo instead of kernel_siginfo 5. use copy_from_user() instead of copy_siginfo_from_user() in copy_siginfo_from_user_any()) Bug: 135608568 Test: test program using syscall(__NR_pidfd_send_signal,..) to send SIGKILL Change-Id: I24e6298ecf036d1822f3fa6c5286984b4e195c16 Signed-off-by: Suren Baghdasaryan <surenb@google.com> |
||
Greg Kroah-Hartman
|
487e61785a |
This is the 4.19.66 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1NlswACgkQONu9yGCS aT6bJBAAhlElptL5xPWWrkZdYBVyfcat3VaeEmtO7ilHxyvhFMFOE8zIYEg/NU5s ySM1yqdYvZY1ANawzIEH3es7eXy++GBBIVcRUPBuEVG/yv4/XyeX6MQYlDn3LuzI KsTAkkkqy4LLieqZTF5cqavu1EuUkQWkPxKW1ps9O5Dv24FRmKhj+EiIoEGt43Ln 6d8vChzS+ZWNhWMJd4zP8x5ZXjUUTYlBxveRQQ9yatTbzrQdfkxyD7k6+1K8se8T m9koOVJC9LtbR7W2WbVVq4L6ik3l3LQTr592kZVKiwyVgJtlul3LvqS0hNXcQxSj pwfbz+4b8UP4aRgRbMxzSRLOkb+hUOvYR+CuGLKx827b9FQcLNAbseRsA0MlPENF jJJUQSDollhx4knbYU+8Y2V1WtDi7Dnjt5gRvCvQ6rdrJMp1mFdqRGxfsdBaR10N kU+LaSfIheEvuRoTv535NpxvPQhmEqY6NTGwMYDP1xlvu4vQbWF+HAMhCLmTWrif i5ITtxYxSZOHNy7lvR7dNTMWsLoYg0KvG/oqJo8kSLmlBPcga+VK1YlPsc6JrfEG d4ft7rMz/Cx2oeUQfiURqi/XPwyIBaU6MbovLuQkVx9CJcwgYcmdFD0HRHZIEwgk z0q965POxm3DUoJF8HjBFNUOmlaYn9JZBmrwg5aFgwCy5Kot8g0= =aeth -----END PGP SIGNATURE----- Merge 4.19.66 into android-4.19 Changes in 4.19.66 scsi: fcoe: Embed fc_rport_priv in fcoe_rport structure gcc-9: don't warn about uninitialized variable driver core: Establish order of operations for device_add and device_del via bitflag drivers/base: Introduce kill_device() libnvdimm/bus: Prevent duplicate device_unregister() calls libnvdimm/region: Register badblocks before namespaces libnvdimm/bus: Prepare the nd_ioctl() path to be re-entrant libnvdimm/bus: Fix wait_nvdimm_bus_probe_idle() ABBA deadlock HID: wacom: fix bit shift for Cintiq Companion 2 HID: Add quirk for HP X1200 PIXART OEM mouse IB: directly cast the sockaddr union to aockaddr atm: iphase: Fix Spectre v1 vulnerability bnx2x: Disable multi-cos feature. ife: error out when nla attributes are empty ip6_gre: reload ipv6h in prepare_ip6gre_xmit_ipv6 ip6_tunnel: fix possible use-after-free on xmit ipip: validate header length in ipip_tunnel_xmit mlxsw: spectrum: Fix error path in mlxsw_sp_module_init() mvpp2: fix panic on module removal mvpp2: refactor MTU change code net: bridge: delete local fdb on device init failure net: bridge: mcast: don't delete permanent entries when fast leave is enabled net: fix ifindex collision during namespace removal net/mlx5e: always initialize frag->last_in_page net/mlx5: Use reversed order when unregister devices net: phylink: Fix flow control for fixed-link net: qualcomm: rmnet: Fix incorrect UL checksum offload logic net: sched: Fix a possible null-pointer dereference in dequeue_func() net sched: update vlan action for batched events operations net: sched: use temporary variable for actions indexes net/smc: do not schedule tx_work in SMC_CLOSED state NFC: nfcmrvl: fix gpio-handling regression ocelot: Cancel delayed work before wq destruction tipc: compat: allow tipc commands without arguments tun: mark small packets as owned by the tap sock net/mlx5: Fix modify_cq_in alignment net/mlx5e: Prevent encap flow counter update async to user query r8169: don't use MSI before RTL8168d compat_ioctl: pppoe: fix PPPOEIOCSFWD handling cgroup: Call cgroup_release() before __exit_signal() cgroup: Implement css_task_iter_skip() cgroup: Include dying leaders with live threads in PROCS iterations cgroup: css_task_iter_skip()'d iterators must be advanced before accessed cgroup: Fix css_task_iter_advance_css_set() cset skip condition spi: bcm2835: Fix 3-wire mode if DMA is enabled Linux 4.19.66 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Id33ce169af8bf14a3791040b4cf923832ce84f6c |
||
Tejun Heo
|
ebda41dd17 |
cgroup: Fix css_task_iter_advance_css_set() cset skip condition
commit c596687a008b579c503afb7a64fcacc7270fae9e upstream. While adding handling for dying task group leaders c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS iterations") added an inverted cset skip condition to css_task_iter_advance_css_set(). It should skip cset if it's completely empty but was incorrectly testing for the inverse condition for the dying_tasks list. Fix it. Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS iterations") Reported-by: syzbot+d4bba5ccd4f9a2a68681@syzkaller.appspotmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Tejun Heo
|
0a9abd2778 |
cgroup: css_task_iter_skip()'d iterators must be advanced before accessed
commit cee0c33c546a93957a52ae9ab6bebadbee765ec5 upstream. b636fd38dc40 ("cgroup: Implement css_task_iter_skip()") introduced css_task_iter_skip() which is used to fix task iterations skipping dying threadgroup leaders with live threads. Skipping is implemented as a subportion of full advancing but css_task_iter_next() forgot to fully advance a skipped iterator before determining the next task to visit causing it to return invalid task pointers. Fix it by making css_task_iter_next() fully advance the iterator if it has been skipped since the previous iteration. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: syzbot Link: http://lkml.kernel.org/r/00000000000097025d058a7fd785@google.com Fixes: b636fd38dc40 ("cgroup: Implement css_task_iter_skip()") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Tejun Heo
|
4340d175b8 |
cgroup: Include dying leaders with live threads in PROCS iterations
commit c03cd7738a83b13739f00546166969342c8ff014 upstream. CSS_TASK_ITER_PROCS currently iterates live group leaders; however, this means that a process with dying leader and live threads will be skipped. IOW, cgroup.procs might be empty while cgroup.threads isn't, which is confusing to say the least. Fix it by making cset track dying tasks and include dying leaders with live threads in PROCS iteration. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-and-tested-by: Topi Miettinen <toiwoton@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Tejun Heo
|
370b9e6399 |
cgroup: Implement css_task_iter_skip()
commit b636fd38dc40113f853337a7d2a6885ad23b8811 upstream. When a task is moved out of a cset, task iterators pointing to the task are advanced using the normal css_task_iter_advance() call. This is fine but we'll be tracking dying tasks on csets and thus moving tasks from cset->tasks to (to be added) cset->dying_tasks. When we remove a task from cset->tasks, if we advance the iterators, they may move over to the next cset before we had the chance to add the task back on the dying list, which can allow the task to escape iteration. This patch separates out skipping from advancing. Skipping only moves the affected iterators to the next pointer rather than fully advancing it and the following advancing will recognize that the cursor has already been moved forward and do the rest of advancing. This ensures that when a task moves from one list to another in its cset, as long as it moves in the right direction, it's always visible to iteration. This doesn't cause any visible behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Tejun Heo
|
7528e95b75 |
cgroup: Call cgroup_release() before __exit_signal()
commit 6b115bf58e6f013ca75e7115aabcbd56c20ff31d upstream. cgroup_release() calls cgroup_subsys->release() which is used by the pids controller to uncharge its pid. We want to use it to manage iteration of dying tasks which requires putting it before __unhash_process(). Move cgroup_release() above __exit_signal(). While this makes it uncharge before the pid is freed, pid is RCU freed anyway and the window is very narrow. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Greg Kroah-Hartman
|
de4c70d6a9 |
This is the 4.19.65 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1Js7MACgkQONu9yGCS aT4PQxAAo7xa4kYvDxc1RjUY/yIlp6lQ3rpYAAfZB0t8vN+dqivnJZ7m6JHeWX1Y CMcxg85zxLVFeuiXdP821Zj68AB5zqlWMhX0bXm2lhw/Eo9+XHzXtnrLZHhz0/Xd M5cmfIPmoyPCUQQfzSfUMvch+ZpwzEt5op5pUfSjckSpjHQZ0HFj1WJ4D8Hn9jAJ y4+DAKDZgtqhb3GvpS6MoVnBJgcPk9+mBiDkSb12L392+FvHqfeBi3tDRhvyiZAO iJrk747SPds7NlNmuRnj7YyUSDhBzaceRCz0Jsv9FT5EKXoPErXdsL3Bkfa9TREM pH0OaMgNr6WSXLO9qIMcfxMeaKVIvIbotqBTkBTzhEAGPkHA75dhi0lpixXXFExg MaqhLfmHO0dOEr9FrvYGe7f2wUA1Rdw/qRTM3KPEKmHxMqBS7eufIWMHwie1n9Oe cYoP6UkxUIvhUyFV2BlMRFdMfaDbtR0iqy8Dqh36NISD6PAYaUGSoVeSO1fEg4Jy 5GgrKPg6rcz2XNY2cVbsm2zLpqY4dY58SFK9ORfuULdKUQvScvFGrdSSW0CgX+uc F/5NmPutUoboHVxFraDPx7yo46pHf1RW0Me4xZ0aJ3e9ituLAN4fmJ9u46nofb5M thPelQlMVt30O41uViJ0ADkOjCsiBr3AxOFvc76Ct9Q/BJVxhLk= =JVBv -----END PGP SIGNATURE----- Merge 4.19.65 into android-4.19 Changes in 4.19.65 ARM: riscpc: fix DMA ARM: dts: rockchip: Make rk3288-veyron-minnie run at hs200 ARM: dts: rockchip: Make rk3288-veyron-mickey's emmc work again ARM: dts: rockchip: Mark that the rk3288 timer might stop in suspend ftrace: Enable trampoline when rec count returns back to one dmaengine: tegra-apb: Error out if DMA_PREP_INTERRUPT flag is unset arm64: dts: rockchip: fix isp iommu clocks and power domain kernel/module.c: Only return -EEXIST for modules that have finished loading firmware/psci: psci_checker: Park kthreads before stopping them MIPS: lantiq: Fix bitfield masking dmaengine: rcar-dmac: Reject zero-length slave DMA requests clk: tegra210: fix PLLU and PLLU_OUT1 fs/adfs: super: fix use-after-free bug clk: sprd: Add check for return value of sprd_clk_regmap_init() btrfs: fix minimum number of chunk errors for DUP btrfs: qgroup: Don't hold qgroup_ioctl_lock in btrfs_qgroup_inherit() cifs: Fix a race condition with cifs_echo_request ceph: fix improper use of smp_mb__before_atomic() ceph: return -ERANGE if virtual xattr value didn't fit in buffer ACPI: blacklist: fix clang warning for unused DMI table scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized perf version: Fix segfault due to missing OPT_END() x86: kvm: avoid constant-conversion warning ACPI: fix false-positive -Wuninitialized warning be2net: Signal that the device cannot transmit during reconfiguration x86/apic: Silence -Wtype-limits compiler warnings x86: math-emu: Hide clang warnings for 16-bit overflow mm/cma.c: fail if fixed declaration can't be honored lib/test_overflow.c: avoid tainting the kernel and fix wrap size lib/test_string.c: avoid masking memset16/32/64 failures coda: add error handling for fget coda: fix build using bare-metal toolchain uapi linux/coda_psdev.h: move upc_req definition from uapi to kernel side headers drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings ipc/mqueue.c: only perform resource calculation if user valid mlxsw: spectrum_dcb: Configure DSCP map as the last rule is removed xen/pv: Fix a boot up hang revealed by int3 self test x86/kvm: Don't call kvm_spurious_fault() from .fixup x86/paravirt: Fix callee-saved function ELF sizes x86, boot: Remove multiple copy of static function sanitize_boot_params() drm/nouveau: fix memory leak in nouveau_conn_reset() kconfig: Clear "written" flag to avoid data loss kbuild: initialize CLANG_FLAGS correctly in the top Makefile Btrfs: fix incremental send failure after deduplication Btrfs: fix race leading to fs corruption after transaction abort mmc: dw_mmc: Fix occasional hang after tuning on eMMC mmc: meson-mx-sdio: Fix misuse of GENMASK macro gpiolib: fix incorrect IRQ requesting of an active-low lineevent IB/hfi1: Fix Spectre v1 vulnerability mtd: rawnand: micron: handle on-die "ECC-off" devices correctly selinux: fix memory leak in policydb_init() ALSA: hda: Fix 1-minute detection delay when i915 module is not available mm: vmscan: check if mem cgroup is disabled or not before calling memcg slab shrinker s390/dasd: fix endless loop after read unit address configuration cgroup: kselftest: relax fs_spec checks parisc: Fix build of compressed kernel even with debug enabled drivers/perf: arm_pmu: Fix failure path in PM notifier arm64: compat: Allow single-byte watchpoints on all addresses arm64: cpufeature: Fix feature comparison for CTR_EL0.{CWG,ERG} nbd: replace kill_bdev() with __invalidate_device() again xen/swiotlb: fix condition for calling xen_destroy_contiguous_region() IB/mlx5: Fix unreg_umr to ignore the mkey state IB/mlx5: Use direct mkey destroy command upon UMR unreg failure IB/mlx5: Move MRs to a kernel PD when freeing them to the MR cache IB/mlx5: Fix clean_mr() to work in the expected order IB/mlx5: Fix RSS Toeplitz setup to be aligned with the HW specification IB/hfi1: Check for error on call to alloc_rsm_map_table drm/i915/gvt: fix incorrect cache entry for guest page mapping eeprom: at24: make spd world-readable again ARC: enable uboot support unconditionally objtool: Support GCC 9 cold subfunction naming scheme gcc-9: properly declare the {pv,hv}clock_page storage x86/vdso: Prevent segfaults due to hoisted vclock reads scsi: mpt3sas: Use 63-bit DMA addressing on SAS35 HBA x86/cpufeatures: Carve out CQM features retrieval x86/cpufeatures: Combine word 11 and 12 into a new scattered features word x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations x86/speculation: Enable Spectre v1 swapgs mitigations x86/entry/64: Use JMP instead of JMPQ x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS Documentation: Add swapgs description to the Spectre v1 documentation Linux 4.19.65 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Iceeabdb164657e0a616db618e6aa8445d56b0dc1 |
||
Prarit Bhargava
|
09ec6c6783 |
kernel/module.c: Only return -EEXIST for modules that have finished loading
[ Upstream commit 6e6de3dee51a439f76eb73c22ae2ffd2c9384712 ] Microsoft HyperV disables the X86_FEATURE_SMCA bit on AMD systems, and linux guests boot with repeated errors: amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2) amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2) amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2) amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2) amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2) amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2) The warnings occur because the module code erroneously returns -EEXIST for modules that have failed to load and are in the process of being removed from the module list. module amd64_edac_mod has a dependency on module edac_mce_amd. Using modules.dep, systemd will load edac_mce_amd for every request of amd64_edac_mod. When the edac_mce_amd module loads, the module has state MODULE_STATE_UNFORMED and once the module load fails and the state becomes MODULE_STATE_GOING. Another request for edac_mce_amd module executes and add_unformed_module() will erroneously return -EEXIST even though the previous instance of edac_mce_amd has MODULE_STATE_GOING. Upon receiving -EEXIST, systemd attempts to load amd64_edac_mod, which fails because of unknown symbols from edac_mce_amd. add_unformed_module() must wait to return for any case other than MODULE_STATE_LIVE to prevent a race between multiple loads of dependent modules. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Barret Rhoden <brho@google.com> Cc: David Arcari <darcari@redhat.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Cheng Jian
|
f486088d38 |
ftrace: Enable trampoline when rec count returns back to one
[ Upstream commit a124692b698b00026a58d89831ceda2331b2e1d0 ] Custom trampolines can only be enabled if there is only a single ops attached to it. If there's only a single callback registered to a function, and the ops has a trampoline registered for it, then we can call the trampoline directly. This is very useful for improving the performance of ftrace and livepatch. If more than one callback is registered to a function, the general trampoline is used, and the custom trampoline is not restored back to the direct call even if all the other callbacks were unregistered and we are back to one callback for the function. To fix this, set FTRACE_FL_TRAMP flag if rec count is decremented to one, and the ops that left has a trampoline. Testing After this patch : insmod livepatch_unshare_files.ko cat /sys/kernel/debug/tracing/enabled_functions unshare_files (1) R I tramp: 0xffffffffc0000000(klp_ftrace_handler+0x0/0xa0) ->ftrace_ops_assist_func+0x0/0xf0 echo unshare_files > /sys/kernel/debug/tracing/set_ftrace_filter echo function > /sys/kernel/debug/tracing/current_tracer cat /sys/kernel/debug/tracing/enabled_functions unshare_files (2) R I ->ftrace_ops_list_func+0x0/0x150 echo nop > /sys/kernel/debug/tracing/current_tracer cat /sys/kernel/debug/tracing/enabled_functions unshare_files (1) R I tramp: 0xffffffffc0000000(klp_ftrace_handler+0x0/0xa0) ->ftrace_ops_assist_func+0x0/0xf0 Link: http://lkml.kernel.org/r/1556969979-111047-1-git-send-email-cj.chengjian@huawei.com Signed-off-by: Cheng Jian <cj.chengjian@huawei.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Greg Kroah-Hartman
|
844ecc4634 |
This is the 4.19.64 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1GibIACgkQONu9yGCS aT7z2hAAmv8AsH9IG43m7t6zLroJVswr/9594xk7yPBQgcY3/PW2aTFBCFbsdOL4 yXcj2PSwRiq9K6qAJULrvOvncR9fIILHqzWzyXnoaZ30lR/FxaaFmuHZX/5Ix1tB e5EEE/EA49UAEjEDaMLq8g2IvibsReDxmSpnXyBJWoyRAdFIElVnMJ2+zvP/wRhF NKzQj/bj/qecCbis2lUCaVWJFZ6+P/52UbD8lvIwqR3nk2TKsGDcLU6eY3yg4KrB rEHl5T8KIPrkX3KNIEB8EcFREene+rdpZLLVe4fYwf+gOqfiFXSzZZvweauMkplq ehlVHkykvQvlsVM2tjBD379z3C4aasZDuMVNMCbAy2FlruLeBQ7gEn77mCJB9VH5 /n/mlc2yizdoowtARCLWOUMfASpdSbqu2SQ7A/3kwG7l6GrpzKSIU2nQgm+41sUZ QJVtZ3IYsPoYjnU4B3JZzgJnf3M9jcRz/3JegviqhSEbF1gaScJX0cqN8C1idN/v ZAGCJK9S20/EEEsp5jn+bq2grUehvmD4TVDfot4P+5yRYyBIhMFpbM2RpjydOpwy +x8D1Q34LYPFgZfQ0vF62vcSBhMBiJ/7j41rUeo44K+Lg00F3yCOyL6FxK6S8h6j wsD0xLbllMrhV5KRYFizb3QbCHoHYiROIJk76uLvB+Tqq2Jg9VQ= =qIi2 -----END PGP SIGNATURE----- Merge 4.19.64 into android-4.19 Changes in 4.19.64 hv_sock: Add support for delayed close vsock: correct removal of socket from the list NFS: Fix dentry revalidation on NFSv4 lookup NFS: Refactor nfs_lookup_revalidate() NFSv4: Fix lookup revalidate of regular files usb: dwc2: Disable all EP's on disconnect usb: dwc2: Fix disable all EP's on disconnect arm64: compat: Provide definition for COMPAT_SIGMINSTKSZ binder: fix possible UAF when freeing buffer ISDN: hfcsusb: checking idx of ep configuration media: au0828: fix null dereference in error path ath10k: Change the warning message string media: cpia2_usb: first wake up, then free in disconnect media: pvrusb2: use a different format for warnings NFS: Cleanup if nfs_match_client is interrupted media: radio-raremono: change devm_k*alloc to k*alloc iommu/vt-d: Don't queue_iova() if there is no flush queue iommu/iova: Fix compilation error with !CONFIG_IOMMU_IOVA Bluetooth: hci_uart: check for missing tty operations vhost: introduce vhost_exceeds_weight() vhost_net: fix possible infinite loop vhost: vsock: add weight support vhost: scsi: add weight support sched/fair: Don't free p->numa_faults with concurrent readers sched/fair: Use RCU accessors consistently for ->numa_group /proc/<pid>/cmdline: remove all the special cases /proc/<pid>/cmdline: add back the setproctitle() special case drivers/pps/pps.c: clear offset flags in PPS_SETPARAMS ioctl Fix allyesconfig output. ceph: hold i_ceph_lock when removing caps for freeing inode block, scsi: Change the preempt-only flag into a counter scsi: core: Avoid that a kernel warning appears during system resume ip_tunnel: allow not to count pkts on tstats by setting skb's dev to NULL Linux 4.19.64 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I3e9055b677bd8ad9d5070307fae0bc765d444e9d |
||
Jann Horn
|
a5a3915f17 |
sched/fair: Use RCU accessors consistently for ->numa_group
commit cb361d8cdef69990f6b4504dc1fd9a594d983c97 upstream.
The old code used RCU annotations and accessors inconsistently for
->numa_group, which can lead to use-after-frees and NULL dereferences.
Let all accesses to ->numa_group use proper RCU helpers to prevent such
issues.
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Fixes:
|
||
Jann Horn
|
48046e092a |
sched/fair: Don't free p->numa_faults with concurrent readers
commit 16d51a590a8ce3befb1308e0e7ab77f3b661af33 upstream.
When going through execve(), zero out the NUMA fault statistics instead of
freeing them.
During execve, the task is reachable through procfs and the scheduler. A
concurrent /proc/*/sched reader can read data from a freed ->numa_faults
allocation (confirmed by KASAN) and write it back to userspace.
I believe that it would also be possible for a use-after-free read to occur
through a race between a NUMA fault and execve(): task_numa_fault() can
lead to task_numa_compare(), which invokes task_weight() on the currently
running task of a different CPU.
Another way to fix this would be to make ->numa_faults RCU-managed or add
extra locking, but it seems easier to wipe the NUMA fault statistics on
execve.
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Fixes:
|
||
Greg Kroah-Hartman
|
75ff56e1a2 |
This is the 4.19.63 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1BJrAACgkQONu9yGCS aT6MpRAAt2nuozm5Z/MshFAuGFzddAwPoYtkIPSy8BiPHYjf7x0+D5Ew4dz5OihS ElfbA94hMOpvhhXlzBU3ZFJsWZIK78gzV6+LiHyb5R97Jdzj/zT4h40y0kxKw+pS gghnZ6zx+pGSIXm/EsODW2gg98yTrmhBFpUhXpAGoC/71c1vxVlj3jHcuhd778YK NRlj2tFWJGIBpmXrApo1Eg7qQRj4tzbOjthfgANWPq+EP68PgiOaxMDMMp7IUwju KrXEFcXlk5aHvfaJ06FSBRBnn45XMyGXPYV/76HsaqkBNmg1r7o02dZKy/0S84Fn YXoEFxHt6NFOJ52LiO95z7cQ0xeTSYygNCtYXJ70uyDMnrCPXCrKp7DRyka+vDPs RCrcpB1QjcCb3xTL//SPkNWM3oZEW9CawpRFh9bmqbw/h7ZjaUEuwNIJnunISulu 2fvOjUmFWfUVIARiwKFVuIkXzgf3cSLYZTtiDFC5/yBpkGVNnXqyO7YtWZwnrMHq L3DC3pOKuYXMa03KWGEzZoCZEXjtoRhRwCSgV7wbK5o90sZeRj/HC1zXEyDPPD/R 7A1rePTuwlAH3gHCJGhYkmYqULx62ZdvV6IC2N7xNxeTL1Y7OVNBT7ZUqexxY6WC OG1vVxUKNJIvBLYmc6cmQATgR6XH5/B9H2p1YRBLuAQuqHk+jqo= =ZQf3 -----END PGP SIGNATURE----- Merge 4.19.63 into android-4.19 Changes in 4.19.63 hvsock: fix epollout hang from race condition drm/panel: simple: Fix panel_simple_dsi_probe iio: adc: stm32-dfsdm: manage the get_irq error case iio: adc: stm32-dfsdm: missing error case during probe staging: vt6656: use meaningful error code during buffer allocation usb: core: hub: Disable hub-initiated U1/U2 tty: max310x: Fix invalid baudrate divisors calculator pinctrl: rockchip: fix leaked of_node references tty: serial: cpm_uart - fix init when SMC is relocated drm/amd/display: Fill prescale_params->scale for RGB565 drm/amdgpu/sriov: Need to initialize the HDP_NONSURFACE_BAStE drm/amd/display: Disable ABM before destroy ABM struct drm/amdkfd: Fix a potential memory leak drm/amdkfd: Fix sdma queue map issue drm/edid: Fix a missing-check bug in drm_load_edid_firmware() PCI: Return error if cannot probe VF drm/bridge: tc358767: read display_props in get_modes() drm/bridge: sii902x: pixel clock unit is 10kHz instead of 1kHz gpu: host1x: Increase maximum DMA segment size drm/crc-debugfs: User irqsafe spinlock in drm_crtc_add_crc_entry drm/crc-debugfs: Also sprinkle irqrestore over early exits memstick: Fix error cleanup path of memstick_init tty/serial: digicolor: Fix digicolor-usart already registered warning tty: serial: msm_serial: avoid system lockup condition serial: 8250: Fix TX interrupt handling condition drm/amd/display: Always allocate initial connector state state drm/virtio: Add memory barriers for capset cache. phy: renesas: rcar-gen2: Fix memory leak at error paths drm/amd/display: fix compilation error powerpc/pseries/mobility: prevent cpu hotplug during DT update drm/rockchip: Properly adjust to a true clock in adjusted_mode serial: imx: fix locking in set_termios() tty: serial_core: Set port active bit in uart_port_activate usb: gadget: Zero ffs_io_data mmc: sdhci: sdhci-pci-o2micro: Check if controller supports 8-bit width powerpc/pci/of: Fix OF flags parsing for 64bit BARs drm/msm: Depopulate platform on probe failure serial: mctrl_gpio: Check if GPIO property exisits before requesting it PCI: sysfs: Ignore lockdep for remove attribute i2c: stm32f7: fix the get_irq error cases kbuild: Add -Werror=unknown-warning-option to CLANG_FLAGS genksyms: Teach parser about 128-bit built-in types PCI: xilinx-nwl: Fix Multi MSI data programming iio: iio-utils: Fix possible incorrect mask calculation powerpc/cacheflush: fix variable set but not used powerpc/xmon: Fix disabling tracing while in xmon recordmcount: Fix spurious mcount entries on powerpc mfd: madera: Add missing of table registration mfd: core: Set fwnode for created devices mfd: arizona: Fix undefined behavior mfd: hi655x-pmic: Fix missing return value check for devm_regmap_init_mmio_clk mm/swap: fix release_pages() when releasing devmap pages um: Silence lockdep complaint about mmap_sem powerpc/4xx/uic: clear pending interrupt after irq type/pol change RDMA/i40iw: Set queue pair state when being queried serial: sh-sci: Terminate TX DMA during buffer flushing serial: sh-sci: Fix TX DMA buffer flushing and workqueue races IB/mlx5: Fixed reporting counters on 2nd port for Dual port RoCE powerpc/mm: Handle page table allocation failures IB/ipoib: Add child to parent list only if device initialized arm64: assembler: Switch ESB-instruction with a vanilla nop if !ARM64_HAS_RAS PCI: mobiveil: Fix PCI base address in MEM/IO outbound windows PCI: mobiveil: Fix the Class Code field kallsyms: exclude kasan local symbols on s390 PCI: mobiveil: Initialize Primary/Secondary/Subordinate bus numbers PCI: mobiveil: Use the 1st inbound window for MEM inbound transactions perf test mmap-thread-lookup: Initialize variable to suppress memory sanitizer warning perf stat: Fix use-after-freed pointer detected by the smatch tool perf top: Fix potential NULL pointer dereference detected by the smatch tool perf session: Fix potential NULL pointer dereference found by the smatch tool perf annotate: Fix dereferencing freed memory found by the smatch tool perf hists browser: Fix potential NULL pointer dereference found by the smatch tool RDMA/rxe: Fill in wc byte_len with IB_WC_RECV_RDMA_WITH_IMM PCI: dwc: pci-dra7xx: Fix compilation when !CONFIG_GPIOLIB powerpc/boot: add {get, put}_unaligned_be32 to xz_config.h block: init flush rq ref count to 1 f2fs: avoid out-of-range memory access mailbox: handle failed named mailbox channel request dlm: check if workqueues are NULL before flushing/destroying powerpc/eeh: Handle hugepages in ioremap space block/bio-integrity: fix a memory leak bug sh: prevent warnings when using iounmap mm/kmemleak.c: fix check for softirq context 9p: pass the correct prototype to read_cache_page mm/gup.c: mark undo_dev_pagemap as __maybe_unused mm/gup.c: remove some BUG_ONs from get_gate_page() memcg, fsnotify: no oom-kill for remote memcg charging mm/mmu_notifier: use hlist_add_head_rcu() proc: use down_read_killable mmap_sem for /proc/pid/smaps_rollup proc: use down_read_killable mmap_sem for /proc/pid/pagemap proc: use down_read_killable mmap_sem for /proc/pid/clear_refs proc: use down_read_killable mmap_sem for /proc/pid/map_files cxgb4: reduce kernel stack usage in cudbg_collect_mem_region() proc: use down_read_killable mmap_sem for /proc/pid/maps locking/lockdep: Fix lock used or unused stats error mm: use down_read_killable for locking mmap_sem in access_remote_vm locking/lockdep: Hide unused 'class' variable usb: wusbcore: fix unbalanced get/put cluster_id usb: pci-quirks: Correct AMD PLL quirk detection btrfs: inode: Don't compress if NODATASUM or NODATACOW set x86/sysfb_efi: Add quirks for some devices with swapped width and height x86/speculation/mds: Apply more accurate check on hypervisor platform binder: prevent transactions to context manager from its own process. fpga-manager: altera-ps-spi: Fix build error mei: me: add mule creek canyon (EHL) device ids hpet: Fix division by zero in hpet_time_div() ALSA: ac97: Fix double free of ac97_codec_device ALSA: line6: Fix wrong altsetting for LINE6_PODHD500_1 ALSA: hda - Add a conexant codec entry to let mute led work powerpc/xive: Fix loop exit-condition in xive_find_target_in_mask() powerpc/tm: Fix oops on sigreturn on systems without TM libnvdimm/bus: Stop holding nvdimm_bus_list_mutex over __nd_ioctl() access: avoid the RCU grace period for the temporary subjective credentials Linux 4.19.63 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ic31529aa6fd283d16d6bfb182187a9402a4db44f |
||
Linus Torvalds
|
408af82309 |
access: avoid the RCU grace period for the temporary subjective credentials
commit d7852fbd0f0423937fa287a598bfde188bb68c22 upstream. It turns out that 'access()' (and 'faccessat()') can cause a lot of RCU work because it installs a temporary credential that gets allocated and freed for each system call. The allocation and freeing overhead is mostly benign, but because credentials can be accessed under the RCU read lock, the freeing involves a RCU grace period. Which is not a huge deal normally, but if you have a lot of access() calls, this causes a fair amount of seconday damage: instead of having a nice alloc/free patterns that hits in hot per-CPU slab caches, you have all those delayed free's, and on big machines with hundreds of cores, the RCU overhead can end up being enormous. But it turns out that all of this is entirely unnecessary. Exactly because access() only installs the credential as the thread-local subjective credential, the temporary cred pointer doesn't actually need to be RCU free'd at all. Once we're done using it, we can just free it synchronously and avoid all the RCU overhead. So add a 'non_rcu' flag to 'struct cred', which can be set by users that know they only use it in non-RCU context (there are other potential users for this). We can make it a union with the rcu freeing list head that we need for the RCU case, so this doesn't need any extra storage. Note that this also makes 'get_current_cred()' clear the new non_rcu flag, in case we have filesystems that take a long-term reference to the cred and then expect the RCU delayed freeing afterwards. It's not entirely clear that this is required, but it makes for clear semantics: the subjective cred remains non-RCU as long as you only access it synchronously using the thread-local accessors, but you _can_ use it as a generic cred if you want to. It is possible that we should just remove the whole RCU markings for ->cred entirely. Only ->real_cred is really supposed to be accessed through RCU, and the long-term cred copies that nfs uses might want to explicitly re-enable RCU freeing if required, rather than have get_current_cred() do it implicitly. But this is a "minimal semantic changes" change for the immediate problem. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Paul E. McKenney <paulmck@linux.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Jan Glauber <jglauber@marvell.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Jayachandran Chandrasekharan Nair <jnair@marvell.com> Cc: Greg KH <greg@kroah.com> Cc: Kees Cook <keescook@chromium.org> Cc: David Howells <dhowells@redhat.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Arnd Bergmann
|
148959cc64 |
locking/lockdep: Hide unused 'class' variable
[ Upstream commit 68037aa78208f34bda4e5cd76c357f718b838cbb ] The usage is now hidden in an #ifdef, so we need to move the variable itself in there as well to avoid this warning: kernel/locking/lockdep_proc.c:203:21: error: unused variable 'class' [-Werror,-Wunused-variable] Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qian Cai <cai@lca.pw> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Yuyang Du <duyuyang@gmail.com> Cc: frederic@kernel.org Fixes: 68d41d8c94a3 ("locking/lockdep: Fix lock used or unused stats error") Link: https://lkml.kernel.org/r/20190715092809.736834-1-arnd@arndb.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Yuyang Du
|
4acb04ef5e |
locking/lockdep: Fix lock used or unused stats error
[ Upstream commit 68d41d8c94a31dfb8233ab90b9baf41a2ed2da68 ] The stats variable nr_unused_locks is incremented every time a new lock class is register and decremented when the lock is first used in __lock_acquire(). And after all, it is shown and checked in lockdep_stats. However, under configurations that either CONFIG_TRACE_IRQFLAGS or CONFIG_PROVE_LOCKING is not defined: The commit: 091806515124b20 ("locking/lockdep: Consolidate lock usage bit initialization") missed marking the LOCK_USED flag at IRQ usage initialization because as mark_usage() is not called. And the commit: 886532aee3cd42d ("locking/lockdep: Move mark_lock() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING") further made mark_lock() not defined such that the LOCK_USED cannot be marked at all when the lock is first acquired. As a result, we fix this by not showing and checking the stats under such configurations for lockdep_stats. Reported-by: Qian Cai <cai@lca.pw> Signed-off-by: Yuyang Du <duyuyang@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: arnd@arndb.de Cc: frederic@kernel.org Link: https://lkml.kernel.org/r/20190709101522.9117-1-duyuyang@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Greg Kroah-Hartman
|
f232ce65ba |
This is the 4.19.62 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl09QMoACgkQONu9yGCS aT7bBQ//dApu9DLQ3tyfc+tucIc3ViOcif3zlfLWRqClcLXw1HF63/Q/u40kYuf1 TOnMkVXFWmpeGPgz2/tWw9TN+ibHaPBfyMoPGw/Ehs5dB8qn/F4AnLqkePFimIYQ +YOZTbvPqW/xSh8I79i6y2NxrZLGvO9FDY9YUblBKz4lKWc+HULXn9RmjtxlBOLE s6epoM0tWKPPeM5MyJIwT+9MZ0o3Cj89YNv+PqYLQSzcvbUbHDUiMt+DO4rqSyyf 6fQe4mSX4tk+tbviyEruvKH3kJg0rWS1Maw1/h+AOJ8Gmr2WZ+vYBorPEdwdYu5K eootc3Jkj4wtcPsvKtNzWVPqJxdd5j651vS8/bxA0DmOVQM7A8dUBWKtMJGGG7nM nIRvxoMo+kv9QE/DJpeQ/apE9WnBflSqw6/DYdqs11gTX4E+beR75aRoVD1ue0lS if5CfnM0BTiOO1rMdMo42rzcBfh2DLt5a18WXsMmYOaSMGEJuF0KjgaGc5E4N00A QlqIJ7PKk+Kb53jz5oLV2vB/SXNvAQRLAvMTMH2Mst4veDonYyiVTfp6C+r8DJYc hvyaF1FoCMTWV5XsINmM2mlJ2+/G0nV5kLDmVPUHiFxE0JXnPQ8iCx9a96Ub+Zim F//muwohNIUa6EEN6AwrEUsoyVZ7cP/aR5wHQ9PAY3RV5nis474= =zWcX -----END PGP SIGNATURE----- Merge 4.19.62 into android-4.19 Changes in 4.19.62 bnx2x: Prevent load reordering in tx completion processing caif-hsi: fix possible deadlock in cfhsi_exit_module() hv_netvsc: Fix extra rcu_read_unlock in netvsc_recv_callback() igmp: fix memory leak in igmpv3_del_delrec() ipv4: don't set IPv6 only flags to IPv4 addresses ipv6: rt6_check should return NULL if 'from' is NULL ipv6: Unlink sibling route in case of failure net: bcmgenet: use promisc for unsupported filters net: dsa: mv88e6xxx: wait after reset deactivation net: make skb_dst_force return true when dst is refcounted net: neigh: fix multiple neigh timer scheduling net: openvswitch: fix csum updates for MPLS actions net: phy: sfp: hwmon: Fix scaling of RX power net: stmmac: Re-work the queue selection for TSO packets nfc: fix potential illegal memory access r8169: fix issue with confused RX unit after PHY power-down on RTL8411b rxrpc: Fix send on a connected, but unbound socket sctp: fix error handling on stream scheduler initialization sky2: Disable MSI on ASUS P6T tcp: be more careful in tcp_fragment() tcp: fix tcp_set_congestion_control() use from bpf hook tcp: Reset bytes_acked and bytes_received when disconnecting vrf: make sure skb->data contains ip header to make routing net/mlx5e: IPoIB, Add error path in mlx5_rdma_setup_rn macsec: fix use-after-free of skb during RX macsec: fix checksumming after decryption netrom: fix a memory leak in nr_rx_frame() netrom: hold sock when setting skb->destructor net_sched: unset TCQ_F_CAN_BYPASS when adding filters net/tls: make sure offload also gets the keys wiped sctp: not bind the socket in sctp_connect net: bridge: mcast: fix stale nsrcs pointer in igmp3/mld2 report handling net: bridge: mcast: fix stale ipv6 hdr pointer when handling v6 query net: bridge: don't cache ether dest pointer on input net: bridge: stp: don't cache eth dest pointer before skb pull dma-buf: balance refcount inbalance dma-buf: Discard old fence_excl on retrying get_fences_rcu for realloc gpio: davinci: silence error prints in case of EPROBE_DEFER MIPS: lb60: Fix pin mappings perf/core: Fix exclusive events' grouping perf/core: Fix race between close() and fork() ext4: don't allow any modifications to an immutable file ext4: enforce the immutable flag on open files mm: add filemap_fdatawait_range_keep_errors() jbd2: introduce jbd2_inode dirty range scoping ext4: use jbd2_inode dirty range scoping ext4: allow directory holes KVM: nVMX: do not use dangling shadow VMCS after guest reset KVM: nVMX: Clear pending KVM_REQ_GET_VMCS12_PAGES when leaving nested mm: vmscan: scan anonymous pages on file refaults net: sched: verify that q!=NULL before setting q->flags Linux 4.19.62 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I2eb23bda9d5294a5c874fe4f403934fd99e84661 |
||
Peter Zijlstra
|
4a5cc64d8a |
perf/core: Fix race between close() and fork()
commit 1cf8dfe8a661f0462925df943140e9f6d1ea5233 upstream. Syzcaller reported the following Use-after-Free bug: close() clone() copy_process() perf_event_init_task() perf_event_init_context() mutex_lock(parent_ctx->mutex) inherit_task_group() inherit_group() inherit_event() mutex_lock(event->child_mutex) // expose event on child list list_add_tail() mutex_unlock(event->child_mutex) mutex_unlock(parent_ctx->mutex) ... goto bad_fork_* bad_fork_cleanup_perf: perf_event_free_task() perf_release() perf_event_release_kernel() list_for_each_entry() mutex_lock(ctx->mutex) mutex_lock(event->child_mutex) // event is from the failing inherit // on the other CPU perf_remove_from_context() list_move() mutex_unlock(event->child_mutex) mutex_unlock(ctx->mutex) mutex_lock(ctx->mutex) list_for_each_entry_safe() // event already stolen mutex_unlock(ctx->mutex) delayed_free_task() free_task() list_for_each_entry_safe() list_del() free_event() _free_event() // and so event->hw.target // is the already freed failed clone() if (event->hw.target) put_task_struct(event->hw.target) // WHOOPSIE, already quite dead Which puts the lie to the the comment on perf_event_free_task(): 'unexposed, unused context' not so much. Which is a 'fun' confluence of fail; copy_process() doing an unconditional free_task() and not respecting refcounts, and perf having creative locking. In particular: |
||
Alexander Shishkin
|
75100ec5f0 |
perf/core: Fix exclusive events' grouping
commit 8a58ddae23796c733c5dfbd717538d89d036c5bd upstream.
So far, we tried to disallow grouping exclusive events for the fear of
complications they would cause with moving between contexts. Specifically,
moving a software group to a hardware context would violate the exclusivity
rules if both groups contain matching exclusive events.
This attempt was, however, unsuccessful: the check that we have in the
perf_event_open() syscall is both wrong (looks at wrong PMU) and
insufficient (group leader may still be exclusive), as can be illustrated
by running:
$ perf record -e '{intel_pt//,cycles}' uname
$ perf record -e '{cycles,intel_pt//}' uname
ultimately successfully.
Furthermore, we are completely free to trigger the exclusivity violation
by:
perf -e '{cycles,intel_pt//}' -e '{intel_pt//,instructions}'
even though the helpful perf record will not allow that, the ABI will.
The warning later in the perf_event_open() path will also not trigger, because
it's also wrong.
Fix all this by validating the original group before moving, getting rid
of broken safeguards and placing a useful one to perf_install_in_context().
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: mathieu.poirier@linaro.org
Cc: will.deacon@arm.com
Fixes:
|
||
Greg Kroah-Hartman
|
71ce27c31a |
This is the 4.19.61 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl06qFcACgkQONu9yGCS aT6O9A/+JZqoVYnItpOnT8Hu//0mYEKvREWqsoTJNpZJhLWtGjPTT9ospHNpVgfC GUkFqngWzXHpzCgTYHUV3Mm+SIiVXCM3nkCU1+2YOsPzrKo/lJSfFt3wOYGpKO5V qratAQLra5TqR0teR00aQblqKqfmrux05uL9dNcVIwve813m00jFALcpjrXnanpP tx5cqCo3uHOou5XLraHx/CMPnfJI/mLegBUTM4DxAmN2vG4gQck2gnrU7s1eg4cy 1Fqh0Oo2Ycj5p9yoGss02JqR3wGZHOEmF55j2JcTZAPvW6/c55iPd52Trn8kPOHB Awq/VwJmP4p10a4TWoZpv7VqpL3PzO8/AW7QWOER8QnDzfOTHGae7YT8LVp5Xqj5 1NqowuP/Tm0yaZSaDLqkdvhVqTi0oGL8OCYLErpeR9PQ3P+p3paaswopsPqnXURj Q4Pahe1vm9WG2NpKh2bHVmmVkQmvwuxxxnaa31HI/IyLd5bYFV1/LbEa/XrSK36W VJtO+0AjERO9uTVP/YDloDkQ4R3+3W+m520jYsgf1OwY7v/Kc6iLb7cDwci/ZWMy YSMm8hrO0nzuT0SI25TKLDvxjGbANKvxytzOQMOTb8NsIWwaoEKWh+4r9XkdUXNa +dx72I5J2Be+3hk+eaDNzCdEae5pgVTxBpwJbzI4RfnK1Doa4uE= =hJdd -----END PGP SIGNATURE----- Merge 4.19.61 into android-4.19 Changes in 4.19.61 MIPS: ath79: fix ar933x uart parity mode MIPS: fix build on non-linux hosts arm64/efi: Mark __efistub_stext_offset as an absolute symbol explicitly scsi: iscsi: set auth_protocol back to NULL if CHAP_A value is not supported dmaengine: imx-sdma: fix use-after-free on probe error path wil6210: fix potential out-of-bounds read ath10k: Do not send probe response template for mesh ath9k: Check for errors when reading SREV register ath6kl: add some bounds checking ath10k: add peer id check in ath10k_peer_find_by_id wil6210: fix spurious interrupts in 3-msi ath: DFS JP domain W56 fixed pulse type 3 RADAR detection regmap: debugfs: Fix memory leak in regmap_debugfs_init batman-adv: fix for leaked TVLV handler. media: dvb: usb: fix use after free in dvb_usb_device_exit media: spi: IR LED: add missing of table registration crypto: talitos - fix skcipher failure due to wrong output IV media: ov7740: avoid invalid framesize setting media: marvell-ccic: fix DMA s/g desc number calculation media: vpss: fix a potential NULL pointer dereference media: media_device_enum_links32: clean a reserved field net: stmmac: dwmac1000: Clear unused address entries net: stmmac: dwmac4/5: Clear unused address entries qed: Set the doorbell address correctly signal/pid_namespace: Fix reboot_pid_ns to use send_sig not force_sig af_key: fix leaks in key_pol_get_resp and dump_sp. xfrm: Fix xfrm sel prefix length validation fscrypt: clean up some BUG_ON()s in block encryption/decryption perf annotate TUI browser: Do not use member from variable within its own initialization media: mc-device.c: don't memset __user pointer contents media: saa7164: fix remove_proc_entry warning media: staging: media: davinci_vpfe: - Fix for memory leak if decoder initialization fails. net: phy: Check against net_device being NULL crypto: talitos - properly handle split ICV. crypto: talitos - Align SEC1 accesses to 32 bits boundaries. tua6100: Avoid build warnings. batman-adv: Fix duplicated OGMs on NETDEV_UP locking/lockdep: Fix merging of hlocks with non-zero references media: wl128x: Fix some error handling in fm_v4l2_init_video_device() net: hns3: set ops to null when unregister ad_dev cpupower : frequency-set -r option misses the last cpu in related cpu list arm64: mm: make CONFIG_ZONE_DMA32 configurable perf jvmti: Address gcc string overflow warning for strncpy() net: stmmac: dwmac4: fix flow control issue net: stmmac: modify default value of tx-frames crypto: inside-secure - do not rely on the hardware last bit for result descriptors net: fec: Do not use netdev messages too early net: axienet: Fix race condition causing TX hang s390/qdio: handle PENDING state for QEBSM devices RAS/CEC: Fix pfn insertion net: sfp: add mutex to prevent concurrent state checks ipset: Fix memory accounting for hash types on resize perf cs-etm: Properly set the value of 'old' and 'head' in snapshot mode perf test 6: Fix missing kvm module load for s390 perf report: Fix OOM error in TUI mode on s390 irqchip/meson-gpio: Add support for Meson-G12A SoC media: uvcvideo: Fix access to uninitialized fields on probe error media: fdp1: Support M3N and E3 platforms iommu: Fix a leak in iommu_insert_resv_region gpio: omap: fix lack of irqstatus_raw0 for OMAP4 gpio: omap: ensure irq is enabled before wakeup regmap: fix bulk writes on paged registers bpf: silence warning messages in core media: s5p-mfc: fix reading min scratch buffer size on MFC v6/v7 selinux: fix empty write to keycreate file x86/cpu: Add Ice Lake NNPI to Intel family ASoC: meson: axg-tdm: fix sample clock inversion rcu: Force inlining of rcu_read_lock() x86/cpufeatures: Add FDP_EXCPTN_ONLY and ZERO_FCS_FDS qed: iWARP - Fix tc for MPA ll2 connection net: hns3: fix for skb leak when doing selftest block: null_blk: fix race condition for null_del_dev blkcg, writeback: dead memcgs shouldn't contribute to writeback ownership arbitration xfrm: fix sa selector validation sched/core: Add __sched tag for io_schedule() sched/fair: Fix "runnable_avg_yN_inv" not used warnings perf/x86/intel/uncore: Handle invalid event coding for free-running counter x86/atomic: Fix smp_mb__{before,after}_atomic() perf evsel: Make perf_evsel__name() accept a NULL argument vhost_net: disable zerocopy by default ipoib: correcly show a VF hardware address x86/cacheinfo: Fix a -Wtype-limits warning blk-iolatency: only account submitted bios ACPICA: Clear status of GPEs on first direct enable EDAC/sysfs: Fix memory leak when creating a csrow object nvme: fix possible io failures when removing multipathed ns nvme-pci: properly report state change failure in nvme_reset_work nvme-pci: set the errno on ctrl state change error lightnvm: pblk: fix freeing of merged pages arm64: Do not enable IRQs for ct_user_exit ipsec: select crypto ciphers for xfrm_algo ipvs: defer hook registration to avoid leaks media: s5p-mfc: Make additional clocks optional media: i2c: fix warning same module names ntp: Limit TAI-UTC offset timer_list: Guard procfs specific code acpi/arm64: ignore 5.1 FADTs that are reported as 5.0 media: coda: fix mpeg2 sequence number handling media: coda: fix last buffer handling in V4L2_ENC_CMD_STOP media: coda: increment sequence offset for the last returned frame media: vimc: cap: check v4l2_fill_pixfmt return value media: hdpvr: fix locking and a missing msleep net: stmmac: sun8i: force select external PHY when no internal one rtlwifi: rtl8192cu: fix error handle when usb probe failed mt7601u: do not schedule rx_tasklet when the device has been disconnected x86/build: Add 'set -e' to mkcapflags.sh to delete broken capflags.c mt7601u: fix possible memory leak when the device is disconnected ipvs: fix tinfo memory leak in start_sync_thread ath10k: add missing error handling ath10k: fix PCIE device wake up failed perf tools: Increase MAX_NR_CPUS and MAX_CACHES ASoC: Intel: hdac_hdmi: Set ops to NULL on remove libata: don't request sense data on !ZAC ATA devices clocksource/drivers/exynos_mct: Increase priority over ARM arch timer xsk: Properly terminate assignment in xskq_produce_flush_desc rslib: Fix decoding of shortened codes rslib: Fix handling of of caller provided syndrome ixgbe: Check DDM existence in transceiver before access crypto: serpent - mark __serpent_setkey_sbox noinline crypto: asymmetric_keys - select CRYPTO_HASH where needed wil6210: drop old event after wmi_call timeout EDAC: Fix global-out-of-bounds write when setting edac_mc_poll_msec bcache: check CACHE_SET_IO_DISABLE in allocator code bcache: check CACHE_SET_IO_DISABLE bit in bch_journal() bcache: acquire bch_register_lock later in cached_dev_free() bcache: check c->gc_thread by IS_ERR_OR_NULL in cache_set_flush() bcache: fix potential deadlock in cached_def_free() net: hns3: fix a -Wformat-nonliteral compile warning net: hns3: add some error checking in hclge_tm module ath10k: destroy sdio workqueue while remove sdio module net: mvpp2: prs: Don't override the sign bit in SRAM parser shift igb: clear out skb->tstamp after reading the txtime iwlwifi: mvm: Drop large non sta frames bpf: fix uapi bpf_prog_info fields alignment perf stat: Make metric event lookup more robust perf stat: Fix group lookup for metric group bnx2x: Prevent ptp_task to be rescheduled indefinitely net: usb: asix: init MAC address buffers rxrpc: Fix oops in tracepoint bpf, libbpf, smatch: Fix potential NULL pointer dereference selftests: bpf: fix inlines in test_lwt_seg6local bonding: validate ip header before check IPPROTO_IGMP gpiolib: Fix references to gpiod_[gs]et_*value_cansleep() variants tools: bpftool: Fix json dump crash on powerpc Bluetooth: hci_bcsp: Fix memory leak in rx_skb Bluetooth: Add new 13d3:3491 QCA_ROME device Bluetooth: Add new 13d3:3501 QCA_ROME device Bluetooth: 6lowpan: search for destination address in all peers perf tests: Fix record+probe_libc_inet_pton.sh for powerpc64 Bluetooth: Check state in l2cap_disconnect_rsp gtp: add missing gtp_encap_disable_sock() in gtp_encap_enable() Bluetooth: validate BLE connection interval updates gtp: fix suspicious RCU usage gtp: fix Illegal context switch in RCU read-side critical section. gtp: fix use-after-free in gtp_encap_destroy() gtp: fix use-after-free in gtp_newlink() net: mvmdio: defer probe of orion-mdio if a clock is not ready iavf: fix dereference of null rx_buffer pointer floppy: fix div-by-zero in setup_format_params floppy: fix out-of-bounds read in next_valid_format floppy: fix invalid pointer dereference in drive_name floppy: fix out-of-bounds read in copy_buffer xen: let alloc_xenballooned_pages() fail if not enough memory free scsi: NCR5380: Reduce goto statements in NCR5380_select() scsi: NCR5380: Always re-enable reselection interrupt Revert "scsi: ncr5380: Increase register polling limit" scsi: core: Fix race on creating sense cache scsi: megaraid_sas: Fix calculation of target ID scsi: mac_scsi: Increase PIO/PDMA transfer length threshold scsi: mac_scsi: Fix pseudo DMA implementation, take 2 crypto: ghash - fix unaligned memory access in ghash_setkey() crypto: ccp - Validate the the error value used to index error messages crypto: arm64/sha1-ce - correct digest for empty data in finup crypto: arm64/sha2-ce - correct digest for empty data in finup crypto: chacha20poly1305 - fix atomic sleep when using async algorithm crypto: crypto4xx - fix AES CTR blocksize value crypto: crypto4xx - fix blocksize for cfb and ofb crypto: crypto4xx - block ciphers should only accept complete blocks crypto: ccp - memset structure fields to zero before reuse crypto: ccp/gcm - use const time tag comparison. crypto: crypto4xx - fix a potential double free in ppc4xx_trng_probe Revert "bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()" bcache: Revert "bcache: fix high CPU occupancy during journal" bcache: Revert "bcache: free heap cache_set->flush_btree in bch_journal_free" bcache: ignore read-ahead request failure on backing device bcache: fix mistaken sysfs entry for io_error counter bcache: destroy dc->writeback_write_wq if failed to create dc->writeback_thread Input: gtco - bounds check collection indent level Input: alps - don't handle ALPS cs19 trackpoint-only device Input: synaptics - whitelist Lenovo T580 SMBus intertouch Input: alps - fix a mismatch between a condition check and its comment regulator: s2mps11: Fix buck7 and buck8 wrong voltages arm64: tegra: Update Jetson TX1 GPU regulator timings iwlwifi: pcie: don't service an interrupt that was masked iwlwifi: pcie: fix ALIVE interrupt handling for gen2 devices w/o MSI-X iwlwifi: don't WARN when calling iwl_get_shared_mem_conf with RF-Kill iwlwifi: fix RF-Kill interrupt while FW load for gen2 devices NFSv4: Handle the special Linux file open access mode pnfs/flexfiles: Fix PTR_ERR() dereferences in ff_layout_track_ds_error pNFS: Fix a typo in pnfs_update_layout pnfs: Fix a problem where we gratuitously start doing I/O through the MDS lib/scatterlist: Fix mapping iterator when sg->offset is greater than PAGE_SIZE ASoC: dapm: Adapt for debugfs API change raid5-cache: Need to do start() part job after adding journal device ALSA: seq: Break too long mutex context in the write loop ALSA: hda/realtek - Fixed Headphone Mic can't record on Dell platform ALSA: hda/realtek: apply ALC891 headset fixup to one Dell machine media: v4l2: Test type instead of cfg->type in v4l2_ctrl_new_custom() media: coda: Remove unbalanced and unneeded mutex unlock media: videobuf2-core: Prevent size alignment wrapping buffer size to 0 media: videobuf2-dma-sg: Prevent size from overflowing KVM: x86/vPMU: refine kvm_pmu err msg when event creation failed arm64: tegra: Fix AGIC register range fs/proc/proc_sysctl.c: fix the default values of i_uid/i_gid on /proc/sys inodes. kconfig: fix missing choice values in auto.conf drm/nouveau/i2c: Enable i2c pads & busses during preinit padata: use smp_mb in padata_reorder to avoid orphaned padata jobs dm zoned: fix zone state management race xen/events: fix binding user event channels to cpus 9p/xen: Add cleanup path in p9_trans_xen_init 9p/virtio: Add cleanup path in p9_virtio_init x86/boot: Fix memory leak in default_get_smp_config() perf/x86/intel: Fix spurious NMI on fixed counter perf/x86/amd/uncore: Do not set 'ThreadMask' and 'SliceMask' for non-L3 PMCs perf/x86/amd/uncore: Set the thread mask for F17h L3 PMCs drm/edid: parse CEA blocks embedded in DisplayID intel_th: pci: Add Ice Lake NNPI support PCI: hv: Fix a use-after-free bug in hv_eject_device_work() PCI: Do not poll for PME if the device is in D3cold PCI: qcom: Ensure that PERST is asserted for at least 100 ms Btrfs: fix data loss after inode eviction, renaming it, and fsync it Btrfs: fix fsync not persisting dentry deletions due to inode evictions Btrfs: add missing inode version, ctime and mtime updates when punching hole IB/mlx5: Report correctly tag matching rendezvous capability HID: wacom: generic: only switch the mode on devices with LEDs HID: wacom: generic: Correct pad syncing HID: wacom: correct touch resolution x/y typo libnvdimm/pfn: fix fsdax-mode namespace info-block zero-fields coda: pass the host file in vma->vm_file on mmap include/asm-generic/bug.h: fix "cut here" for WARN_ON for __WARN_TAINT architectures xfs: fix pagecache truncation prior to reflink xfs: flush removing page cache in xfs_reflink_remap_prep xfs: don't overflow xattr listent buffer xfs: rename m_inotbt_nores to m_finobt_nores xfs: don't ever put nlink > 0 inodes on the unlinked list xfs: reserve blocks for ifree transaction during log recovery xfs: fix reporting supported extra file attributes for statx() xfs: serialize unaligned dio writes against all other dio writes xfs: abort unaligned nowait directio early gpu: ipu-v3: ipu-ic: Fix saturation bit offset in TPMEM crypto: caam - limit output IV to CBC to work around CTR mode DMA issue parisc: Ensure userspace privilege for ptraced processes in regset functions parisc: Fix kernel panic due invalid values in IAOQ0 or IAOQ1 powerpc/32s: fix suspend/resume when IBATs 4-7 are used powerpc/watchpoint: Restore NV GPRs while returning from exception powerpc/powernv/npu: Fix reference leak powerpc/pseries: Fix oops in hotplug memory notifier mmc: sdhci-msm: fix mutex while in spinlock eCryptfs: fix a couple type promotion bugs mtd: rawnand: mtk: Correct low level time calculation of r/w cycle mtd: spinand: read returns badly if the last page has bitflips intel_th: msu: Fix single mode with disabled IOMMU Bluetooth: Add SMP workaround Microsoft Surface Precision Mouse bug usb: Handle USB3 remote wakeup for LPM enabled devices correctly blk-throttle: fix zero wait time for iops throttled group blk-iolatency: clear use_delay when io.latency is set to zero blkcg: update blkcg_print_stat() to handle larger outputs net: mvmdio: allow up to four clocks to be specified for orion-mdio dt-bindings: allow up to four clocks for orion-mdio dm bufio: fix deadlock with loop device Linux 4.19.61 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I2f565111b1c16f369fa86e0481527fcc6357fe1b |
||
Daniel Jordan
|
1e4247d795 |
padata: use smp_mb in padata_reorder to avoid orphaned padata jobs
commit cf144f81a99d1a3928f90b0936accfd3f45c9a0a upstream.
Testing padata with the tcrypt module on a 5.2 kernel...
# modprobe tcrypt alg="pcrypt(rfc4106(gcm(aes)))" type=3
# modprobe tcrypt mode=211 sec=1
...produces this splat:
INFO: task modprobe:10075 blocked for more than 120 seconds.
Not tainted 5.2.0-base+ #16
modprobe D 0 10075 10064 0x80004080
Call Trace:
? __schedule+0x4dd/0x610
? ring_buffer_unlock_commit+0x23/0x100
schedule+0x6c/0x90
schedule_timeout+0x3b/0x320
? trace_buffer_unlock_commit_regs+0x4f/0x1f0
wait_for_common+0x160/0x1a0
? wake_up_q+0x80/0x80
{ crypto_wait_req } # entries in braces added by hand
{ do_one_aead_op }
{ test_aead_jiffies }
test_aead_speed.constprop.17+0x681/0xf30 [tcrypt]
do_test+0x4053/0x6a2b [tcrypt]
? 0xffffffffa00f4000
tcrypt_mod_init+0x50/0x1000 [tcrypt]
...
The second modprobe command never finishes because in padata_reorder,
CPU0's load of reorder_objects is executed before the unlocking store in
spin_unlock_bh(pd->lock), causing CPU0 to miss CPU1's increment:
CPU0 CPU1
padata_reorder padata_do_serial
LOAD reorder_objects // 0
INC reorder_objects // 1
padata_reorder
TRYLOCK pd->lock // failed
UNLOCK pd->lock
CPU0 deletes the timer before returning from padata_reorder and since no
other job is submitted to padata, modprobe waits indefinitely.
Add a pair of full barriers to guarantee proper ordering:
CPU0 CPU1
padata_reorder padata_do_serial
UNLOCK pd->lock
smp_mb()
LOAD reorder_objects
INC reorder_objects
smp_mb__after_atomic()
padata_reorder
TRYLOCK pd->lock
smp_mb__after_atomic is needed so the read part of the trylock operation
comes after the INC, as Andrea points out. Thanks also to Andrea for
help with writing a litmus test.
Fixes:
|
||
Nathan Huckleberry
|
b9f547b7bd |
timer_list: Guard procfs specific code
[ Upstream commit a9314773a91a1d3b36270085246a6715a326ff00 ] With CONFIG_PROC_FS=n the following warning is emitted: kernel/time/timer_list.c:361:36: warning: unused variable 'timer_list_sops' [-Wunused-const-variable] static const struct seq_operations timer_list_sops = { Add #ifdef guard around procfs specific code. Signed-off-by: Nathan Huckleberry <nhuck@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Cc: john.stultz@linaro.org Cc: sboyd@kernel.org Cc: clang-built-linux@googlegroups.com Link: https://github.com/ClangBuiltLinux/linux/issues/534 Link: https://lkml.kernel.org/r/20190614181604.112297-1-nhuck@google.com Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Miroslav Lichvar
|
d86c0b73f7 |
ntp: Limit TAI-UTC offset
[ Upstream commit d897a4ab11dc8a9fda50d2eccc081a96a6385998 ] Don't allow the TAI-UTC offset of the system clock to be set by adjtimex() to a value larger than 100000 seconds. This prevents an overflow in the conversion to int, prevents the CLOCK_TAI clock from getting too far ahead of the CLOCK_REALTIME clock, and it is still large enough to allow leap seconds to be inserted at the maximum rate currently supported by the kernel (once per day) for the next ~270 years, however unlikely it is that someone can survive a catastrophic event which slowed down the rotation of the Earth so much. Reported-by: Weikang shi <swkhack@gmail.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <sboyd@kernel.org> Link: https://lkml.kernel.org/r/20190618154713.20929-1-mlichvar@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Qian Cai
|
7fc96cd2b0 |
sched/fair: Fix "runnable_avg_yN_inv" not used warnings
[ Upstream commit 509466b7d480bc5d22e90b9fbe6122ae0e2fbe39 ] runnable_avg_yN_inv[] is only used in kernel/sched/pelt.c but was included in several other places because they need other macros all came from kernel/sched/sched-pelt.h which was generated by Documentation/scheduler/sched-pelt. As the result, it causes compilation a lot of warnings, kernel/sched/sched-pelt.h:4:18: warning: 'runnable_avg_yN_inv' defined but not used [-Wunused-const-variable=] kernel/sched/sched-pelt.h:4:18: warning: 'runnable_avg_yN_inv' defined but not used [-Wunused-const-variable=] kernel/sched/sched-pelt.h:4:18: warning: 'runnable_avg_yN_inv' defined but not used [-Wunused-const-variable=] ... Silence it by appending the __maybe_unused attribute for it, so all generated variables and macros can still be kept in the same file. Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/1559596304-31581-1-git-send-email-cai@lca.pw Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Gao Xiang
|
d8b7db6c50 |
sched/core: Add __sched tag for io_schedule()
[ Upstream commit e3b929b0a184edb35531153c5afcaebb09014f9d ] Non-inline io_schedule() was introduced in: commit |
||
Valdis Klētnieks
|
7c10f8941b |
bpf: silence warning messages in core
[ Upstream commit aee450cbe482a8c2f6fa5b05b178ef8b8ff107ca ] Compiling kernel/bpf/core.c with W=1 causes a flood of warnings: kernel/bpf/core.c:1198:65: warning: initialized field overwritten [-Woverride-init] 1198 | #define BPF_INSN_3_TBL(x, y, z) [BPF_##x | BPF_##y | BPF_##z] = true | ^~~~ kernel/bpf/core.c:1087:2: note: in expansion of macro 'BPF_INSN_3_TBL' 1087 | INSN_3(ALU, ADD, X), \ | ^~~~~~ kernel/bpf/core.c:1202:3: note: in expansion of macro 'BPF_INSN_MAP' 1202 | BPF_INSN_MAP(BPF_INSN_2_TBL, BPF_INSN_3_TBL), | ^~~~~~~~~~~~ kernel/bpf/core.c:1198:65: note: (near initialization for 'public_insntable[12]') 1198 | #define BPF_INSN_3_TBL(x, y, z) [BPF_##x | BPF_##y | BPF_##z] = true | ^~~~ kernel/bpf/core.c:1087:2: note: in expansion of macro 'BPF_INSN_3_TBL' 1087 | INSN_3(ALU, ADD, X), \ | ^~~~~~ kernel/bpf/core.c:1202:3: note: in expansion of macro 'BPF_INSN_MAP' 1202 | BPF_INSN_MAP(BPF_INSN_2_TBL, BPF_INSN_3_TBL), | ^~~~~~~~~~~~ 98 copies of the above. The attached patch silences the warnings, because we *know* we're overwriting the default initializer. That leaves bpf/core.c with only 6 other warnings, which become more visible in comparison. Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Imre Deak
|
2fbde27465 |
locking/lockdep: Fix merging of hlocks with non-zero references
[ Upstream commit d9349850e188b8b59e5322fda17ff389a1c0cd7d ] The sequence static DEFINE_WW_CLASS(test_ww_class); struct ww_acquire_ctx ww_ctx; struct ww_mutex ww_lock_a; struct ww_mutex ww_lock_b; struct ww_mutex ww_lock_c; struct mutex lock_c; ww_acquire_init(&ww_ctx, &test_ww_class); ww_mutex_init(&ww_lock_a, &test_ww_class); ww_mutex_init(&ww_lock_b, &test_ww_class); ww_mutex_init(&ww_lock_c, &test_ww_class); mutex_init(&lock_c); ww_mutex_lock(&ww_lock_a, &ww_ctx); mutex_lock(&lock_c); ww_mutex_lock(&ww_lock_b, &ww_ctx); ww_mutex_lock(&ww_lock_c, &ww_ctx); mutex_unlock(&lock_c); (*) ww_mutex_unlock(&ww_lock_c); ww_mutex_unlock(&ww_lock_b); ww_mutex_unlock(&ww_lock_a); ww_acquire_fini(&ww_ctx); (**) will trigger the following error in __lock_release() when calling mutex_release() at **: DEBUG_LOCKS_WARN_ON(depth <= 0) The problem is that the hlock merging happening at * updates the references for test_ww_class incorrectly to 3 whereas it should've updated it to 4 (representing all the instances for ww_ctx and ww_lock_[abc]). Fix this by updating the references during merging correctly taking into account that we can have non-zero references (both for the hlock that we merge into another hlock or for the hlock we are merging into). Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?= <ville.syrjala@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: https://lkml.kernel.org/r/20190524201509.9199-2-imre.deak@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Eric W. Biederman
|
b397462a01 |
signal/pid_namespace: Fix reboot_pid_ns to use send_sig not force_sig
[ Upstream commit f9070dc94542093fd516ae4ccea17ef46a4362c5 ]
The locking in force_sig_info is not prepared to deal with a task that
exits or execs (as sighand may change). The is not a locking problem
in force_sig as force_sig is only built to handle synchronous
exceptions.
Further the function force_sig_info changes the signal state if the
signal is ignored, or blocked or if SIGNAL_UNKILLABLE will prevent the
delivery of the signal. The signal SIGKILL can not be ignored and can
not be blocked and SIGNAL_UNKILLABLE won't prevent it from being
delivered.
So using force_sig rather than send_sig for SIGKILL is confusing
and pointless.
Because it won't impact the sending of the signal and and because
using force_sig is wrong, replace force_sig with send_sig.
Cc: Daniel Lezcano <daniel.lezcano@free.fr>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Fixes:
|
||
Sami Tolvanen
|
9a8a0e6fd9 |
ANDROID: kernel/Makefile: do not disable LTO for sys_ni.c with CFI
sys_ni.c compiles fine with CONFIG_LTO_CLANG, and disabling LTO for it breaks indirect call checking to functions in this file. Bug: 138254717 Change-Id: I7947cf3d0283ad37431860739665fee7fb0dfbdb Signed-off-by: Sami Tolvanen <samitolvanen@google.com> |
||
Greg Kroah-Hartman
|
bafa20fa20 |
This is the 4.19.60 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl00DjYACgkQONu9yGCS aT7O6hAAqqs1jm+vztbAJTyZPR+Vu7yGO1BukoyoqA3iUm7JPW0/Xamp+e/nOjq3 UrRKcn6WvIdDv22ikrR1qfFTFZYYCZfe4LWvzuUNsscr0dixW6iYoiSr5RDypH0C VIYZfEMaZ5G1R07jO7u8HWXAjAm+xqvxZRgARu9H0tk9As1+yW1kYFnQubdpIyoA 3zsTTQ+Dsyzc5mQQXBi88VnNpnI2PZGDAyaYmqfe7iuiIZ6qvjYZ245GygVb5Qlo 9yGKuxqRc7Lrd34f6t/0w2CwZuj8lbpt7twcdLXOjg/EjcouwBnX5smoq8oo5UIK kNSRsV0pfxhLt7EXViuRFduJIinViaYJY7guzWon3O9HXjO6OlUIhM65WRvwuxhz NuM1ctOfDqiyDzJ0NEp7tROsmkV3Un/DFHrePWGvcQ25lFxJMLtXUQDf/39cNkP2 iiWDSDOAXzgskfzpxmfRYyXO2/u2cjnmdUil27+/B54vYYM4XemBn07uc6zJZhJ/ spq2Hg/i/7gaAaoqRgoHvYLajlUytvetJMhdAZYhEpHL2/1gxE6SXI9LypV3096a FgdEfAghf0yY6FzaOXVb1PlqgbkigWtf8vo7Wmr25mNrg01678UTqGi2soCMhLXz OAGtOvPKcmD6wTY3gZlEzzVxoX0eCXUUVgK6TZFsMbmJb3+Y9yA= =Uqvz -----END PGP SIGNATURE----- Merge 4.19.60 into android-4.19 Changes in 4.19.60 Revert "e1000e: fix cyclic resets at link up with active tx" e1000e: start network tx queue only when link is up Input: synaptics - enable SMBUS on T480 thinkpad trackpad nilfs2: do not use unexported cpu_to_le32()/le32_to_cpu() in uapi header drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT firmware: improve LSM/IMA security behaviour irqchip/gic-v3-its: Fix command queue pointer comparison bug clk: ti: clkctrl: Fix returning uninitialized data efi/bgrt: Drop BGRT status field reserved bits check perf/core: Fix perf_sample_regs_user() mm check ARM: dts: gemini Fix up DNS-313 compatible string ARM: omap2: remove incorrect __init annotation afs: Fix uninitialised spinlock afs_volume::cb_break_lock x86/apic: Fix integer overflow on 10 bit left shift of cpu_khz be2net: fix link failure after ethtool offline test ppp: mppe: Add softdep to arc4 sis900: fix TX completion ARM: dts: imx6ul: fix PWM[1-4] interrupts pinctrl: mcp23s08: Fix add_data and irqchip_add_nested call order dm table: don't copy from a NULL pointer in realloc_argv() dm verity: use message limit for data block corruption message x86/boot/64: Fix crash if kernel image crosses page table boundary x86/boot/64: Add missing fixup_pointer() for next_early_pgt access HID: chicony: add another quirk for PixArt mouse HID: multitouch: Add pointstick support for ALPS Touchpad pinctrl: mediatek: Ignore interrupts that are wake only during resume cpu/hotplug: Fix out-of-bounds read when setting fail state pinctrl: mediatek: Update cur_mask in mask/mask ops linux/kernel.h: fix overflow for DIV_ROUND_UP_ULL genirq: Delay deactivation in free_irq() genirq: Fix misleading synchronize_irq() documentation genirq: Add optional hardware synchronization for shutdown x86/ioapic: Implement irq_get_irqchip_state() callback x86/irq: Handle spurious interrupt after shutdown gracefully x86/irq: Seperate unused system vectors from spurious entry again ARC: hide unused function unw_hdr_alloc s390: fix stfle zero padding s390/qdio: (re-)initialize tiqdio list entries s390/qdio: don't touch the dsci in tiqdio_add_input_queues() crypto: talitos - move struct talitos_edesc into talitos.h crypto: talitos - fix hash on SEC1. crypto/NX: Set receive window credits to max number of CRBs in RxFIFO regmap-irq: do not write mask register if mask_base is zero drm/udl: introduce a macro to convert dev to udl. drm/udl: Replace drm_dev_unref with drm_dev_put drm/udl: move to embedding drm device inside udl device. x86/entry/32: Fix ENDPROC of common_spurious Linux 4.19.60 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I283306f8640e06b3ffe8bcdca1478a0fd3af77db |
||
Thomas Gleixner
|
6074f6043c |
genirq: Add optional hardware synchronization for shutdown
commit 62e0468650c30f0298822c580f382b16328119f6 upstream
free_irq() ensures that no hardware interrupt handler is executing on a
different CPU before actually releasing resources and deactivating the
interrupt completely in a domain hierarchy.
But that does not catch the case where the interrupt is on flight at the
hardware level but not yet serviced by the target CPU. That creates an
interesing race condition:
CPU 0 CPU 1 IRQ CHIP
interrupt is raised
sent to CPU1
Unable to handle
immediately
(interrupts off,
deep idle delay)
mask()
...
free()
shutdown()
synchronize_irq()
release_resources()
do_IRQ()
-> resources are not available
That might be harmless and just trigger a spurious interrupt warning, but
some interrupt chips might get into a wedged state.
Utilize the existing irq_get_irqchip_state() callback for the
synchronization in free_irq().
synchronize_hardirq() is not using this mechanism as it might actually
deadlock unter certain conditions, e.g. when called with interrupts
disabled and the target CPU is the one on which the synchronization is
invoked. synchronize_irq() uses it because that function cannot be called
from non preemtible contexts as it might sleep.
No functional change intended and according to Marc the existing GIC
implementations where the driver supports the callback should be able
to cope with that core change. Famous last words.
Fixes:
|
||
Thomas Gleixner
|
3f10ccc297 |
genirq: Fix misleading synchronize_irq() documentation
commit 1d21f2af8571c6a6a44e7c1911780614847b0253 upstream The function might sleep, so it cannot be called from interrupt context. Not even with care. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <marc.zyngier@arm.com> Link: https://lkml.kernel.org/r/20190628111440.189241552@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Thomas Gleixner
|
578db1aa59 |
genirq: Delay deactivation in free_irq()
commit 4001d8e8762f57d418b66e4e668601791900a1dd upstream
When interrupts are shutdown, they are immediately deactivated in the
irqdomain hierarchy. While this looks obviously correct there is a subtle
issue:
There might be an interrupt in flight when free_irq() is invoking the
shutdown. This is properly handled at the irq descriptor / primary handler
level, but the deactivation might completely disable resources which are
required to acknowledge the interrupt.
Split the shutdown code and deactivate the interrupt after synchronization
in free_irq(). Fixup all other usage sites where this is not an issue to
invoke the combined shutdown_and_deactivate() function instead.
This still might be an issue if the interrupt in flight servicing is
delayed on a remote CPU beyond the invocation of synchronize_irq(), but
that cannot be handled at that level and needs to be handled in the
synchronize_irq() context.
Fixes:
|
||
Eiichi Tsukata
|
f6e01328cb |
cpu/hotplug: Fix out-of-bounds read when setting fail state
[ Upstream commit 33d4a5a7a5b4d02915d765064b2319e90a11cbde ]
Setting invalid value to /sys/devices/system/cpu/cpuX/hotplug/fail
can control `struct cpuhp_step *sp` address, results in the following
global-out-of-bounds read.
Reproducer:
# echo -2 > /sys/devices/system/cpu/cpu0/hotplug/fail
KASAN report:
BUG: KASAN: global-out-of-bounds in write_cpuhp_fail+0x2cd/0x2e0
Read of size 8 at addr ffffffff89734438 by task bash/1941
CPU: 0 PID: 1941 Comm: bash Not tainted 5.2.0-rc6+ #31
Call Trace:
write_cpuhp_fail+0x2cd/0x2e0
dev_attr_store+0x58/0x80
sysfs_kf_write+0x13d/0x1a0
kernfs_fop_write+0x2bc/0x460
vfs_write+0x1e1/0x560
ksys_write+0x126/0x250
do_syscall_64+0xc1/0x390
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f05e4f4c970
The buggy address belongs to the variable:
cpu_hotplug_lock+0x98/0xa0
Memory state around the buggy address:
ffffffff89734300: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00
ffffffff89734380: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00
>ffffffff89734400: 00 00 00 00 fa fa fa fa 00 00 00 00 fa fa fa fa
^
ffffffff89734480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffffffff89734500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Add a sanity check for the value written from user space.
Fixes:
|
||
Peter Zijlstra
|
afda29dc5a |
perf/core: Fix perf_sample_regs_user() mm check
[ Upstream commit 085ebfe937d7a7a5df1729f35a12d6d655fea68c ] perf_sample_regs_user() uses 'current->mm' to test for the presence of userspace, but this is insufficient, consider use_mm(). A better test is: '!(current->flags & PF_KTHREAD)', exec() clears PF_KTHREAD after it sets the new ->mm but before it drops to userspace for the first time. Possibly obsoletes: |
||
Greg Kroah-Hartman
|
0f653d9aa3 |
This is the 4.19.59 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl0qx4sACgkQONu9yGCS aT7Wzw/+Ixgza5VeJICnFgLZ80bYEQP5fDDcTD8psGi8fg/yKpUcHM0tv2Fi/ScQ dKNKN1zrWtn8e5bC8HE7V5rVFH3iT9gJXL4tebmFg9IOaBoce9wSaDMaptnv4OEw Ikb8apdrO2cHRWFhyIj9f35d3WE2OWUA4QYhrL17rptyP+k0eBBdyo572qfnheuf 4Yp4X6u8pnSR3fl4sgxzcfNLPXfrF8BMAKEx8/I1YyhUORpeJ/QxZkyFKNLMbUHm OWIHcw0O4Sfqtx9zWzwmpLk/aF8b98rCieJUDxYakVYD/iLsrdkkCx3IHlvMWdZF UtNVQbA26KIIFpXYe5gD1My+56grJaSCxAsO6M+c4PRCZ2BP+e6t+k3eASueadqs Ihq2qZyq1cMBQCeT1Sc3zQZgzwTE7lgzqQLVHiMmMukWv1Sx2xyio3GvN0i51gqz PCIxslzNhQnpmswCnDXgwaSp7W3YlT6+/zpQnzK1spZsfp8Ab/PkB41WyiPCWBtJ /Zx+lkdUd8HU8ZoKBoNMPWErX//MKa3NhKvakliPklVkSUfF12+4aB+Iil9H8vag ie4qmJrGvwg0t5PvRqRqy35fij/kcnJnFJJLlywkzRdTXlFUqqV+09N6hhS0BRgf YJibc8VptLWXgYRQoQD1J/xF87bcmB7HBnC4jBpdDzCkbTEHoI8= =zCPG -----END PGP SIGNATURE----- Merge 4.19.59 into android-4.19 Changes in 4.19.59 crypto: talitos - rename alternative AEAD algos. soc: brcmstb: Fix error path for unsupported CPUs soc: bcm: brcmstb: biuctrl: Register writes require a barrier Input: elantech - enable middle button support on 2 ThinkPads samples, bpf: fix to change the buffer size for read() samples, bpf: suppress compiler warning mac80211: fix rate reporting inside cfg80211_calculate_bitrate_he() bpf: sockmap, fix use after free from sleep in psock backlog workqueue soundwire: stream: fix out of boundary access on port properties staging:iio:ad7150: fix threshold mode config bit mac80211: mesh: fix RCU warning mac80211: free peer keys before vif down in mesh mwifiex: Fix possible buffer overflows at parsing bss descriptor iwlwifi: Fix double-free problems in iwl_req_fw_callback() mwifiex: Fix heap overflow in mwifiex_uap_parse_tail_ies() soundwire: intel: set dai min and max channels correctly dt-bindings: can: mcp251x: add mcp25625 support can: mcp251x: add support for mcp25625 can: m_can: implement errata "Needless activation of MRAF irq" can: af_can: Fix error path of can_init() net: phy: rename Asix Electronics PHY driver ibmvnic: Do not close unopened driver during reset ibmvnic: Refresh device multicast list after reset ibmvnic: Fix unchecked return codes of memory allocations ARM: dts: am335x phytec boards: Fix cd-gpios active level s390/boot: disable address-of-packed-member warning drm/vmwgfx: Honor the sg list segment size limitation drm/vmwgfx: fix a warning due to missing dma_parms riscv: Fix udelay in RV32. Input: imx_keypad - make sure keyboard can always wake up system KVM: arm/arm64: vgic: Fix kvm_device leak in vgic_its_destroy mlxsw: spectrum: Disallow prio-tagged packets when PVID is removed ARM: davinci: da850-evm: call regulator_has_full_constraints() ARM: davinci: da8xx: specify dma_coherent_mask for lcdc mac80211: only warn once on chanctx_conf being NULL mac80211: do not start any work during reconfigure flow bpf, devmap: Fix premature entry free on destroying map bpf, devmap: Add missing bulk queue free bpf, devmap: Add missing RCU read lock on flush bpf, x64: fix stack layout of JITed bpf code qmi_wwan: add support for QMAP padding in the RX path qmi_wwan: avoid RCU stalls on device disconnect when in QMAP mode qmi_wwan: extend permitted QMAP mux_id value range mmc: core: complete HS400 before checking status md: fix for divide error in status_resync bnx2x: Check if transceiver implements DDM before access drm: return -EFAULT if copy_to_user() fails ip6_tunnel: allow not to count pkts on tstats by passing dev as NULL net: lio_core: fix potential sign-extension overflow on large shift scsi: qedi: Check targetname while finding boot target information quota: fix a problem about transfer quota net: dsa: mv88e6xxx: fix shift of FID bits in mv88e6185_g1_vtu_loadpurge() NFS4: Only set creation opendata if O_CREAT net :sunrpc :clnt :Fix xps refcount imbalance on the error path fscrypt: don't set policy for a dead directory udf: Fix incorrect final NOT_ALLOCATED (hole) extent length media: stv0297: fix frequency range limit ALSA: usb-audio: Fix parse of UAC2 Extension Units ALSA: hda/realtek - Headphone Mic can't record after S3 block, bfq: NULL out the bic when it's no longer valid perf pmu: Fix uncore PMU alias list for ARM64 x86/ptrace: Fix possible spectre-v1 in ptrace_get_debugreg() x86/tls: Fix possible spectre-v1 in do_get_thread_area() Documentation: Add section about CPU vulnerabilities for Spectre Documentation/admin: Remove the vsyscall=native documentation mwifiex: Abort at too short BSS descriptor element mwifiex: Don't abort on small, spec-compliant vendor IEs USB: serial: ftdi_sio: add ID for isodebug v1 USB: serial: option: add support for GosunCn ME3630 RNDIS mode Revert "serial: 8250: Don't service RX FIFO if interrupts are disabled" p54usb: Fix race between disconnect and firmware loading usb: gadget: ether: Fix race between gether_disconnect and rx_submit usb: dwc2: use a longer AHB idle timeout in dwc2_core_reset() usb: renesas_usbhs: add a workaround for a race condition of workqueue drivers/usb/typec/tps6598x.c: fix portinfo width drivers/usb/typec/tps6598x.c: fix 4CC cmd write staging: comedi: dt282x: fix a null pointer deref on interrupt staging: comedi: amplc_pci230: fix null pointer deref on interrupt HID: Add another Primax PIXART OEM mouse quirk lkdtm: support llvm-objcopy binder: fix memory leak in error path carl9170: fix misuse of device driver API VMCI: Fix integer overflow in VMCI handle arrays MIPS: Remove superfluous check for __linux__ staging: fsl-dpaa2/ethsw: fix memory leak of switchdev_work staging: bcm2835-camera: Replace spinlock protecting context_map with mutex staging: bcm2835-camera: Ensure all buffers are returned on disable staging: bcm2835-camera: Remove check of the number of buffers supplied staging: bcm2835-camera: Handle empty EOS buffers whilst streaming staging: rtl8712: reduce stack usage, again Linux 4.19.59 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I650890ad9d984de0fc729677bd29506cd21338be |
||
Toshiaki Makita
|
4c2ce7addd |
bpf, devmap: Add missing RCU read lock on flush
[ Upstream commit 86723c8640633bee4b4588d3c7784ee7a0032f65 ]
.ndo_xdp_xmit() assumes it is called under RCU. For example virtio_net
uses RCU to detect it has setup the resources for tx. The assumption
accidentally broke when introducing bulk queue in devmap.
Fixes:
|
||
Toshiaki Makita
|
ab44f8bcf2 |
bpf, devmap: Add missing bulk queue free
[ Upstream commit edabf4d9dd905acd60048ea1579943801e3a4876 ]
dev_map_free() forgot to free bulk queue when freeing its entries.
Fixes:
|
||
Toshiaki Makita
|
8d09e86210 |
bpf, devmap: Fix premature entry free on destroying map
[ Upstream commit d4dd153d551634683fccf8881f606fa9f3dfa1ef ] dev_map_free() waits for flush_needed bitmap to be empty in order to ensure all flush operations have completed before freeing its entries. However the corresponding clear_bit() was called before using the entries, so the entries could be used after free. All access to the entries needs to be done before clearing the bit. It seems commit |
||
Greg Kroah-Hartman
|
5ad6eeba58 |
This is the 4.19.58 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl0lmYwACgkQONu9yGCS aT4h5w//ZG0BYEwxoa4Qc8rwvncnk78miK/VRH5JVTiToDqTuttHZQoMp+NLD2fQ V679f/2+VqEPn8o6yJsrbM8uea0iIratI8U6L2OEt6TKPbar3CPcRUPJeqlPWkej tf3qjAtvNNjLcl7xCYt9JNvpF4RwA8rLWWP5hZyYMi7xcMiB0FOriTlVJYHJ0PLK Iqg+edkBxKwx7mvFlZnJkT0ln5hCqT4QBq2XrOYGUfy2Ans5Ytg5dhhp41QDD6iu oE4mS+fybCzNOR3BWl7pfpeJRg8TKq4XNzYsQr9ftt2e3OZxOi3Jg+RLsgzjJB9P 1aTsuSzSeMXVGrAwRpBAot7TC+8F88sci0gibh4pg5N0ujGdvRW4gyzYHtdKhsTc wmjYMKbAxJWwz0vkRp1aSnUMSRur4Wo3qCWaOWpjkP4xhSBTTER5e5cqeuVSWde5 FaD8s0yjnQsUaH3oxZ7zDL//MR0N+C4Izs9c2A8HkdksWTdTvI7YX8c766iIZgrm JFV0FIZYIHAyuXT04W9n3VSvV4tLS+ouwYZpgG09oK0lBA8NT6RyZWzijY3VE0ed Kl+t6iu02qZgZrvnq4pHUVnLQtw7KfyL3mzeljVxEeaTbGODPOJfypY1OMfhWYw+ dIlmsmfa2aANf5wttl8CjLkAIIG3JmuWO2exMQidvXlGCE+rKVM= =u7q2 -----END PGP SIGNATURE----- Merge 4.19.58 into android-4.19 Changes in 4.19.58 Bluetooth: Fix faulty expression for minimum encryption key size check block: Fix a NULL pointer dereference in generic_make_request() md/raid0: Do not bypass blocking queue entered for raid0 bios netfilter: nf_flow_table: ignore DF bit setting netfilter: nft_flow_offload: set liberal tracking mode for tcp netfilter: nft_flow_offload: don't offload when sequence numbers need adjustment netfilter: nft_flow_offload: IPCB is only valid for ipv4 family ASoC : cs4265 : readable register too low ASoC: ak4458: add return value for ak4458_probe ASoC: soc-pcm: BE dai needs prepare when pause release after resume ASoC: ak4458: rstn_control - return a non-zero on error only spi: bitbang: Fix NULL pointer dereference in spi_unregister_master drm/mediatek: fix unbind functions drm/mediatek: unbind components in mtk_drm_unbind() drm/mediatek: call drm_atomic_helper_shutdown() when unbinding driver drm/mediatek: clear num_pipes when unbind driver drm/mediatek: call mtk_dsi_stop() after mtk_drm_crtc_atomic_disable() ASoC: max98090: remove 24-bit format support if RJ is 0 ASoC: sun4i-i2s: Fix sun8i tx channel offset mask ASoC: sun4i-i2s: Add offset to RX channel select x86/CPU: Add more Icelake model numbers usb: gadget: fusb300_udc: Fix memory leak of fusb300->ep[i] usb: gadget: udc: lpc32xx: allocate descriptor with GFP_ATOMIC ALSA: hdac: fix memory release for SST and SOF drivers SoC: rt274: Fix internal jack assignment in set_jack callback scsi: hpsa: correct ioaccel2 chaining drm: panel-orientation-quirks: Add quirk for GPD pocket2 drm: panel-orientation-quirks: Add quirk for GPD MicroPC platform/x86: asus-wmi: Only Tell EC the OS will handle display hotkeys from asus_nb_wmi platform/x86: intel-vbtn: Report switch events when event wakes device platform/x86: mlx-platform: Fix parent device in i2c-mux-reg device registration platform/mellanox: mlxreg-hotplug: Add devm_free_irq call to remove flow i2c: pca-platform: Fix GPIO lookup code cpuset: restore sanity to cpuset_cpus_allowed_fallback() scripts/decode_stacktrace.sh: prefix addr2line with $CROSS_COMPILE mm/mlock.c: change count_mm_mlocked_page_nr return type tracing: avoid build warning with HAVE_NOP_MCOUNT module: Fix livepatch/ftrace module text permissions race ftrace: Fix NULL pointer dereference in free_ftrace_func_mapper() drm/i915/dmc: protect against reading random memory ptrace: Fix ->ptracer_cred handling for PTRACE_TRACEME crypto: user - prevent operating on larval algorithms crypto: cryptd - Fix skcipher instance memory leak ALSA: seq: fix incorrect order of dest_client/dest_ports arguments ALSA: firewire-lib/fireworks: fix miss detection of received MIDI messages ALSA: line6: Fix write on zero-sized buffer ALSA: usb-audio: fix sign unintended sign extension on left shifts ALSA: hda/realtek: Add quirks for several Clevo notebook barebones ALSA: hda/realtek - Change front mic location for Lenovo M710q lib/mpi: Fix karactx leak in mpi_powm fs/userfaultfd.c: disable irqs for fault_pending and event locks tracing/snapshot: Resize spare buffer if size changed ARM: dts: armada-xp-98dx3236: Switch to armada-38x-uart serial node arm64: kaslr: keep modules inside module region when KASAN is enabled drm/amd/powerplay: use hardware fan control if no powerplay fan table drm/amdgpu/gfx9: use reset default for PA_SC_FIFO_SIZE drm/etnaviv: add missing failure path to destroy suballoc drm/imx: notify drm core before sending event during crtc disable drm/imx: only send event on crtc disable if kept disabled ftrace/x86: Remove possible deadlock between register_kprobe() and ftrace_run_update_code() mm/vmscan.c: prevent useless kswapd loops btrfs: Ensure replaced device doesn't have pending chunk allocation tty: rocket: fix incorrect forward declaration of 'rp_init()' mlxsw: spectrum: Handle VLAN device unlinking net/smc: move unhash before release of clcsock media: s5p-mfc: fix incorrect bus assignment in virtual child device drm/fb-helper: generic: Don't take module ref for fbcon f2fs: don't access node/meta inode mapping after iput mac80211: mesh: fix missing unlock on error in table_path_del() scsi: tcmu: fix use after free selftests: fib_rule_tests: Fix icmp proto with ipv6 x86/boot/compressed/64: Do not corrupt EDX on EFER.LME=1 setting net: hns: Fixes the missing put_device in positive leg for roce reset ALSA: hda: Initialize power_state field properly rds: Fix warning. ip6: fix skb leak in ip6frag_expire_frag_queue() netfilter: ipv6: nf_defrag: fix leakage of unqueued fragments sc16is7xx: move label 'err_spi' to correct section net: hns: fix unsigned comparison to less than zero bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K netfilter: ipv6: nf_defrag: accept duplicate fragments again KVM: x86: degrade WARN to pr_warn_ratelimited KVM: LAPIC: Fix pending interrupt in IRR blocked by software disable LAPIC nfsd: Fix overflow causing non-working mounts on 1 TB machines svcrdma: Ignore source port when computing DRC hash MIPS: Fix bounds check virt_addr_valid MIPS: Add missing EHB in mtc0 -> mfc0 sequence. MIPS: have "plain" make calls build dtbs for selected platforms dmaengine: qcom: bam_dma: Fix completed descriptors count dmaengine: imx-sdma: remove BD_INTR for channel0 Linux 4.19.58 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Daniel Borkmann
|
54e8cf41b2 |
bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K
[ Upstream commit fdadd04931c2d7cd294dc5b2b342863f94be53a3 ] Michael and Sandipan report: Commit ede95a63b5 introduced a bpf_jit_limit tuneable to limit BPF JIT allocations. At compile time it defaults to PAGE_SIZE * 40000, and is adjusted again at init time if MODULES_VADDR is defined. For ppc64 kernels, MODULES_VADDR isn't defined, so we're stuck with the compile-time default at boot-time, which is 0x9c400000 when using 64K page size. This overflows the signed 32-bit bpf_jit_limit value: root@ubuntu:/tmp# cat /proc/sys/net/core/bpf_jit_limit -1673527296 and can cause various unexpected failures throughout the network stack. In one case `strace dhclient eth0` reported: setsockopt(5, SOL_SOCKET, SO_ATTACH_FILTER, {len=11, filter=0x105dd27f8}, 16) = -1 ENOTSUPP (Unknown error 524) and similar failures can be seen with tools like tcpdump. This doesn't always reproduce however, and I'm not sure why. The more consistent failure I've seen is an Ubuntu 18.04 KVM guest booted on a POWER9 host would time out on systemd/netplan configuring a virtio-net NIC with no noticeable errors in the logs. Given this and also given that in near future some architectures like arm64 will have a custom area for BPF JIT image allocations we should get rid of the BPF_JIT_LIMIT_DEFAULT fallback / default entirely. For 4.21, we have an overridable bpf_jit_alloc_exec(), bpf_jit_free_exec() so therefore add another overridable bpf_jit_alloc_exec_limit() helper function which returns the possible size of the memory area for deriving the default heuristic in bpf_jit_charge_init(). Like bpf_jit_alloc_exec() and bpf_jit_free_exec(), the new bpf_jit_alloc_exec_limit() assumes that module_alloc() is the default JIT memory provider, and therefore in case archs implement their custom module_alloc() we use MODULES_{END,_VADDR} for limits and otherwise for vmalloc_exec() cases like on ppc64 we use VMALLOC_{END,_START}. Additionally, for archs supporting large page sizes, we should change the sysctl to be handled as long to not run into sysctl restrictions in future. Fixes: ede95a63b5e8 ("bpf: add bpf_jit_limit knob to restrict unpriv allocations") Reported-by: Sandipan Das <sandipan@linux.ibm.com> Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Petr Mladek
|
c854d9b6ef |
ftrace/x86: Remove possible deadlock between register_kprobe() and ftrace_run_update_code()
commit d5b844a2cf507fc7642c9ae80a9d585db3065c28 upstream.
The commit 9f255b632bf12c4dd7 ("module: Fix livepatch/ftrace module text
permissions race") causes a possible deadlock between register_kprobe()
and ftrace_run_update_code() when ftrace is using stop_machine().
The existing dependency chain (in reverse order) is:
-> #1 (text_mutex){+.+.}:
validate_chain.isra.21+0xb32/0xd70
__lock_acquire+0x4b8/0x928
lock_acquire+0x102/0x230
__mutex_lock+0x88/0x908
mutex_lock_nested+0x32/0x40
register_kprobe+0x254/0x658
init_kprobes+0x11a/0x168
do_one_initcall+0x70/0x318
kernel_init_freeable+0x456/0x508
kernel_init+0x22/0x150
ret_from_fork+0x30/0x34
kernel_thread_starter+0x0/0xc
-> #0 (cpu_hotplug_lock.rw_sem){++++}:
check_prev_add+0x90c/0xde0
validate_chain.isra.21+0xb32/0xd70
__lock_acquire+0x4b8/0x928
lock_acquire+0x102/0x230
cpus_read_lock+0x62/0xd0
stop_machine+0x2e/0x60
arch_ftrace_update_code+0x2e/0x40
ftrace_run_update_code+0x40/0xa0
ftrace_startup+0xb2/0x168
register_ftrace_function+0x64/0x88
klp_patch_object+0x1a2/0x290
klp_enable_patch+0x554/0x980
do_one_initcall+0x70/0x318
do_init_module+0x6e/0x250
load_module+0x1782/0x1990
__s390x_sys_finit_module+0xaa/0xf0
system_call+0xd8/0x2d0
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(text_mutex);
lock(cpu_hotplug_lock.rw_sem);
lock(text_mutex);
lock(cpu_hotplug_lock.rw_sem);
It is similar problem that has been solved by the commit
|
||
Eiichi Tsukata
|
c8790d7f76 |
tracing/snapshot: Resize spare buffer if size changed
commit 46cc0b44428d0f0e81f11ea98217fc0edfbeab07 upstream.
Current snapshot implementation swaps two ring_buffers even though their
sizes are different from each other, that can cause an inconsistency
between the contents of buffer_size_kb file and the current buffer size.
For example:
# cat buffer_size_kb
7 (expanded: 1408)
# echo 1 > events/enable
# grep bytes per_cpu/cpu0/stats
bytes: 1441020
# echo 1 > snapshot // current:1408, spare:1408
# echo 123 > buffer_size_kb // current:123, spare:1408
# echo 1 > snapshot // current:1408, spare:123
# grep bytes per_cpu/cpu0/stats
bytes: 1443700
# cat buffer_size_kb
123 // != current:1408
And also, a similar per-cpu case hits the following WARNING:
Reproducer:
# echo 1 > per_cpu/cpu0/snapshot
# echo 123 > buffer_size_kb
# echo 1 > per_cpu/cpu0/snapshot
WARNING:
WARNING: CPU: 0 PID: 1946 at kernel/trace/trace.c:1607 update_max_tr_single.part.0+0x2b8/0x380
Modules linked in:
CPU: 0 PID: 1946 Comm: bash Not tainted 5.2.0-rc6 #20
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
RIP: 0010:update_max_tr_single.part.0+0x2b8/0x380
Code: ff e8 dc da f9 ff 0f 0b e9 88 fe ff ff e8 d0 da f9 ff 44 89 ee bf f5 ff ff ff e8 33 dc f9 ff 41 83 fd f5 74 96 e8 b8 da f9 ff <0f> 0b eb 8d e8 af da f9 ff 0f 0b e9 bf fd ff ff e8 a3 da f9 ff 48
RSP: 0018:ffff888063e4fca0 EFLAGS: 00010093
RAX: ffff888066214380 RBX: ffffffff99850fe0 RCX: ffffffff964298a8
RDX: 0000000000000000 RSI: 00000000fffffff5 RDI: 0000000000000005
RBP: 1ffff1100c7c9f96 R08: ffff888066214380 R09: ffffed100c7c9f9b
R10: ffffed100c7c9f9a R11: 0000000000000003 R12: 0000000000000000
R13: 00000000ffffffea R14: ffff888066214380 R15: ffffffff99851060
FS: 00007f9f8173c700(0000) GS:ffff88806d000000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000714dc0 CR3: 0000000066fa6000 CR4: 00000000000006f0
Call Trace:
? trace_array_printk_buf+0x140/0x140
? __mutex_lock_slowpath+0x10/0x10
tracing_snapshot_write+0x4c8/0x7f0
? trace_printk_init_buffers+0x60/0x60
? selinux_file_permission+0x3b/0x540
? tracer_preempt_off+0x38/0x506
? trace_printk_init_buffers+0x60/0x60
__vfs_write+0x81/0x100
vfs_write+0x1e1/0x560
ksys_write+0x126/0x250
? __ia32_sys_read+0xb0/0xb0
? do_syscall_64+0x1f/0x390
do_syscall_64+0xc1/0x390
entry_SYSCALL_64_after_hwframe+0x49/0xbe
This patch adds resize_buffer_duplicate_size() to check if there is a
difference between current/spare buffer sizes and resize a spare buffer
if necessary.
Link: http://lkml.kernel.org/r/20190625012910.13109-1-devel@etsukata.com
Cc: stable@vger.kernel.org
Fixes:
|
||
Jann Horn
|
54435b7fff |
ptrace: Fix ->ptracer_cred handling for PTRACE_TRACEME
commit 6994eefb0053799d2e07cd140df6c2ea106c41ee upstream.
Fix two issues:
When called for PTRACE_TRACEME, ptrace_link() would obtain an RCU
reference to the parent's objective credentials, then give that pointer
to get_cred(). However, the object lifetime rules for things like
struct cred do not permit unconditionally turning an RCU reference into
a stable reference.
PTRACE_TRACEME records the parent's credentials as if the parent was
acting as the subject, but that's not the case. If a malicious
unprivileged child uses PTRACE_TRACEME and the parent is privileged, and
at a later point, the parent process becomes attacker-controlled
(because it drops privileges and calls execve()), the attacker ends up
with control over two processes with a privileged ptrace relationship,
which can be abused to ptrace a suid binary and obtain root privileges.
Fix both of these by always recording the credentials of the process
that is requesting the creation of the ptrace relationship:
current_cred() can't change under us, and current is the proper subject
for access control.
This change is theoretically userspace-visible, but I am not aware of
any code that it will actually break.
Fixes:
|
||
Wei Li
|
2b39351e38 |
ftrace: Fix NULL pointer dereference in free_ftrace_func_mapper()
[ Upstream commit 04e03d9a616c19a47178eaca835358610e63a1dd ] The mapper may be NULL when called from register_ftrace_function_probe() with probe->data == NULL. This issue can be reproduced as follow (it may be covered by compiler optimization sometime): / # cat /sys/kernel/debug/tracing/set_ftrace_filter #### all functions enabled #### / # echo foo_bar:dump > /sys/kernel/debug/tracing/set_ftrace_filter [ 206.949100] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 206.952402] Mem abort info: [ 206.952819] ESR = 0x96000006 [ 206.955326] Exception class = DABT (current EL), IL = 32 bits [ 206.955844] SET = 0, FnV = 0 [ 206.956272] EA = 0, S1PTW = 0 [ 206.956652] Data abort info: [ 206.957320] ISV = 0, ISS = 0x00000006 [ 206.959271] CM = 0, WnR = 0 [ 206.959938] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000419f3a000 [ 206.960483] [0000000000000000] pgd=0000000411a87003, pud=0000000411a83003, pmd=0000000000000000 [ 206.964953] Internal error: Oops: 96000006 [#1] SMP [ 206.971122] Dumping ftrace buffer: [ 206.973677] (ftrace buffer empty) [ 206.975258] Modules linked in: [ 206.976631] Process sh (pid: 281, stack limit = 0x(____ptrval____)) [ 206.978449] CPU: 10 PID: 281 Comm: sh Not tainted 5.2.0-rc1+ #17 [ 206.978955] Hardware name: linux,dummy-virt (DT) [ 206.979883] pstate: 60000005 (nZCv daif -PAN -UAO) [ 206.980499] pc : free_ftrace_func_mapper+0x2c/0x118 [ 206.980874] lr : ftrace_count_free+0x68/0x80 [ 206.982539] sp : ffff0000182f3ab0 [ 206.983102] x29: ffff0000182f3ab0 x28: ffff8003d0ec1700 [ 206.983632] x27: ffff000013054b40 x26: 0000000000000001 [ 206.984000] x25: ffff00001385f000 x24: 0000000000000000 [ 206.984394] x23: ffff000013453000 x22: ffff000013054000 [ 206.984775] x21: 0000000000000000 x20: ffff00001385fe28 [ 206.986575] x19: ffff000013872c30 x18: 0000000000000000 [ 206.987111] x17: 0000000000000000 x16: 0000000000000000 [ 206.987491] x15: ffffffffffffffb0 x14: 0000000000000000 [ 206.987850] x13: 000000000017430e x12: 0000000000000580 [ 206.988251] x11: 0000000000000000 x10: cccccccccccccccc [ 206.988740] x9 : 0000000000000000 x8 : ffff000013917550 [ 206.990198] x7 : ffff000012fac2e8 x6 : ffff000012fac000 [ 206.991008] x5 : ffff0000103da588 x4 : 0000000000000001 [ 206.991395] x3 : 0000000000000001 x2 : ffff000013872a28 [ 206.991771] x1 : 0000000000000000 x0 : 0000000000000000 [ 206.992557] Call trace: [ 206.993101] free_ftrace_func_mapper+0x2c/0x118 [ 206.994827] ftrace_count_free+0x68/0x80 [ 206.995238] release_probe+0xfc/0x1d0 [ 206.995555] register_ftrace_function_probe+0x4a8/0x868 [ 206.995923] ftrace_trace_probe_callback.isra.4+0xb8/0x180 [ 206.996330] ftrace_dump_callback+0x50/0x70 [ 206.996663] ftrace_regex_write.isra.29+0x290/0x3a8 [ 206.997157] ftrace_filter_write+0x44/0x60 [ 206.998971] __vfs_write+0x64/0xf0 [ 206.999285] vfs_write+0x14c/0x2f0 [ 206.999591] ksys_write+0xbc/0x1b0 [ 206.999888] __arm64_sys_write+0x3c/0x58 [ 207.000246] el0_svc_common.constprop.0+0x408/0x5f0 [ 207.000607] el0_svc_handler+0x144/0x1c8 [ 207.000916] el0_svc+0x8/0xc [ 207.003699] Code: aa0003f8 a9025bf5 aa0103f5 f946ea80 (f9400303) [ 207.008388] ---[ end trace 7b6d11b5f542bdf1 ]--- [ 207.010126] Kernel panic - not syncing: Fatal exception [ 207.011322] SMP: stopping secondary CPUs [ 207.013956] Dumping ftrace buffer: [ 207.014595] (ftrace buffer empty) [ 207.015632] Kernel Offset: disabled [ 207.017187] CPU features: 0x002,20006008 [ 207.017985] Memory Limit: none [ 207.019825] ---[ end Kernel panic - not syncing: Fatal exception ]--- Link: http://lkml.kernel.org/r/20190606031754.10798-1-liwei391@huawei.com Signed-off-by: Wei Li <liwei391@huawei.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |