commit 7f3bd8db99746a60bcae1ec4059a4756d19b63c2 Author: Greg Kroah-Hartman Date: Tue Feb 13 12:36:03 2018 +0100 Linux 4.9.81 commit 2760f452a71899af860355abe818a8b2bd32b2cb Author: Borislav Petkov Date: Thu Oct 12 13:23:16 2017 +0200 x86/microcode: Do the family check first commit 1f161f67a272cc4f29f27934dd3f74cb657eb5c4 upstream with adjustments. On CPUs like AMD's Geode, for example, we shouldn't even try to load microcode because they do not support the modern microcode loading interface. However, we do the family check *after* the other checks whether the loader has been disabled on the command line or whether we're running in a guest. So move the family checks first in order to exit early if we're being loaded on an unsupported family. Reported-and-tested-by: Sven Glodowski Signed-off-by: Borislav Petkov Cc: # 4.11.. Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://bugzilla.suse.com/show_bug.cgi?id=1061396 Link: http://lkml.kernel.org/r/20171012112316.977-1-bp@alien8.de Signed-off-by: Ingo Molnar Signed-off-by: Rolf Neugebauer Signed-off-by: Greg Kroah-Hartman commit 230ca8fb951528f298e06a9c15257df5df85ecbd Author: Laurent Pinchart Date: Sat Jul 29 02:31:33 2017 +0300 drm: rcar-du: Fix race condition when disabling planes at CRTC stop commit 641307df71fe77d7b38a477067495ede05d47295 upstream. When stopping the CRTC the driver must disable all planes and wait for the change to take effect at the next vblank. Merely calling drm_crtc_wait_one_vblank() is not enough, as the function doesn't include any mechanism to handle the race with vblank interrupts. Replace the drm_crtc_wait_one_vblank() call with a manual mechanism that handles the vblank interrupt race. Signed-off-by: Laurent Pinchart Reviewed-by: Kieran Bingham Signed-off-by: thongsyho Signed-off-by: Nhan Nguyen Signed-off-by: Greg Kroah-Hartman commit 758e22acf4fd9aaf51c7bae93a47401b8f792d56 Author: Laurent Pinchart Date: Mon Jul 10 23:46:39 2017 +0300 drm: rcar-du: Use the VBK interrupt for vblank events commit cbbb90b0c084d7dfb2ed8e3fecf8df200fbdd2a0 upstream. When implementing support for interlaced modes, the driver switched from reporting vblank events on the vertical blanking (VBK) interrupt to the frame end interrupt (FRM). This incorrectly divided the reported refresh rate by two. Fix it by moving back to the VBK interrupt. Fixes: 906eff7fcada ("drm: rcar-du: Implement support for interlaced modes") Signed-off-by: Laurent Pinchart Reviewed-by: Kieran Bingham Signed-off-by: thongsyho Signed-off-by: Nhan Nguyen Signed-off-by: Greg Kroah-Hartman commit 1cb145c67260edf54f106f031756a6ae780b7a32 Author: Kuninori Morimoto Date: Wed Aug 9 02:16:20 2017 +0000 ASoC: rsnd: avoid duplicate free_irq() commit e0936c3471a8411a5df327641fa3ffe12a2fb07b upstream. commit 1f8754d4daea5f ("ASoC: rsnd: don't call free_irq() on Parent SSI") fixed Parent SSI duplicate free_irq(). But on Renesas Sound, not only Parent SSI but also Multi SSI have same issue. This patch avoid duplicate free_irq() if it was not pure SSI. Fixes: 1f8754d4daea5f ("ASoC: rsnd: don't call free_irq() on Parent SSI") Signed-off-by: Kuninori Morimoto Signed-off-by: Mark Brown Signed-off-by: thongsyho Signed-off-by: Nhan Nguyen Signed-off-by: Greg Kroah-Hartman commit 24978c21f7ed183e7bb9745a295d066483b6bd48 Author: Kuninori Morimoto Date: Tue May 16 01:48:24 2017 +0000 ASoC: rsnd: don't call free_irq() on Parent SSI commit 1f8754d4daea5f257370a52a30fcb22798c54516 upstream. If SSI uses shared pin, some SSI will be used as parent SSI. Then, normal SSI's remove and Parent SSI's remove (these are same SSI) will be called when unbind or remove timing. In this case, free_irq() will be called twice. This patch solve this issue. Signed-off-by: Kuninori Morimoto Tested-by: Hiroyuki Yokoyama Reported-by: Hiroyuki Yokoyama Signed-off-by: Mark Brown Signed-off-by: thongsyho Signed-off-by: Nhan Nguyen Signed-off-by: Greg Kroah-Hartman commit a7de0e9718c3062ed5fad916ceb65f387a29447b Author: Julian Scheel Date: Wed May 24 12:28:23 2017 +0200 ASoC: simple-card: Fix misleading error message commit 7ac45d1635a4cd2e99a4b11903d4a2815ca1b27b upstream. In case cpu could not be found the error message would always refer to /codec/ not being found in DT. Fix this by catching the cpu node not found case explicitly. Signed-off-by: Julian Scheel Signed-off-by: Mark Brown Signed-off-by: thongsyho Signed-off-by: Nhan Nguyen Signed-off-by: Greg Kroah-Hartman commit 7c17a1e5852a0c1c67a2ac65569270619fe1d576 Author: Robert Baronescu Date: Tue Oct 10 13:21:59 2017 +0300 crypto: tcrypt - fix S/G table for test_aead_speed() commit 5c6ac1d4f8fbdbed65dbeb8cf149d736409d16a1 upstream. In case buffer length is a multiple of PAGE_SIZE, the S/G table is incorrectly generated. Fix this by handling buflen = k * PAGE_SIZE separately. Signed-off-by: Robert Baronescu Signed-off-by: Herbert Xu Signed-off-by: Horia Geantă Signed-off-by: Greg Kroah-Hartman commit fc00dde96099a1a331a683a87b23ccb4626d816a Author: KarimAllah Ahmed Date: Sat Feb 3 15:56:23 2018 +0100 KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL (cherry picked from commit b2ac58f90540e39324e7a29a7ad471407ae0bf48) [ Based on a patch from Paolo Bonzini ] ... basically doing exactly what we do for VMX: - Passthrough SPEC_CTRL to guests (if enabled in guest CPUID) - Save and restore SPEC_CTRL around VMExit and VMEntry only if the guest actually used it. Signed-off-by: KarimAllah Ahmed Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Reviewed-by: Darren Kenny Reviewed-by: Konrad Rzeszutek Wilk Cc: Andrea Arcangeli Cc: Andi Kleen Cc: Jun Nakajima Cc: kvm@vger.kernel.org Cc: Dave Hansen Cc: Tim Chen Cc: Andy Lutomirski Cc: Asit Mallick Cc: Arjan Van De Ven Cc: Greg KH Cc: Paolo Bonzini Cc: Dan Williams Cc: Linus Torvalds Cc: Ashok Raj Link: https://lkml.kernel.org/r/1517669783-20732-1-git-send-email-karahmed@amazon.de Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit e5a83419c957edff9290a7e09b951f44af7fa2e2 Author: KarimAllah Ahmed Date: Thu Feb 1 22:59:45 2018 +0100 KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL (cherry picked from commit d28b387fb74da95d69d2615732f50cceb38e9a4d) [ Based on a patch from Ashok Raj ] Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for guests that will only mitigate Spectre V2 through IBRS+IBPB and will not be using a retpoline+IBPB based approach. To avoid the overhead of saving and restoring the MSR_IA32_SPEC_CTRL for guests that do not actually use the MSR, only start saving and restoring when a non-zero is written to it. No attempt is made to handle STIBP here, intentionally. Filtering STIBP may be added in a future patch, which may require trapping all writes if we don't want to pass it through directly to the guest. [dwmw2: Clean up CPUID bits, save/restore manually, handle reset] Signed-off-by: KarimAllah Ahmed Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Reviewed-by: Darren Kenny Reviewed-by: Konrad Rzeszutek Wilk Reviewed-by: Jim Mattson Cc: Andrea Arcangeli Cc: Andi Kleen Cc: Jun Nakajima Cc: kvm@vger.kernel.org Cc: Dave Hansen Cc: Tim Chen Cc: Andy Lutomirski Cc: Asit Mallick Cc: Arjan Van De Ven Cc: Greg KH Cc: Paolo Bonzini Cc: Dan Williams Cc: Linus Torvalds Cc: Ashok Raj Link: https://lkml.kernel.org/r/1517522386-18410-5-git-send-email-karahmed@amazon.de Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 755502f810c646a1d828fbc66364c252430b6064 Author: KarimAllah Ahmed Date: Thu Feb 1 22:59:44 2018 +0100 KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES (cherry picked from commit 28c1c9fabf48d6ad596273a11c46e0d0da3e14cd) Intel processors use MSR_IA32_ARCH_CAPABILITIES MSR to indicate RDCL_NO (bit 0) and IBRS_ALL (bit 1). This is a read-only MSR. By default the contents will come directly from the hardware, but user-space can still override it. [dwmw2: The bit in kvm_cpuid_7_0_edx_x86_features can be unconditional] Signed-off-by: KarimAllah Ahmed Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Reviewed-by: Paolo Bonzini Reviewed-by: Darren Kenny Reviewed-by: Jim Mattson Reviewed-by: Konrad Rzeszutek Wilk Cc: Andrea Arcangeli Cc: Andi Kleen Cc: Jun Nakajima Cc: kvm@vger.kernel.org Cc: Dave Hansen Cc: Linus Torvalds Cc: Andy Lutomirski Cc: Asit Mallick Cc: Arjan Van De Ven Cc: Greg KH Cc: Dan Williams Cc: Tim Chen Cc: Ashok Raj Link: https://lkml.kernel.org/r/1517522386-18410-4-git-send-email-karahmed@amazon.de Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 7013129a4034ea2168a0ccb32d7ddfefe9123333 Author: Ashok Raj Date: Thu Feb 1 22:59:43 2018 +0100 KVM/x86: Add IBPB support (cherry picked from commit 15d45071523d89b3fb7372e2135fbd72f6af9506) The Indirect Branch Predictor Barrier (IBPB) is an indirect branch control mechanism. It keeps earlier branches from influencing later ones. Unlike IBRS and STIBP, IBPB does not define a new mode of operation. It's a command that ensures predicted branch targets aren't used after the barrier. Although IBRS and IBPB are enumerated by the same CPUID enumeration, IBPB is very different. IBPB helps mitigate against three potential attacks: * Mitigate guests from being attacked by other guests. - This is addressed by issing IBPB when we do a guest switch. * Mitigate attacks from guest/ring3->host/ring3. These would require a IBPB during context switch in host, or after VMEXIT. The host process has two ways to mitigate - Either it can be compiled with retpoline - If its going through context switch, and has set !dumpable then there is a IBPB in that path. (Tim's patch: https://patchwork.kernel.org/patch/10192871) - The case where after a VMEXIT you return back to Qemu might make Qemu attackable from guest when Qemu isn't compiled with retpoline. There are issues reported when doing IBPB on every VMEXIT that resulted in some tsc calibration woes in guest. * Mitigate guest/ring0->host/ring0 attacks. When host kernel is using retpoline it is safe against these attacks. If host kernel isn't using retpoline we might need to do a IBPB flush on every VMEXIT. Even when using retpoline for indirect calls, in certain conditions 'ret' can use the BTB on Skylake-era CPUs. There are other mitigations available like RSB stuffing/clearing. * IBPB is issued only for SVM during svm_free_vcpu(). VMX has a vmclear and SVM doesn't. Follow discussion here: https://lkml.org/lkml/2018/1/15/146 Please refer to the following spec for more details on the enumeration and control. Refer here to get documentation about mitigations. https://software.intel.com/en-us/side-channel-security-support [peterz: rebase and changelog rewrite] [karahmed: - rebase - vmx: expose PRED_CMD if guest has it in CPUID - svm: only pass through IBPB if guest has it in CPUID - vmx: support !cpu_has_vmx_msr_bitmap()] - vmx: support nested] [dwmw2: Expose CPUID bit too (AMD IBPB only for now as we lack IBRS) PRED_CMD is a write-only MSR] Signed-off-by: Ashok Raj Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: David Woodhouse Signed-off-by: KarimAllah Ahmed Signed-off-by: Thomas Gleixner Reviewed-by: Konrad Rzeszutek Wilk Cc: Andrea Arcangeli Cc: Andi Kleen Cc: kvm@vger.kernel.org Cc: Asit Mallick Cc: Linus Torvalds Cc: Andy Lutomirski Cc: Dave Hansen Cc: Arjan Van De Ven Cc: Greg KH Cc: Jun Nakajima Cc: Paolo Bonzini Cc: Dan Williams Cc: Tim Chen Link: http://lkml.kernel.org/r/1515720739-43819-6-git-send-email-ashok.raj@intel.com Link: https://lkml.kernel.org/r/1517522386-18410-3-git-send-email-karahmed@amazon.de Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 6236b782eba37a028972bdfd654773ff2e283a22 Author: Paolo Bonzini Date: Tue Jan 16 16:51:18 2018 +0100 KVM: VMX: make MSR bitmaps per-VCPU (cherry picked from commit 904e14fb7cb96401a7dc803ca2863fd5ba32ffe6) Place the MSR bitmap in struct loaded_vmcs, and update it in place every time the x2apic or APICv state can change. This is rare and the loop can handle 64 MSRs per iteration, in a similar fashion as nested_vmx_prepare_msr_bitmap. This prepares for choosing, on a per-VM basis, whether to intercept the SPEC_CTRL and PRED_CMD MSRs. Cc: stable@vger.kernel.org # prereq for Spectre mitigation Suggested-by: Jim Mattson Signed-off-by: Paolo Bonzini Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit ff546f9d83d320bc81b72bd2ccd4ebae45d3d714 Author: Paolo Bonzini Date: Thu Jan 11 12:16:15 2018 +0100 KVM: VMX: introduce alloc_loaded_vmcs (cherry picked from commit f21f165ef922c2146cc5bdc620f542953c41714b) Group together the calls to alloc_vmcs and loaded_vmcs_init. Soon we'll also allocate an MSR bitmap there. Cc: stable@vger.kernel.org # prereq for Spectre mitigation Signed-off-by: Paolo Bonzini Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 46e24dfc2dfe0062a77538698c84f13b24ce9b2c Author: Jim Mattson Date: Mon Nov 27 17:22:25 2017 -0600 KVM: nVMX: Eliminate vmcs02 pool (cherry picked from commit de3a0021a60635de96aa92713c1a31a96747d72c) The potential performance advantages of a vmcs02 pool have never been realized. To simplify the code, eliminate the pool. Instead, a single vmcs02 is allocated per VCPU when the VCPU enters VMX operation. Cc: stable@vger.kernel.org # prereq for Spectre mitigation Signed-off-by: Jim Mattson Signed-off-by: Mark Kanda Reviewed-by: Ameya More Reviewed-by: David Hildenbrand Reviewed-by: Paolo Bonzini Signed-off-by: Radim Krčmář Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit b7649e1776706c28d837d423412c1763900e521e Author: David Matlack Date: Tue Aug 1 14:00:40 2017 -0700 KVM: nVMX: mark vmcs12 pages dirty on L2 exit (cherry picked from commit c9f04407f2e0b3fc9ff7913c65fcfcb0a4b61570) The host physical addresses of L1's Virtual APIC Page and Posted Interrupt descriptor are loaded into the VMCS02. The CPU may write to these pages via their host physical address while L2 is running, bypassing address-translation-based dirty tracking (e.g. EPT write protection). Mark them dirty on every exit from L2 to prevent them from getting out of sync with dirty tracking. Also mark the virtual APIC page and the posted interrupt descriptor dirty when KVM is virtualizing posted interrupt processing. Signed-off-by: David Matlack Reviewed-by: Paolo Bonzini Signed-off-by: Radim Krčmář Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 1edccf20b9d82d318f0003ad67b8afed299ae93e Author: David Hildenbrand Date: Wed Jan 25 11:58:58 2017 +0100 KVM: nVMX: vmx_complete_nested_posted_interrupt() can't fail (cherry picked from commit 6342c50ad12e8ce0736e722184a7dbdea4a3477f) vmx_complete_nested_posted_interrupt() can't fail, let's turn it into a void function. Signed-off-by: David Hildenbrand Signed-off-by: Paolo Bonzini Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 19b1d4bdfe5ce543e54292fb1a45c7300398e6aa Author: David Hildenbrand Date: Wed Jan 25 11:58:57 2017 +0100 KVM: nVMX: kmap() can't fail commit 42cf014d38d8822cce63703a467e00f65d000952 upstream. kmap() can't fail, therefore it will always return a valid pointer. Let's just get rid of the unnecessary checks. Signed-off-by: David Hildenbrand Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 34900390e96663cfe5c23e254baaf49664be8836 Author: Darren Kenny Date: Fri Feb 2 19:12:20 2018 +0000 x86/speculation: Fix typo IBRS_ATT, which should be IBRS_ALL (cherry picked from commit af189c95a371b59f493dbe0f50c0a09724868881) Fixes: 117cc7a908c83 ("x86/retpoline: Fill return stack buffer on vmexit") Signed-off-by: Darren Kenny Signed-off-by: Thomas Gleixner Reviewed-by: Konrad Rzeszutek Wilk Cc: Tom Lendacky Cc: Andi Kleen Cc: Borislav Petkov Cc: Masami Hiramatsu Cc: Arjan van de Ven Cc: David Woodhouse Link: https://lkml.kernel.org/r/20180202191220.blvgkgutojecxr3b@starbug-vm.ie.oracle.com Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 4b234a253e52b1ae98449d390cd00f07b96ca7e7 Author: Arnd Bergmann Date: Fri Feb 2 22:39:23 2018 +0100 x86/pti: Mark constant arrays as __initconst (cherry picked from commit 4bf5d56d429cbc96c23d809a08f63cd29e1a702e) I'm seeing build failures from the two newly introduced arrays that are marked 'const' and '__initdata', which are mutually exclusive: arch/x86/kernel/cpu/common.c:882:43: error: 'cpu_no_speculation' causes a section type conflict with 'e820_table_firmware_init' arch/x86/kernel/cpu/common.c:895:43: error: 'cpu_no_meltdown' causes a section type conflict with 'e820_table_firmware_init' The correct annotation is __initconst. Fixes: fec9434a12f3 ("x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown") Signed-off-by: Arnd Bergmann Signed-off-by: Thomas Gleixner Cc: Ricardo Neri Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Thomas Garnier Cc: David Woodhouse Link: https://lkml.kernel.org/r/20180202213959.611210-1-arnd@arndb.de Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 961cb14c615d13a823963585dafdc19ee9e9ba85 Author: KarimAllah Ahmed Date: Thu Feb 1 11:27:21 2018 +0000 x86/spectre: Simplify spectre_v2 command line parsing (cherry picked from commit 9005c6834c0ffdfe46afa76656bd9276cca864f6) [dwmw2: Use ARRAY_SIZE] Signed-off-by: KarimAllah Ahmed Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Cc: peterz@infradead.org Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1517484441-1420-3-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit fe4333893936daf7c02c3b8afc379bc600b5e286 Author: David Woodhouse Date: Thu Feb 1 11:27:20 2018 +0000 x86/retpoline: Avoid retpolines for built-in __init functions (cherry picked from commit 66f793099a636862a71c59d4a6ba91387b155e0c) There's no point in building init code with retpolines, since it runs before any potentially hostile userspace does. And before the retpoline is actually ALTERNATIVEd into place, for much of it. Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Cc: karahmed@amazon.de Cc: peterz@infradead.org Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1517484441-1420-2-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit eb99bd6341cb3039d0b9d2b7f89d5f2ff98e9676 Author: Dan Williams Date: Wed Jan 31 17:47:03 2018 -0800 x86/kvm: Update spectre-v1 mitigation (cherry picked from commit 085331dfc6bbe3501fb936e657331ca943827600) Commit 75f139aaf896 "KVM: x86: Add memory barrier on vmcs field lookup" added a raw 'asm("lfence");' to prevent a bounds check bypass of 'vmcs_field_to_offset_table'. The lfence can be avoided in this path by using the array_index_nospec() helper designed for these types of fixes. Signed-off-by: Dan Williams Signed-off-by: Thomas Gleixner Acked-by: Paolo Bonzini Cc: Andrew Honig Cc: kvm@vger.kernel.org Cc: Jim Mattson Link: https://lkml.kernel.org/r/151744959670.6342.3001723920950249067.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 7552556f65af7db6b2bad828beb4c633cb7d8533 Author: Josh Poimboeuf Date: Tue Jan 30 22:13:33 2018 -0600 x86/paravirt: Remove 'noreplace-paravirt' cmdline option (cherry picked from commit 12c69f1e94c89d40696e83804dd2f0965b5250cd) The 'noreplace-paravirt' option disables paravirt patching, leaving the original pv indirect calls in place. That's highly incompatible with retpolines, unless we want to uglify paravirt even further and convert the paravirt calls to retpolines. As far as I can tell, the option doesn't seem to be useful for much other than introducing surprising corner cases and making the kernel vulnerable to Spectre v2. It was probably a debug option from the early paravirt days. So just remove it. Signed-off-by: Josh Poimboeuf Signed-off-by: Thomas Gleixner Reviewed-by: Juergen Gross Cc: Andrea Arcangeli Cc: Peter Zijlstra Cc: Andi Kleen Cc: Ashok Raj Cc: Greg KH Cc: Jun Nakajima Cc: Tim Chen Cc: Rusty Russell Cc: Dave Hansen Cc: Asit Mallick Cc: Andy Lutomirski Cc: Linus Torvalds Cc: Jason Baron Cc: Paolo Bonzini Cc: Alok Kataria Cc: Arjan Van De Ven Cc: David Woodhouse Cc: Dan Williams Link: https://lkml.kernel.org/r/20180131041333.2x6blhxirc2kclrq@treble Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit cda6b6074cc6f94ba4cb69a6c8928c1cd19fe291 Author: David Woodhouse Date: Tue Jan 30 14:30:23 2018 +0000 x86/cpuid: Fix up "virtual" IBRS/IBPB/STIBP feature bits on Intel (cherry picked from commit 7fcae1118f5fd44a862aa5c3525248e35ee67c3b) Despite the fact that all the other code there seems to be doing it, just using set_cpu_cap() in early_intel_init() doesn't actually work. For CPUs with PKU support, setup_pku() calls get_cpu_cap() after c->c_init() has set those feature bits. That resets those bits back to what was queried from the hardware. Turning the bits off for bad microcode is easy to fix. That can just use setup_clear_cpu_cap() to force them off for all CPUs. I was less keen on forcing the feature bits *on* that way, just in case of inconsistencies. I appreciate that the kernel is going to get this utterly wrong if CPU features are not consistent, because it has already applied alternatives by the time secondary CPUs are brought up. But at least if setup_force_cpu_cap() isn't being used, we might have a chance of *detecting* the lack of the corresponding bit and either panicking or refusing to bring the offending CPU online. So ensure that the appropriate feature bits are set within get_cpu_cap() regardless of how many extra times it's called. Fixes: 2961298e ("x86/cpufeatures: Clean up Spectre v2 related CPUID flags") Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Cc: karahmed@amazon.de Cc: peterz@infradead.org Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1517322623-15261-1-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit f67e05d1506a24f329589c27952d776f977161b7 Author: Colin Ian King Date: Tue Jan 30 19:32:18 2018 +0000 x86/spectre: Fix spelling mistake: "vunerable"-> "vulnerable" (cherry picked from commit e698dcdfcda41efd0984de539767b4cddd235f1e) Trivial fix to spelling mistake in pr_err error message text. Signed-off-by: Colin Ian King Signed-off-by: Thomas Gleixner Cc: Andi Kleen Cc: Greg Kroah-Hartman Cc: kernel-janitors@vger.kernel.org Cc: Andy Lutomirski Cc: Borislav Petkov Cc: David Woodhouse Link: https://lkml.kernel.org/r/20180130193218.9271-1-colin.king@canonical.com Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 359fde6bd0ec91c284641965e5c5c45a1fab6d51 Author: Dan Williams Date: Mon Jan 29 17:03:21 2018 -0800 x86/spectre: Report get_user mitigation for spectre_v1 (cherry picked from commit edfbae53dab8348fca778531be9f4855d2ca0360) Reflect the presence of get_user(), __get_user(), and 'syscall' protections in sysfs. The expectation is that new and better tooling will allow the kernel to grow more usages of array_index_nospec(), for now, only claim mitigation for __user pointer de-references. Reported-by: Jiri Slaby Signed-off-by: Dan Williams Signed-off-by: Thomas Gleixner Cc: linux-arch@vger.kernel.org Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727420158.33451.11658324346540434635.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 0781a50a30d37230147954a3dd15bbfcebfc2398 Author: Dan Williams Date: Mon Jan 29 17:03:15 2018 -0800 nl80211: Sanitize array index in parse_txq_params (cherry picked from commit 259d8c1e984318497c84eef547bbb6b1d9f4eb05) Wireless drivers rely on parse_txq_params to validate that txq_params->ac is less than NL80211_NUM_ACS by the time the low-level driver's ->conf_tx() handler is called. Use a new helper, array_index_nospec(), to sanitize txq_params->ac with respect to speculation. I.e. ensure that any speculation into ->conf_tx() handlers is done with a value of txq_params->ac that is within the bounds of [0, NL80211_NUM_ACS). Reported-by: Christian Lamparter Reported-by: Elena Reshetova Signed-off-by: Dan Williams Signed-off-by: Thomas Gleixner Acked-by: Johannes Berg Cc: linux-arch@vger.kernel.org Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: linux-wireless@vger.kernel.org Cc: torvalds@linux-foundation.org Cc: "David S. Miller" Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727419584.33451.7700736761686184303.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit c26ceec69576cb61157d2487812fb2776e125260 Author: Dan Williams Date: Mon Jan 29 17:03:05 2018 -0800 vfs, fdtable: Prevent bounds-check bypass via speculative execution (cherry picked from commit 56c30ba7b348b90484969054d561f711ba196507) 'fd' is a user controlled value that is used as a data dependency to read from the 'fdt->fd' array. In order to avoid potential leaks of kernel memory values, block speculative execution of the instruction stream that could issue reads based on an invalid 'file *' returned from __fcheck_files. Co-developed-by: Elena Reshetova Signed-off-by: Dan Williams Signed-off-by: Thomas Gleixner Cc: linux-arch@vger.kernel.org Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727418500.33451.17392199002892248656.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit c3193fd49f6f0a9b378379f8fd3a95f01172d477 Author: Dan Williams Date: Mon Jan 29 17:02:59 2018 -0800 x86/syscall: Sanitize syscall table de-references under speculation (cherry picked from commit 2fbd7af5af8665d18bcefae3e9700be07e22b681) The syscall table base is a user controlled function pointer in kernel space. Use array_index_nospec() to prevent any out of bounds speculation. While retpoline prevents speculating into a userspace directed target it does not stop the pointer de-reference, the concern is leaking memory relative to the syscall table base, by observing instruction cache behavior. Reported-by: Linus Torvalds Signed-off-by: Dan Williams Signed-off-by: Thomas Gleixner Cc: linux-arch@vger.kernel.org Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Andy Lutomirski Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727417984.33451.1216731042505722161.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 398a39311c0b67ff1d886f9861ae5251f8a7cad4 Author: Dan Williams Date: Mon Jan 29 17:02:54 2018 -0800 x86/get_user: Use pointer masking to limit speculation (cherry picked from commit c7f631cb07e7da06ac1d231ca178452339e32a94) Quoting Linus: I do think that it would be a good idea to very expressly document the fact that it's not that the user access itself is unsafe. I do agree that things like "get_user()" want to be protected, but not because of any direct bugs or problems with get_user() and friends, but simply because get_user() is an excellent source of a pointer that is obviously controlled from a potentially attacking user space. So it's a prime candidate for then finding _subsequent_ accesses that can then be used to perturb the cache. Unlike the __get_user() case get_user() includes the address limit check near the pointer de-reference. With that locality the speculation can be mitigated with pointer narrowing rather than a barrier, i.e. array_index_nospec(). Where the narrowing is performed by: cmp %limit, %ptr sbb %mask, %mask and %mask, %ptr With respect to speculation the value of %ptr is either less than %limit or NULL. Co-developed-by: Linus Torvalds Signed-off-by: Dan Williams Signed-off-by: Thomas Gleixner Cc: linux-arch@vger.kernel.org Cc: Kees Cook Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro Cc: Andy Lutomirski Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727417469.33451.11804043010080838495.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 065eae4be83d8d2bea325c11610ebaf836e1cebb Author: Dan Williams Date: Mon Jan 29 17:02:49 2018 -0800 x86/uaccess: Use __uaccess_begin_nospec() and uaccess_try_nospec (cherry picked from commit 304ec1b050310548db33063e567123fae8fd0301) Quoting Linus: I do think that it would be a good idea to very expressly document the fact that it's not that the user access itself is unsafe. I do agree that things like "get_user()" want to be protected, but not because of any direct bugs or problems with get_user() and friends, but simply because get_user() is an excellent source of a pointer that is obviously controlled from a potentially attacking user space. So it's a prime candidate for then finding _subsequent_ accesses that can then be used to perturb the cache. __uaccess_begin_nospec() covers __get_user() and copy_from_iter() where the limit check is far away from the user pointer de-reference. In those cases a barrier_nospec() prevents speculation with a potential pointer to privileged memory. uaccess_try_nospec covers get_user_try. Suggested-by: Linus Torvalds Suggested-by: Andi Kleen Signed-off-by: Dan Williams Signed-off-by: Thomas Gleixner Cc: linux-arch@vger.kernel.org Cc: Kees Cook Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727416953.33451.10508284228526170604.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit ae75f83e79e4b2de7981d34ddcceb54178098bea Author: Dan Williams Date: Mon Jan 29 17:02:44 2018 -0800 x86/usercopy: Replace open coded stac/clac with __uaccess_{begin, end} (cherry picked from commit b5c4ae4f35325d520b230bab6eb3310613b72ac1) In preparation for converting some __uaccess_begin() instances to __uacess_begin_nospec(), make sure all 'from user' uaccess paths are using the _begin(), _end() helpers rather than open-coded stac() and clac(). No functional changes. Suggested-by: Ingo Molnar Signed-off-by: Dan Williams Signed-off-by: Thomas Gleixner Cc: linux-arch@vger.kernel.org Cc: Tom Lendacky Cc: Kees Cook Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727416438.33451.17309465232057176966.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit e06d7bfb223e6babd7a7f8586e9c25b87272b841 Author: Dan Williams Date: Mon Jan 29 17:02:39 2018 -0800 x86: Introduce __uaccess_begin_nospec() and uaccess_try_nospec (cherry picked from commit b3bbfb3fb5d25776b8e3f361d2eedaabb0b496cd) For __get_user() paths, do not allow the kernel to speculate on the value of a user controlled pointer. In addition to the 'stac' instruction for Supervisor Mode Access Protection (SMAP), a barrier_nospec() causes the access_ok() result to resolve in the pipeline before the CPU might take any speculative action on the pointer value. Given the cost of 'stac' the speculation barrier is placed after 'stac' to hopefully overlap the cost of disabling SMAP with the cost of flushing the instruction pipeline. Since __get_user is a major kernel interface that deals with user controlled pointers, the __uaccess_begin_nospec() mechanism will prevent speculative execution past an access_ok() permission check. While speculative execution past access_ok() is not enough to lead to a kernel memory leak, it is a necessary precondition. To be clear, __uaccess_begin_nospec() is addressing a class of potential problems near __get_user() usages. Note, that while the barrier_nospec() in __uaccess_begin_nospec() is used to protect __get_user(), pointer masking similar to array_index_nospec() will be used for get_user() since it incorporates a bounds check near the usage. uaccess_try_nospec provides the same mechanism for get_user_try. No functional changes. Suggested-by: Linus Torvalds Suggested-by: Andi Kleen Suggested-by: Ingo Molnar Signed-off-by: Dan Williams Signed-off-by: Thomas Gleixner Cc: linux-arch@vger.kernel.org Cc: Tom Lendacky Cc: Kees Cook Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727415922.33451.5796614273104346583.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 1f03d140e2f509ae27f21d229930e707436a07ac Author: Dan Williams Date: Mon Jan 29 17:02:33 2018 -0800 x86: Introduce barrier_nospec (cherry picked from commit b3d7ad85b80bbc404635dca80f5b129f6242bc7a) Rename the open coded form of this instruction sequence from rdtsc_ordered() into a generic barrier primitive, barrier_nospec(). One of the mitigations for Spectre variant1 vulnerabilities is to fence speculative execution after successfully validating a bounds check. I.e. force the result of a bounds check to resolve in the instruction pipeline to ensure speculative execution honors that result before potentially operating on out-of-bounds data. No functional changes. Suggested-by: Linus Torvalds Suggested-by: Andi Kleen Suggested-by: Ingo Molnar Signed-off-by: Dan Williams Signed-off-by: Thomas Gleixner Cc: linux-arch@vger.kernel.org Cc: Tom Lendacky Cc: Kees Cook Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727415361.33451.9049453007262764675.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 8c33e2d23a6821cb7d608011c3d2f54accf4212c Author: Dan Williams Date: Mon Jan 29 17:02:28 2018 -0800 x86: Implement array_index_mask_nospec (cherry picked from commit babdde2698d482b6c0de1eab4f697cf5856c5859) array_index_nospec() uses a mask to sanitize user controllable array indexes, i.e. generate a 0 mask if 'index' >= 'size', and a ~0 mask otherwise. While the default array_index_mask_nospec() handles the carry-bit from the (index - size) result in software. The x86 array_index_mask_nospec() does the same, but the carry-bit is handled in the processor CF flag without conditional instructions in the control flow. Suggested-by: Linus Torvalds Signed-off-by: Dan Williams Signed-off-by: Thomas Gleixner Cc: linux-arch@vger.kernel.org Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727414808.33451.1873237130672785331.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 579ef9ea20d67a80a0bc5d093259ff6e97087d59 Author: Dan Williams Date: Mon Jan 29 17:02:22 2018 -0800 array_index_nospec: Sanitize speculative array de-references (cherry picked from commit f3804203306e098dae9ca51540fcd5eb700d7f40) array_index_nospec() is proposed as a generic mechanism to mitigate against Spectre-variant-1 attacks, i.e. an attack that bypasses boundary checks via speculative execution. The array_index_nospec() implementation is expected to be safe for current generation CPUs across multiple architectures (ARM, x86). Based on an original implementation by Linus Torvalds, tweaked to remove speculative flows by Alexei Starovoitov, and tweaked again by Linus to introduce an x86 assembly implementation for the mask generation. Co-developed-by: Linus Torvalds Co-developed-by: Alexei Starovoitov Suggested-by: Cyril Novikov Signed-off-by: Dan Williams Signed-off-by: Thomas Gleixner Cc: linux-arch@vger.kernel.org Cc: kernel-hardening@lists.openwall.com Cc: Peter Zijlstra Cc: Catalin Marinas Cc: Will Deacon Cc: Russell King Cc: gregkh@linuxfoundation.org Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727414229.33451.18411580953862676575.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 899ab2cf91389aa3e27c341a9858dc22985dae2f Author: Mark Rutland Date: Mon Jan 29 17:02:16 2018 -0800 Documentation: Document array_index_nospec (cherry picked from commit f84a56f73dddaeac1dba8045b007f742f61cd2da) Document the rationale and usage of the new array_index_nospec() helper. Signed-off-by: Mark Rutland Signed-off-by: Will Deacon Signed-off-by: Dan Williams Signed-off-by: Thomas Gleixner Reviewed-by: Kees Cook Cc: linux-arch@vger.kernel.org Cc: Jonathan Corbet Cc: Peter Zijlstra Cc: gregkh@linuxfoundation.org Cc: kernel-hardening@lists.openwall.com Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727413645.33451.15878817161436755393.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit f03d00ba0b478963fc96c975d3b27fc1b9bc3a43 Author: Andy Lutomirski Date: Sun Jan 28 10:38:50 2018 -0800 x86/asm: Move 'status' from thread_struct to thread_info (cherry picked from commit 37a8f7c38339b22b69876d6f5a0ab851565284e3) The TS_COMPAT bit is very hot and is accessed from code paths that mostly also touch thread_info::flags. Move it into struct thread_info to improve cache locality. The only reason it was in thread_struct is that there was a brief period during which arch-specific fields were not allowed in struct thread_info. Linus suggested further changing: ti->status &= ~(TS_COMPAT|TS_I386_REGS_POKED); to: if (unlikely(ti->status & (TS_COMPAT|TS_I386_REGS_POKED))) ti->status &= ~(TS_COMPAT|TS_I386_REGS_POKED); on the theory that frequently dirtying the cacheline even in pure 64-bit code that never needs to modify status hurts performance. That could be a reasonable followup patch, but I suspect it matters less on top of this patch. Suggested-by: Linus Torvalds Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Reviewed-by: Ingo Molnar Acked-by: Linus Torvalds Cc: Borislav Petkov Cc: Kernel Hardening Link: https://lkml.kernel.org/r/03148bcc1b217100e6e8ecf6a5468c45cf4304b6.1517164461.git.luto@kernel.org Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 572e509178112895f6162ead5d4bf39d2b981729 Author: Andy Lutomirski Date: Sun Jan 28 10:38:49 2018 -0800 x86/entry/64: Push extra regs right away (cherry picked from commit d1f7732009e0549eedf8ea1db948dc37be77fd46) With the fast path removed there is no point in splitting the push of the normal and the extra register set. Just push the extra regs right away. [ tglx: Split out from 'x86/entry/64: Remove the SYSCALL64 fast path' ] Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Acked-by: Ingo Molnar Cc: Borislav Petkov Cc: Linus Torvalds Cc: Kernel Hardening Link: https://lkml.kernel.org/r/462dff8d4d64dfbfc851fbf3130641809d980ecd.1517164461.git.luto@kernel.org Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit d7f8d17406d62f0c8b20a9100d34d0e203557fe1 Author: Andy Lutomirski Date: Sun Jan 28 10:38:49 2018 -0800 x86/entry/64: Remove the SYSCALL64 fast path (cherry picked from commit 21d375b6b34ff511a507de27bf316b3dde6938d9) The SYCALLL64 fast path was a nice, if small, optimization back in the good old days when syscalls were actually reasonably fast. Now there is PTI to slow everything down, and indirect branches are verboten, making everything messier. The retpoline code in the fast path is particularly nasty. Just get rid of the fast path. The slow path is barely slower. [ tglx: Split out the 'push all extra regs' part ] Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Acked-by: Ingo Molnar Cc: Borislav Petkov Cc: Linus Torvalds Cc: Kernel Hardening Link: https://lkml.kernel.org/r/462dff8d4d64dfbfc851fbf3130641809d980ecd.1517164461.git.luto@kernel.org Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 9eedeb72c4127726b2f3d62b4c7a163f67d721ab Author: Dou Liyang Date: Tue Jan 30 14:13:50 2018 +0800 x86/spectre: Check CONFIG_RETPOLINE in command line parser (cherry picked from commit 9471eee9186a46893726e22ebb54cade3f9bc043) The spectre_v2 option 'auto' does not check whether CONFIG_RETPOLINE is enabled. As a consequence it fails to emit the appropriate warning and sets feature flags which have no effect at all. Add the missing IS_ENABLED() check. Fixes: da285121560e ("x86/spectre: Add boot time option to select Spectre v2 mitigation") Signed-off-by: Dou Liyang Signed-off-by: Thomas Gleixner Cc: ak@linux.intel.com Cc: peterz@infradead.org Cc: Tomohiro Cc: dave.hansen@intel.com Cc: bp@alien8.de Cc: arjan@linux.intel.com Cc: dwmw@amazon.co.uk Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/f5892721-7528-3647-08fb-f8d10e65ad87@cn.fujitsu.com Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 77d1424d2fb84ba2d3ce60a0ec4c3913e2e59abd Author: Borislav Petkov Date: Sat Jan 27 16:24:33 2018 +0000 x86/retpoline: Simplify vmexit_fill_RSB() (cherry picked from commit 1dde7415e99933bb7293d6b2843752cbdb43ec11) Simplify it to call an asm-function instead of pasting 41 insn bytes at every call site. Also, add alignment to the macro as suggested here: https://support.google.com/faqs/answer/7625886 [dwmw2: Clean up comments, let it clobber %ebx and just tell the compiler] Signed-off-by: Borislav Petkov Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Cc: ak@linux.intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1517070274-12128-3-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 77b3b3ee238664df176ed44dd4658e578186c6d3 Author: David Woodhouse Date: Sat Jan 27 16:24:32 2018 +0000 x86/cpufeatures: Clean up Spectre v2 related CPUID flags (cherry picked from commit 2961298efe1ea1b6fc0d7ee8b76018fa6c0bcef2) We want to expose the hardware features simply in /proc/cpuinfo as "ibrs", "ibpb" and "stibp". Since AMD has separate CPUID bits for those, use them as the user-visible bits. When the Intel SPEC_CTRL bit is set which indicates both IBRS and IBPB capability, set those (AMD) bits accordingly. Likewise if the Intel STIBP bit is set, set the AMD STIBP that's used for the generic hardware capability. Hide the rest from /proc/cpuinfo by putting "" in the comments. Including RETPOLINE and RETPOLINE_AMD which shouldn't be visible there. There are patches to make the sysfs vulnerabilities information non-readable by non-root, and the same should apply to all information about which mitigations are actually in use. Those *shouldn't* appear in /proc/cpuinfo. The feature bit for whether IBPB is actually used, which is needed for ALTERNATIVEs, is renamed to X86_FEATURE_USE_IBPB. Originally-by: Borislav Petkov Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Cc: ak@linux.intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1517070274-12128-2-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 98911226d51ea50dd8cb33cfb5a90be39872401b Author: Thomas Gleixner Date: Sat Jan 27 15:45:14 2018 +0100 x86/cpu/bugs: Make retpoline module warning conditional (cherry picked from commit e383095c7fe8d218e00ec0f83e4b95ed4e627b02) If sysfs is disabled and RETPOLINE not defined: arch/x86/kernel/cpu/bugs.c:97:13: warning: ‘spectre_v2_bad_module’ defined but not used [-Wunused-variable] static bool spectre_v2_bad_module; Hide it. Fixes: caf7501a1b4e ("module/retpoline: Warn about missing retpoline in module") Reported-by: Borislav Petkov Signed-off-by: Thomas Gleixner Cc: Andi Kleen Cc: David Woodhouse Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 557cbfa2221110761d6c26085c6df43d48425598 Author: Borislav Petkov Date: Fri Jan 26 13:11:39 2018 +0100 x86/bugs: Drop one "mitigation" from dmesg (cherry picked from commit 55fa19d3e51f33d9cd4056d25836d93abf9438db) Make [ 0.031118] Spectre V2 mitigation: Mitigation: Full generic retpoline into [ 0.031118] Spectre V2: Mitigation: Full generic retpoline to reduce the mitigation mitigations strings. Signed-off-by: Borislav Petkov Signed-off-by: Thomas Gleixner Reviewed-by: Greg Kroah-Hartman Cc: riel@redhat.com Cc: ak@linux.intel.com Cc: peterz@infradead.org Cc: David Woodhouse Cc: jikos@kernel.org Cc: luto@amacapital.net Cc: dave.hansen@intel.com Cc: torvalds@linux-foundation.org Cc: keescook@google.com Cc: Josh Poimboeuf Cc: tim.c.chen@linux.intel.com Cc: pjt@google.com Link: https://lkml.kernel.org/r/20180126121139.31959-5-bp@alien8.de Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 18bc71dff630283333ccea760efb398dbe282a72 Author: Borislav Petkov Date: Fri Jan 26 13:11:37 2018 +0100 x86/nospec: Fix header guards names (cherry picked from commit 7a32fc51ca938e67974cbb9db31e1a43f98345a9) ... to adhere to the _ASM_X86_ naming scheme. No functional change. Signed-off-by: Borislav Petkov Signed-off-by: Thomas Gleixner Cc: riel@redhat.com Cc: ak@linux.intel.com Cc: peterz@infradead.org Cc: David Woodhouse Cc: jikos@kernel.org Cc: luto@amacapital.net Cc: dave.hansen@intel.com Cc: torvalds@linux-foundation.org Cc: keescook@google.com Cc: Josh Poimboeuf Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Cc: pjt@google.com Link: https://lkml.kernel.org/r/20180126121139.31959-3-bp@alien8.de Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 31fd9eda7f69e0be0d0410682fc0d4cd76fe3699 Author: David Woodhouse Date: Thu Jan 25 16:14:15 2018 +0000 x86/speculation: Add basic IBPB (Indirect Branch Prediction Barrier) support (cherry picked from commit 20ffa1caecca4db8f79fe665acdeaa5af815a24d) Expose indirect_branch_prediction_barrier() for use in subsequent patches. [ tglx: Add IBPB status to spectre_v2 sysfs file ] Co-developed-by: KarimAllah Ahmed Signed-off-by: KarimAllah Ahmed Signed-off-by: David Woodhouse Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-8-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 6c5e49150a51b0210710246b61a972300a274dcd Author: David Woodhouse Date: Thu Jan 25 16:14:14 2018 +0000 x86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodes (cherry picked from commit a5b2966364538a0e68c9fa29bc0a3a1651799035) This doesn't refuse to load the affected microcodes; it just refuses to use the Spectre v2 mitigation features if they're detected, by clearing the appropriate feature bits. The AMD CPUID bits are handled here too, because hypervisors *may* have been exposing those bits even on Intel chips, for fine-grained control of what's available. It is non-trivial to use x86_match_cpu() for this table because that doesn't handle steppings. And the approach taken in commit bd9240a18 almost made me lose my lunch. Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Reviewed-by: Greg Kroah-Hartman Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-7-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit a8799fd14d9f7f385a5a5c86cde247caf4bb0320 Author: David Woodhouse Date: Thu Jan 25 16:14:13 2018 +0000 x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown (cherry picked from commit fec9434a12f38d3aeafeb75711b71d8a1fdef621) Also, for CPUs which don't speculate at all, don't report that they're vulnerable to the Spectre variants either. Leave the cpu_no_meltdown[] match table with just X86_VENDOR_AMD in it for now, even though that could be done with a simple comparison, on the assumption that we'll have more to add. Based on suggestions from Dave Hansen and Alan Cox. Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Reviewed-by: Greg Kroah-Hartman Reviewed-by: Borislav Petkov Acked-by: Dave Hansen Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-6-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit af57d43c908fe9b06dc555b400eca68562262eca Author: David Woodhouse Date: Thu Jan 25 16:14:12 2018 +0000 x86/msr: Add definitions for new speculation control MSRs (cherry picked from commit 1e340c60d0dd3ae07b5bedc16a0469c14b9f3410) Add MSR and bit definitions for SPEC_CTRL, PRED_CMD and ARCH_CAPABILITIES. See Intel's 336996-Speculative-Execution-Side-Channel-Mitigations.pdf Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Reviewed-by: Greg Kroah-Hartman Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-5-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit c26a6bea26b356fb3539cdf5b2e348a5e528aa7b Author: David Woodhouse Date: Thu Jan 25 16:14:11 2018 +0000 x86/cpufeatures: Add AMD feature bits for Speculation Control (cherry picked from commit 5d10cbc91d9eb5537998b65608441b592eec65e7) AMD exposes the PRED_CMD/SPEC_CTRL MSRs slightly differently to Intel. See http://lkml.kernel.org/r/2b3e25cc-286d-8bd0-aeaf-9ac4aae39de8@amd.com Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Reviewed-by: Greg Kroah-Hartman Cc: Tom Lendacky Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-4-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 40532f65cccc5056b50cf1ab07a9a41445b24aa8 Author: David Woodhouse Date: Thu Jan 25 16:14:10 2018 +0000 x86/cpufeatures: Add Intel feature bits for Speculation Control (cherry picked from commit fc67dd70adb711a45d2ef34e12d1a8be75edde61) Add three feature bits exposed by new microcode on Intel CPUs for speculation control. Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Reviewed-by: Greg Kroah-Hartman Reviewed-by: Borislav Petkov Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-3-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit d3eba7744075dc55f367364b5ed2055b5e3d5687 Author: David Woodhouse Date: Thu Jan 25 16:14:09 2018 +0000 x86/cpufeatures: Add CPUID_7_EDX CPUID leaf (cherry picked from commit 95ca0ee8636059ea2800dfbac9ecac6212d6b38f) This is a pure feature bits leaf. There are two AVX512 feature bits in it already which were handled as scattered bits, and three more from this leaf are going to be added for speculation control features. Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Reviewed-by: Greg Kroah-Hartman Reviewed-by: Borislav Petkov Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-2-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit a1745ad92f50e95b6c2d101cd9b77253969ddccc Author: Andi Kleen Date: Thu Jan 25 15:50:28 2018 -0800 module/retpoline: Warn about missing retpoline in module (cherry picked from commit caf7501a1b4ec964190f31f9c3f163de252273b8) There's a risk that a kernel which has full retpoline mitigations becomes vulnerable when a module gets loaded that hasn't been compiled with the right compiler or the right option. To enable detection of that mismatch at module load time, add a module info string "retpoline" at build time when the module was compiled with retpoline support. This only covers compiled C source, but assembler source or prebuilt object files are not checked. If a retpoline enabled kernel detects a non retpoline protected module at load time, print a warning and report it in the sysfs vulnerability file. [ tglx: Massaged changelog ] Signed-off-by: Andi Kleen Signed-off-by: Thomas Gleixner Cc: David Woodhouse Cc: gregkh@linuxfoundation.org Cc: torvalds@linux-foundation.org Cc: jeyu@kernel.org Cc: arjan@linux.intel.com Link: https://lkml.kernel.org/r/20180125235028.31211-1-andi@firstfloor.org Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit ec86a1dad0c00757ceb49e86b4c10dd7fbc380cb Author: Peter Zijlstra Date: Thu Jan 25 10:58:14 2018 +0100 KVM: VMX: Make indirect call speculation safe (cherry picked from commit c940a3fb1e2e9b7d03228ab28f375fb5a47ff699) Replace indirect call with CALL_NOSPEC. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner Reviewed-by: David Woodhouse Cc: Andrea Arcangeli Cc: Andi Kleen Cc: Ashok Raj Cc: Greg KH Cc: Jun Nakajima Cc: David Woodhouse Cc: Linus Torvalds Cc: rga@amazon.de Cc: Dave Hansen Cc: Asit Mallick Cc: Andy Lutomirski Cc: Josh Poimboeuf Cc: Jason Baron Cc: Paolo Bonzini Cc: Dan Williams Cc: Arjan Van De Ven Cc: Tim Chen Link: https://lkml.kernel.org/r/20180125095843.645776917@infradead.org Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit fea3c9a54012227b30e73f5fb3d132c103c7aa84 Author: Peter Zijlstra Date: Thu Jan 25 10:58:13 2018 +0100 KVM: x86: Make indirect calls in emulator speculation safe (cherry picked from commit 1a29b5b7f347a1a9230c1e0af5b37e3e571588ab) Replace the indirect calls with CALL_NOSPEC. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner Reviewed-by: David Woodhouse Cc: Andrea Arcangeli Cc: Andi Kleen Cc: Ashok Raj Cc: Greg KH Cc: Jun Nakajima Cc: David Woodhouse Cc: Linus Torvalds Cc: rga@amazon.de Cc: Dave Hansen Cc: Asit Mallick Cc: Andy Lutomirski Cc: Josh Poimboeuf Cc: Jason Baron Cc: Paolo Bonzini Cc: Dan Williams Cc: Arjan Van De Ven Cc: Tim Chen Link: https://lkml.kernel.org/r/20180125095843.595615683@infradead.org [dwmw2: Use ASM_CALL_CONSTRAINT like upstream, now we have it] Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 734e687d1d7bcddf2548eeb5a4935a186f452b69 Author: Waiman Long Date: Mon Jan 22 17:09:34 2018 -0500 x86/retpoline: Remove the esp/rsp thunk (cherry picked from commit 1df37383a8aeabb9b418698f0bcdffea01f4b1b2) It doesn't make sense to have an indirect call thunk with esp/rsp as retpoline code won't work correctly with the stack pointer register. Removing it will help compiler writers to catch error in case such a thunk call is emitted incorrectly. Fixes: 76b043848fd2 ("x86/retpoline: Add initial retpoline support") Suggested-by: Jeff Law Signed-off-by: Waiman Long Signed-off-by: Thomas Gleixner Acked-by: David Woodhouse Cc: Tom Lendacky Cc: Kees Cook Cc: Andi Kleen Cc: Tim Chen Cc: Peter Zijlstra Cc: Linus Torvalds Cc: Jiri Kosina Cc: Andy Lutomirski Cc: Dave Hansen Cc: Josh Poimboeuf Cc: Arjan van de Ven Cc: Greg Kroah-Hartman Cc: Paul Turner Link: https://lkml.kernel.org/r/1516658974-27852-1-git-send-email-longman@redhat.com Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 9692602ab850eec484d8cc5a740803d34f00016c Author: Eric Biggers Date: Thu Jun 8 14:48:18 2017 +0100 KEYS: encrypted: fix buffer overread in valid_master_desc() commit 794b4bc292f5d31739d89c0202c54e7dc9bc3add upstream. With the 'encrypted' key type it was possible for userspace to provide a data blob ending with a master key description shorter than expected, e.g. 'keyctl add encrypted desc "new x" @s'. When validating such a master key description, validate_master_desc() could read beyond the end of the buffer. Fix this by using strncmp() instead of memcmp(). [Also clean up the code to deduplicate some logic.] Cc: Mimi Zohar Signed-off-by: Eric Biggers Signed-off-by: David Howells Signed-off-by: James Morris Signed-off-by: Jin Qian Signed-off-by: Greg Kroah-Hartman commit 0a01ecbd23a9547b60b1a1d7e83f60704b176925 Author: Takashi Iwai Date: Thu May 4 11:27:52 2017 +0200 b43: Add missing MODULE_FIRMWARE() commit 3c89a72ad80c64bdbd5ff851ee9c328a191f7e01 upstream. Some firmware entries were forgotten to be added via MODULE_FIRMWARE(), which may result in the non-functional state when the driver is loaded in initrd. Link: http://bugzilla.opensuse.org/show_bug.cgi?id=1037344 Fixes: 15be8e89cdd9 ("b43: add more bcma cores") Signed-off-by: Takashi Iwai Signed-off-by: Kalle Valo Signed-off-by: Greg Kroah-Hartman commit 113d22965c78a79210d4da2d455dc9bcff5e2fb6 Author: Jesse Chan Date: Mon Nov 20 15:56:28 2017 -0500 media: soc_camera: soc_scale_crop: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE commit 5331aec1bf9c9da557668174e0a4bfcee39f1121 upstream. This change resolves a new compile-time warning when built as a loadable module: WARNING: modpost: missing MODULE_LICENSE() in drivers/media/platform/soc_camera/soc_scale_crop.o see include/linux/module.h for more information This adds the license as "GPL", which matches the header of the file. MODULE_DESCRIPTION and MODULE_AUTHOR are also added. Signed-off-by: Jesse Chan Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit dd7b14c3e05ef49bc2062ccc18a703604063fe1d Author: Borislav Petkov Date: Sun Dec 18 17:44:13 2016 +0100 x86/microcode/AMD: Do not load when running on a hypervisor commit a15a753539eca8ba243d576f02e7ca9c4b7d7042 upstream. Doing so is completely void of sense for multiple reasons so prevent it. Set dis_ucode_ldr to true and thus disable the microcode loader by default to address xen pv guests which execute the AP path but not the BSP path. By having it turned off by default, the APs won't run into the loader either. Also, check CPUID(1).ECX[31] which hypervisors set. Well almost, not the xen pv one. That one gets the aforementioned "fix". Also, improve the detection method by caching the final decision whether to continue loading in dis_ucode_ldr and do it once on the BSP. The APs then simply test that value. Signed-off-by: Borislav Petkov Tested-by: Juergen Gross Tested-by: Boris Ostrovsky Acked-by: Juergen Gross Link: http://lkml.kernel.org/r/20161218164414.9649-4-bp@alien8.de Signed-off-by: Thomas Gleixner Cc: Rolf Neugebauer Signed-off-by: Greg Kroah-Hartman commit 0a9b2dec6c12e30895ab7478000420ff2e811dce Author: Josh Poimboeuf Date: Thu Sep 28 16:58:26 2017 -0500 x86/asm: Fix inline asm call constraints for GCC 4.4 commit 520a13c530aeb5f63e011d668c42db1af19ed349 upstream. The kernel test bot (run by Xiaolong Ye) reported that the following commit: f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang") is causing double faults in a kernel compiled with GCC 4.4. Linus subsequently diagnosed the crash pattern and the buggy commit and found that the issue is with this code: register unsigned int __asm_call_sp asm("esp"); #define ASM_CALL_CONSTRAINT "+r" (__asm_call_sp) Even on a 64-bit kernel, it's using ESP instead of RSP. That causes GCC to produce the following bogus code: ffffffff8147461d: 89 e0 mov %esp,%eax ffffffff8147461f: 4c 89 f7 mov %r14,%rdi ffffffff81474622: 4c 89 fe mov %r15,%rsi ffffffff81474625: ba 20 00 00 00 mov $0x20,%edx ffffffff8147462a: 89 c4 mov %eax,%esp ffffffff8147462c: e8 bf 52 05 00 callq ffffffff814c98f0 Despite the absurdity of it backing up and restoring the stack pointer for no reason, the bug is actually the fact that it's only backing up and restoring the lower 32 bits of the stack pointer. The upper 32 bits are getting cleared out, corrupting the stack pointer. So change the '__asm_call_sp' register variable to be associated with the actual full-size stack pointer. This also requires changing the __ASM_SEL() macro to be based on the actual compiled arch size, rather than the CONFIG value, because CONFIG_X86_64 compiles some files with '-m32' (e.g., realmode and vdso). Otherwise Clang fails to build the kernel because it complains about the use of a 64-bit register (RSP) in a 32-bit file. Reported-and-Bisected-and-Tested-by: kernel test robot Diagnosed-by: Linus Torvalds Signed-off-by: Josh Poimboeuf Cc: Alexander Potapenko Cc: Andrey Ryabinin Cc: Andy Lutomirski Cc: Arnd Bergmann Cc: Dmitriy Vyukov Cc: LKP Cc: Linus Torvalds Cc: Matthias Kaehlcke Cc: Miguel Bernal Marin Cc: Peter Zijlstra Cc: Thomas Gleixner Fixes: f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang") Link: http://lkml.kernel.org/r/20170928215826.6sdpmwtkiydiytim@treble Signed-off-by: Ingo Molnar Cc: Matthias Kaehlcke Signed-off-by: Greg Kroah-Hartman commit b671f40419bb0d59c5af69c17af5a86bc467a273 Author: Eric Dumazet Date: Fri Feb 2 10:27:27 2018 -0800 soreuseport: fix mem leak in reuseport_add_sock() [ Upstream commit 4db428a7c9ab07e08783e0fcdc4ca0f555da0567 ] reuseport_add_sock() needs to deal with attaching a socket having its own sk_reuseport_cb, after a prior setsockopt(SO_ATTACH_REUSEPORT_?BPF) Without this fix, not only a WARN_ONCE() was issued, but we were also leaking memory. Thanks to sysbot and Eric Biggers for providing us nice C repros. ------------[ cut here ]------------ socket already in reuseport group WARNING: CPU: 0 PID: 3496 at net/core/sock_reuseport.c:119   reuseport_add_sock+0x742/0x9b0 net/core/sock_reuseport.c:117 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 3496 Comm: syzkaller869503 Not tainted 4.15.0-rc6+ #245 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS   Google 01/01/2011 Call Trace:   __dump_stack lib/dump_stack.c:17 [inline]   dump_stack+0x194/0x257 lib/dump_stack.c:53   panic+0x1e4/0x41c kernel/panic.c:183   __warn+0x1dc/0x200 kernel/panic.c:547   report_bug+0x211/0x2d0 lib/bug.c:184   fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178   fixup_bug arch/x86/kernel/traps.c:247 [inline]   do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296   do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315   invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:1079 Fixes: ef456144da8e ("soreuseport: define reuseport groups") Signed-off-by: Eric Dumazet Reported-by: syzbot+c0ea2226f77a42936bf7@syzkaller.appspotmail.com Acked-by: Craig Gallek Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 5771415d24bf66f199a4a16d900d2c3afaaa6423 Author: Martin KaFai Lau Date: Wed Jan 24 23:15:27 2018 -0800 ipv6: Fix SO_REUSEPORT UDP socket with implicit sk_ipv6only [ Upstream commit 7ece54a60ee2ba7a386308cae73c790bd580589c ] If a sk_v6_rcv_saddr is !IPV6_ADDR_ANY and !IPV6_ADDR_MAPPED, it implicitly implies it is an ipv6only socket. However, in inet6_bind(), this addr_type checking and setting sk->sk_ipv6only to 1 are only done after sk->sk_prot->get_port(sk, snum) has been completed successfully. This inconsistency between sk_v6_rcv_saddr and sk_ipv6only confuses the 'get_port()'. In particular, when binding SO_REUSEPORT UDP sockets, udp_reuseport_add_sock(sk,...) is called. udp_reuseport_add_sock() checks "ipv6_only_sock(sk2) == ipv6_only_sock(sk)" before adding sk to sk2->sk_reuseport_cb. In this case, ipv6_only_sock(sk2) could be 1 while ipv6_only_sock(sk) is still 0 here. The end result is, reuseport_alloc(sk) is called instead of adding sk to the existing sk2->sk_reuseport_cb. It can be reproduced by binding two SO_REUSEPORT UDP sockets on an IPv6 address (!ANY and !MAPPED). Only one of the socket will receive packet. The fix is to set the implicit sk_ipv6only before calling get_port(). The original sk_ipv6only has to be saved such that it can be restored in case get_port() failed. The situation is similar to the inet_reset_saddr(sk) after get_port() has failed. Thanks to Calvin Owens who created an easy reproduction which leads to a fix. Fixes: e32ea7e74727 ("soreuseport: fast reuseport UDP socket selection") Signed-off-by: Martin KaFai Lau Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit fa46d1437fabe9da3a3c084cef6ac310ecadb9f8 Author: Paolo Abeni Date: Fri Feb 2 16:02:22 2018 +0100 cls_u32: add missing RCU annotation. [ Upstream commit 058a6c033488494a6b1477b05fe8e1a16e344462 ] In a couple of points of the control path, n->ht_down is currently accessed without the required RCU annotation. The accesses are safe, but sparse complaints. Since we already held the rtnl lock, let use rtnl_dereference(). Fixes: a1b7c5fd7fe9 ("net: sched: add cls_u32 offload hooks for netdevs") Fixes: de5df63228fc ("net: sched: cls_u32 changes to knode must appear atomic to readers") Signed-off-by: Paolo Abeni Acked-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit b980f718f52561b58072e0e675320fd21d71b94a Author: Neal Cardwell Date: Wed Jan 31 15:43:05 2018 -0500 tcp_bbr: fix pacing_gain to always be unity when using lt_bw [ Upstream commit 3aff3b4b986e51bcf4ab249e5d48d39596e0df6a ] This commit fixes the pacing_gain to remain at BBR_UNIT (1.0) when using lt_bw and returning from the PROBE_RTT state to PROBE_BW. Previously, when using lt_bw, upon exiting PROBE_RTT and entering PROBE_BW the bbr_reset_probe_bw_mode() code could sometimes randomly end up with a cycle_idx of 0 and hence have bbr_advance_cycle_phase() set a pacing gain above 1.0. In such cases this would result in a pacing rate that is 1.25x higher than intended, potentially resulting in a high loss rate for a little while until we stop using the lt_bw a bit later. This commit is a stable candidate for kernels back as far as 4.9. Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") Signed-off-by: Neal Cardwell Signed-off-by: Yuchung Cheng Signed-off-by: Soheil Hassas Yeganeh Reported-by: Beyers Cronje Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 73adb3b74efd2809735d2063b8fdaaf762bdd071 Author: Jason Wang Date: Thu Jan 25 22:03:52 2018 +0800 vhost_net: stop device during reset owner [ Upstream commit 4cd879515d686849eec5f718aeac62a70b067d82 ] We don't stop device before reset owner, this means we could try to serve any virtqueue kick before reset dev->worker. This will result a warn since the work was pending at llist during owner resetting. Fix this by stopping device during owner reset. Reported-by: syzbot+eb17c6162478cc50632c@syzkaller.appspotmail.com Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server") Signed-off-by: Jason Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ee46a8614204e136e7ec6426e9c696bd21901405 Author: Li RongQing Date: Fri Jan 26 16:40:41 2018 +0800 tcp: release sk_frag.page in tcp_disconnect [ Upstream commit 9b42d55a66d388e4dd5550107df051a9637564fc ] socket can be disconnected and gets transformed back to a listening socket, if sk_frag.page is not released, which will be cloned into a new socket by sk_clone_lock, but the reference count of this page is increased, lead to a use after free or double free issue Signed-off-by: Li RongQing Cc: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 5db5cabbf09dd793bf42e65cea9abe66fad62c13 Author: Chunhao Lin Date: Wed Jan 31 01:32:36 2018 +0800 r8169: fix RTL8168EP take too long to complete driver initialization. [ Upstream commit 086ca23d03c0d2f4088f472386778d293e15c5f6 ] Driver check the wrong register bit in rtl_ocp_tx_cond() that keep driver waiting until timeout. Fix this by waiting for the right register bit. Signed-off-by: Chunhao Lin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 9f2f873d5a1c3f6b22b18db0e4c9f3a30164b8f0 Author: Kristian Evensen Date: Tue Jan 30 14:12:55 2018 +0100 qmi_wwan: Add support for Quectel EP06 [ Upstream commit c0b91a56a2e57a5a370655b25d677ae0ebf8a2d0 ] The Quectel EP06 is a Cat. 6 LTE modem. It uses the same interface as the EC20/EC25 for QMI, and requires the same "set DTR"-quirk to work. Signed-off-by: Kristian Evensen Acked-by: Bjørn Mork Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 97fe899816a68e3970b877e766bcb8d66013076b Author: Junxiao Bi Date: Mon Jan 29 17:53:42 2018 +0800 qlcnic: fix deadlock bug [ Upstream commit 233ac3891607f501f08879134d623b303838f478 ] The following soft lockup was caught. This is a deadlock caused by recusive locking. Process kworker/u40:1:28016 was holding spin lock "mbx->queue_lock" in qlcnic_83xx_mailbox_worker(), while a softirq came in and ask the same spin lock in qlcnic_83xx_enqueue_mbx_cmd(). This lock should be hold by disable bh.. [161846.962125] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/u40:1:28016] [161846.962367] Modules linked in: tun ocfs2 xen_netback xen_blkback xen_gntalloc xen_gntdev xen_evtchn xenfs xen_privcmd autofs4 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bnx2fc fcoe libfcoe libfc sunrpc 8021q mrp garp bridge stp llc bonding dm_round_robin dm_multipath iTCO_wdt iTCO_vendor_support pcspkr sb_edac edac_core i2c_i801 shpchp lpc_ich mfd_core ioatdma ipmi_devintf ipmi_si ipmi_msghandler sg ext4 jbd2 mbcache2 sr_mod cdrom sd_mod igb i2c_algo_bit i2c_core ahci libahci megaraid_sas ixgbe dca ptp pps_core vxlan udp_tunnel ip6_udp_tunnel qla2xxx scsi_transport_fc qlcnic crc32c_intel be2iscsi bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi ipv6 cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi dm_mirror dm_region_hash dm_log dm_mod [161846.962454] [161846.962460] CPU: 1 PID: 28016 Comm: kworker/u40:1 Not tainted 4.1.12-94.5.9.el6uek.x86_64 #2 [161846.962463] Hardware name: Oracle Corporation SUN SERVER X4-2L /ASSY,MB,X4-2L , BIOS 26050100 09/19/2017 [161846.962489] Workqueue: qlcnic_mailbox qlcnic_83xx_mailbox_worker [qlcnic] [161846.962493] task: ffff8801f2e34600 ti: ffff88004ca5c000 task.ti: ffff88004ca5c000 [161846.962496] RIP: e030:[] [] xen_hypercall_sched_op+0xa/0x20 [161846.962506] RSP: e02b:ffff880202e43388 EFLAGS: 00000206 [161846.962509] RAX: 0000000000000000 RBX: ffff8801f6996b70 RCX: ffffffff810013aa [161846.962511] RDX: ffff880202e433cc RSI: ffff880202e433b0 RDI: 0000000000000003 [161846.962513] RBP: ffff880202e433d0 R08: 0000000000000000 R09: ffff8801fe893200 [161846.962516] R10: ffff8801fe400538 R11: 0000000000000206 R12: ffff880202e4b000 [161846.962518] R13: 0000000000000050 R14: 0000000000000001 R15: 000000000000020d [161846.962528] FS: 0000000000000000(0000) GS:ffff880202e40000(0000) knlGS:ffff880202e40000 [161846.962531] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [161846.962533] CR2: 0000000002612640 CR3: 00000001bb796000 CR4: 0000000000042660 [161846.962536] Stack: [161846.962538] ffff880202e43608 0000000000000000 ffffffff813f0442 ffff880202e433b0 [161846.962543] 0000000000000000 ffff880202e433cc ffffffff00000001 0000000000000000 [161846.962547] 00000009813f03d6 ffff880202e433e0 ffffffff813f0460 ffff880202e43440 [161846.962552] Call Trace: [161846.962555] [161846.962565] [] ? xen_poll_irq_timeout+0x42/0x50 [161846.962570] [] xen_poll_irq+0x10/0x20 [161846.962578] [] xen_lock_spinning+0xe2/0x110 [161846.962583] [] __raw_callee_save_xen_lock_spinning+0x11/0x20 [161846.962592] [] ? _raw_spin_lock+0x57/0x80 [161846.962609] [] qlcnic_83xx_enqueue_mbx_cmd+0x7c/0xe0 [qlcnic] [161846.962623] [] qlcnic_83xx_issue_cmd+0x58/0x210 [qlcnic] [161846.962636] [] qlcnic_83xx_sre_macaddr_change+0x162/0x1d0 [qlcnic] [161846.962649] [] qlcnic_83xx_change_l2_filter+0x2b/0x30 [qlcnic] [161846.962657] [] ? __skb_flow_dissect+0x18b/0x650 [161846.962670] [] qlcnic_send_filter+0x205/0x250 [qlcnic] [161846.962682] [] qlcnic_xmit_frame+0x547/0x7b0 [qlcnic] [161846.962691] [] xmit_one+0x82/0x1a0 [161846.962696] [] dev_hard_start_xmit+0x50/0xa0 [161846.962701] [] sch_direct_xmit+0x112/0x220 [161846.962706] [] __dev_queue_xmit+0x1df/0x5e0 [161846.962710] [] dev_queue_xmit_sk+0x13/0x20 [161846.962721] [] bond_dev_queue_xmit+0x35/0x80 [bonding] [161846.962729] [] __bond_start_xmit+0x1cb/0x210 [bonding] [161846.962736] [] bond_start_xmit+0x31/0x60 [bonding] [161846.962740] [] xmit_one+0x82/0x1a0 [161846.962745] [] dev_hard_start_xmit+0x50/0xa0 [161846.962749] [] __dev_queue_xmit+0x4ee/0x5e0 [161846.962754] [] dev_queue_xmit_sk+0x13/0x20 [161846.962760] [] vlan_dev_hard_start_xmit+0xb2/0x150 [8021q] [161846.962764] [] xmit_one+0x82/0x1a0 [161846.962769] [] dev_hard_start_xmit+0x50/0xa0 [161846.962773] [] __dev_queue_xmit+0x4ee/0x5e0 [161846.962777] [] dev_queue_xmit_sk+0x13/0x20 [161846.962789] [] br_dev_queue_push_xmit+0x54/0xa0 [bridge] [161846.962797] [] br_forward_finish+0x2f/0x90 [bridge] [161846.962807] [] ? ttwu_do_wakeup+0x1d/0x100 [161846.962811] [] ? __alloc_skb+0x8b/0x1f0 [161846.962818] [] __br_forward+0x8d/0x120 [bridge] [161846.962822] [] ? __kmalloc_reserve+0x3b/0xa0 [161846.962829] [] ? update_rq_runnable_avg+0xee/0x230 [161846.962836] [] br_forward+0x96/0xb0 [bridge] [161846.962845] [] br_handle_frame_finish+0x1ae/0x420 [bridge] [161846.962853] [] br_handle_frame+0x17f/0x260 [bridge] [161846.962862] [] ? br_handle_frame_finish+0x420/0x420 [bridge] [161846.962867] [] __netif_receive_skb_core+0x1f7/0x870 [161846.962872] [] __netif_receive_skb+0x22/0x70 [161846.962877] [] netif_receive_skb_internal+0x23/0x90 [161846.962884] [] ? xenvif_idx_release+0xea/0x100 [xen_netback] [161846.962889] [] ? _raw_spin_unlock_irqrestore+0x20/0x50 [161846.962893] [] netif_receive_skb_sk+0x24/0x90 [161846.962899] [] xenvif_tx_submit+0x2ca/0x3f0 [xen_netback] [161846.962906] [] xenvif_tx_action+0x9c/0xd0 [xen_netback] [161846.962915] [] xenvif_poll+0x35/0x70 [xen_netback] [161846.962920] [] napi_poll+0xcb/0x1e0 [161846.962925] [] net_rx_action+0x90/0x1c0 [161846.962931] [] __do_softirq+0x10a/0x350 [161846.962938] [] irq_exit+0x125/0x130 [161846.962943] [] xen_evtchn_do_upcall+0x39/0x50 [161846.962950] [] xen_do_hypervisor_callback+0x1e/0x40 [161846.962952] [161846.962959] [] ? _raw_spin_lock+0x4a/0x80 [161846.962964] [] ? _raw_spin_lock_irqsave+0x1e/0xa0 [161846.962978] [] ? qlcnic_83xx_mailbox_worker+0xb9/0x2a0 [qlcnic] [161846.962991] [] ? process_one_work+0x151/0x4b0 [161846.962995] [] ? check_events+0x12/0x20 [161846.963001] [] ? worker_thread+0x120/0x480 [161846.963005] [] ? __schedule+0x30b/0x890 [161846.963010] [] ? process_one_work+0x4b0/0x4b0 [161846.963015] [] ? process_one_work+0x4b0/0x4b0 [161846.963021] [] ? kthread+0xce/0xf0 [161846.963025] [] ? kthread_freezable_should_stop+0x70/0x70 [161846.963031] [] ? ret_from_fork+0x42/0x70 [161846.963035] [] ? kthread_freezable_should_stop+0x70/0x70 [161846.963037] Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc Signed-off-by: Junxiao Bi Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ce43c07fcef8cf5cdbc7a45db7a6b4a1a7473905 Author: Eric Dumazet Date: Thu Feb 1 10:26:57 2018 -0800 net: igmp: add a missing rcu locking section [ Upstream commit e7aadb27a5415e8125834b84a74477bfbee4eff5 ] Newly added igmpv3_get_srcaddr() needs to be called under rcu lock. Timer callbacks do not ensure this locking. ============================= WARNING: suspicious RCU usage 4.15.0+ #200 Not tainted ----------------------------- ./include/linux/inetdevice.h:216 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 3 locks held by syzkaller616973/4074: #0: (&mm->mmap_sem){++++}, at: [<00000000bfce669e>] __do_page_fault+0x32d/0xc90 arch/x86/mm/fault.c:1355 #1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] lockdep_copy_map include/linux/lockdep.h:178 [inline] #1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1316 #2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] spin_lock_bh include/linux/spinlock.h:315 [inline] #2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] igmpv3_send_report+0x98/0x5b0 net/ipv4/igmp.c:600 stack backtrace: CPU: 0 PID: 4074 Comm: syzkaller616973 Not tainted 4.15.0+ #200 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4592 __in_dev_get_rcu include/linux/inetdevice.h:216 [inline] igmpv3_get_srcaddr net/ipv4/igmp.c:329 [inline] igmpv3_newpack+0xeef/0x12e0 net/ipv4/igmp.c:389 add_grhead.isra.27+0x235/0x300 net/ipv4/igmp.c:432 add_grec+0xbd3/0x1170 net/ipv4/igmp.c:565 igmpv3_send_report+0xd5/0x5b0 net/ipv4/igmp.c:605 igmp_send_report+0xc43/0x1050 net/ipv4/igmp.c:722 igmp_timer_expire+0x322/0x5c0 net/ipv4/igmp.c:831 call_timer_fn+0x228/0x820 kernel/time/timer.c:1326 expire_timers kernel/time/timer.c:1363 [inline] __run_timers+0x7ee/0xb70 kernel/time/timer.c:1666 run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692 __do_softirq+0x2d7/0xb85 kernel/softirq.c:285 invoke_softirq kernel/softirq.c:365 [inline] irq_exit+0x1cc/0x200 kernel/softirq.c:405 exiting_irq arch/x86/include/asm/apic.h:541 [inline] smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052 apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:938 Fixes: a46182b00290 ("net: igmp: Use correct source address on IGMPv3 reports") Signed-off-by: Eric Dumazet Reported-by: syzbot Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 7d3d60ef2256f5fcfd4929963dd67f58028ebeee Author: Nikolay Aleksandrov Date: Wed Jan 31 16:29:30 2018 +0200 ip6mr: fix stale iterator [ Upstream commit 4adfa79fc254efb7b0eb3cd58f62c2c3f805f1ba ] When we dump the ip6mr mfc entries via proc, we initialize an iterator with the table to dump but we don't clear the cache pointer which might be initialized from a prior read on the same descriptor that ended. This can result in lock imbalance (an unnecessary unlock) leading to other crashes and hangs. Clear the cache pointer like ipmr does to fix the issue. Thanks for the reliable reproducer. Here's syzbot's trace: WARNING: bad unlock balance detected! 4.15.0-rc3+ #128 Not tainted syzkaller971460/3195 is trying to release lock (mrt_lock) at: [<000000006898068d>] ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553 but there are no more locks to release! other info that might help us debug this: 1 lock held by syzkaller971460/3195: #0: (&p->lock){+.+.}, at: [<00000000744a6565>] seq_read+0xd5/0x13d0 fs/seq_file.c:165 stack backtrace: CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 print_unlock_imbalance_bug+0x12f/0x140 kernel/locking/lockdep.c:3561 __lock_release kernel/locking/lockdep.c:3775 [inline] lock_release+0x5f9/0xda0 kernel/locking/lockdep.c:4023 __raw_read_unlock include/linux/rwlock_api_smp.h:225 [inline] _raw_read_unlock+0x1a/0x30 kernel/locking/spinlock.c:255 ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553 traverse+0x3bc/0xa00 fs/seq_file.c:135 seq_read+0x96a/0x13d0 fs/seq_file.c:189 proc_reg_read+0xef/0x170 fs/proc/inode.c:217 do_loop_readv_writev fs/read_write.c:673 [inline] do_iter_read+0x3db/0x5b0 fs/read_write.c:897 compat_readv+0x1bf/0x270 fs/read_write.c:1140 do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189 C_SYSC_preadv fs/read_write.c:1209 [inline] compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203 do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline] do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389 entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125 RIP: 0023:0xf7f73c79 RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0 RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 BUG: sleeping function called from invalid context at lib/usercopy.c:25 in_atomic(): 1, irqs_disabled(): 0, pid: 3195, name: syzkaller971460 INFO: lockdep is turned off. CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6060 __might_sleep+0x95/0x190 kernel/sched/core.c:6013 __might_fault+0xab/0x1d0 mm/memory.c:4525 _copy_to_user+0x2c/0xc0 lib/usercopy.c:25 copy_to_user include/linux/uaccess.h:155 [inline] seq_read+0xcb4/0x13d0 fs/seq_file.c:279 proc_reg_read+0xef/0x170 fs/proc/inode.c:217 do_loop_readv_writev fs/read_write.c:673 [inline] do_iter_read+0x3db/0x5b0 fs/read_write.c:897 compat_readv+0x1bf/0x270 fs/read_write.c:1140 do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189 C_SYSC_preadv fs/read_write.c:1209 [inline] compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203 do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline] do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389 entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125 RIP: 0023:0xf7f73c79 RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0 RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 WARNING: CPU: 1 PID: 3195 at lib/usercopy.c:26 _copy_to_user+0xb5/0xc0 lib/usercopy.c:26 Reported-by: syzbot Signed-off-by: Nikolay Aleksandrov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ffcf167d348ee5e8ddbe2c4116b71ebfff2258e0 Author: Sebastian Andrzej Siewior Date: Thu Jan 11 18:57:26 2018 +0100 serial: core: mark port as initialized after successful IRQ change commit 44117a1d1732c513875d5a163f10d9adbe866c08 upstream. setserial changes the IRQ via uart_set_info(). It invokes uart_shutdown() which free the current used IRQ and clear TTY_PORT_INITIALIZED. It will then update the IRQ number and invoke uart_startup() before returning to the caller leaving TTY_PORT_INITIALIZED cleared. The next open will crash with | list_add double add: new=ffffffff839fcc98, prev=ffffffff839fcc98, next=ffffffff839fcc98. since the close from the IOCTL won't free the IRQ (and clean the list) due to the TTY_PORT_INITIALIZED check in uart_shutdown(). There is same pattern in uart_do_autoconfig() and I *think* it also needs to set TTY_PORT_INITIALIZED there. Is there a reason why uart_startup() does not set the flag by itself after the IRQ has been acquired (since it is cleared in uart_shutdown)? Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Greg Kroah-Hartman commit 400d3c8b0c7f6c942cd9bf201da1f32da9bdcc9f Author: Hugh Dickins Date: Mon Jan 29 18:17:58 2018 -0800 kaiser: allocate pgd with order 0 when pti=off The 4.9.77 version of "x86/pti/efi: broken conversion from efi to kernel page table" looked nicer than the 4.4.112 version, but was suboptimal on machines booted with "pti=off" (or on AMD machines): it allocated pgd with an order 1 page whatever the setting of kaiser_enabled. Fix that by moving the definition of PGD_ALLOCATION_ORDER from asm/pgalloc.h to asm/pgtable.h, which already defines kaiser_enabled. Fixes: 1b92c48a2eeb ("x86/pti/efi: broken conversion from efi to kernel page table") Reviewed-by: Pavel Tatashin Cc: Steven Sistare Cc: Jiri Kosina Signed-off-by: Hugh Dickins Signed-off-by: Greg Kroah-Hartman commit ae1fc8de51b10dba40cbd54959b5ba0a311c0861 Author: Dave Hansen Date: Mon Jan 29 18:17:26 2018 -0800 x86/pti: Make unpoison of pgd for trusted boot work for real commit 445b69e3b75e42362a5bdc13c8b8f61599e2228a upstream The inital fix for trusted boot and PTI potentially misses the pgd clearing if pud_alloc() sets a PGD. It probably works in *practice* because for two adjacent calls to map_tboot_page() that share a PGD entry, the first will clear NX, *then* allocate and set the PGD (without NX clear). The second call will *not* allocate but will clear the NX bit. Defer the NX clearing to a point after it is known that all top-level allocations have occurred. Add a comment to clarify why. [ tglx: Massaged changelog ] [ hughd notes: I have not tested tboot, but this looks to me as necessary and as safe in old-Kaiser backports as it is upstream; I'm not submitting the commit-to-be-fixed 262b6b30087, since it was undone by 445b69e3b75e, and makes conflict trouble because of 5-level's p4d versus 4-level's pgd.] Fixes: 262b6b30087 ("x86/tboot: Unbreak tboot with PTI enabled") Signed-off-by: Dave Hansen Signed-off-by: Thomas Gleixner Reviewed-by: Andrea Arcangeli Cc: Jon Masters Cc: Tim Chen Cc: gnomes@lxorguk.ukuu.org.uk Cc: peterz@infradead.org Cc: ning.sun@intel.com Cc: tboot-devel@lists.sourceforge.net Cc: andi@firstfloor.org Cc: luto@kernel.org Cc: law@redhat.com Cc: pbonzini@redhat.com Cc: torvalds@linux-foundation.org Cc: gregkh@linux-foundation.org Cc: dwmw@amazon.co.uk Cc: nickc@redhat.com Link: https://lkml.kernel.org/r/20180110224939.2695CD47@viggo.jf.intel.com Cc: Jiri Kosina Signed-off-by: Hugh Dickins Signed-off-by: Greg Kroah-Hartman commit 0a61cd6caed72ecac8cbb637f6498e17a33d73b1 Author: Hugh Dickins Date: Mon Jan 29 18:16:55 2018 -0800 kaiser: fix intel_bts perf crashes Vince reported perf_fuzzer quickly locks up on 4.15-rc7 with PTI; Robert reported Bad RIP with KPTI and Intel BTS also on 4.15-rc7: honggfuzz -f /tmp/somedirectorywithatleastonefile \ --linux_perf_bts_edge -s -- /bin/true (honggfuzz from https://github.com/google/honggfuzz) crashed with BUG: unable to handle kernel paging request at ffff9d3215100000 (then narrowed it down to perf record --per-thread -e intel_bts//u -- /bin/ls). The intel_bts driver does not use the 'normal' BTS buffer which is exposed through kaiser_add_mapping(), but instead uses the memory allocated for the perf AUX buffer. This obviously comes apart when using PTI, because then the kernel mapping, which includes that AUX buffer memory, disappears while switched to user page tables. Easily fixed in old-Kaiser backports, by applying kaiser_add_mapping() to those pages; perhaps not so easy for upstream, where 4.15-rc8 commit 99a9dc98ba52 ("x86,perf: Disable intel_bts when PTI") disables for now. Slightly reorganized surrounding code in bts_buffer_setup_aux(), so it can better match bts_buffer_free_aux(): free_aux with an #ifdef to avoid the loop when PTI is off, but setup_aux needs to loop anyway (and kaiser_add_mapping() is cheap when PTI config is off or "pti=off"). Reported-by: Vince Weaver Reported-by: Robert Święcki Analyzed-by: Peter Zijlstra Analyzed-by: Stephane Eranian Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Andy Lutomirski Cc: Alexander Shishkin Cc: Linus Torvalds Cc: Vince Weaver Cc: Jiri Kosina Signed-off-by: Hugh Dickins Signed-off-by: Greg Kroah-Hartman commit 374c84de94af798cad81b5e09b81e901966a4eb0 Author: Jesse Chan Date: Sun Nov 19 23:45:49 2017 -0800 ASoC: pcm512x: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE commit 0cab20cec0b663b7be8e2be5998d5a4113647f86 upstream. This change resolves a new compile-time warning when built as a loadable module: WARNING: modpost: missing MODULE_LICENSE() in sound/soc/codecs/snd-soc-pcm512x-spi.o see include/linux/module.h for more information This adds the license as "GPL v2", which matches the header of the file. MODULE_DESCRIPTION and MODULE_AUTHOR are also added. Signed-off-by: Jesse Chan Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 0ee4f5e7bbffc07d98cb7626446175700ef5fcce Author: Jesse Chan Date: Mon Nov 20 12:58:03 2017 -0800 pinctrl: pxa: pxa2xx: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE commit 0b9335cbd38e3bd2025bcc23b5758df4ac035f75 upstream. This change resolves a new compile-time warning when built as a loadable module: WARNING: modpost: missing MODULE_LICENSE() in drivers/pinctrl/pxa/pinctrl-pxa2xx.o see include/linux/module.h for more information This adds the license as "GPL v2", which matches the header of the file. MODULE_DESCRIPTION and MODULE_AUTHOR are also added. Signed-off-by: Jesse Chan Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit 781a2d683110864fc136d85df7a2fb3fc145e5e8 Author: Jesse Chan Date: Wed Jan 10 17:41:10 2018 +0100 auxdisplay: img-ascii-lcd: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE commit 09c479f7f1fbfaf848e5813996793966cd50be81 upstream. This change resolves a new compile-time warning when built as a loadable module: WARNING: modpost: missing MODULE_LICENSE() in drivers/auxdisplay/img-ascii-lcd.o see include/linux/module.h for more information This adds the license as "GPL", which matches the header of the file. MODULE_DESCRIPTION and MODULE_AUTHOR are also added. Signed-off-by: Jesse Chan Signed-off-by: Arnd Bergmann Signed-off-by: Greg Kroah-Hartman commit 9fed3978c39b27b27eeb676301b7ba960a74170c Author: Michael Ellerman Date: Tue Jan 16 22:17:18 2018 +1100 powerpc/64s: Allow control of RFI flush via debugfs commit 236003e6b5443c45c18e613d2b0d776a9f87540e upstream. Expose the state of the RFI flush (enabled/disabled) via debugfs, and allow it to be enabled/disabled at runtime. eg: $ cat /sys/kernel/debug/powerpc/rfi_flush 1 $ echo 0 > /sys/kernel/debug/powerpc/rfi_flush $ cat /sys/kernel/debug/powerpc/rfi_flush 0 Signed-off-by: Michael Ellerman Reviewed-by: Nicholas Piggin Signed-off-by: Greg Kroah-Hartman commit 1f0c936f431d98611fff5ef7082380f087da1578 Author: Michael Ellerman Date: Tue Jan 16 21:20:05 2018 +1100 powerpc/64s: Wire up cpu_show_meltdown() commit fd6e440f20b1a4304553775fc55938848ff617c9 upstream. The recent commit 87590ce6e373 ("sysfs/cpu: Add vulnerability folder") added a generic folder and set of files for reporting information on CPU vulnerabilities. One of those was for meltdown: /sys/devices/system/cpu/vulnerabilities/meltdown This commit wires up that file for 64-bit Book3S powerpc. For now we default to "Vulnerable" unless the RFI flush is enabled. That may not actually be true on all hardware, further patches will refine the reporting based on the CPU/platform etc. But for now we default to being pessimists. Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit 6aec12e1869e31839f317c02f81a92c393222f71 Author: Oliver O'Halloran Date: Wed Jan 10 03:07:15 2018 +1100 powerpc/powernv: Check device-tree for RFI flush settings commit 6e032b350cd1fdb830f18f8320ef0e13b4e24094 upstream. New device-tree properties are available which tell the hypervisor settings related to the RFI flush. Use them to determine the appropriate flush instruction to use, and whether the flush is required. Signed-off-by: Oliver O'Halloran Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit 7db0fff62f52c3f23c39262b9e037d8b43dfc88d Author: Michael Neuling Date: Wed Jan 10 03:07:15 2018 +1100 powerpc/pseries: Query hypervisor for RFI flush settings commit 8989d56878a7735dfdb234707a2fee6faf631085 upstream. A new hypervisor call is available which tells the guest settings related to the RFI flush. Use it to query the appropriate flush instruction(s), and whether the flush is required. Signed-off-by: Michael Neuling Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit 0ef9f8289edf1d335fb2bd3c162521528823b585 Author: Michael Ellerman Date: Wed Jan 10 03:07:15 2018 +1100 powerpc/64s: Support disabling RFI flush with no_rfi_flush and nopti commit bc9c9304a45480797e13a8e1df96ffcf44fb62fe upstream. Because there may be some performance overhead of the RFI flush, add kernel command line options to disable it. We add a sensibly named 'no_rfi_flush' option, but we also hijack the x86 option 'nopti'. The RFI flush is not the same as KPTI, but if we see 'nopti' we can guess that the user is trying to avoid any overhead of Meltdown mitigations, and it means we don't have to educate every one about a different command line option. Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit c3b82ebee6e0d92431c92ee80393c023d550c8a1 Author: Michael Ellerman Date: Wed Jan 10 03:07:15 2018 +1100 powerpc/64s: Add support for RFI flush of L1-D cache commit aa8a5e0062ac940f7659394f4817c948dc8c0667 upstream. On some CPUs we can prevent the Meltdown vulnerability by flushing the L1-D cache on exit from kernel to user mode, and from hypervisor to guest. This is known to be the case on at least Power7, Power8 and Power9. At this time we do not know the status of the vulnerability on other CPUs such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale CPUs. As more information comes to light we can enable this, or other mechanisms on those CPUs. The vulnerability occurs when the load of an architecturally inaccessible memory region (eg. userspace load of kernel memory) is speculatively executed to the point where its result can influence the address of a subsequent speculatively executed load. In order for that to happen, the first load must hit in the L1, because before the load is sent to the L2 the permission check is performed. Therefore if no kernel addresses hit in the L1 the vulnerability can not occur. We can ensure that is the case by flushing the L1 whenever we return to userspace. Similarly for hypervisor vs guest. In order to flush the L1-D cache on exit, we add a section of nops at each (h)rfi location that returns to a lower privileged context, and patch that with some sequence. Newer firmwares are able to advertise to us that there is a special nop instruction that flushes the L1-D. If we do not see that advertised, we fall back to doing a displacement flush in software. For guest kernels we support migration between some CPU versions, and different CPUs may use different flush instructions. So that we are prepared to migrate to a machine with a different flush instruction activated, we may have to patch more than one flush instruction at boot if the hypervisor tells us to. In the end this patch is mostly the work of Nicholas Piggin and Michael Ellerman. However a cast of thousands contributed to analysis of the issue, earlier versions of the patch, back ports testing etc. Many thanks to all of them. Tested-by: Jon Masters Signed-off-by: Nicholas Piggin Signed-off-by: Michael Ellerman [Balbir - back ported to stable with changes] Signed-off-by: Balbir Singh Signed-off-by: Greg Kroah-Hartman commit 48cc95d4e4d6a1265b7f728182d6dc62de849b05 Author: Nicholas Piggin Date: Wed Jan 10 03:07:15 2018 +1100 powerpc/64s: Convert slb_miss_common to use RFI_TO_USER/KERNEL commit c7305645eb0c1621351cfc104038831ae87c0053 upstream. In the SLB miss handler we may be returning to user or kernel. We need to add a check early on and save the result in the cr4 register, and then we bifurcate the return path based on that. Signed-off-by: Nicholas Piggin Signed-off-by: Michael Ellerman Signed-off-by: Nicholas Piggin [mpe: Backport to 4.4 based on patch from Balbir] Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit 00e40620a51ebee4ea002ec2efcd64f1960cb964 Author: Nicholas Piggin Date: Wed Jan 10 03:07:15 2018 +1100 powerpc/64: Convert the syscall exit path to use RFI_TO_USER/KERNEL commit b8e90cb7bc04a509e821e82ab6ed7a8ef11ba333 upstream. In the syscall exit path we may be returning to user or kernel context. We already have a test for that, because we conditionally restore r13. So use that existing test and branch, and bifurcate the return based on that. Signed-off-by: Nicholas Piggin Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit 9d914324d966497f4d40cfa9333cbe55150cc09b Author: Nicholas Piggin Date: Wed Jan 10 03:07:15 2018 +1100 powerpc/64: Convert fast_exception_return to use RFI_TO_USER/KERNEL commit a08f828cf47e6c605af21d2cdec68f84e799c318 upstream. Similar to the syscall return path, in fast_exception_return we may be returning to user or kernel context. We already have a test for that, because we conditionally restore r13. So use that existing test and branch, and bifurcate the return based on that. Signed-off-by: Nicholas Piggin Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit 8fd3f98d0f4d5a9044b900de950ab02164968d27 Author: Nicholas Piggin Date: Wed Jan 10 03:07:15 2018 +1100 powerpc/64: Add macros for annotating the destination of rfid/hrfid commit 50e51c13b3822d14ff6df4279423e4b7b2269bc3 upstream. The rfid/hrfid ((Hypervisor) Return From Interrupt) instruction is used for switching from the kernel to userspace, and from the hypervisor to the guest kernel. However it can and is also used for other transitions, eg. from real mode kernel code to virtual mode kernel code, and it's not always clear from the code what the destination context is. To make it clearer when reading the code, add macros which encode the expected destination context. Signed-off-by: Nicholas Piggin Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit be6641a7e6f79b69446e7f1c44bab75bf20f1665 Author: Michael Neuling Date: Tue Jan 9 03:52:05 2018 +1100 powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapper commit 191eccb1580939fb0d47deb405b82a85b0379070 upstream. A new hypervisor call has been defined to communicate various characteristics of the CPU to guests. Add definitions for the hcall number, flags and a wrapper function. Signed-off-by: Michael Neuling Signed-off-by: Michael Ellerman [Balbir fixed conflicts in backport] Signed-off-by: Balbir Singh Signed-off-by: Greg Kroah-Hartman