commit a97a16f18c4895e41951a44d27af8af0b8f1d897 Author: Greg Kroah-Hartman Date: Wed Aug 16 13:44:13 2017 -0700 Linux 4.9.44 commit eea1ec08f8a5f8ab57b66f47d2673089c9ebea69 Author: Maciej W. Rozycki Date: Sun Jul 30 21:28:15 2017 +0100 MIPS: DEC: Fix an int-handler.S CPU_DADDI_WORKAROUNDS regression commit 68fe55680d0f3342969f49412fceabb90bdfadba upstream. Fix a commit 3021773c7c3e ("MIPS: DEC: Avoid la pseudo-instruction in delay slots") regression and remove assembly errors: arch/mips/dec/int-handler.S: Assembler messages: arch/mips/dec/int-handler.S:162: Error: Macro used $at after ".set noat" arch/mips/dec/int-handler.S:163: Error: Macro used $at after ".set noat" arch/mips/dec/int-handler.S:229: Error: Macro used $at after ".set noat" arch/mips/dec/int-handler.S:230: Error: Macro used $at after ".set noat" triggering with with the CPU_DADDI_WORKAROUNDS option set and the DADDIU instruction. This is because with that option in place the instruction becomes a macro, which expands to an LI/DADDU (or actually ADDIU/DADDU) sequence that uses $at as a temporary register. With CPU_DADDI_WORKAROUNDS we only support `-msym32' compilation though, and this is already enforced in arch/mips/Makefile, so choose the 32-bit expansion variant for the supported configurations and then replace the 64-bit variant with #error just in case. Fixes: 3021773c7c3e ("MIPS: DEC: Avoid la pseudo-instruction in delay slots") Signed-off-by: Maciej W. Rozycki Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/16893/ Signed-off-by: Ralf Baechle Signed-off-by: Greg Kroah-Hartman commit 5e5a510455323634dc3df7a2d58c985b730c31ad Author: Neil Armstrong Date: Tue May 23 16:09:19 2017 +0200 pinctrl: meson-gxbb: Add missing GPIODV_18 pin entry commit 34e61801a3b9df74b69f0e359d64a197a77dd6ac upstream. GPIODV_18 entry was missing in the original driver push. Fixes: 468c234f9ed7 ("pinctrl: amlogic: Add support for Amlogic Meson GXBB SoC") Signed-off-by: Neil Armstrong Reviewed-by: Jerome Brunet Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit 8cbc0b49ca8db1f83835f86d337195947632aa3f Author: Thomas Gleixner Date: Thu Jun 29 23:33:35 2017 +0200 pinctrl: samsung: Remove bogus irq_[un]mask from resource management commit 3fa53ec2ed885b0aec3f0472e3b4a8a6f1cd748c upstream. The irq chip callbacks irq_request/release_resources() have absolutely no business with masking and unmasking the irq. The core code unmasks the interrupt after complete setup and masks it before invoking irq_release_resources(). The unmask is actually harmful as it happens before the interrupt is completely initialized in __setup_irq(). Remove it. Fixes: f6a8249f9e55 ("pinctrl: exynos: Lock GPIOs as interrupts when used as EINTs") Signed-off-by: Thomas Gleixner Cc: Krzysztof Kozlowski Cc: Sylwester Nawrocki Cc: Linus Walleij Cc: Kukjin Kim Cc: linux-arm-kernel@lists.infradead.org Cc: linux-samsung-soc@vger.kernel.org Cc: linux-gpio@vger.kernel.org Acked-by: Tomasz Figa Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit 8495ab6ef94a92118ff50b27768bbcc10e1adc14 Author: Masahiro Yamada Date: Wed Jun 14 13:49:30 2017 +0900 pinctrl: uniphier: fix WARN_ON() of pingroups dump on LD20 commit 1bd303dc04c3f744474e77c153575087b657f7e1 upstream. The pingroups dump of debugfs hits WARN_ON() in pinctrl_groups_show(). Filling non-existing ports with '-1' turned out a bad idea. Fixes: 336306ee1f2d ("pinctrl: uniphier: add UniPhier PH1-LD20 pinctrl driver") Signed-off-by: Masahiro Yamada Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit f642d29c23883f0d700f9f87107f1786dd2d7363 Author: Masahiro Yamada Date: Wed Jun 14 13:49:29 2017 +0900 pinctrl: uniphier: fix WARN_ON() of pingroups dump on LD11 commit 9592bc256d50481dfcdba93890e576a728fb373c upstream. The pingroups dump of debugfs hits WARN_ON() in pinctrl_groups_show(). Filling non-existing ports with '-1' turned out a bad idea. Fixes: 70f2f9c4cf25 ("pinctrl: uniphier: add UniPhier PH1-LD11 pinctrl driver") Signed-off-by: Masahiro Yamada Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit 877fe62863d0d1dcac837bef9a7b0028f0cd8951 Author: Andy Shevchenko Date: Fri Aug 4 19:26:34 2017 +0300 pinctrl: intel: merrifield: Correct UART pin lists commit 5d996132d921c391af5f267123eca1a6a3148ecd upstream. UART pin lists consist GPIO numbers which is simply wrong. Replace it by pin numbers. Fixes: 4e80c8f50574 ("pinctrl: intel: Add Intel Merrifield pin controller support") Signed-off-by: Andy Shevchenko Acked-by: Mika Westerberg Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit 7b6fff65ecf92b265f48249f041b8c06557960ed Author: Icenowy Zheng Date: Sat Jul 22 10:50:53 2017 +0800 pinctrl: sunxi: add a missing function of A10/A20 pinctrl driver commit d81ece747d8727bb8b1cfc9a20dbe62f09a4e35a upstream. The PH16 pin has a function with mux id 0x5, which is the DET pin of the "sim" (smart card reader) IP block. This function is missing in old versions of A10/A20 SoCs' datasheets and user manuals, so it's also missing in the old drivers. The newest A10 Datasheet V1.70 and A20 Datasheet V1.41 contain this pin function, and it's discovered during implementing R40 pinctrl driver. Add it to the driver. As we now merged A20 pinctrl driver to the A10 one, we need to only fix the A10 driver now. Fixes: f2821b1ca3a2 ("pinctrl: sunxi: Move Allwinner A10 pinctrl driver to a driver of its own") Signed-off-by: Icenowy Zheng Reviewed-by: Chen-Yu Tsai Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit a68978bb949a9be075b75a045250ce89d7550604 Author: Christoph Hellwig Date: Sat Aug 5 10:59:14 2017 +0200 pnfs/blocklayout: require 64-bit sector_t commit 8a9d6e964d318533ba3d2901ce153ba317c99a89 upstream. The blocklayout code does not compile cleanly for a 32-bit sector_t, and also has no reliable checks for devices sizes, which makes it unsafe to use with a kernel that doesn't support large block devices. Signed-off-by: Christoph Hellwig Reported-by: Arnd Bergmann Fixes: 5c83746a0cf2 ("pnfs/blocklayout: in-kernel GETDEVICEINFO XDR parsing") Signed-off-by: Anna Schumaker Signed-off-by: Greg Kroah-Hartman commit eda1b3d42fad388bf285e8d7b45009b61b0e6f6e Author: Stefan-Gabriel Mirea Date: Thu Jul 6 10:06:41 2017 +0100 iio: adc: vf610_adc: Fix VALT selection value for REFSEL bits commit d466d3c1217406b14b834335b5b4b33c0d45bd09 upstream. In order to select the alternate voltage reference pair (VALTH/VALTL), the right value for the REFSEL field in the ADCx_CFG register is "01", leading to 0x800 as register mask. See section 8.2.6.4 in the reference manual[1]. [1] http://www.nxp.com/docs/en/reference-manual/VFXXXRM.pdf Fixes: a775427632fd ("iio:adc:imx: add Freescale Vybrid vf610 adc driver") Signed-off-by: Stefan-Gabriel Mirea Signed-off-by: Jonathan Cameron Signed-off-by: Greg Kroah-Hartman commit 4cae4a23d9a4d2e0515d22590c23147809f7ea8e Author: Sandeep Singh Date: Fri Aug 4 16:35:56 2017 +0530 usb:xhci:Add quirk for Certain failing HP keyboard on reset after resume commit e788787ef4f9c24aafefc480a8da5f92b914e5e6 upstream. Certain HP keyboards would keep inputting a character automatically which is the wake-up key after S3 resume On some AMD platforms USB host fails to respond (by holding resume-K) to USB device (an HP keyboard) resume request within 1ms (TURSM) and ensures that resume is signaled for at least 20 ms (TDRSMDN), which is defined in USB 2.0 spec. The result is that the keyboard is out of function. In SNPS USB design, the host responds to the resume request only after system gets back to S0 and the host gets to functional after the internal HW restore operation that is more than 1 second after the initial resume request from the USB device. As a workaround for specific keyboard ID(HP Keyboards), applying port reset after resume when the keyboard is plugged in. Signed-off-by: Sandeep Singh Signed-off-by: Shyam Sundar S K cc: Nehal Shah Reviewed-by: Felipe Balbi Signed-off-by: Greg Kroah-Hartman commit f4bbed570aef42ccd08852b05ace99d488ed3ddf Author: Kai-Heng Feng Date: Tue Aug 8 17:51:27 2017 +0800 usb: quirks: Add no-lpm quirk for Moshi USB to Ethernet Adapter commit 7496cfe5431f21da5d27a8388c326397e3f0a5db upstream. Moshi USB to Ethernet Adapter internally uses a Genesys Logic hub to connect to Realtek r8153. The Realtek r8153 ethernet does not work on the internal hub, no-lpm quirk can make it work. Since another r8153 dongle at my hand does not have the issue, so add the quirk to the Genesys Logic hub instead. Signed-off-by: Kai-Heng Feng Signed-off-by: Greg Kroah-Hartman commit 42d65cc89a2338c8d488cba8a8625951fac7b0a8 Author: Bin Liu Date: Tue Jul 25 09:31:33 2017 -0500 usb: core: unlink urbs from the tail of the endpoint's urb_list commit 2eac13624364db5b5e1666ae0bb3a4d36bc56b6e upstream. While unlink an urb, if the urb has been programmed in the controller, the controller driver might do some hw related actions to tear down the urb. Currently usb_hcd_flush_endpoint() passes each urb from the head of the endpoint's urb_list to the controller driver, which could make the controller driver think each urb has been programmed and take the unnecessary actions for each urb. This patch changes the behavior in usb_hcd_flush_endpoint() to pass the urbs from the tail of the list, to avoid any unnecessary actions in an controller driver. Acked-by: Alan Stern Signed-off-by: Bin Liu Signed-off-by: Greg Kroah-Hartman commit 7c2beb1c44326ecf61c49df4a87a7c75fc255b8a Author: Alan Stern Date: Tue Aug 1 10:41:56 2017 -0400 USB: Check for dropped connection before switching to full speed commit 94c43b9897abf4ea366ed4dba027494e080c7050 upstream. Some buggy USB disk adapters disconnect and reconnect multiple times during the enumeration procedure. This may lead to a device connecting at full speed instead of high speed, because when the USB stack sees that a device isn't able to enumerate at high speed, it tries to hand the connection over to a full-speed companion controller. The logic for doing this is careful to check that the device is still connected. But this check is inadequate if the device disconnects and reconnects before the check is done. The symptom is that a device works, but much more slowly than it is capable of operating. The situation was made worse recently by commit 22547c4cc4fe ("usb: hub: Wait for connection to be reestablished after port reset"), which increases the delay following a reset before a disconnect is recognized, thus giving the device more time to reconnect. This patch makes the check more robust. If the device was disconnected at any time during enumeration, we will now skip the full-speed handover. Signed-off-by: Alan Stern Reported-and-tested-by: Zdenek Kabelac Reviewed-by: Guenter Roeck Signed-off-by: Greg Kroah-Hartman commit 7f737f10c1ee1749f32e77ac98d5bc3ffde876ce Author: Yoshihiro Shimoda Date: Wed Aug 2 13:21:45 2017 +0900 usb: renesas_usbhs: Fix UGCTRL2 value for R-Car Gen3 commit 2acecd58969897795cf015c9057ebd349a3fda8a upstream. The latest HW manual (Rev.0.55) shows us this UGCTRL2.VBUSSEL bit. If the bit sets to 1, the VBUS drive is controlled by phy related registers (called "UCOM Registers" on the manual). Since R-Car Gen3 environment will control VBUS by phy-rcar-gen3-usb2 driver, the UGCTRL2.VBUSSEL bit should be set to 1. So, this patch fixes the register's value. Otherwise, even if the ID pin indicates to peripheral, the R-Car will output USBn_PWEN to 1 when a host driver is running. Fixes: de18757e272d ("usb: renesas_usbhs: add R-Car Gen3 power control" Signed-off-by: Yoshihiro Shimoda Signed-off-by: Felipe Balbi Signed-off-by: Greg Kroah-Hartman commit 2db03a7fa0ddb671d2508325e1243bd46ab4c6dc Author: Yoshihiro Shimoda Date: Wed Aug 2 21:06:35 2017 +0900 usb: gadget: udc: renesas_usb3: Fix usb_gadget_giveback_request() calling commit aca5b9ebd096039657417c321a9252c696b359c2 upstream. According to the gadget.h, a "complete" function will always be called with interrupts disabled. However, sometimes usb3_request_done() function is called with interrupts enabled. So, this function should be held by spin_lock_irqsave() to disable interruption. Also, this driver has to call spin_unlock() to avoid spinlock recursion by this driver before calling usb_gadget_giveback_request(). Reported-by: Kazuya Mizuguchi Tested-by: Kazuya Mizuguchi Fixes: 746bfe63bba3 ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller") Signed-off-by: Yoshihiro Shimoda Signed-off-by: Felipe Balbi Signed-off-by: Greg Kroah-Hartman commit a09ecc9345b6bf07812e55070328d7c0bdb2c498 Author: Alan Swanson Date: Wed Jul 26 12:03:33 2017 +0100 uas: Add US_FL_IGNORE_RESIDUE for Initio Corporation INIC-3069 commit 89f23d51defcb94a5026d4b5da13faf4e1150a6f upstream. Similar to commit d595259fbb7a ("usb-storage: Add ignore-residue quirk for Initio INIC-3619") for INIC-3169 in unusual_devs.h but INIC-3069 already present in unusual_uas.h. Both in same controller IC family. Issue is that MakeMKV fails during key exchange with installed bluray drive with following error: 002004:0000 Error 'Scsi error - ILLEGAL REQUEST:COPY PROTECTION KEY EXCHANGE FAILURE - KEY NOT ESTABLISHED' occurred while issuing SCSI command AD010..080002400 to device 'SG:dev_11:0' Signed-off-by: Alan Swanson Acked-by: Oliver Neukum Signed-off-by: Greg Kroah-Hartman commit b189f8eb27153daaaa464d7bfec9c5a8ed2989a0 Author: Ian Abbott Date: Fri Jul 28 16:22:31 2017 +0100 staging: comedi: comedi_fops: do not call blocking ops when !TASK_RUNNING commit cef988642cdac44e910a27cb6e8166c96f86a0df upstream. Comedi's read and write file operation handlers (`comedi_read()` and `comedi_write()`) currently call `copy_to_user()` or `copy_from_user()` whilst in the `TASK_INTERRUPTIBLE` state, which falls foul of the `might_fault()` checks when enabled. Fix it by setting the current task state back to `TASK_RUNNING` a bit earlier before calling these functions. Reported-by: Piotr Gregor Signed-off-by: Ian Abbott Signed-off-by: Greg Kroah-Hartman commit bbae08213e6e9811a2d3ece7ba95b1ac0e21f47c Author: Akinobu Mita Date: Wed Jun 21 01:46:37 2017 +0900 iio: light: tsl2563: use correct event code commit a3507e48d3f99a93a3056a34a5365f310434570f upstream. The TSL2563 driver provides three iio channels, two of which are raw ADC channels (channel 0 and channel 1) in the device and the remaining one is calculated by the two. The ADC channel 0 only supports programmable interrupt with threshold settings and this driver supports the event but the generated event code does not contain the corresponding iio channel type. This is going to change userspace ABI. Hopefully fixing this to be what it should always have been won't break any userspace code. Cc: Jonathan Cameron Signed-off-by: Akinobu Mita Signed-off-by: Jonathan Cameron Signed-off-by: Greg Kroah-Hartman commit 1ca3869234d300b9a0543cf55dccce42c2fc77c1 Author: Hans de Goede Date: Thu Jul 13 15:13:41 2017 +0200 iio: accel: bmc150: Always restore device to normal mode after suspend-resume commit e59e18989c68a8d7941005f81ad6abc4ca682de0 upstream. After probe we would put the device in normal mode, after a runtime suspend-resume we would put it back in normal mode. But for a regular suspend-resume we would only put it back in normal mode if triggers or events have been requested. This is not consistent and breaks reading raw values after a suspend-resume. This commit changes the regular resume path to also unconditionally put the device back in normal mode, fixing reading of raw values not working after a regular suspend-resume cycle. Signed-off-by: Hans de Goede Reviewed-by: Srinivas Pandruvada Signed-off-by: Jonathan Cameron Signed-off-by: Greg Kroah-Hartman commit c5347390e57a19b4f25fc58f3c77e2cc15d2e47a Author: Arnd Bergmann Date: Fri Jul 14 11:31:03 2017 +0200 staging:iio:resolver:ad2s1210 fix negative IIO_ANGL_VEL read commit 105967ad68d2eb1a041bc041f9cf96af2a653b65 upstream. gcc-7 points out an older regression: drivers/staging/iio/resolver/ad2s1210.c: In function 'ad2s1210_read_raw': drivers/staging/iio/resolver/ad2s1210.c:515:42: error: '<<' in boolean context, did you mean '<' ? [-Werror=int-in-bool-context] The original code had 'unsigned short' here, but incorrectly got converted to 'bool'. This reverts the regression and uses a normal type instead. Fixes: 29148543c521 ("staging:iio:resolver:ad2s1210 minimal chan spec conversion.") Signed-off-by: Arnd Bergmann Signed-off-by: Jonathan Cameron Signed-off-by: Greg Kroah-Hartman commit 199a3f26e9d8c5712833084be362004c959ebb59 Author: Rafael J. Wysocki Date: Tue Jul 25 23:58:50 2017 +0200 USB: hcd: Mark secondary HCD as dead if the primary one died commit cd5a6a4fdaba150089af2afc220eae0fef74878a upstream. Make usb_hc_died() clear the HCD_FLAG_RH_RUNNING flag for the shared HCD and set HCD_FLAG_DEAD for it, in analogy with what is done for the primary one. Among other thigs, this prevents check_root_hub_suspended() from returning -EBUSY for dead HCDs which helps to work around system suspend issues in some situations. This actually fixes occasional suspend failures on one of my test machines. Suggested-by: Alan Stern Signed-off-by: Rafael J. Wysocki Acked-by: Alan Stern Signed-off-by: Greg Kroah-Hartman commit 821ccbe2937ece265dfc087c4fe272242b70350a Author: Bin Liu Date: Tue Jul 25 09:31:34 2017 -0500 usb: musb: fix tx fifo flush handling again commit 45d73860530a14c608f410b91c6c341777bfa85d upstream. commit 68fe05e2a451 ("usb: musb: fix tx fifo flush handling") drops the 1ms delay trying to solve the long disconnect time issue when application queued many tx urbs. However, the 1ms delay is needed for some use cases, for example, without the delay, reconnecting AR9271 WIFI dongle no longer works if the connection is dropped from the AP. So let's add back the 1ms delay in musb_h_tx_flush_fifo(), and solve the long disconnect time problem with a separate patch for usb_hcd_flush_endpoint(). Signed-off-by: Bin Liu Signed-off-by: Greg Kroah-Hartman commit 4fd8c366acac7a888b83a25d349476a111e49ac7 Author: Greg Kroah-Hartman Date: Thu Aug 10 11:54:12 2017 -0700 USB: serial: pl2303: add new ATEN device id commit 3b6bcd3d093c698d32e93d4da57679b8fbc5e01e upstream. This adds a new ATEN device id for a new pl2303-based device. Reported-by: Peter Kuo Cc: Johan Hovold Signed-off-by: Greg Kroah-Hartman commit 5665164015017a8abeb4b401656db678b43dafac Author: Stefan Triller Date: Fri Jun 30 14:44:03 2017 +0200 USB: serial: cp210x: add support for Qivicon USB ZigBee dongle commit 9585e340db9f6cc1c0928d82c3a23cc4460f0a3f upstream. The German Telekom offers a ZigBee USB Stick under the brand name Qivicon for their SmartHome Home Base in its 1. Generation. The productId is not known by the according kernel module, this patch adds support for it. Signed-off-by: Stefan Triller Reviewed-by: Frans Klaver Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman commit e27f58cd130bae882e6bd66e094c036056a03278 Author: Hector Martin Date: Wed Aug 2 00:45:06 2017 +0900 USB: serial: option: add D-Link DWM-222 device ID commit fd1b8668af59a11bb754a6c9b0051c6c5ce73b74 upstream. Add device id for D-Link DWM-222. Signed-off-by: Hector Martin Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman commit 2b3bf207b2a2465115d2bded0eef8a3c4ee0c6c2 Author: Maarten Lankhorst Date: Mon Jul 24 11:14:31 2017 +0200 drm/i915: Fix out-of-bounds array access in bdw_load_gamma_lut commit 5279fc7724ae3a82c9cfe5b09c1fb07ff0e41056 upstream. bdw_load_gamma_lut is writing beyond the array to the maximum value. The intend of the function is to clamp values > 1 to 1, so write the intended color to the max register. This fixes the following KASAN warning: [ 197.020857] [IGT] kms_pipe_color: executing [ 197.063434] [IGT] kms_pipe_color: starting subtest ctm-0-25-pipe0 [ 197.078989] ================================================================== [ 197.079127] BUG: KASAN: slab-out-of-bounds in bdw_load_gamma_lut.isra.2+0x3b9/0x570 [i915] [ 197.079188] Read of size 2 at addr ffff8800d38db150 by task kms_pipe_color/1839 [ 197.079208] CPU: 2 PID: 1839 Comm: kms_pipe_color Tainted: G U 4.13.0-rc1-patser+ #5211 [ 197.079215] Hardware name: NUC5i7RYB, BIOS RYBDWi35.86A.0246.2015.0309.1355 03/09/2015 [ 197.079220] Call Trace: [ 197.079230] dump_stack+0x68/0x9e [ 197.079239] print_address_description+0x6f/0x250 [ 197.079251] kasan_report+0x216/0x370 [ 197.079374] ? bdw_load_gamma_lut.isra.2+0x3b9/0x570 [i915] [ 197.079451] ? gen8_write16+0x4e0/0x4e0 [i915] [ 197.079460] __asan_report_load2_noabort+0x14/0x20 [ 197.079535] bdw_load_gamma_lut.isra.2+0x3b9/0x570 [i915] [ 197.079612] broadwell_load_luts+0x1df/0x550 [i915] [ 197.079690] intel_color_load_luts+0x7b/0x80 [i915] [ 197.079764] intel_begin_crtc_commit+0x138/0x760 [i915] [ 197.079783] drm_atomic_helper_commit_planes_on_crtc+0x1a3/0x820 [drm_kms_helper] [ 197.079859] ? intel_pre_plane_update+0x571/0x580 [i915] [ 197.079937] intel_update_crtc+0x238/0x330 [i915] [ 197.080016] intel_update_crtcs+0x10f/0x210 [i915] [ 197.080092] intel_atomic_commit_tail+0x1552/0x3340 [i915] [ 197.080101] ? _raw_spin_unlock+0x3c/0x40 [ 197.080110] ? __queue_work+0xb40/0xbf0 [ 197.080188] ? skl_update_crtcs+0xc00/0xc00 [i915] [ 197.080195] ? trace_hardirqs_on+0xd/0x10 [ 197.080269] ? intel_atomic_commit_ready+0x128/0x13c [i915] [ 197.080329] ? __i915_sw_fence_complete+0x5b8/0x6d0 [i915] [ 197.080336] ? debug_object_activate+0x39e/0x580 [ 197.080397] ? i915_sw_fence_await+0x30/0x30 [i915] [ 197.080409] ? __might_sleep+0x15b/0x180 [ 197.080483] intel_atomic_commit+0x944/0xa70 [i915] [ 197.080490] ? refcount_dec_and_test+0x11/0x20 [ 197.080567] ? intel_atomic_commit_tail+0x3340/0x3340 [i915] [ 197.080597] ? drm_atomic_crtc_set_property+0x303/0x580 [drm] [ 197.080674] ? intel_atomic_commit_tail+0x3340/0x3340 [i915] [ 197.080704] drm_atomic_commit+0xd7/0xe0 [drm] [ 197.080722] drm_atomic_helper_crtc_set_property+0xec/0x130 [drm_kms_helper] [ 197.080749] drm_mode_crtc_set_obj_prop+0x7d/0xb0 [drm] [ 197.080775] drm_mode_obj_set_property_ioctl+0x50b/0x5d0 [drm] [ 197.080783] ? __might_fault+0x104/0x180 [ 197.080809] ? drm_mode_obj_find_prop_id+0x160/0x160 [drm] [ 197.080838] ? drm_mode_obj_find_prop_id+0x160/0x160 [drm] [ 197.080861] drm_ioctl_kernel+0x154/0x1a0 [drm] [ 197.080885] drm_ioctl+0x624/0x8f0 [drm] [ 197.080910] ? drm_mode_obj_find_prop_id+0x160/0x160 [drm] [ 197.080934] ? drm_getunique+0x210/0x210 [drm] [ 197.080943] ? __handle_mm_fault+0x1bd0/0x1ce0 [ 197.080949] ? lock_downgrade+0x610/0x610 [ 197.080957] ? __lru_cache_add+0x15a/0x180 [ 197.080967] do_vfs_ioctl+0xd92/0xe40 [ 197.080975] ? ioctl_preallocate+0x1b0/0x1b0 [ 197.080982] ? selinux_capable+0x20/0x20 [ 197.080991] ? __do_page_fault+0x7b7/0x9a0 [ 197.080997] ? lock_downgrade+0x5bb/0x610 [ 197.081007] ? security_file_ioctl+0x57/0x90 [ 197.081016] SyS_ioctl+0x4e/0x80 [ 197.081024] entry_SYSCALL_64_fastpath+0x18/0xad [ 197.081030] RIP: 0033:0x7f61f287a987 [ 197.081035] RSP: 002b:00007fff7d44d188 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 197.081043] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f61f287a987 [ 197.081048] RDX: 00007fff7d44d1c0 RSI: 00000000c01864ba RDI: 0000000000000003 [ 197.081053] RBP: 00007f61f2b3eb00 R08: 0000000000000059 R09: 0000000000000000 [ 197.081058] R10: 0000002ea5c4a290 R11: 0000000000000246 R12: 00007f61f2b3eb58 [ 197.081063] R13: 0000000000001010 R14: 00007f61f2b3eb58 R15: 0000000000002702 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101659 Signed-off-by: Maarten Lankhorst Reported-by: Martin Peres Cc: Martin Peres Fixes: 82cf435b3134 ("drm/i915: Implement color management on bdw/skl/bxt/kbl") Cc: Shashank Sharma Cc: Kiran S Kumar Cc: Kausal Malladi Cc: Lionel Landwerlin Cc: Matt Roper Cc: Daniel Vetter Cc: Jani Nikula Cc: intel-gfx@lists.freedesktop.org Link: https://patchwork.freedesktop.org/patch/msgid/20170724091431.24251-1-maarten.lankhorst@linux.intel.com Reviewed-by: Lionel Landwerlin (cherry picked from commit 09a92bc8773b4314e02b478e003fe5936ce85adb) Signed-off-by: Jani Nikula Signed-off-by: Greg Kroah-Hartman commit 4381e2c30008fa5ceb28957cb186a25871b09bf1 Author: Wladimir J. van der Laan Date: Tue Jul 25 14:33:36 2017 +0200 drm/etnaviv: Fix off-by-one error in reloc checking commit d6f756e09f01ea7a0efbbcef269a1e384a35d824 upstream. A relocation pointing to the last four bytes of a buffer can legitimately happen in the case of small vertex buffers. Signed-off-by: Wladimir J. van der Laan Reviewed-by: Philipp Zabel Reviewed-by: Christian Gmeiner Signed-off-by: Lucas Stach Signed-off-by: Greg Kroah-Hartman commit 00f3c2a253f752948c5cac9f3a554407ab43d9c2 Author: Weston Andros Adamson Date: Tue Aug 1 16:25:01 2017 -0400 nfs/flexfiles: fix leak of nfs4_ff_ds_version arrays commit 1feb26162bee7b2f110facfec71b5c7bdbc7d14d upstream. The client was freeing the nfs4_ff_layout_ds, but not the contained nfs4_ff_ds_version array. Signed-off-by: Weston Andros Adamson Signed-off-by: Anna Schumaker Signed-off-by: Greg Kroah-Hartman commit 0a205d8145c22d83e24ada019b599a33f7bf68ed Author: Haibo Chen Date: Tue Aug 8 18:54:01 2017 +0800 mmc: mmc: correct the logic for setting HS400ES signal voltage commit 92ddd95919466de5d34f3cb43635da9a7f9ab814 upstream. Change the default err value to -EINVAL, make sure the card only has type EXT_CSD_CARD_TYPE_HS400_1_8V also do the signal voltage setting when select hs400es mode. Fixes: commit 1720d3545b77 ("mmc: core: switch to 1V8 or 1V2 for hs400es mode") Signed-off-by: Haibo Chen Reviewed-by: Shawn Lin Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 160c365b5879e6d4ea1c299b68c9702e8c783298 Author: Miquel Raynal Date: Wed Jul 5 08:51:09 2017 +0200 nand: fix wrong default oob layout for small pages using soft ecc commit f7f8c1756e9a5f1258a7cc6b663f8451b724900f upstream. When using soft ecc, if no ooblayout is given, the core automatically uses one of the nand_ooblayout_{sp,lp}*() functions to determine the layout inside the out of band data. Until kernel version 4.6, struct nand_ecclayout was used for that purpose. During the migration from 4.6 to 4.7, an error shown up in the small page layout, in the case oob section is only 8 bytes long. The layout was using three bytes (0, 1, 2) for ecc, two bytes (3, 4) as free bytes, one byte (5) for bad block marker and finally two bytes (6, 7) as free bytes, as shown there: [linux-4.6] drivers/mtd/nand/nand_base.c:52 static struct nand_ecclayout nand_oob_8 = { .eccbytes = 3, .eccpos = {0, 1, 2}, .oobfree = { {.offset = 3, .length = 2}, {.offset = 6, .length = 2} } }; This fixes the current implementation which is incoherent. It references bit 3 at the same time as an ecc byte and a free byte. Furthermore, it is clear with the previous implementation that there is only one ecc section with 8 bytes oob sections. We shall return -ERANGE in the nand_ooblayout_ecc_sp() function when asked for the second section. Signed-off-by: Miquel Raynal Fixes: 41b207a70d3a ("mtd: nand: implement the default mtd_ooblayout_ops") Signed-off-by: Boris Brezillon Signed-off-by: Greg Kroah-Hartman commit 227559e6233c9af382efcfd4c9ac87e090bb4853 Author: Mateusz Jurczyk Date: Wed Jun 7 12:26:49 2017 +0200 fuse: initialize the flock flag in fuse_file on allocation commit 68227c03cba84a24faf8a7277d2b1a03c8959c2c upstream. Before the patch, the flock flag could remain uninitialized for the lifespan of the fuse_file allocation. Unless set to true in fuse_file_flock(), it would remain in an indeterminate state until read in an if statement in fuse_release_common(). This could consequently lead to taking an unexpected branch in the code. The bug was discovered by a runtime instrumentation designed to detect use of uninitialized memory in the kernel. Signed-off-by: Mateusz Jurczyk Fixes: 37fb3a30b462 ("fuse: fix flock") Signed-off-by: Miklos Szeredi Signed-off-by: Greg Kroah-Hartman commit 1da30c23b63b3ed426880ac11179dbc2d2bbf368 Author: Nicholas Bellinger Date: Sun Aug 6 16:10:03 2017 -0700 target: Fix node_acl demo-mode + uncached dynamic shutdown regression commit 6f48655facfd7f7ccfe6d252ac0fe319ab02e4dd upstream. This patch fixes a generate_node_acls = 1 + cache_dynamic_acls = 0 regression, that was introduced by commit 01d4d673558985d9a118e1e05026633c3e2ade9b Author: Nicholas Bellinger Date: Wed Dec 7 12:55:54 2016 -0800 which originally had the proper list_del_init() usage, but was dropped during list review as it was thought unnecessary by HCH. However, list_del_init() usage is required during the special generate_node_acls = 1 + cache_dynamic_acls = 0 case when transport_free_session() does a list_del(&se_nacl->acl_list), followed by target_complete_nacl() doing the same thing. This was manifesting as a general protection fault as reported by Justin: kernel: general protection fault: 0000 [#1] SMP kernel: Modules linked in: kernel: CPU: 0 PID: 11047 Comm: iscsi_ttx Not tainted 4.13.0-rc2.x86_64.1+ #20 kernel: Hardware name: Intel Corporation S5500BC/S5500BC, BIOS S5500.86B.01.00.0064.050520141428 05/05/2014 kernel: task: ffff88026939e800 task.stack: ffffc90007884000 kernel: RIP: 0010:target_put_nacl+0x49/0xb0 kernel: RSP: 0018:ffffc90007887d70 EFLAGS: 00010246 kernel: RAX: dead000000000200 RBX: ffff8802556ca000 RCX: 0000000000000000 kernel: RDX: dead000000000100 RSI: 0000000000000246 RDI: ffff8802556ce028 kernel: RBP: ffffc90007887d88 R08: 0000000000000001 R09: 0000000000000000 kernel: R10: ffffc90007887df8 R11: ffffea0009986900 R12: ffff8802556ce020 kernel: R13: ffff8802556ce028 R14: ffff8802556ce028 R15: ffffffff88d85540 kernel: FS: 0000000000000000(0000) GS:ffff88027fc00000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 00007fffe36f5f94 CR3: 0000000009209000 CR4: 00000000003406f0 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 kernel: Call Trace: kernel: transport_free_session+0x67/0x140 kernel: transport_deregister_session+0x7a/0xc0 kernel: iscsit_close_session+0x92/0x210 kernel: iscsit_close_connection+0x5f9/0x840 kernel: iscsit_take_action_for_connection_exit+0xfe/0x110 kernel: iscsi_target_tx_thread+0x140/0x1e0 kernel: ? wait_woken+0x90/0x90 kernel: kthread+0x124/0x160 kernel: ? iscsit_thread_get_cpumask+0x90/0x90 kernel: ? kthread_create_on_node+0x40/0x40 kernel: ret_from_fork+0x22/0x30 kernel: Code: 00 48 89 fb 4c 8b a7 48 01 00 00 74 68 4d 8d 6c 24 08 4c 89 ef e8 e8 28 43 00 48 8b 93 20 04 00 00 48 8b 83 28 04 00 00 4c 89 ef <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 83 20 kernel: RIP: target_put_nacl+0x49/0xb0 RSP: ffffc90007887d70 kernel: ---[ end trace f12821adbfd46fed ]--- To address this, go ahead and use proper list_del_list() for all cases of se_nacl->acl_list deletion. Reported-by: Justin Maggard Tested-by: Justin Maggard Cc: Justin Maggard Signed-off-by: Nicholas Bellinger Signed-off-by: Greg Kroah-Hartman commit b51a71635576ddd2f30597c148ca4b0d62cec288 Author: Nicholas Bellinger Date: Fri Aug 4 23:59:31 2017 -0700 iscsi-target: Fix iscsi_np reset hung task during parallel delete commit 978d13d60c34818a41fc35962602bdfa5c03f214 upstream. This patch fixes a bug associated with iscsit_reset_np_thread() that can occur during parallel configfs rmdir of a single iscsi_np used across multiple iscsi-target instances, that would result in hung task(s) similar to below where configfs rmdir process context was blocked indefinately waiting for iscsi_np->np_restart_comp to finish: [ 6726.112076] INFO: task dcp_proxy_node_:15550 blocked for more than 120 seconds. [ 6726.119440] Tainted: G W O 4.1.26-3321 #2 [ 6726.125045] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 6726.132927] dcp_proxy_node_ D ffff8803f202bc88 0 15550 1 0x00000000 [ 6726.140058] ffff8803f202bc88 ffff88085c64d960 ffff88083b3b1ad0 ffff88087fffeb08 [ 6726.147593] ffff8803f202c000 7fffffffffffffff ffff88083f459c28 ffff88083b3b1ad0 [ 6726.155132] ffff88035373c100 ffff8803f202bca8 ffffffff8168ced2 ffff8803f202bcb8 [ 6726.162667] Call Trace: [ 6726.165150] [] schedule+0x32/0x80 [ 6726.170156] [] schedule_timeout+0x214/0x290 [ 6726.176030] [] ? __send_signal+0x52/0x4a0 [ 6726.181728] [] wait_for_completion+0x96/0x100 [ 6726.187774] [] ? wake_up_state+0x10/0x10 [ 6726.193395] [] iscsit_reset_np_thread+0x62/0xe0 [iscsi_target_mod] [ 6726.201278] [] iscsit_tpg_disable_portal_group+0x96/0x190 [iscsi_target_mod] [ 6726.210033] [] lio_target_tpg_store_enable+0x4f/0xc0 [iscsi_target_mod] [ 6726.218351] [] configfs_write_file+0xaa/0x110 [ 6726.224392] [] vfs_write+0xa4/0x1b0 [ 6726.229576] [] SyS_write+0x41/0xb0 [ 6726.234659] [] system_call_fastpath+0x12/0x71 It would happen because each iscsit_reset_np_thread() sets state to ISCSI_NP_THREAD_RESET, sends SIGINT, and then blocks waiting for completion on iscsi_np->np_restart_comp. However, if iscsi_np was active processing a login request and more than a single iscsit_reset_np_thread() caller to the same iscsi_np was blocked on iscsi_np->np_restart_comp, iscsi_np kthread process context in __iscsi_target_login_thread() would flush pending signals and only perform a single completion of np->np_restart_comp before going back to sleep within transport specific iscsit_transport->iscsi_accept_np code. To address this bug, add a iscsi_np->np_reset_count and update __iscsi_target_login_thread() to keep completing np->np_restart_comp until ->np_reset_count has reached zero. Reported-by: Gary Guo Tested-by: Gary Guo Cc: Mike Christie Cc: Hannes Reinecke Signed-off-by: Nicholas Bellinger Signed-off-by: Greg Kroah-Hartman commit e6a0599b746427682df1e8d444eee9e0641b4c09 Author: Varun Prakash Date: Sun Jul 23 20:03:33 2017 +0530 iscsi-target: fix memory leak in iscsit_setup_text_cmd() commit ea8dc5b4cd2195ee582cae28afa4164c6dea1738 upstream. On receiving text request iscsi-target allocates buffer for payload in iscsit_handle_text_cmd() and assigns buffer pointer to cmd->text_in_ptr, this buffer is currently freed in iscsit_release_cmd(), if iscsi-target sets 'C' bit in text response then it will receive another text request from the initiator with ttt != 0xffffffff in this case iscsi-target will find cmd using itt and call iscsit_setup_text_cmd() which will set cmd->text_in_ptr to NULL without freeing previously allocated buffer. This patch fixes this issue by calling kfree(cmd->text_in_ptr) in iscsit_setup_text_cmd() before assigning NULL to it. For the first text request cmd->text_in_ptr is NULL as cmd is memset to 0 in iscsit_allocate_cmd(). Signed-off-by: Varun Prakash Signed-off-by: Nicholas Bellinger Signed-off-by: Greg Kroah-Hartman commit ced271b814e441ac5ce079427c14860f9534b57a Author: Boris Brezillon Date: Mon Jul 31 10:29:56 2017 +0200 mtd: nand: Fix timing setup for NANDs that do not support SET FEATURES commit a11bf5ed951f8900d244d09eb03a888b59c7fc82 upstream. Some ONFI NANDs do not support the SET/GET FEATURES commands, which, according to the spec, is perfectly valid. On these NANDs we can't set a specific timing mode using the "timing mode" feature, and we should assume the NAND does not require any setup to enter a specific timing mode. Signed-off-by: Boris Brezillon Fixes: d8e725dd8311 ("mtd: nand: automate NAND timings selection") Reported-by: Alexander Dahl Tested-by: Alexander Dahl Signed-off-by: Boris Brezillon Signed-off-by: Greg Kroah-Hartman commit a311810903c700ff7411be16489df09ad6f3faee Author: Max Filippov Date: Tue Aug 1 11:02:46 2017 -0700 xtensa: don't limit csum_partial export by CONFIG_NET commit 7f81e55c737a8fa82c71f290945d729a4902f8d2 upstream. csum_partial and csum_partial_copy_generic are defined unconditionally and are available even when CONFIG_NET is disabled. They are used not only by the network drivers, but also by scsi and media. Don't limit these functions export by CONFIG_NET. Signed-off-by: Max Filippov Signed-off-by: Greg Kroah-Hartman commit a3ab0f069f466acb7da4f960d8e1bc3cfdf1309e Author: Max Filippov Date: Tue Aug 1 11:15:15 2017 -0700 xtensa: mm/cache: add missing EXPORT_SYMBOLs commit bc652eb6a0d5cffaea7dc8e8ad488aab2a1bf1ed upstream. Functions clear_user_highpage, copy_user_highpage, flush_dcache_page, local_flush_cache_range and local_flush_cache_page may be used from modules. Export them. Signed-off-by: Max Filippov Signed-off-by: Greg Kroah-Hartman commit 03973c57e1a27de080ed06a3961f1d6afec27539 Author: Max Filippov Date: Fri Jul 28 17:42:59 2017 -0700 xtensa: fix cache aliasing handling code for WT cache commit 6d0f581d1768d3eaba15776e7dd1fdfec10cfe36 upstream. Currently building kernel for xtensa core with aliasing WT cache fails with the following messages: mm/memory.c:2152: undefined reference to `flush_dcache_page' mm/memory.c:2332: undefined reference to `local_flush_cache_page' mm/memory.c:1919: undefined reference to `local_flush_cache_range' mm/memory.c:4179: undefined reference to `copy_to_user_page' mm/memory.c:4183: undefined reference to `copy_from_user_page' This happens because implementation of these functions is only compiled when data cache is WB, which looks wrong: even when data cache doesn't need flushing it still needs invalidation. The functions like __flush_[invalidate_]dcache_* are correctly defined for both WB and WT caches (and even if they weren't that'd still be ok, just slower). Fix this by providing the same implementation of the above functions for both WB and WT cache. Signed-off-by: Max Filippov Signed-off-by: Greg Kroah-Hartman commit 0041042de554cbcaa73163747d7179be715923fc Author: Mel Gorman Date: Wed Aug 9 08:27:11 2017 +0100 futex: Remove unnecessary warning from get_futex_key commit 48fb6f4db940e92cfb16cd878cddd59ea6120d06 upstream. Commit 65d8fc777f6d ("futex: Remove requirement for lock_page() in get_futex_key()") removed an unnecessary lock_page() with the side-effect that page->mapping needed to be treated very carefully. Two defensive warnings were added in case any assumption was missed and the first warning assumed a correct application would not alter a mapping backing a futex key. Since merging, it has not triggered for any unexpected case but Mark Rutland reported the following bug triggering due to the first warning. kernel BUG at kernel/futex.c:679! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 3695 Comm: syz-executor1 Not tainted 4.13.0-rc3-00020-g307fec773ba3 #3 Hardware name: linux,dummy-virt (DT) task: ffff80001e271780 task.stack: ffff000010908000 PC is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679 LR is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679 pc : [] lr : [] pstate: 80000145 The fact that it's a bug instead of a warning was due to an unrelated arm64 problem, but the warning itself triggered because the underlying mapping changed. This is an application issue but from a kernel perspective it's a recoverable situation and the warning is unnecessary so this patch removes the warning. The warning may potentially be triggered with the following test program from Mark although it may be necessary to adjust NR_FUTEX_THREADS to be a value smaller than the number of CPUs in the system. #include #include #include #include #include #include #include #include #define NR_FUTEX_THREADS 16 pthread_t threads[NR_FUTEX_THREADS]; void *mem; #define MEM_PROT (PROT_READ | PROT_WRITE) #define MEM_SIZE 65536 static int futex_wrapper(int *uaddr, int op, int val, const struct timespec *timeout, int *uaddr2, int val3) { syscall(SYS_futex, uaddr, op, val, timeout, uaddr2, val3); } void *poll_futex(void *unused) { for (;;) { futex_wrapper(mem, FUTEX_CMP_REQUEUE_PI, 1, NULL, mem + 4, 1); } } int main(int argc, char *argv[]) { int i; mem = mmap(NULL, MEM_SIZE, MEM_PROT, MAP_SHARED | MAP_ANONYMOUS, -1, 0); printf("Mapping @ %p\n", mem); printf("Creating futex threads...\n"); for (i = 0; i < NR_FUTEX_THREADS; i++) pthread_create(&threads[i], NULL, poll_futex, NULL); printf("Flipping mapping...\n"); for (;;) { mmap(mem, MEM_SIZE, MEM_PROT, MAP_FIXED | MAP_SHARED | MAP_ANONYMOUS, -1, 0); } return 0; } Reported-and-tested-by: Mark Rutland Signed-off-by: Mel Gorman Acked-by: Peter Zijlstra (Intel) Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit e2286916ac078728949163d7cbd5d4875a57dbfb Author: Cong Wang Date: Thu Aug 10 15:24:24 2017 -0700 mm: fix list corruptions on shmem shrinklist commit d041353dc98a6339182cd6f628b4c8f111278cb3 upstream. We saw many list corruption warnings on shmem shrinklist: WARNING: CPU: 18 PID: 177 at lib/list_debug.c:59 __list_del_entry+0x9e/0xc0 list_del corruption. prev->next should be ffff9ae5694b82d8, but was ffff9ae5699ba960 Modules linked in: intel_rapl sb_edac edac_core x86_pkg_temp_thermal coretemp iTCO_wdt iTCO_vendor_support crct10dif_pclmul crc32_pclmul ghash_clmulni_intel raid0 dcdbas shpchp wmi hed i2c_i801 ioatdma lpc_ich i2c_smbus acpi_cpufreq tcp_diag inet_diag sch_fq_codel ipmi_si ipmi_devintf ipmi_msghandler igb ptp crc32c_intel pps_core i2c_algo_bit i2c_core dca ipv6 crc_ccitt CPU: 18 PID: 177 Comm: kswapd1 Not tainted 4.9.34-t3.el7.twitter.x86_64 #1 Hardware name: Dell Inc. PowerEdge C6220/0W6W6G, BIOS 2.2.3 11/07/2013 Call Trace: dump_stack+0x4d/0x66 __warn+0xcb/0xf0 warn_slowpath_fmt+0x4f/0x60 __list_del_entry+0x9e/0xc0 shmem_unused_huge_shrink+0xfa/0x2e0 shmem_unused_huge_scan+0x20/0x30 super_cache_scan+0x193/0x1a0 shrink_slab.part.41+0x1e3/0x3f0 shrink_slab+0x29/0x30 shrink_node+0xf9/0x2f0 kswapd+0x2d8/0x6c0 kthread+0xd7/0xf0 ret_from_fork+0x22/0x30 WARNING: CPU: 23 PID: 639 at lib/list_debug.c:33 __list_add+0x89/0xb0 list_add corruption. prev->next should be next (ffff9ae5699ba960), but was ffff9ae5694b82d8. (prev=ffff9ae5694b82d8). Modules linked in: intel_rapl sb_edac edac_core x86_pkg_temp_thermal coretemp iTCO_wdt iTCO_vendor_support crct10dif_pclmul crc32_pclmul ghash_clmulni_intel raid0 dcdbas shpchp wmi hed i2c_i801 ioatdma lpc_ich i2c_smbus acpi_cpufreq tcp_diag inet_diag sch_fq_codel ipmi_si ipmi_devintf ipmi_msghandler igb ptp crc32c_intel pps_core i2c_algo_bit i2c_core dca ipv6 crc_ccitt CPU: 23 PID: 639 Comm: systemd-udevd Tainted: G W 4.9.34-t3.el7.twitter.x86_64 #1 Hardware name: Dell Inc. PowerEdge C6220/0W6W6G, BIOS 2.2.3 11/07/2013 Call Trace: dump_stack+0x4d/0x66 __warn+0xcb/0xf0 warn_slowpath_fmt+0x4f/0x60 __list_add+0x89/0xb0 shmem_setattr+0x204/0x230 notify_change+0x2ef/0x440 do_truncate+0x5d/0x90 path_openat+0x331/0x1190 do_filp_open+0x7e/0xe0 do_sys_open+0x123/0x200 SyS_open+0x1e/0x20 do_syscall_64+0x61/0x170 entry_SYSCALL64_slow_path+0x25/0x25 The problem is that shmem_unused_huge_shrink() moves entries from the global sbinfo->shrinklist to its local lists and then releases the spinlock. However, a parallel shmem_setattr() could access one of these entries directly and add it back to the global shrinklist if it is removed, with the spinlock held. The logic itself looks solid since an entry could be either in a local list or the global list, otherwise it is removed from one of them by list_del_init(). So probably the race condition is that, one CPU is in the middle of INIT_LIST_HEAD() but the other CPU calls list_empty() which returns true too early then the following list_add_tail() sees a corrupted entry. list_empty_careful() is designed to fix this situation. [akpm@linux-foundation.org: add comments] Link: http://lkml.kernel.org/r/20170803054630.18775-1-xiyou.wangcong@gmail.com Fixes: 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure") Signed-off-by: Cong Wang Acked-by: Linus Torvalds Acked-by: Kirill A. Shutemov Cc: Hugh Dickins Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit b56cd77c1205fae6c92a519bd11cc26e78292bfe Author: Jonathan Toppins Date: Thu Aug 10 15:23:35 2017 -0700 mm: ratelimit PFNs busy info message commit 75dddef32514f7aa58930bde6a1263253bc3d4ba upstream. The RDMA subsystem can generate several thousand of these messages per second eventually leading to a kernel crash. Ratelimit these messages to prevent this crash. Doug said: "I've been carrying a version of this for several kernel versions. I don't remember when they started, but we have one (and only one) class of machines: Dell PE R730xd, that generate these errors. When it happens, without a rate limit, we get rcu timeouts and kernel oopses. With the rate limit, we just get a lot of annoying kernel messages but the machine continues on, recovers, and eventually the memory operations all succeed" And: "> Well... why are all these EBUSY's occurring? It sounds inefficient > (at least) but if it is expected, normal and unavoidable then > perhaps we should just remove that message altogether? I don't have an answer to that question. To be honest, I haven't looked real hard. We never had this at all, then it started out of the blue, but only on our Dell 730xd machines (and it hits all of them), but no other classes or brands of machines. And we have our 730xd machines loaded up with different brands and models of cards (for instance one dedicated to mlx4 hardware, one for qib, one for mlx5, an ocrdma/cxgb4 combo, etc), so the fact that it hit all of the machines meant it wasn't tied to any particular brand/model of RDMA hardware. To me, it always smelled of a hardware oddity specific to maybe the CPUs or mainboard chipsets in these machines, so given that I'm not an mm expert anyway, I never chased it down. A few other relevant details: it showed up somewhere around 4.8/4.9 or thereabouts. It never happened before, but the prinkt has been there since the 3.18 days, so possibly the test to trigger this message was changed, or something else in the allocator changed such that the situation started happening on these machines? And, like I said, it is specific to our 730xd machines (but they are all identical, so that could mean it's something like their specific ram configuration is causing the allocator to hit this on these machine but not on other machines in the cluster, I don't want to say it's necessarily the model of chipset or CPU, there are other bits of identicalness between these machines)" Link: http://lkml.kernel.org/r/499c0f6cc10d6eb829a67f2a4d75b4228a9b356e.1501695897.git.jtoppins@redhat.com Signed-off-by: Jonathan Toppins Reviewed-by: Doug Ledford Tested-by: Doug Ledford Cc: Michal Hocko Cc: Vlastimil Babka Cc: Mel Gorman Cc: Hillf Danton Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman