linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first
@ 2018-11-29  6:00 Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 02/35] iommu/vt-d: Fix NULL pointer dereference in prq_event_thread() Sasha Levin
                   ` (33 more replies)
  0 siblings, 34 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Sakari Ailus, Mauro Carvalho Chehab, Sasha Levin, linux-media

From: Sakari Ailus <sakari.ailus@linux.intel.com>

[ Upstream commit 30efae3d789cd0714ef795545a46749236e29558 ]

While there are issues related to object lifetime management, unregister the
media device first when the driver is being unbound. This is slightly
safer.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/media/platform/omap3isp/isp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/media/platform/omap3isp/isp.c b/drivers/media/platform/omap3isp/isp.c
index 6e6e978263b0..c834fea5f9b0 100644
--- a/drivers/media/platform/omap3isp/isp.c
+++ b/drivers/media/platform/omap3isp/isp.c
@@ -1592,6 +1592,8 @@ static void isp_pm_complete(struct device *dev)
 
 static void isp_unregister_entities(struct isp_device *isp)
 {
+	media_device_unregister(&isp->media_dev);
+
 	omap3isp_csi2_unregister_entities(&isp->isp_csi2a);
 	omap3isp_ccp2_unregister_entities(&isp->isp_ccp2);
 	omap3isp_ccdc_unregister_entities(&isp->isp_ccdc);
@@ -1602,7 +1604,6 @@ static void isp_unregister_entities(struct isp_device *isp)
 	omap3isp_stat_unregister_entities(&isp->isp_hist);
 
 	v4l2_device_unregister(&isp->v4l2_dev);
-	media_device_unregister(&isp->media_dev);
 	media_device_cleanup(&isp->media_dev);
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 02/35] iommu/vt-d: Fix NULL pointer dereference in prq_event_thread()
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 03/35] brcmutil: really fix decoding channel info for 160 MHz bandwidth Sasha Levin
                   ` (32 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Lu Baolu, Ashok Raj, Jacob Pan, Sohil Mehta, Joerg Roedel,
	Sasha Levin, iommu

From: Lu Baolu <baolu.lu@linux.intel.com>

[ Upstream commit 19ed3e2dd8549c1a34914e8dad01b64e7837645a ]

When handling page request without pasid event, go to "no_pasid"
branch instead of "bad_req". Otherwise, a NULL pointer deference
will happen there.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Fixes: a222a7f0bb6c9 'iommu/vt-d: Implement page request handling'
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/iommu/intel-svm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index d7def26ccf79..f5573bb9f450 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -589,7 +589,7 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 			pr_err("%s: Page request without PASID: %08llx %08llx\n",
 			       iommu->name, ((unsigned long long *)req)[0],
 			       ((unsigned long long *)req)[1]);
-			goto bad_req;
+			goto no_pasid;
 		}
 
 		if (!svm || svm->pasid != req->pasid) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 03/35] brcmutil: really fix decoding channel info for 160 MHz bandwidth
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 02/35] iommu/vt-d: Fix NULL pointer dereference in prq_event_thread() Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 04/35] iommu/ipmmu-vmsa: Fix crash on early domain free Sasha Levin
                   ` (31 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Rafał Miłecki, Kalle Valo, Sasha Levin, linux-wireless,
	brcm80211-dev-list.pdl, brcm80211-dev-list, netdev

From: Rafał Miłecki <rafal@milecki.pl>

[ Upstream commit 3401d42c7ea2d064d15c66698ff8eb96553179ce ]

Previous commit /adding/ support for 160 MHz chanspecs was incomplete.
It didn't set bandwidth info and didn't extract control channel info. As
the result it was also using uninitialized "sb" var.

This change has been tested for two chanspecs found to be reported by
some devices/firmwares:
1) 60/160 (0xee32)
   Before: chnum:50 control_ch_num:36
    After: chnum:50 control_ch_num:60
2) 120/160 (0xed72)
   Before: chnum:114 control_ch_num:100
    After: chnum:114 control_ch_num:120

Fixes: 330994e8e8ec ("brcmfmac: fix for proper support of 160MHz bandwidth")
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/wireless/broadcom/brcm80211/brcmutil/d11.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmutil/d11.c b/drivers/net/wireless/broadcom/brcm80211/brcmutil/d11.c
index e7584b842dce..eb5db94f5745 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmutil/d11.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmutil/d11.c
@@ -193,6 +193,9 @@ static void brcmu_d11ac_decchspec(struct brcmu_chan *ch)
 		}
 		break;
 	case BRCMU_CHSPEC_D11AC_BW_160:
+		ch->bw = BRCMU_CHAN_BW_160;
+		ch->sb = brcmu_maskget16(ch->chspec, BRCMU_CHSPEC_D11AC_SB_MASK,
+					 BRCMU_CHSPEC_D11AC_SB_SHIFT);
 		switch (ch->sb) {
 		case BRCMU_CHAN_SB_LLL:
 			ch->control_ch_num -= CH_70MHZ_APART;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 04/35] iommu/ipmmu-vmsa: Fix crash on early domain free
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 02/35] iommu/vt-d: Fix NULL pointer dereference in prq_event_thread() Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 03/35] brcmutil: really fix decoding channel info for 160 MHz bandwidth Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 05/35] can: rcar_can: Fix erroneous registration Sasha Levin
                   ` (30 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel; +Cc: Geert Uytterhoeven, Joerg Roedel, Sasha Levin, iommu

From: Geert Uytterhoeven <geert+renesas@glider.be>

[ Upstream commit e5b78f2e349eef5d4fca5dc1cf5a3b4b2cc27abd ]

If iommu_ops.add_device() fails, iommu_ops.domain_free() is still
called, leading to a crash, as the domain was only partially
initialized:

    ipmmu-vmsa e67b0000.mmu: Cannot accommodate DMA translation for IOMMU page tables
    sata_rcar ee300000.sata: Unable to initialize IPMMU context
    iommu: Failed to add device ee300000.sata to group 0: -22
    Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
    ...
    Call trace:
     ipmmu_domain_free+0x1c/0xa0
     iommu_group_release+0x48/0x68
     kobject_put+0x74/0xe8
     kobject_del.part.0+0x3c/0x50
     kobject_put+0x60/0xe8
     iommu_group_get_for_dev+0xa8/0x1f0
     ipmmu_add_device+0x1c/0x40
     of_iommu_configure+0x118/0x190

Fix this by checking if the domain's context already exists, before
trying to destroy it.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Fixes: d25a2a16f0889 ('iommu: Add driver for Renesas VMSA-compatible IPMMU')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/iommu/ipmmu-vmsa.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 5d0ba5f644c4..777aff1f549f 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -424,6 +424,9 @@ static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
 
 static void ipmmu_domain_destroy_context(struct ipmmu_vmsa_domain *domain)
 {
+	if (!domain->mmu)
+		return;
+
 	/*
 	 * Disable the context. Flush the TLB as required when modifying the
 	 * context registers.
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 05/35] can: rcar_can: Fix erroneous registration
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (2 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 04/35] iommu/ipmmu-vmsa: Fix crash on early domain free Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 06/35] test_firmware: fix error return getting clobbered Sasha Levin
                   ` (29 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Fabrizio Castro, Chris Paterson, Marc Kleine-Budde, Sasha Levin,
	linux-can, netdev

From: Fabrizio Castro <fabrizio.castro@bp.renesas.com>

[ Upstream commit 68c8d209cd4337da4fa04c672f0b62bb735969bc ]

Assigning 2 to "renesas,can-clock-select" tricks the driver into
registering the CAN interface, even though we don't want that.
This patch improves one of the checks to prevent that from happening.

Fixes: 862e2b6af9413b43 ("can: rcar_can: support all input clocks")
Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com>
Signed-off-by: Chris Paterson <Chris.Paterson2@renesas.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/can/rcar/rcar_can.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/can/rcar/rcar_can.c b/drivers/net/can/rcar/rcar_can.c
index 11662f479e76..771a46083739 100644
--- a/drivers/net/can/rcar/rcar_can.c
+++ b/drivers/net/can/rcar/rcar_can.c
@@ -24,6 +24,9 @@
 
 #define RCAR_CAN_DRV_NAME	"rcar_can"
 
+#define RCAR_SUPPORTED_CLOCKS	(BIT(CLKR_CLKP1) | BIT(CLKR_CLKP2) | \
+				 BIT(CLKR_CLKEXT))
+
 /* Mailbox configuration:
  * mailbox 60 - 63 - Rx FIFO mailboxes
  * mailbox 56 - 59 - Tx FIFO mailboxes
@@ -789,7 +792,7 @@ static int rcar_can_probe(struct platform_device *pdev)
 		goto fail_clk;
 	}
 
-	if (clock_select >= ARRAY_SIZE(clock_names)) {
+	if (!(BIT(clock_select) & RCAR_SUPPORTED_CLOCKS)) {
 		err = -EINVAL;
 		dev_err(&pdev->dev, "invalid CAN clock selected\n");
 		goto fail_clk;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 06/35] test_firmware: fix error return getting clobbered
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (3 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 05/35] can: rcar_can: Fix erroneous registration Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 07/35] HID: input: Ignore battery reported by Symbol DS4308 Sasha Levin
                   ` (28 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel; +Cc: Colin Ian King, Greg Kroah-Hartman, Sasha Levin

From: Colin Ian King <colin.king@canonical.com>

[ Upstream commit 8bb0a88600f0267cfcc245d34f8c4abe8c282713 ]

In the case where eq->fw->size > PAGE_SIZE the error return rc
is being set to EINVAL however this is being overwritten to
rc = req->fw->size because the error exit path via label 'out' is
not being taken.  Fix this by adding the jump to the error exit
path 'out'.

Detected by CoverityScan, CID#1453465 ("Unused value")

Fixes: c92316bf8e94 ("test_firmware: add batched firmware tests")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 lib/test_firmware.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/test_firmware.c b/lib/test_firmware.c
index e7008688769b..71d371f97138 100644
--- a/lib/test_firmware.c
+++ b/lib/test_firmware.c
@@ -838,6 +838,7 @@ static ssize_t read_firmware_show(struct device *dev,
 	if (req->fw->size > PAGE_SIZE) {
 		pr_err("Testing interface must use PAGE_SIZE firmware for now\n");
 		rc = -EINVAL;
+		goto out;
 	}
 	memcpy(buf, req->fw->data, req->fw->size);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 07/35] HID: input: Ignore battery reported by Symbol DS4308
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (4 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 06/35] test_firmware: fix error return getting clobbered Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 08/35] batman-adv: Use explicit tvlv padding for ELP packets Sasha Levin
                   ` (27 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Benson Leung, Benjamin Tissoires, Sasha Levin, linux-input

From: Benson Leung <bleung@chromium.org>

[ Upstream commit 0fd791841a6d67af1155a9c3de54dea51220721e ]

The Motorola/Zebra Symbol DS4308-HD is a handheld USB barcode scanner
which does not have a battery, but reports one anyway that always has
capacity 2.

Let's apply the IGNORE quirk to prevent it from being treated like a
power supply so that userspaces don't get confused that this
accessory is almost out of power and warn the user that they need to charge
their wired barcode scanner.

Reported here: https://bugs.chromium.org/p/chromium/issues/detail?id=804720

Signed-off-by: Benson Leung <bleung@chromium.org>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/hid/hid-ids.h   | 1 +
 drivers/hid/hid-input.c | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h
index 3fc8c0d67592..87904d2adadb 100644
--- a/drivers/hid/hid-ids.h
+++ b/drivers/hid/hid-ids.h
@@ -1001,6 +1001,7 @@
 #define USB_VENDOR_ID_SYMBOL		0x05e0
 #define USB_DEVICE_ID_SYMBOL_SCANNER_1	0x0800
 #define USB_DEVICE_ID_SYMBOL_SCANNER_2	0x1300
+#define USB_DEVICE_ID_SYMBOL_SCANNER_3	0x1200
 
 #define USB_VENDOR_ID_SYNAPTICS		0x06cb
 #define USB_DEVICE_ID_SYNAPTICS_TP	0x0001
diff --git a/drivers/hid/hid-input.c b/drivers/hid/hid-input.c
index bb984cc9753b..d146a9b545ee 100644
--- a/drivers/hid/hid-input.c
+++ b/drivers/hid/hid-input.c
@@ -325,6 +325,9 @@ static const struct hid_device_id hid_battery_quirks[] = {
 	{ HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_ELECOM,
 		USB_DEVICE_ID_ELECOM_BM084),
 	  HID_BATTERY_QUIRK_IGNORE },
+	{ HID_USB_DEVICE(USB_VENDOR_ID_SYMBOL,
+		USB_DEVICE_ID_SYMBOL_SCANNER_3),
+	  HID_BATTERY_QUIRK_IGNORE },
 	{}
 };
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 08/35] batman-adv: Use explicit tvlv padding for ELP packets
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (5 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 07/35] HID: input: Ignore battery reported by Symbol DS4308 Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 09/35] batman-adv: Expand merged fragment buffer for full packet Sasha Levin
                   ` (26 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Sven Eckelmann, Simon Wunderlich, Sasha Levin, netdev

From: Sven Eckelmann <sven@narfation.org>

[ Upstream commit f4156f9656feac21f4de712fac94fae964c5d402 ]

The announcement messages of batman-adv COMPAT_VERSION 15 have the
possibility to announce additional information via a dynamic TVLV part.
This part is optional for the ELP packets and currently not parsed by the
Linux implementation. Still out-of-tree versions are using it to transport
things like neighbor hashes to optimize the rebroadcast behavior.

Since the ELP broadcast packets are smaller than the minimal ethernet
packet, it often has to be padded. This is often done (as specified in
RFC894) with octets of zero and thus work perfectly fine with the TVLV
part (making it a zero length and thus empty). But not all ethernet
compatible hardware seems to follow this advice. To avoid ambiguous
situations when parsing the TVLV header, just force the 4 bytes (TVLV
length + padding) after the required ELP header to zero.

Fixes: d6f94d91f766 ("batman-adv: ELP - adding basic infrastructure")
Reported-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/batman-adv/bat_v_elp.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/batman-adv/bat_v_elp.c b/net/batman-adv/bat_v_elp.c
index e92dfedccc16..fbc132f4670e 100644
--- a/net/batman-adv/bat_v_elp.c
+++ b/net/batman-adv/bat_v_elp.c
@@ -338,19 +338,21 @@ static void batadv_v_elp_periodic_work(struct work_struct *work)
  */
 int batadv_v_elp_iface_enable(struct batadv_hard_iface *hard_iface)
 {
+	static const size_t tvlv_padding = sizeof(__be32);
 	struct batadv_elp_packet *elp_packet;
 	unsigned char *elp_buff;
 	u32 random_seqno;
 	size_t size;
 	int res = -ENOMEM;
 
-	size = ETH_HLEN + NET_IP_ALIGN + BATADV_ELP_HLEN;
+	size = ETH_HLEN + NET_IP_ALIGN + BATADV_ELP_HLEN + tvlv_padding;
 	hard_iface->bat_v.elp_skb = dev_alloc_skb(size);
 	if (!hard_iface->bat_v.elp_skb)
 		goto out;
 
 	skb_reserve(hard_iface->bat_v.elp_skb, ETH_HLEN + NET_IP_ALIGN);
-	elp_buff = skb_put_zero(hard_iface->bat_v.elp_skb, BATADV_ELP_HLEN);
+	elp_buff = skb_put_zero(hard_iface->bat_v.elp_skb,
+				BATADV_ELP_HLEN + tvlv_padding);
 	elp_packet = (struct batadv_elp_packet *)elp_buff;
 
 	elp_packet->packet_type = BATADV_ELP;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 09/35] batman-adv: Expand merged fragment buffer for full packet
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (6 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 08/35] batman-adv: Use explicit tvlv padding for ELP packets Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 10/35] amd/iommu: Fix Guest Virtual APIC Log Tail Address Register Sasha Levin
                   ` (25 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Sven Eckelmann, Simon Wunderlich, Sasha Levin, netdev

From: Sven Eckelmann <sven@narfation.org>

[ Upstream commit d7d8bbb40a5b1f682ee6589e212934f4c6b8ad60 ]

The complete size ("total_size") of the fragmented packet is stored in the
fragment header and in the size of the fragment chain. When the fragments
are ready for merge, the skbuff's tail of the first fragment is expanded to
have enough room after the data pointer for at least total_size. This means
that it gets expanded by total_size - first_skb->len.

But this is ignoring the fact that after expanding the buffer, the fragment
header is pulled by from this buffer. Assuming that the tailroom of the
buffer was already 0, the buffer after the data pointer of the skbuff is
now only total_size - len(fragment_header) large. When the merge function
is then processing the remaining fragments, the code to copy the data over
to the merged skbuff will cause an skb_over_panic when it tries to actually
put enough data to fill the total_size bytes of the packet.

The size of the skb_pull must therefore also be taken into account when the
buffer's tailroom is expanded.

Fixes: 610bfc6bc99b ("batman-adv: Receive fragmented packets and merge")
Reported-by: Martin Weinelt <martin@darmstadt.freifunk.net>
Co-authored-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/batman-adv/fragmentation.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/batman-adv/fragmentation.c b/net/batman-adv/fragmentation.c
index b6abd19ab23e..c6d37d22bd12 100644
--- a/net/batman-adv/fragmentation.c
+++ b/net/batman-adv/fragmentation.c
@@ -274,7 +274,7 @@ batadv_frag_merge_packets(struct hlist_head *chain)
 	kfree(entry);
 
 	packet = (struct batadv_frag_packet *)skb_out->data;
-	size = ntohs(packet->total_size);
+	size = ntohs(packet->total_size) + hdr_size;
 
 	/* Make room for the rest of the fragments. */
 	if (pskb_expand_head(skb_out, 0, size - skb_out->len, GFP_ATOMIC) < 0) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 10/35] amd/iommu: Fix Guest Virtual APIC Log Tail Address Register
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (7 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 09/35] batman-adv: Expand merged fragment buffer for full packet Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 11/35] bnx2x: Assign unique DMAE channel number for FW DMAE transactions Sasha Levin
                   ` (24 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Filippo Sironi, Wei Wang, Suravee Suthikulpanit, Joerg Roedel,
	Sasha Levin, iommu

From: Filippo Sironi <sironi@amazon.de>

[ Upstream commit ab99be4683d9db33b100497d463274ebd23bd67e ]

This register should have been programmed with the physical address
of the memory location containing the shadow tail pointer for
the guest virtual APIC log instead of the base address.

Fixes: 8bda0cfbdc1a  ('iommu/amd: Detect and initialize guest vAPIC log')
Signed-off-by: Filippo Sironi <sironi@amazon.de>
Signed-off-by: Wei Wang <wawei@amazon.de>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/iommu/amd_iommu_init.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 6fe2d0346073..b97984a5ddad 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -796,7 +796,8 @@ static int iommu_init_ga_log(struct amd_iommu *iommu)
 	entry = iommu_virt_to_phys(iommu->ga_log) | GA_LOG_SIZE_512;
 	memcpy_toio(iommu->mmio_base + MMIO_GA_LOG_BASE_OFFSET,
 		    &entry, sizeof(entry));
-	entry = (iommu_virt_to_phys(iommu->ga_log) & 0xFFFFFFFFFFFFFULL) & ~7ULL;
+	entry = (iommu_virt_to_phys(iommu->ga_log_tail) &
+		 (BIT_ULL(52)-1)) & ~7ULL;
 	memcpy_toio(iommu->mmio_base + MMIO_GA_LOG_TAIL_OFFSET,
 		    &entry, sizeof(entry));
 	writel(0x00, iommu->mmio_base + MMIO_GA_HEAD_OFFSET);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 11/35] bnx2x: Assign unique DMAE channel number for FW DMAE transactions.
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (8 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 10/35] amd/iommu: Fix Guest Virtual APIC Log Tail Address Register Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 12/35] qed: Fix PTT leak in qed_drain() Sasha Levin
                   ` (23 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Sudarsana Reddy Kalluru, Sudarsana Reddy Kalluru,
	Michal Kalderon, David S . Miller, Sasha Levin, netdev

From: Sudarsana Reddy Kalluru <sudarsana.kalluru@cavium.com>

[ Upstream commit 77e461d14ed141253573eeeb4d34eccc51e38328 ]

Driver assigns DMAE channel 0 for FW as part of START_RAMROD command. FW
uses this channel for DMAE operations (e.g., TIME_SYNC implementation).
Driver also uses the same channel 0 for DMAE operations for some of the PFs
(e.g., PF0 on Port0). This could lead to concurrent access to the DMAE
channel by FW and driver which is not legal. Hence need to assign unique
DMAE id for FW.
Currently following DMAE channels are used by the clients,
  MFW - OCBB/OCSD functionality uses DMAE channel 14/15
  Driver 0-3 and 8-11 (for PF dmae operations)
         4 and 12 (for stats requests)
Assigning unique dmae_id '13' to the FW.

Changes from previous version:
------------------------------
v2: Incorporated the review comments.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h    | 7 +++++++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c | 1 +
 2 files changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
index 828e2e56b75e..1b7f4342dab9 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
@@ -2187,6 +2187,13 @@ void bnx2x_igu_clear_sb_gen(struct bnx2x *bp, u8 func, u8 idu_sb_id,
 #define PMF_DMAE_C(bp)			(BP_PORT(bp) * MAX_DMAE_C_PER_PORT + \
 					 E1HVN_MAX)
 
+/* Following is the DMAE channel number allocation for the clients.
+ *   MFW: OCBB/OCSD implementations use DMAE channels 14/15 respectively.
+ *   Driver: 0-3 and 8-11 (for PF dmae operations)
+ *           4 and 12 (for stats requests)
+ */
+#define BNX2X_FW_DMAE_C                 13 /* Channel for FW DMAE operations */
+
 /* PCIE link and speed */
 #define PCICFG_LINK_WIDTH		0x1f00000
 #define PCICFG_LINK_WIDTH_SHIFT		20
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c
index 8baf9d3eb4b1..453bfd83a070 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c
@@ -6149,6 +6149,7 @@ static inline int bnx2x_func_send_start(struct bnx2x *bp,
 	rdata->sd_vlan_tag	= cpu_to_le16(start_params->sd_vlan_tag);
 	rdata->path_id		= BP_PATH(bp);
 	rdata->network_cos_mode	= start_params->network_cos_mode;
+	rdata->dmae_cmd_id	= BNX2X_FW_DMAE_C;
 
 	rdata->vxlan_dst_port	= cpu_to_le16(start_params->vxlan_dst_port);
 	rdata->geneve_dst_port	= cpu_to_le16(start_params->geneve_dst_port);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 12/35] qed: Fix PTT leak in qed_drain()
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (9 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 11/35] bnx2x: Assign unique DMAE channel number for FW DMAE transactions Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 13/35] qed: Fix reading wrong value in loop condition Sasha Levin
                   ` (22 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Denis Bolotin, Michal Kalderon, David S . Miller, Sasha Levin, netdev

From: Denis Bolotin <denis.bolotin@cavium.com>

[ Upstream commit 9aaa4e8ba12972d674caeefbc5f88d83235dd697 ]

Release PTT before entering error flow.

Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com>
Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/qlogic/qed/qed_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c b/drivers/net/ethernet/qlogic/qed/qed_main.c
index 954f7ce4cf28..ecc2d4296526 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_main.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_main.c
@@ -1561,9 +1561,9 @@ static int qed_drain(struct qed_dev *cdev)
 			return -EBUSY;
 		}
 		rc = qed_mcp_drain(hwfn, ptt);
+		qed_ptt_release(hwfn, ptt);
 		if (rc)
 			return rc;
-		qed_ptt_release(hwfn, ptt);
 	}
 
 	return 0;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 13/35] qed: Fix reading wrong value in loop condition
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (10 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 12/35] qed: Fix PTT leak in qed_drain() Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 14/35] Revert "usb: gadget: ffs: Fix BUG when userland exits with submitted AIO transfers" Sasha Levin
                   ` (21 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Denis Bolotin, Michal Kalderon, David S . Miller, Sasha Levin, netdev

From: Denis Bolotin <denis.bolotin@cavium.com>

[ Upstream commit ed4eac20dcffdad47709422e0cb925981b056668 ]

The value of "sb_index" is written by the hardware. Reading its value and
writing it to "index" must finish before checking the loop condition.

Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com>
Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/qlogic/qed/qed_int.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_int.c b/drivers/net/ethernet/qlogic/qed/qed_int.c
index 719cdbfe1695..7746417130bd 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_int.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_int.c
@@ -992,6 +992,8 @@ static int qed_int_attentions(struct qed_hwfn *p_hwfn)
 	 */
 	do {
 		index = p_sb_attn->sb_index;
+		/* finish reading index before the loop condition */
+		dma_rmb();
 		attn_bits = le32_to_cpu(p_sb_attn->atten_bits);
 		attn_acks = le32_to_cpu(p_sb_attn->atten_ack);
 	} while (index != p_sb_attn->sb_index);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 14/35] Revert "usb: gadget: ffs: Fix BUG when userland exits with submitted AIO transfers"
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (11 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 13/35] qed: Fix reading wrong value in loop condition Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 15/35] net/mlx4_core: Zero out lkey field in SW2HW_MPT fw command Sasha Levin
                   ` (20 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Shen Jing, Saranya Gopal, Felipe Balbi, Sasha Levin, linux-usb

From: Shen Jing <jingx.shen@intel.com>

[ Upstream commit a9c859033f6ec772f8e3228c343bb1321584ae0e ]

This reverts commit b4194da3f9087dd38d91b40f9bec42d59ce589a8
since it causes list corruption followed by kernel panic:

Workqueue: adb ffs_aio_cancel_worker
RIP: 0010:__list_add_valid+0x4d/0x70
Call Trace:
insert_work+0x47/0xb0
__queue_work+0xf6/0x400
queue_work_on+0x65/0x70
dwc3_gadget_giveback+0x44/0x50 [dwc3]
dwc3_gadget_ep_dequeue+0x83/0x2d0 [dwc3]
? finish_wait+0x80/0x80
usb_ep_dequeue+0x1e/0x90
process_one_work+0x18c/0x3b0
worker_thread+0x3c/0x390
? process_one_work+0x3b0/0x3b0
kthread+0x11e/0x140
? kthread_create_worker_on_cpu+0x70/0x70
ret_from_fork+0x3a/0x50

This issue is seen with warm reboot stability testing.

Signed-off-by: Shen Jing <jingx.shen@intel.com>
Signed-off-by: Saranya Gopal <saranya.gopal@intel.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/usb/gadget/function/f_fs.c | 26 ++++++++------------------
 1 file changed, 8 insertions(+), 18 deletions(-)

diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
index 17467545391b..52e6897fa35a 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -219,7 +219,6 @@ struct ffs_io_data {
 
 	struct mm_struct *mm;
 	struct work_struct work;
-	struct work_struct cancellation_work;
 
 	struct usb_ep *ep;
 	struct usb_request *req;
@@ -1074,31 +1073,22 @@ ffs_epfile_open(struct inode *inode, struct file *file)
 	return 0;
 }
 
-static void ffs_aio_cancel_worker(struct work_struct *work)
-{
-	struct ffs_io_data *io_data = container_of(work, struct ffs_io_data,
-						   cancellation_work);
-
-	ENTER();
-
-	usb_ep_dequeue(io_data->ep, io_data->req);
-}
-
 static int ffs_aio_cancel(struct kiocb *kiocb)
 {
 	struct ffs_io_data *io_data = kiocb->private;
-	struct ffs_data *ffs = io_data->ffs;
+	struct ffs_epfile *epfile = kiocb->ki_filp->private_data;
 	int value;
 
 	ENTER();
 
-	if (likely(io_data && io_data->ep && io_data->req)) {
-		INIT_WORK(&io_data->cancellation_work, ffs_aio_cancel_worker);
-		queue_work(ffs->io_completion_wq, &io_data->cancellation_work);
-		value = -EINPROGRESS;
-	} else {
+	spin_lock_irq(&epfile->ffs->eps_lock);
+
+	if (likely(io_data && io_data->ep && io_data->req))
+		value = usb_ep_dequeue(io_data->ep, io_data->req);
+	else
 		value = -EINVAL;
-	}
+
+	spin_unlock_irq(&epfile->ffs->eps_lock);
 
 	return value;
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 15/35] net/mlx4_core: Zero out lkey field in SW2HW_MPT fw command
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (12 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 14/35] Revert "usb: gadget: ffs: Fix BUG when userland exits with submitted AIO transfers" Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 16/35] net/mlx4_core: Fix uninitialized variable compilation warning Sasha Levin
                   ` (19 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Jack Morgenstein, Tariq Toukan, David S . Miller, Sasha Levin,
	netdev, linux-rdma

From: Jack Morgenstein <jackm@dev.mellanox.co.il>

[ Upstream commit bd85fbc2038a1bbe84990b23ff69b6fc81a32b2c ]

When re-registering a user mr, the mpt information for the
existing mr when running SRIOV is obtained via the QUERY_MPT
fw command. The returned information includes the mpt's lkey.

This retrieved mpt information is used to move the mpt back
to hardware ownership in the rereg flow (via the SW2HW_MPT
fw command when running SRIOV).

The fw API spec states that for SW2HW_MPT, the lkey field
must be zero. Any ConnectX-3 PF driver which checks for strict spec
adherence will return failure for SW2HW_MPT if the lkey field is not
zero (although the fw in practice ignores this field for SW2HW_MPT).

Thus, in order to conform to the fw API spec, set the lkey field to zero
before invoking SW2HW_MPT when running SRIOV.

Fixes: e630664c8383 ("mlx4_core: Add helper functions to support MR re-registration")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/mellanox/mlx4/mr.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/mellanox/mlx4/mr.c b/drivers/net/ethernet/mellanox/mlx4/mr.c
index c7c0764991c9..20043f82c1d8 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mr.c
+++ b/drivers/net/ethernet/mellanox/mlx4/mr.c
@@ -363,6 +363,7 @@ int mlx4_mr_hw_write_mpt(struct mlx4_dev *dev, struct mlx4_mr *mmr,
 			container_of((void *)mpt_entry, struct mlx4_cmd_mailbox,
 				     buf);
 
+		(*mpt_entry)->lkey = 0;
 		err = mlx4_SW2HW_MPT(dev, mailbox, key);
 	}
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 16/35] net/mlx4_core: Fix uninitialized variable compilation warning
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (13 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 15/35] net/mlx4_core: Zero out lkey field in SW2HW_MPT fw command Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 17/35] net/mlx4: Fix UBSAN warning of signed integer overflow Sasha Levin
                   ` (18 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Tariq Toukan, David S . Miller, Sasha Levin, netdev, linux-rdma

From: Tariq Toukan <tariqt@mellanox.com>

[ Upstream commit 3ea7e7ea53c9f6ee41cb69a29c375fe9dd9a56a7 ]

Initialize the uid variable to zero to avoid the compilation warning.

Fixes: 7a89399ffad7 ("net/mlx4: Add mlx4_bitmap zone allocator")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/mellanox/mlx4/alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/alloc.c b/drivers/net/ethernet/mellanox/mlx4/alloc.c
index 6dabd983e7e0..94f4dc4a77e9 100644
--- a/drivers/net/ethernet/mellanox/mlx4/alloc.c
+++ b/drivers/net/ethernet/mellanox/mlx4/alloc.c
@@ -337,7 +337,7 @@ void mlx4_zone_allocator_destroy(struct mlx4_zone_allocator *zone_alloc)
 static u32 __mlx4_alloc_from_zone(struct mlx4_zone_entry *zone, int count,
 				  int align, u32 skip_mask, u32 *puid)
 {
-	u32 uid;
+	u32 uid = 0;
 	u32 res;
 	struct mlx4_zone_allocator *zone_alloc = zone->allocator;
 	struct mlx4_zone_entry *curr_node;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 17/35] net/mlx4: Fix UBSAN warning of signed integer overflow
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (14 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 16/35] net/mlx4_core: Fix uninitialized variable compilation warning Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 18/35] gpio: mockup: fix indicated direction Sasha Levin
                   ` (17 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Aya Levin, Tariq Toukan, David S . Miller, Sasha Levin, netdev,
	linux-rdma

From: Aya Levin <ayal@mellanox.com>

[ Upstream commit a463146e67c848cbab5ce706d6528281b7cded08 ]

UBSAN: Undefined behavior in
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:626:29
signed integer overflow: 1802201963 + 1802201963 cannot be represented
in type 'int'

The union of res_reserved and res_port_rsvd[MLX4_MAX_PORTS] monitors
granting of reserved resources. The grant operation is calculated and
protected, thus both members of the union cannot be negative.  Changed
type of res_reserved and of res_port_rsvd[MLX4_MAX_PORTS] from signed
int to unsigned int, allowing large value.

Fixes: 5a0d0a6161ae ("mlx4: Structures and init/teardown for VF resource quotas")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/mellanox/mlx4/mlx4.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index c68da1986e51..aaeb446bba62 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -541,8 +541,8 @@ struct slave_list {
 struct resource_allocator {
 	spinlock_t alloc_lock; /* protect quotas */
 	union {
-		int res_reserved;
-		int res_port_rsvd[MLX4_MAX_PORTS];
+		unsigned int res_reserved;
+		unsigned int res_port_rsvd[MLX4_MAX_PORTS];
 	};
 	union {
 		int res_free;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 18/35] gpio: mockup: fix indicated direction
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (15 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 17/35] net/mlx4: Fix UBSAN warning of signed integer overflow Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 19/35] mtd: rawnand: qcom: Namespace prefix some commands Sasha Levin
                   ` (16 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Bartosz Golaszewski, Linus Walleij, Sasha Levin, linux-gpio

From: Bartosz Golaszewski <brgl@bgdev.pl>

[ Upstream commit bff466bac59994cfcceabe4d0be5fdc1c20cd5b8 ]

Commit 3edfb7bd76bd ("gpiolib: Show correct direction from the
beginning") fixed an existing issue but broke libgpiod tests by
changing the default direction of dummy lines to output.

We don't break user-space so make gpio-mockup behave as before.

Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpio/gpio-mockup.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpio/gpio-mockup.c b/drivers/gpio/gpio-mockup.c
index 9532d86a82f7..d99c8d8da9a0 100644
--- a/drivers/gpio/gpio-mockup.c
+++ b/drivers/gpio/gpio-mockup.c
@@ -35,8 +35,8 @@
 #define GPIO_MOCKUP_MAX_RANGES	(GPIO_MOCKUP_MAX_GC * 2)
 
 enum {
-	GPIO_MOCKUP_DIR_OUT = 0,
-	GPIO_MOCKUP_DIR_IN = 1,
+	GPIO_MOCKUP_DIR_IN = 0,
+	GPIO_MOCKUP_DIR_OUT = 1,
 };
 
 /*
@@ -112,7 +112,7 @@ static int gpio_mockup_get_direction(struct gpio_chip *gc, unsigned int offset)
 {
 	struct gpio_mockup_chip *chip = gpiochip_get_data(gc);
 
-	return chip->lines[offset].dir;
+	return !chip->lines[offset].dir;
 }
 
 static int gpio_mockup_name_lines(struct device *dev,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 19/35] mtd: rawnand: qcom: Namespace prefix some commands
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (16 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 18/35] gpio: mockup: fix indicated direction Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 20/35] exec: make de_thread() freezable Sasha Levin
                   ` (15 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Olof Johansson, Boris Brezillon, Sasha Levin, linux-mtd, linux-riscv

From: Olof Johansson <olof@lixom.net>

[ Upstream commit 33bf5519ae5dd356b182a94e3622f42860274a38 ]

PAGE_READ is used by RISC-V arch code included through mm headers,
and it makes sense to bring in a prefix on these in the driver.

drivers/mtd/nand/raw/qcom_nandc.c:153: warning: "PAGE_READ" redefined
 #define PAGE_READ   0x2
In file included from include/linux/memremap.h:7,
                 from include/linux/mm.h:27,
                 from include/linux/scatterlist.h:8,
                 from include/linux/dma-mapping.h:11,
                 from drivers/mtd/nand/raw/qcom_nandc.c:17:
arch/riscv/include/asm/pgtable.h:48: note: this is the location of the previous definition

Caught by riscv allmodconfig.

Signed-off-by: Olof Johansson <olof@lixom.net>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/mtd/nand/qcom_nandc.c | 32 ++++++++++++++++----------------
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/mtd/nand/qcom_nandc.c b/drivers/mtd/nand/qcom_nandc.c
index b49ca02b399d..09d5f7df6023 100644
--- a/drivers/mtd/nand/qcom_nandc.c
+++ b/drivers/mtd/nand/qcom_nandc.c
@@ -149,15 +149,15 @@
 #define	NAND_VERSION_MINOR_SHIFT	16
 
 /* NAND OP_CMDs */
-#define	PAGE_READ			0x2
-#define	PAGE_READ_WITH_ECC		0x3
-#define	PAGE_READ_WITH_ECC_SPARE	0x4
-#define	PROGRAM_PAGE			0x6
-#define	PAGE_PROGRAM_WITH_ECC		0x7
-#define	PROGRAM_PAGE_SPARE		0x9
-#define	BLOCK_ERASE			0xa
-#define	FETCH_ID			0xb
-#define	RESET_DEVICE			0xd
+#define	OP_PAGE_READ			0x2
+#define	OP_PAGE_READ_WITH_ECC		0x3
+#define	OP_PAGE_READ_WITH_ECC_SPARE	0x4
+#define	OP_PROGRAM_PAGE			0x6
+#define	OP_PAGE_PROGRAM_WITH_ECC	0x7
+#define	OP_PROGRAM_PAGE_SPARE		0x9
+#define	OP_BLOCK_ERASE			0xa
+#define	OP_FETCH_ID			0xb
+#define	OP_RESET_DEVICE			0xd
 
 /* Default Value for NAND_DEV_CMD_VLD */
 #define NAND_DEV_CMD_VLD_VAL		(READ_START_VLD | WRITE_START_VLD | \
@@ -629,11 +629,11 @@ static void update_rw_regs(struct qcom_nand_host *host, int num_cw, bool read)
 
 	if (read) {
 		if (host->use_ecc)
-			cmd = PAGE_READ_WITH_ECC | PAGE_ACC | LAST_PAGE;
+			cmd = OP_PAGE_READ_WITH_ECC | PAGE_ACC | LAST_PAGE;
 		else
-			cmd = PAGE_READ | PAGE_ACC | LAST_PAGE;
+			cmd = OP_PAGE_READ | PAGE_ACC | LAST_PAGE;
 	} else {
-			cmd = PROGRAM_PAGE | PAGE_ACC | LAST_PAGE;
+		cmd = OP_PROGRAM_PAGE | PAGE_ACC | LAST_PAGE;
 	}
 
 	if (host->use_ecc) {
@@ -1030,7 +1030,7 @@ static int nandc_param(struct qcom_nand_host *host)
 	 * in use. we configure the controller to perform a raw read of 512
 	 * bytes to read onfi params
 	 */
-	nandc_set_reg(nandc, NAND_FLASH_CMD, PAGE_READ | PAGE_ACC | LAST_PAGE);
+	nandc_set_reg(nandc, NAND_FLASH_CMD, OP_PAGE_READ | PAGE_ACC | LAST_PAGE);
 	nandc_set_reg(nandc, NAND_ADDR0, 0);
 	nandc_set_reg(nandc, NAND_ADDR1, 0);
 	nandc_set_reg(nandc, NAND_DEV0_CFG0, 0 << CW_PER_PAGE
@@ -1084,7 +1084,7 @@ static int erase_block(struct qcom_nand_host *host, int page_addr)
 	struct qcom_nand_controller *nandc = get_qcom_nand_controller(chip);
 
 	nandc_set_reg(nandc, NAND_FLASH_CMD,
-		      BLOCK_ERASE | PAGE_ACC | LAST_PAGE);
+		      OP_BLOCK_ERASE | PAGE_ACC | LAST_PAGE);
 	nandc_set_reg(nandc, NAND_ADDR0, page_addr);
 	nandc_set_reg(nandc, NAND_ADDR1, 0);
 	nandc_set_reg(nandc, NAND_DEV0_CFG0,
@@ -1115,7 +1115,7 @@ static int read_id(struct qcom_nand_host *host, int column)
 	if (column == -1)
 		return 0;
 
-	nandc_set_reg(nandc, NAND_FLASH_CMD, FETCH_ID);
+	nandc_set_reg(nandc, NAND_FLASH_CMD, OP_FETCH_ID);
 	nandc_set_reg(nandc, NAND_ADDR0, column);
 	nandc_set_reg(nandc, NAND_ADDR1, 0);
 	nandc_set_reg(nandc, NAND_FLASH_CHIP_SELECT,
@@ -1136,7 +1136,7 @@ static int reset(struct qcom_nand_host *host)
 	struct nand_chip *chip = &host->chip;
 	struct qcom_nand_controller *nandc = get_qcom_nand_controller(chip);
 
-	nandc_set_reg(nandc, NAND_FLASH_CMD, RESET_DEVICE);
+	nandc_set_reg(nandc, NAND_FLASH_CMD, OP_RESET_DEVICE);
 	nandc_set_reg(nandc, NAND_EXEC_CMD, 1);
 
 	write_reg_dma(nandc, NAND_FLASH_CMD, 1, NAND_BAM_NEXT_SGL);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 20/35] exec: make de_thread() freezable
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (17 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 19/35] mtd: rawnand: qcom: Namespace prefix some commands Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 21/35] HID: multitouch: Add pointstick support for Cirque Touchpad Sasha Levin
                   ` (14 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Chanho Min, Rafael J . Wysocki, Sasha Levin, linux-fsdevel

From: Chanho Min <chanho.min@lge.com>

[ Upstream commit c22397888f1eed98cd59f0a88f2a5f6925f80e15 ]

Suspend fails due to the exec family of functions blocking the freezer.
The casue is that de_thread() sleeps in TASK_UNINTERRUPTIBLE waiting for
all sub-threads to die, and we have the deadlock if one of them is frozen.
This also can occur with the schedule() waiting for the group thread leader
to exit if it is frozen.

In our machine, it causes freeze timeout as bellows.

Freezing of tasks failed after 20.010 seconds (1 tasks refusing to freeze, wq_busy=0):
setcpushares-ls D ffffffc00008ed70     0  5817   1483 0x0040000d
 Call trace:
[<ffffffc00008ed70>] __switch_to+0x88/0xa0
[<ffffffc000d1c30c>] __schedule+0x1bc/0x720
[<ffffffc000d1ca90>] schedule+0x40/0xa8
[<ffffffc0001cd784>] flush_old_exec+0xdc/0x640
[<ffffffc000220360>] load_elf_binary+0x2a8/0x1090
[<ffffffc0001ccff4>] search_binary_handler+0x9c/0x240
[<ffffffc00021c584>] load_script+0x20c/0x228
[<ffffffc0001ccff4>] search_binary_handler+0x9c/0x240
[<ffffffc0001ce8e0>] do_execveat_common.isra.14+0x4f8/0x6e8
[<ffffffc0001cedd0>] compat_SyS_execve+0x38/0x48
[<ffffffc00008de30>] el0_svc_naked+0x24/0x28

To fix this, make de_thread() freezable. It looks safe and works fine.

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Chanho Min <chanho.min@lge.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/exec.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 0da4d748b4e6..25c529f46aaa 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -62,6 +62,7 @@
 #include <linux/oom.h>
 #include <linux/compat.h>
 #include <linux/vmalloc.h>
+#include <linux/freezer.h>
 
 #include <linux/uaccess.h>
 #include <asm/mmu_context.h>
@@ -1079,7 +1080,7 @@ static int de_thread(struct task_struct *tsk)
 	while (sig->notify_count) {
 		__set_current_state(TASK_KILLABLE);
 		spin_unlock_irq(lock);
-		schedule();
+		freezable_schedule();
 		if (unlikely(__fatal_signal_pending(tsk)))
 			goto killed;
 		spin_lock_irq(lock);
@@ -1107,7 +1108,7 @@ static int de_thread(struct task_struct *tsk)
 			__set_current_state(TASK_KILLABLE);
 			write_unlock_irq(&tasklist_lock);
 			cgroup_threadgroup_change_end(tsk);
-			schedule();
+			freezable_schedule();
 			if (unlikely(__fatal_signal_pending(tsk)))
 				goto killed;
 		}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 21/35] HID: multitouch: Add pointstick support for Cirque Touchpad
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (18 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 20/35] exec: make de_thread() freezable Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 22/35] mtd: spi-nor: Fix Cadence QSPI page fault kernel panic Sasha Levin
                   ` (13 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel; +Cc: Kai-Heng Feng, Jiri Kosina, Sasha Levin, linux-input

From: Kai-Heng Feng <kai.heng.feng@canonical.com>

[ Upstream commit 12d43aacf9a74d0eb66fd0ea54ebeb79ca28940f ]

Cirque Touchpad/Pointstick combo is similar to Alps devices, it requires
MT_CLS_WIN_8_DUAL to expose its pointstick as a mouse.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/hid/hid-ids.h        | 3 +++
 drivers/hid/hid-multitouch.c | 6 ++++++
 2 files changed, 9 insertions(+)

diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h
index 87904d2adadb..fcc688df694c 100644
--- a/drivers/hid/hid-ids.h
+++ b/drivers/hid/hid-ids.h
@@ -266,6 +266,9 @@
 
 #define USB_VENDOR_ID_CIDC		0x1677
 
+#define I2C_VENDOR_ID_CIRQUE		0x0488
+#define I2C_PRODUCT_ID_CIRQUE_121F	0x121F
+
 #define USB_VENDOR_ID_CJTOUCH		0x24b8
 #define USB_DEVICE_ID_CJTOUCH_MULTI_TOUCH_0020	0x0020
 #define USB_DEVICE_ID_CJTOUCH_MULTI_TOUCH_0040	0x0040
diff --git a/drivers/hid/hid-multitouch.c b/drivers/hid/hid-multitouch.c
index c3b9bd5dba75..07d92d4a9f7c 100644
--- a/drivers/hid/hid-multitouch.c
+++ b/drivers/hid/hid-multitouch.c
@@ -1474,6 +1474,12 @@ static const struct hid_device_id mt_devices[] = {
 		MT_USB_DEVICE(USB_VENDOR_ID_CHUNGHWAT,
 			USB_DEVICE_ID_CHUNGHWAT_MULTITOUCH) },
 
+	/* Cirque devices */
+	{ .driver_data = MT_CLS_WIN_8_DUAL,
+		HID_DEVICE(BUS_I2C, HID_GROUP_MULTITOUCH_WIN_8,
+			I2C_VENDOR_ID_CIRQUE,
+			I2C_PRODUCT_ID_CIRQUE_121F) },
+
 	/* CJTouch panels */
 	{ .driver_data = MT_CLS_NSMU,
 		MT_USB_DEVICE(USB_VENDOR_ID_CJTOUCH,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 22/35] mtd: spi-nor: Fix Cadence QSPI page fault kernel panic
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (19 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 21/35] HID: multitouch: Add pointstick support for Cirque Touchpad Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 23/35] qed: Fix bitmap_weight() check Sasha Levin
                   ` (12 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel; +Cc: Thor Thayer, Boris Brezillon, Sasha Levin, linux-mtd

From: Thor Thayer <thor.thayer@linux.intel.com>

[ Upstream commit a6a66f80c85e8e20573ca03fabf32445954a88d5 ]

The current Cadence QSPI driver caused a kernel panic sporadically
when writing to QSPI. The problem was caused by writing more bytes
than needed because the QSPI operated on 4 bytes at a time.
<snip>
[   11.202044] Unable to handle kernel paging request at virtual address bffd3000
[   11.209254] pgd = e463054d
[   11.211948] [bffd3000] *pgd=2fffb811, *pte=00000000, *ppte=00000000
[   11.218202] Internal error: Oops: 7 [#1] SMP ARM
[   11.222797] Modules linked in:
[   11.225844] CPU: 1 PID: 1317 Comm: systemd-hwdb Not tainted 4.17.7-d0c45cd44a8f
[   11.235796] Hardware name: Altera SOCFPGA Arria10
[   11.240487] PC is at __raw_writesl+0x70/0xd4
[   11.244741] LR is at cqspi_write+0x1a0/0x2cc
</snip>
On a page boundary limit the number of bytes copied from the tx buffer
to remain within the page.

This patch uses a temporary buffer to hold the 4 bytes to write and then
copies only the bytes required from the tx buffer.

Reported-by: Adrian Amborzewicz <adrian.ambrozewicz@intel.com>
Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/mtd/spi-nor/cadence-quadspi.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/mtd/spi-nor/cadence-quadspi.c b/drivers/mtd/spi-nor/cadence-quadspi.c
index 8d89204b90d2..f22dd34f4f83 100644
--- a/drivers/mtd/spi-nor/cadence-quadspi.c
+++ b/drivers/mtd/spi-nor/cadence-quadspi.c
@@ -625,9 +625,23 @@ static int cqspi_indirect_write_execute(struct spi_nor *nor,
 	       reg_base + CQSPI_REG_INDIRECTWR);
 
 	while (remaining > 0) {
+		size_t write_words, mod_bytes;
+
 		write_bytes = remaining > page_size ? page_size : remaining;
-		iowrite32_rep(cqspi->ahb_base, txbuf,
-			      DIV_ROUND_UP(write_bytes, 4));
+		write_words = write_bytes / 4;
+		mod_bytes = write_bytes % 4;
+		/* Write 4 bytes at a time then single bytes. */
+		if (write_words) {
+			iowrite32_rep(cqspi->ahb_base, txbuf, write_words);
+			txbuf += (write_words * 4);
+		}
+		if (mod_bytes) {
+			unsigned int temp = 0xFFFFFFFF;
+
+			memcpy(&temp, txbuf, mod_bytes);
+			iowrite32(temp, cqspi->ahb_base);
+			txbuf += mod_bytes;
+		}
 
 		ret = wait_for_completion_timeout(&cqspi->transfer_complete,
 						  msecs_to_jiffies
@@ -638,7 +652,6 @@ static int cqspi_indirect_write_execute(struct spi_nor *nor,
 			goto failwr;
 		}
 
-		txbuf += write_bytes;
 		remaining -= write_bytes;
 
 		if (remaining > 0)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 23/35] qed: Fix bitmap_weight() check
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (20 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 22/35] mtd: spi-nor: Fix Cadence QSPI page fault kernel panic Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 24/35] qed: Fix QM getters to always return a valid pq Sasha Levin
                   ` (11 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Denis Bolotin, Michal Kalderon, David S . Miller, Sasha Levin, netdev

From: Denis Bolotin <denis.bolotin@cavium.com>

[ Upstream commit 276d43f0ae963312c0cd0e2b9a85fd11ac65dfcc ]

Fix the condition which verifies that only one flag is set. The API
bitmap_weight() should receive size in bits instead of bytes.

Fixes: b5a9ee7cf3be ("qed: Revise QM cofiguration")
Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com>
Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/qlogic/qed/qed_dev.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev.c b/drivers/net/ethernet/qlogic/qed/qed_dev.c
index ef2374699726..a51cd1028ecb 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_dev.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c
@@ -440,8 +440,11 @@ static u16 *qed_init_qm_get_idx_from_flags(struct qed_hwfn *p_hwfn,
 	struct qed_qm_info *qm_info = &p_hwfn->qm_info;
 
 	/* Can't have multiple flags set here */
-	if (bitmap_weight((unsigned long *)&pq_flags, sizeof(pq_flags)) > 1)
+	if (bitmap_weight((unsigned long *)&pq_flags,
+			  sizeof(pq_flags) * BITS_PER_BYTE) > 1) {
+		DP_ERR(p_hwfn, "requested multiple pq flags 0x%x\n", pq_flags);
 		goto err;
+	}
 
 	switch (pq_flags) {
 	case PQ_FLAGS_RLS:
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 24/35] qed: Fix QM getters to always return a valid pq
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (21 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 23/35] qed: Fix bitmap_weight() check Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF Sasha Levin
                   ` (10 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Denis Bolotin, Michal Kalderon, David S . Miller, Sasha Levin, netdev

From: Denis Bolotin <denis.bolotin@cavium.com>

[ Upstream commit eb62cca9bee842e5b23bd0ddfb1f271ca95e8759 ]

The getter callers doesn't know the valid Physical Queues (PQ) values.
This patch makes sure that a valid PQ will always be returned.

The patch consists of 3 fixes:

 - When qed_init_qm_get_idx_from_flags() receives a disabled flag, it
   returned PQ 0, which can potentially be another function's pq. Verify
   that flag is enabled, otherwise return default start_pq.

 - When qed_init_qm_get_idx_from_flags() receives an unknown flag, it
   returned NULL and could lead to a segmentation fault. Return default
   start_pq instead.

 - A modulo operation was added to MCOS/VFS PQ getters to make sure the
   PQ returned is in range of the required flag.

Fixes: b5a9ee7cf3be ("qed: Revise QM cofiguration")
Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com>
Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/qlogic/qed/qed_dev.c | 24 +++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev.c b/drivers/net/ethernet/qlogic/qed/qed_dev.c
index a51cd1028ecb..16953c4ebd71 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_dev.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c
@@ -446,6 +446,11 @@ static u16 *qed_init_qm_get_idx_from_flags(struct qed_hwfn *p_hwfn,
 		goto err;
 	}
 
+	if (!(qed_get_pq_flags(p_hwfn) & pq_flags)) {
+		DP_ERR(p_hwfn, "pq flag 0x%x is not set\n", pq_flags);
+		goto err;
+	}
+
 	switch (pq_flags) {
 	case PQ_FLAGS_RLS:
 		return &qm_info->first_rl_pq;
@@ -468,8 +473,7 @@ static u16 *qed_init_qm_get_idx_from_flags(struct qed_hwfn *p_hwfn,
 	}
 
 err:
-	DP_ERR(p_hwfn, "BAD pq flags %d\n", pq_flags);
-	return NULL;
+	return &qm_info->start_pq;
 }
 
 /* save pq index in qm info */
@@ -493,20 +497,32 @@ u16 qed_get_cm_pq_idx_mcos(struct qed_hwfn *p_hwfn, u8 tc)
 {
 	u8 max_tc = qed_init_qm_get_num_tcs(p_hwfn);
 
+	if (max_tc == 0) {
+		DP_ERR(p_hwfn, "pq with flag 0x%lx do not exist\n",
+		       PQ_FLAGS_MCOS);
+		return p_hwfn->qm_info.start_pq;
+	}
+
 	if (tc > max_tc)
 		DP_ERR(p_hwfn, "tc %d must be smaller than %d\n", tc, max_tc);
 
-	return qed_get_cm_pq_idx(p_hwfn, PQ_FLAGS_MCOS) + tc;
+	return qed_get_cm_pq_idx(p_hwfn, PQ_FLAGS_MCOS) + (tc % max_tc);
 }
 
 u16 qed_get_cm_pq_idx_vf(struct qed_hwfn *p_hwfn, u16 vf)
 {
 	u16 max_vf = qed_init_qm_get_num_vfs(p_hwfn);
 
+	if (max_vf == 0) {
+		DP_ERR(p_hwfn, "pq with flag 0x%lx do not exist\n",
+		       PQ_FLAGS_VFS);
+		return p_hwfn->qm_info.start_pq;
+	}
+
 	if (vf > max_vf)
 		DP_ERR(p_hwfn, "vf %d must be smaller than %d\n", vf, max_vf);
 
-	return qed_get_cm_pq_idx(p_hwfn, PQ_FLAGS_VFS) + vf;
+	return qed_get_cm_pq_idx(p_hwfn, PQ_FLAGS_VFS) + (vf % max_vf);
 }
 
 u16 qed_get_cm_pq_idx_rl(struct qed_hwfn *p_hwfn, u8 rl)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (22 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 24/35] qed: Fix QM getters to always return a valid pq Sasha Levin
@ 2018-11-29  6:00 ` Sasha Levin
  2018-11-29 12:14   ` Dave Chinner
  2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 26/35] net: faraday: ftmac100: remove netif_running(netdev) check before disabling interrupts Sasha Levin
                   ` (9 subsequent siblings)
  33 siblings, 1 reply; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:00 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Dave Chinner, Darrick J . Wong, Sasha Levin, linux-fsdevel

From: Dave Chinner <dchinner@redhat.com>

[ Upstream commit b450672fb66b4a991a5b55ee24209ac7ae7690ce ]

If we are doing sub-block dio that extends EOF, we need to zero
the unused tail of the block to initialise the data in it it. If we
do not zero the tail of the block, then an immediate mmap read of
the EOF block will expose stale data beyond EOF to userspace. Found
with fsx running sub-block DIO sizes vs MAPREAD/MAPWRITE operations.

Fix this by detecting if the end of the DIO write is beyond EOF
and zeroing the tail if necessary.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/iomap.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/iomap.c b/fs/iomap.c
index 8f7673a69273..407efdae3978 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -940,7 +940,14 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
 		dio->submit.cookie = submit_bio(bio);
 	} while (nr_pages);
 
-	if (need_zeroout) {
+	/*
+	 * We need to zeroout the tail of a sub-block write if the extent type
+	 * requires zeroing or the write extends beyond EOF. If we don't zero
+	 * the block tail in the latter case, we can expose stale data via mmap
+	 * reads of the EOF block.
+	 */
+	if (need_zeroout ||
+	    ((dio->flags & IOMAP_DIO_WRITE) && pos >= i_size_read(inode))) {
 		/* zero out from the end of the write to the end of the block */
 		pad = pos & (fs_block_size - 1);
 		if (pad)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 26/35] net: faraday: ftmac100: remove netif_running(netdev) check before disabling interrupts
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (23 preceding siblings ...)
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF Sasha Levin
@ 2018-11-29  6:01 ` Sasha Levin
  2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 27/35] iommu/vt-d: Use memunmap to free memremap Sasha Levin
                   ` (8 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:01 UTC (permalink / raw)
  To: stable, linux-kernel; +Cc: Vincent Chen, David S . Miller, Sasha Levin, netdev

From: Vincent Chen <vincentc@andestech.com>

[ Upstream commit 426a593e641ebf0d9288f0a2fcab644a86820220 ]

In the original ftmac100_interrupt(), the interrupts are only disabled when
the condition "netif_running(netdev)" is true. However, this condition
causes kerenl hang in the following case. When the user requests to
disable the network device, kernel will clear the bit __LINK_STATE_START
from the dev->state and then call the driver's ndo_stop function. Network
device interrupts are not blocked during this process. If an interrupt
occurs between clearing __LINK_STATE_START and stopping network device,
kernel cannot disable the interrupts due to the condition
"netif_running(netdev)" in the ISR. Hence, kernel will hang due to the
continuous interruption of the network device.

In order to solve the above problem, the interrupts of the network device
should always be disabled in the ISR without being restricted by the
condition "netif_running(netdev)".

[V2]
Remove unnecessary curly braces.

Signed-off-by: Vincent Chen <vincentc@andestech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/faraday/ftmac100.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/faraday/ftmac100.c b/drivers/net/ethernet/faraday/ftmac100.c
index 66928a922824..415fd93e9930 100644
--- a/drivers/net/ethernet/faraday/ftmac100.c
+++ b/drivers/net/ethernet/faraday/ftmac100.c
@@ -870,11 +870,10 @@ static irqreturn_t ftmac100_interrupt(int irq, void *dev_id)
 	struct net_device *netdev = dev_id;
 	struct ftmac100 *priv = netdev_priv(netdev);
 
-	if (likely(netif_running(netdev))) {
-		/* Disable interrupts for polling */
-		ftmac100_disable_all_int(priv);
+	/* Disable interrupts for polling */
+	ftmac100_disable_all_int(priv);
+	if (likely(netif_running(netdev)))
 		napi_schedule(&priv->napi);
-	}
 
 	return IRQ_HANDLED;
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 27/35] iommu/vt-d: Use memunmap to free memremap
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (24 preceding siblings ...)
  2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 26/35] net: faraday: ftmac100: remove netif_running(netdev) check before disabling interrupts Sasha Levin
@ 2018-11-29  6:01 ` Sasha Levin
  2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 28/35] flexfiles: use per-mirror specified stateid for IO Sasha Levin
                   ` (7 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:01 UTC (permalink / raw)
  To: stable, linux-kernel; +Cc: Pan Bian, Joerg Roedel, Sasha Levin, iommu

From: Pan Bian <bianpan2016@163.com>

[ Upstream commit 829383e183728dec7ed9150b949cd6de64127809 ]

memunmap() should be used to free the return of memremap(), not
iounmap().

Fixes: dfddb969edf0 ('iommu/vt-d: Switch from ioremap_cache to memremap')
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/iommu/intel-iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index aaf3fed97477..e86c1c8ec7f6 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3086,7 +3086,7 @@ static int copy_context_table(struct intel_iommu *iommu,
 			}
 
 			if (old_ce)
-				iounmap(old_ce);
+				memunmap(old_ce);
 
 			ret = 0;
 			if (devfn < 0x80)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 28/35] flexfiles: use per-mirror specified stateid for IO
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (25 preceding siblings ...)
  2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 27/35] iommu/vt-d: Use memunmap to free memremap Sasha Levin
@ 2018-11-29  6:01 ` Sasha Levin
  2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 29/35] net: thunderx: set xdp_prog to NULL if bpf_prog_add fails Sasha Levin
                   ` (6 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:01 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Tigran Mkrtchyan, Rick Macklem, Trond Myklebust, Sasha Levin, linux-nfs

From: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>

[ Upstream commit bb21ce0ad227b69ec0f83279297ee44232105d96 ]

rfc8435 says:

  For tight coupling, ffds_stateid provides the stateid to be used by
  the client to access the file.

However current implementation replaces per-mirror provided stateid with
by open or lock stateid.

Ensure that per-mirror stateid is used by ff_layout_write_prepare_v4 and
nfs4_ff_layout_prepare_ds.

Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/nfs/flexfilelayout/flexfilelayout.c    | 21 +++++++++------------
 fs/nfs/flexfilelayout/flexfilelayout.h    |  4 ++++
 fs/nfs/flexfilelayout/flexfilelayoutdev.c | 19 +++++++++++++++++++
 3 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c
index b0fa83a60754..13612a848378 100644
--- a/fs/nfs/flexfilelayout/flexfilelayout.c
+++ b/fs/nfs/flexfilelayout/flexfilelayout.c
@@ -1365,12 +1365,7 @@ static void ff_layout_read_prepare_v4(struct rpc_task *task, void *data)
 				task))
 		return;
 
-	if (ff_layout_read_prepare_common(task, hdr))
-		return;
-
-	if (nfs4_set_rw_stateid(&hdr->args.stateid, hdr->args.context,
-			hdr->args.lock_context, FMODE_READ) == -EIO)
-		rpc_exit(task, -EIO); /* lost lock, terminate I/O */
+	ff_layout_read_prepare_common(task, hdr);
 }
 
 static void ff_layout_read_call_done(struct rpc_task *task, void *data)
@@ -1539,12 +1534,7 @@ static void ff_layout_write_prepare_v4(struct rpc_task *task, void *data)
 				task))
 		return;
 
-	if (ff_layout_write_prepare_common(task, hdr))
-		return;
-
-	if (nfs4_set_rw_stateid(&hdr->args.stateid, hdr->args.context,
-			hdr->args.lock_context, FMODE_WRITE) == -EIO)
-		rpc_exit(task, -EIO); /* lost lock, terminate I/O */
+	ff_layout_write_prepare_common(task, hdr);
 }
 
 static void ff_layout_write_call_done(struct rpc_task *task, void *data)
@@ -1734,6 +1724,10 @@ ff_layout_read_pagelist(struct nfs_pgio_header *hdr)
 	fh = nfs4_ff_layout_select_ds_fh(lseg, idx);
 	if (fh)
 		hdr->args.fh = fh;
+
+	if (!nfs4_ff_layout_select_ds_stateid(lseg, idx, &hdr->args.stateid))
+		goto out_failed;
+
 	/*
 	 * Note that if we ever decide to split across DSes,
 	 * then we may need to handle dense-like offsets.
@@ -1796,6 +1790,9 @@ ff_layout_write_pagelist(struct nfs_pgio_header *hdr, int sync)
 	if (fh)
 		hdr->args.fh = fh;
 
+	if (!nfs4_ff_layout_select_ds_stateid(lseg, idx, &hdr->args.stateid))
+		goto out_failed;
+
 	/*
 	 * Note that if we ever decide to split across DSes,
 	 * then we may need to handle dense-like offsets.
diff --git a/fs/nfs/flexfilelayout/flexfilelayout.h b/fs/nfs/flexfilelayout/flexfilelayout.h
index 679cb087ef3f..d6515f1584f3 100644
--- a/fs/nfs/flexfilelayout/flexfilelayout.h
+++ b/fs/nfs/flexfilelayout/flexfilelayout.h
@@ -214,6 +214,10 @@ unsigned int ff_layout_fetch_ds_ioerr(struct pnfs_layout_hdr *lo,
 		unsigned int maxnum);
 struct nfs_fh *
 nfs4_ff_layout_select_ds_fh(struct pnfs_layout_segment *lseg, u32 mirror_idx);
+int
+nfs4_ff_layout_select_ds_stateid(struct pnfs_layout_segment *lseg,
+				u32 mirror_idx,
+				nfs4_stateid *stateid);
 
 struct nfs4_pnfs_ds *
 nfs4_ff_layout_prepare_ds(struct pnfs_layout_segment *lseg, u32 ds_idx,
diff --git a/fs/nfs/flexfilelayout/flexfilelayoutdev.c b/fs/nfs/flexfilelayout/flexfilelayoutdev.c
index d62279d3fc5d..9f69e83810ca 100644
--- a/fs/nfs/flexfilelayout/flexfilelayoutdev.c
+++ b/fs/nfs/flexfilelayout/flexfilelayoutdev.c
@@ -369,6 +369,25 @@ nfs4_ff_layout_select_ds_fh(struct pnfs_layout_segment *lseg, u32 mirror_idx)
 	return fh;
 }
 
+int
+nfs4_ff_layout_select_ds_stateid(struct pnfs_layout_segment *lseg,
+				u32 mirror_idx,
+				nfs4_stateid *stateid)
+{
+	struct nfs4_ff_layout_mirror *mirror = FF_LAYOUT_COMP(lseg, mirror_idx);
+
+	if (!ff_layout_mirror_valid(lseg, mirror, false)) {
+		pr_err_ratelimited("NFS: %s: No data server for mirror offset index %d\n",
+			__func__, mirror_idx);
+		goto out;
+	}
+
+	nfs4_stateid_copy(stateid, &mirror->stateid);
+	return 1;
+out:
+	return 0;
+}
+
 /**
  * nfs4_ff_layout_prepare_ds - prepare a DS connection for an RPC call
  * @lseg: the layout segment we're operating on
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 29/35] net: thunderx: set xdp_prog to NULL if bpf_prog_add fails
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (26 preceding siblings ...)
  2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 28/35] flexfiles: use per-mirror specified stateid for IO Sasha Levin
@ 2018-11-29  6:01 ` Sasha Levin
  2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 30/35] ibmvnic: Fix RX queue buffer cleanup Sasha Levin
                   ` (5 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:01 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Lorenzo Bianconi, David S . Miller, Sasha Levin, netdev

From: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>

[ Upstream commit 6d0f60b0f8588fd4380ea5df9601e12fddd55ce2 ]

Set xdp_prog pointer to NULL if bpf_prog_add fails since that routine
reports the error code instead of NULL in case of failure and xdp_prog
pointer value is used in the driver to verify if XDP is currently
enabled.
Moreover report the error code to userspace if nicvf_xdp_setup fails

Fixes: 05c773f52b96 ("net: thunderx: Add basic XDP support")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/cavium/thunder/nicvf_main.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 2237ef8e4344..f13256af8031 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -1691,6 +1691,7 @@ static int nicvf_xdp_setup(struct nicvf *nic, struct bpf_prog *prog)
 	bool if_up = netif_running(nic->netdev);
 	struct bpf_prog *old_prog;
 	bool bpf_attached = false;
+	int ret = 0;
 
 	/* For now just support only the usual MTU sized frames */
 	if (prog && (dev->mtu > 1500)) {
@@ -1724,8 +1725,12 @@ static int nicvf_xdp_setup(struct nicvf *nic, struct bpf_prog *prog)
 	if (nic->xdp_prog) {
 		/* Attach BPF program */
 		nic->xdp_prog = bpf_prog_add(nic->xdp_prog, nic->rx_queues - 1);
-		if (!IS_ERR(nic->xdp_prog))
+		if (!IS_ERR(nic->xdp_prog)) {
 			bpf_attached = true;
+		} else {
+			ret = PTR_ERR(nic->xdp_prog);
+			nic->xdp_prog = NULL;
+		}
 	}
 
 	/* Calculate Tx queues needed for XDP and network stack */
@@ -1737,7 +1742,7 @@ static int nicvf_xdp_setup(struct nicvf *nic, struct bpf_prog *prog)
 		netif_trans_update(nic->netdev);
 	}
 
-	return 0;
+	return ret;
 }
 
 static int nicvf_xdp(struct net_device *netdev, struct netdev_xdp *xdp)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 30/35] ibmvnic: Fix RX queue buffer cleanup
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (27 preceding siblings ...)
  2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 29/35] net: thunderx: set xdp_prog to NULL if bpf_prog_add fails Sasha Levin
@ 2018-11-29  6:01 ` Sasha Levin
  2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 31/35] virtio-net: disable guest csum during XDP set Sasha Levin
                   ` (4 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:01 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Thomas Falcon, David S . Miller, Sasha Levin, linuxppc-dev, netdev

From: Thomas Falcon <tlfalcon@linux.ibm.com>

[ Upstream commit b7cdec3d699db2e5985ad39de0f25d3b6111928e ]

The wrong index is used when cleaning up RX buffer objects during release
of RX queues. Update to use the correct index counter.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 5c7134ccc1fd..14c53ed5cca6 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -457,8 +457,8 @@ static void release_rx_pools(struct ibmvnic_adapter *adapter)
 
 		for (j = 0; j < rx_pool->size; j++) {
 			if (rx_pool->rx_buff[j].skb) {
-				dev_kfree_skb_any(rx_pool->rx_buff[i].skb);
-				rx_pool->rx_buff[i].skb = NULL;
+				dev_kfree_skb_any(rx_pool->rx_buff[j].skb);
+				rx_pool->rx_buff[j].skb = NULL;
 			}
 		}
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 31/35] virtio-net: disable guest csum during XDP set
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (28 preceding siblings ...)
  2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 30/35] ibmvnic: Fix RX queue buffer cleanup Sasha Levin
@ 2018-11-29  6:01 ` Sasha Levin
  2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 32/35] virtio-net: fail XDP set if guest csum is negotiated Sasha Levin
                   ` (3 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:01 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Jason Wang, Jesper Dangaard Brouer, Pavel Popa, David Ahern,
	David S . Miller, Sasha Levin, virtualization, netdev

From: Jason Wang <jasowang@redhat.com>

[ Upstream commit e59ff2c49ae16e1d179de679aca81405829aee6c ]

We don't disable VIRTIO_NET_F_GUEST_CSUM if XDP was set. This means we
can receive partial csumed packets with metadata kept in the
vnet_hdr. This may have several side effects:

- It could be overridden by header adjustment, thus is might be not
  correct after XDP processing.
- There's no way to pass such metadata information through
  XDP_REDIRECT to another driver.
- XDP does not support checksum offload right now.

So simply disable guest csum if possible in this the case of XDP.

Fixes: 3f93522ffab2d ("virtio-net: switch off offloads on demand if possible on XDP set")
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Pavel Popa <pashinho1990@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/virtio_net.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f528e9ac3413..2ffa7b290591 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -61,7 +61,8 @@ static const unsigned long guest_offloads[] = {
 	VIRTIO_NET_F_GUEST_TSO4,
 	VIRTIO_NET_F_GUEST_TSO6,
 	VIRTIO_NET_F_GUEST_ECN,
-	VIRTIO_NET_F_GUEST_UFO
+	VIRTIO_NET_F_GUEST_UFO,
+	VIRTIO_NET_F_GUEST_CSUM
 };
 
 struct virtnet_stats {
@@ -1939,9 +1940,6 @@ static int virtnet_clear_guest_offloads(struct virtnet_info *vi)
 	if (!vi->guest_offloads)
 		return 0;
 
-	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_CSUM))
-		offloads = 1ULL << VIRTIO_NET_F_GUEST_CSUM;
-
 	return virtnet_set_guest_offloads(vi, offloads);
 }
 
@@ -1951,8 +1949,6 @@ static int virtnet_restore_guest_offloads(struct virtnet_info *vi)
 
 	if (!vi->guest_offloads)
 		return 0;
-	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_CSUM))
-		offloads |= 1ULL << VIRTIO_NET_F_GUEST_CSUM;
 
 	return virtnet_set_guest_offloads(vi, offloads);
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 32/35] virtio-net: fail XDP set if guest csum is negotiated
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (29 preceding siblings ...)
  2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 31/35] virtio-net: disable guest csum during XDP set Sasha Levin
@ 2018-11-29  6:01 ` Sasha Levin
  2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 33/35] team: no need to do team_notify_peers or team_mcast_rejoin when disabling port Sasha Levin
                   ` (2 subsequent siblings)
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:01 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Jason Wang, Jesper Dangaard Brouer, Pavel Popa, David Ahern,
	David S . Miller, Sasha Levin, virtualization, netdev

From: Jason Wang <jasowang@redhat.com>

[ Upstream commit 18ba58e1c234ea1a2d9835ac8c1735d965ce4640 ]

We don't support partial csumed packet since its metadata will be lost
or incorrect during XDP processing. So fail the XDP set if guest_csum
feature is negotiated.

Fixes: f600b6905015 ("virtio_net: Add XDP support")
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Pavel Popa <pashinho1990@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/virtio_net.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 2ffa7b290591..0e8e3be50332 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1966,8 +1966,9 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
 	    && (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO4) ||
 	        virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO6) ||
 	        virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_ECN) ||
-		virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_UFO))) {
-		NL_SET_ERR_MSG_MOD(extack, "Can't set XDP while host is implementing LRO, disable LRO first");
+		virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_UFO) ||
+		virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_CSUM))) {
+		NL_SET_ERR_MSG_MOD(extack, "Can't set XDP while host is implementing LRO/CSUM, disable LRO/CSUM first");
 		return -EOPNOTSUPP;
 	}
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 33/35] team: no need to do team_notify_peers or team_mcast_rejoin when disabling port
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (30 preceding siblings ...)
  2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 32/35] virtio-net: fail XDP set if guest csum is negotiated Sasha Levin
@ 2018-11-29  6:01 ` Sasha Levin
  2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 34/35] net: amd: add missing of_node_put() Sasha Levin
  2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 35/35] net: thunderx: set tso_hdrs pointer to NULL in nicvf_free_snd_queue Sasha Levin
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:01 UTC (permalink / raw)
  To: stable, linux-kernel; +Cc: Hangbin Liu, David S . Miller, Sasha Levin, netdev

From: Hangbin Liu <liuhangbin@gmail.com>

[ Upstream commit 5ed9dc99107144f83b6c1bb52a69b58875baf540 ]

team_notify_peers() will send ARP and NA to notify peers. team_mcast_rejoin()
will send multicast join group message to notify peers. We should do this when
enabling/changed to a new port. But it doesn't make sense to do it when a port
is disabled.

On the other hand, when we set mcast_rejoin_count to 2, and do a failover,
team_port_disable() will increase mcast_rejoin.count_pending to 2 and then
team_port_enable() will increase mcast_rejoin.count_pending to 4. We will send
4 mcast rejoin messages at latest, which will make user confused. The same
with notify_peers.count.

Fix it by deleting team_notify_peers() and team_mcast_rejoin() in
team_port_disable().

Reported-by: Liang Li <liali@redhat.com>
Fixes: fc423ff00df3a ("team: add peer notification")
Fixes: 492b200efdd20 ("team: add support for sending multicast rejoins")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/team/team.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index 817451a1efd6..bd455a6cc82c 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -989,8 +989,6 @@ static void team_port_disable(struct team *team,
 	team->en_port_count--;
 	team_queue_override_port_del(team, port);
 	team_adjust_ops(team);
-	team_notify_peers(team);
-	team_mcast_rejoin(team);
 	team_lower_state_changed(port);
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 34/35] net: amd: add missing of_node_put()
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (31 preceding siblings ...)
  2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 33/35] team: no need to do team_notify_peers or team_mcast_rejoin when disabling port Sasha Levin
@ 2018-11-29  6:01 ` Sasha Levin
  2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 35/35] net: thunderx: set tso_hdrs pointer to NULL in nicvf_free_snd_queue Sasha Levin
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:01 UTC (permalink / raw)
  To: stable, linux-kernel; +Cc: Yangtao Li, David S . Miller, Sasha Levin, netdev

From: Yangtao Li <tiny.windzz@gmail.com>

[ Upstream commit c44c749d3b6fdfca39002e7e48e03fe9f9fe37a3 ]

of_find_node_by_path() acquires a reference to the node
returned by it and that reference needs to be dropped by its caller.
This place doesn't do that, so fix it.

Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/amd/sunlance.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/amd/sunlance.c b/drivers/net/ethernet/amd/sunlance.c
index 291ca5187f12..9845e07d40cd 100644
--- a/drivers/net/ethernet/amd/sunlance.c
+++ b/drivers/net/ethernet/amd/sunlance.c
@@ -1418,7 +1418,7 @@ static int sparc_lance_probe_one(struct platform_device *op,
 
 			prop = of_get_property(nd, "tpe-link-test?", NULL);
 			if (!prop)
-				goto no_link_test;
+				goto node_put;
 
 			if (strcmp(prop, "true")) {
 				printk(KERN_NOTICE "SunLance: warning: overriding option "
@@ -1427,6 +1427,8 @@ static int sparc_lance_probe_one(struct platform_device *op,
 				       "to ecd@skynet.be\n");
 				auxio_set_lte(AUXIO_LTE_ON);
 			}
+node_put:
+			of_node_put(nd);
 no_link_test:
 			lp->auto_select = 1;
 			lp->tpe = 0;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH AUTOSEL 4.14 35/35] net: thunderx: set tso_hdrs pointer to NULL in nicvf_free_snd_queue
  2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
                   ` (32 preceding siblings ...)
  2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 34/35] net: amd: add missing of_node_put() Sasha Levin
@ 2018-11-29  6:01 ` Sasha Levin
  33 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-29  6:01 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Lorenzo Bianconi, David S . Miller, Sasha Levin, netdev

From: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>

[ Upstream commit ef2a7cf1d8831535b8991459567b385661eb4a36 ]

Reset snd_queue tso_hdrs pointer to NULL in nicvf_free_snd_queue routine
since it is used to check if tso dma descriptor queue has been previously
allocated. The issue can be triggered with the following reproducer:

$ip link set dev enP2p1s0v0 xdpdrv obj xdp_dummy.o
$ip link set dev enP2p1s0v0 xdpdrv off

[  341.467649] WARNING: CPU: 74 PID: 2158 at mm/vmalloc.c:1511 __vunmap+0x98/0xe0
[  341.515010] Hardware name: GIGABYTE H270-T70/MT70-HD0, BIOS T49 02/02/2018
[  341.521874] pstate: 60400005 (nZCv daif +PAN -UAO)
[  341.526654] pc : __vunmap+0x98/0xe0
[  341.530132] lr : __vunmap+0x98/0xe0
[  341.533609] sp : ffff00001c5db860
[  341.536913] x29: ffff00001c5db860 x28: 0000000000020000
[  341.542214] x27: ffff810feb5090b0 x26: ffff000017e57000
[  341.547515] x25: 0000000000000000 x24: 00000000fbd00000
[  341.552816] x23: 0000000000000000 x22: ffff810feb5090b0
[  341.558117] x21: 0000000000000000 x20: 0000000000000000
[  341.563418] x19: ffff000017e57000 x18: 0000000000000000
[  341.568719] x17: 0000000000000000 x16: 0000000000000000
[  341.574020] x15: 0000000000000010 x14: ffffffffffffffff
[  341.579321] x13: ffff00008985eb27 x12: ffff00000985eb2f
[  341.584622] x11: ffff0000096b3000 x10: ffff00001c5db510
[  341.589923] x9 : 00000000ffffffd0 x8 : ffff0000086868e8
[  341.595224] x7 : 3430303030303030 x6 : 00000000000006ef
[  341.600525] x5 : 00000000003fffff x4 : 0000000000000000
[  341.605825] x3 : 0000000000000000 x2 : ffffffffffffffff
[  341.611126] x1 : ffff0000096b3728 x0 : 0000000000000038
[  341.616428] Call trace:
[  341.618866]  __vunmap+0x98/0xe0
[  341.621997]  vunmap+0x3c/0x50
[  341.624961]  arch_dma_free+0x68/0xa0
[  341.628534]  dma_direct_free+0x50/0x80
[  341.632285]  nicvf_free_resources+0x160/0x2d8 [nicvf]
[  341.637327]  nicvf_config_data_transfer+0x174/0x5e8 [nicvf]
[  341.642890]  nicvf_stop+0x298/0x340 [nicvf]
[  341.647066]  __dev_close_many+0x9c/0x108
[  341.650977]  dev_close_many+0xa4/0x158
[  341.654720]  rollback_registered_many+0x140/0x530
[  341.659414]  rollback_registered+0x54/0x80
[  341.663499]  unregister_netdevice_queue+0x9c/0xe8
[  341.668192]  unregister_netdev+0x28/0x38
[  341.672106]  nicvf_remove+0xa4/0xa8 [nicvf]
[  341.676280]  nicvf_shutdown+0x20/0x30 [nicvf]
[  341.680630]  pci_device_shutdown+0x44/0x88
[  341.684720]  device_shutdown+0x144/0x250
[  341.688640]  kernel_restart_prepare+0x44/0x50
[  341.692986]  kernel_restart+0x20/0x68
[  341.696638]  __se_sys_reboot+0x210/0x238
[  341.700550]  __arm64_sys_reboot+0x24/0x30
[  341.704555]  el0_svc_handler+0x94/0x110
[  341.708382]  el0_svc+0x8/0xc
[  341.711252] ---[ end trace 3f4019c8439959c9 ]---
[  341.715874] page:ffff7e0003ef4000 count:0 mapcount:0 mapping:0000000000000000 index:0x4
[  341.723872] flags: 0x1fffe000000000()
[  341.727527] raw: 001fffe000000000 ffff7e0003f1a008 ffff7e0003ef4048 0000000000000000
[  341.735263] raw: 0000000000000004 0000000000000000 00000000ffffffff 0000000000000000
[  341.742994] page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)

where xdp_dummy.c is a simple bpf program that forwards the incoming
frames to the network stack (available here:
https://github.com/altoor/xdp_walkthrough_examples/blob/master/sample_1/xdp_dummy.c)

Fixes: 05c773f52b96 ("net: thunderx: Add basic XDP support")
Fixes: 4863dea3fab0 ("net: Adding support for Cavium ThunderX network controller")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index a3d12dbde95b..09494e1c77c5 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -585,10 +585,12 @@ static void nicvf_free_snd_queue(struct nicvf *nic, struct snd_queue *sq)
 	if (!sq->dmem.base)
 		return;
 
-	if (sq->tso_hdrs)
+	if (sq->tso_hdrs) {
 		dma_free_coherent(&nic->pdev->dev,
 				  sq->dmem.q_len * TSO_HEADER_SIZE,
 				  sq->tso_hdrs, sq->tso_hdrs_phys);
+		sq->tso_hdrs = NULL;
+	}
 
 	/* Free pending skbs in the queue */
 	smp_rmb();
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF
  2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF Sasha Levin
@ 2018-11-29 12:14   ` Dave Chinner
  2018-11-29 12:47     ` Greg KH
  0 siblings, 1 reply; 59+ messages in thread
From: Dave Chinner @ 2018-11-29 12:14 UTC (permalink / raw)
  To: Sasha Levin
  Cc: stable, linux-kernel, Dave Chinner, Darrick J . Wong, linux-fsdevel

On Thu, Nov 29, 2018 at 01:00:59AM -0500, Sasha Levin wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> [ Upstream commit b450672fb66b4a991a5b55ee24209ac7ae7690ce ]
> 
> If we are doing sub-block dio that extends EOF, we need to zero
> the unused tail of the block to initialise the data in it it. If we
> do not zero the tail of the block, then an immediate mmap read of
> the EOF block will expose stale data beyond EOF to userspace. Found
> with fsx running sub-block DIO sizes vs MAPREAD/MAPWRITE operations.
> 
> Fix this by detecting if the end of the DIO write is beyond EOF
> and zeroing the tail if necessary.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>  fs/iomap.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/iomap.c b/fs/iomap.c
> index 8f7673a69273..407efdae3978 100644
> --- a/fs/iomap.c
> +++ b/fs/iomap.c
> @@ -940,7 +940,14 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
>  		dio->submit.cookie = submit_bio(bio);
>  	} while (nr_pages);
>  
> -	if (need_zeroout) {
> +	/*
> +	 * We need to zeroout the tail of a sub-block write if the extent type
> +	 * requires zeroing or the write extends beyond EOF. If we don't zero
> +	 * the block tail in the latter case, we can expose stale data via mmap
> +	 * reads of the EOF block.
> +	 */
> +	if (need_zeroout ||
> +	    ((dio->flags & IOMAP_DIO_WRITE) && pos >= i_size_read(inode))) {
>  		/* zero out from the end of the write to the end of the block */
>  		pad = pos & (fs_block_size - 1);
>  		if (pad)

How do you propose to validate that this doesn't introduce new data
corruptions in isolation? I've spent the last 4 weeks of my life and
about 15 billion fsx ops chasing an validating the bug corruption
fixes we've pushed recently into the 4.19 and 4.20 codebase.

Cherry picking only one of the 50-odd patches we've committed into
late 4.19 and 4.20 kernels to fix the problems we've found really
seems like asking for trouble. If you're going to back port random
data corruption fixes, then you need to spend a *lot* of time
validating that it doesn't make things worse than they already
are...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF
  2018-11-29 12:14   ` Dave Chinner
@ 2018-11-29 12:47     ` Greg KH
  2018-11-29 22:40       ` Dave Chinner
  0 siblings, 1 reply; 59+ messages in thread
From: Greg KH @ 2018-11-29 12:47 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Sasha Levin, stable, linux-kernel, Dave Chinner,
	Darrick J . Wong, linux-fsdevel

On Thu, Nov 29, 2018 at 11:14:59PM +1100, Dave Chinner wrote:
> 
> Cherry picking only one of the 50-odd patches we've committed into
> late 4.19 and 4.20 kernels to fix the problems we've found really
> seems like asking for trouble. If you're going to back port random
> data corruption fixes, then you need to spend a *lot* of time
> validating that it doesn't make things worse than they already
> are...

Any reason why we can't take the 50-odd patches in their entirety?  It
sounds like 4.19 isn't fully fixed, but 4.20-rc1 is?  If so, what do you
recommend we do to make 4.19 working properly?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF
  2018-11-29 12:47     ` Greg KH
@ 2018-11-29 22:40       ` Dave Chinner
  2018-11-30  8:22         ` Greg KH
  0 siblings, 1 reply; 59+ messages in thread
From: Dave Chinner @ 2018-11-29 22:40 UTC (permalink / raw)
  To: Greg KH
  Cc: Sasha Levin, stable, linux-kernel, Dave Chinner,
	Darrick J . Wong, linux-fsdevel

On Thu, Nov 29, 2018 at 01:47:56PM +0100, Greg KH wrote:
> On Thu, Nov 29, 2018 at 11:14:59PM +1100, Dave Chinner wrote:
> > 
> > Cherry picking only one of the 50-odd patches we've committed into
> > late 4.19 and 4.20 kernels to fix the problems we've found really
> > seems like asking for trouble. If you're going to back port random
> > data corruption fixes, then you need to spend a *lot* of time
> > validating that it doesn't make things worse than they already
> > are...
> 
> Any reason why we can't take the 50-odd patches in their entirety?  It
> sounds like 4.19 isn't fully fixed, but 4.20-rc1 is?  If so, what do you
> recommend we do to make 4.19 working properly?

You coul dpull all the fixes, but then you have a QA problem.
Basically, we have multiple badly broken syscalls (FICLONERANGE,
FIDEDUPERANGE and copy_file_range), and even 4.20-rc4 isn't fully
fixed.

There were ~5 critical dedupe/clone data corruption fixes for XFS
went into 4.19-rc8.

There were ~30 patches that went into 4.20-rc1 that fixed the
FICLONERANGE/FIDEDUPERANGE ioctls. That completely reworks the
entire VFS infrastructure for those calls, and touches several
filesystems as well. It fixes problems with setuid files, swap
files, modifying immutable files, failure to enforce rlimit and
max file size constraints, behaviour that didn't match man page
descriptions, etc.

There were another ~10 patches that went into 4.20-rc4 that fixed
yet more data corruption and API problems that we found when we
enhanced fsx to use the above syscalls.

And I have another ~10 patches that I'm working on right now to fix
the copy_file_range() implementation - it has all the same problems
I listed above for FICLONERANGE/FIDEDUPERANGE and some other unique
ones. I'm currently writing error condition tests for fstests so
that we at least have some coverage of the conditions
copy_file_range() is supposed to catch and fail. This might all make
a late 4.20-rcX, but it's looking more like 4.21 at this point.

As to testing this stuff, I've spend several weeks now on this and
so has Darrick. Between us we've done a huge amount of QA needed to
verify that the problems are fixed and it is still ongoing. From
#xfs a couple of days ago:

[28/11/18 16:59] * djwong hits 6 billion fsxops...
[28/11/18 17:07] <dchinner_> djwong: I've got about 3.75 billion ops running on a machine here....
[28/11/18 17:20] <djwong> note that's 1 billion fsxops x 6 machines
[28/11/18 17:21] <djwong> [xfsv4, xfsv5, xfsv5 w/ 1k blocks] * [directio fsx, buffered fsx]
[28/11/18 17:21] <dchinner_> Oh, I've got 3.75B x 4 instances on one filesystem :P
[28/11/18 17:22] <dchinner_> [direct io, buffered] x [small op lengths, large op lengths]

And this morning:

[30/11/18 08:53] <djwong> 7 billion fsxops...

I stopped my tests at 5 billion ops yesterday (i.e. 20 billion ops
aggregate) to focus on testing the copy_file_range() changes, but
Darrick's tests are still ongoing and have passed 40 billion ops in
aggregate over the past few days.

The reason we are running these so long is that we've seen fsx data
corruption failures after 12+ hours of runtime and hundreds of
millions of ops. Hence the testing for backported fixes will need to
replicate these test runs across multiple configurations for
multiple days before we have any confidence that we've actually
fixed the data corruptions and not introduced any new ones.

If you pull only a small subset of the fixes, the fsx will still
fail and we have no real way of actually verifying that there have
been no regression introduced by the backport.  IOWs, there's a
/massive/ amount of QA needed for ensuring that these backports work
correctly.

Right now the XFS developers don't have the time or resources
available to validate stable backports are correct and regression
fre because we are focussed on ensuring the upstream fixes we've
already made (and are still writing) are solid and reliable.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF
  2018-11-29 22:40       ` Dave Chinner
@ 2018-11-30  8:22         ` Greg KH
  2018-11-30 10:14           ` Sasha Levin
  2018-11-30 21:45           ` Dave Chinner
  0 siblings, 2 replies; 59+ messages in thread
From: Greg KH @ 2018-11-30  8:22 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Sasha Levin, stable, linux-kernel, Dave Chinner,
	Darrick J . Wong, linux-fsdevel

On Fri, Nov 30, 2018 at 09:40:19AM +1100, Dave Chinner wrote:
> On Thu, Nov 29, 2018 at 01:47:56PM +0100, Greg KH wrote:
> > On Thu, Nov 29, 2018 at 11:14:59PM +1100, Dave Chinner wrote:
> > > 
> > > Cherry picking only one of the 50-odd patches we've committed into
> > > late 4.19 and 4.20 kernels to fix the problems we've found really
> > > seems like asking for trouble. If you're going to back port random
> > > data corruption fixes, then you need to spend a *lot* of time
> > > validating that it doesn't make things worse than they already
> > > are...
> > 
> > Any reason why we can't take the 50-odd patches in their entirety?  It
> > sounds like 4.19 isn't fully fixed, but 4.20-rc1 is?  If so, what do you
> > recommend we do to make 4.19 working properly?
> 
> You coul dpull all the fixes, but then you have a QA problem.
> Basically, we have multiple badly broken syscalls (FICLONERANGE,
> FIDEDUPERANGE and copy_file_range), and even 4.20-rc4 isn't fully
> fixed.
> 
> There were ~5 critical dedupe/clone data corruption fixes for XFS
> went into 4.19-rc8.

Have any of those been tagged for stable?

> There were ~30 patches that went into 4.20-rc1 that fixed the
> FICLONERANGE/FIDEDUPERANGE ioctls. That completely reworks the
> entire VFS infrastructure for those calls, and touches several
> filesystems as well. It fixes problems with setuid files, swap
> files, modifying immutable files, failure to enforce rlimit and
> max file size constraints, behaviour that didn't match man page
> descriptions, etc.
> 
> There were another ~10 patches that went into 4.20-rc4 that fixed
> yet more data corruption and API problems that we found when we
> enhanced fsx to use the above syscalls.
> 
> And I have another ~10 patches that I'm working on right now to fix
> the copy_file_range() implementation - it has all the same problems
> I listed above for FICLONERANGE/FIDEDUPERANGE and some other unique
> ones. I'm currently writing error condition tests for fstests so
> that we at least have some coverage of the conditions
> copy_file_range() is supposed to catch and fail. This might all make
> a late 4.20-rcX, but it's looking more like 4.21 at this point.
> 
> As to testing this stuff, I've spend several weeks now on this and
> so has Darrick. Between us we've done a huge amount of QA needed to
> verify that the problems are fixed and it is still ongoing. From
> #xfs a couple of days ago:
> 
> [28/11/18 16:59] * djwong hits 6 billion fsxops...
> [28/11/18 17:07] <dchinner_> djwong: I've got about 3.75 billion ops running on a machine here....
> [28/11/18 17:20] <djwong> note that's 1 billion fsxops x 6 machines
> [28/11/18 17:21] <djwong> [xfsv4, xfsv5, xfsv5 w/ 1k blocks] * [directio fsx, buffered fsx]
> [28/11/18 17:21] <dchinner_> Oh, I've got 3.75B x 4 instances on one filesystem :P
> [28/11/18 17:22] <dchinner_> [direct io, buffered] x [small op lengths, large op lengths]
> 
> And this morning:
> 
> [30/11/18 08:53] <djwong> 7 billion fsxops...
> 
> I stopped my tests at 5 billion ops yesterday (i.e. 20 billion ops
> aggregate) to focus on testing the copy_file_range() changes, but
> Darrick's tests are still ongoing and have passed 40 billion ops in
> aggregate over the past few days.
> 
> The reason we are running these so long is that we've seen fsx data
> corruption failures after 12+ hours of runtime and hundreds of
> millions of ops. Hence the testing for backported fixes will need to
> replicate these test runs across multiple configurations for
> multiple days before we have any confidence that we've actually
> fixed the data corruptions and not introduced any new ones.
> 
> If you pull only a small subset of the fixes, the fsx will still
> fail and we have no real way of actually verifying that there have
> been no regression introduced by the backport.  IOWs, there's a
> /massive/ amount of QA needed for ensuring that these backports work
> correctly.
> 
> Right now the XFS developers don't have the time or resources
> available to validate stable backports are correct and regression
> fre because we are focussed on ensuring the upstream fixes we've
> already made (and are still writing) are solid and reliable.

Ok, that's fine, so users of XFS should wait until the 4.20 release
before relying on it?  :)

I understand your reluctance to want to backport anything, but it really
feels like you are not even allowing for fixes that are "obviously
right" to be backported either, even after they pass testing.  Which
isn't ok for your users.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF
  2018-11-30  8:22         ` Greg KH
@ 2018-11-30 10:14           ` Sasha Levin
  2018-11-30 20:35             ` Darrick J. Wong
  2018-11-30 21:50             ` Dave Chinner
  2018-11-30 21:45           ` Dave Chinner
  1 sibling, 2 replies; 59+ messages in thread
From: Sasha Levin @ 2018-11-30 10:14 UTC (permalink / raw)
  To: Greg KH
  Cc: Dave Chinner, stable, linux-kernel, Dave Chinner,
	Darrick J . Wong, linux-fsdevel

On Fri, Nov 30, 2018 at 09:22:03AM +0100, Greg KH wrote:
>On Fri, Nov 30, 2018 at 09:40:19AM +1100, Dave Chinner wrote:
>> I stopped my tests at 5 billion ops yesterday (i.e. 20 billion ops
>> aggregate) to focus on testing the copy_file_range() changes, but
>> Darrick's tests are still ongoing and have passed 40 billion ops in
>> aggregate over the past few days.
>>
>> The reason we are running these so long is that we've seen fsx data
>> corruption failures after 12+ hours of runtime and hundreds of
>> millions of ops. Hence the testing for backported fixes will need to
>> replicate these test runs across multiple configurations for
>> multiple days before we have any confidence that we've actually
>> fixed the data corruptions and not introduced any new ones.
>>
>> If you pull only a small subset of the fixes, the fsx will still
>> fail and we have no real way of actually verifying that there have
>> been no regression introduced by the backport.  IOWs, there's a
>> /massive/ amount of QA needed for ensuring that these backports work
>> correctly.
>>
>> Right now the XFS developers don't have the time or resources
>> available to validate stable backports are correct and regression
>> fre because we are focussed on ensuring the upstream fixes we've
>> already made (and are still writing) are solid and reliable.
>
>Ok, that's fine, so users of XFS should wait until the 4.20 release
>before relying on it?  :)

It's getting to the point that with the amount of known issues with XFS
on LTS kernels it makes sense to mark it as CONFIG_BROKEN.

>I understand your reluctance to want to backport anything, but it really
>feels like you are not even allowing for fixes that are "obviously
>right" to be backported either, even after they pass testing.  Which
>isn't ok for your users.

Do the XFS maintainers expect users to always use the latest upstream
kernel?

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF
  2018-11-30 10:14           ` Sasha Levin
@ 2018-11-30 20:35             ` Darrick J. Wong
  2018-11-30 21:50             ` Dave Chinner
  1 sibling, 0 replies; 59+ messages in thread
From: Darrick J. Wong @ 2018-11-30 20:35 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Greg KH, Dave Chinner, stable, linux-kernel, Dave Chinner,
	linux-fsdevel, xfs

On Fri, Nov 30, 2018 at 05:14:41AM -0500, Sasha Levin wrote:
> On Fri, Nov 30, 2018 at 09:22:03AM +0100, Greg KH wrote:
> > On Fri, Nov 30, 2018 at 09:40:19AM +1100, Dave Chinner wrote:
> > > I stopped my tests at 5 billion ops yesterday (i.e. 20 billion ops
> > > aggregate) to focus on testing the copy_file_range() changes, but
> > > Darrick's tests are still ongoing and have passed 40 billion ops in
> > > aggregate over the past few days.
> > > 
> > > The reason we are running these so long is that we've seen fsx data
> > > corruption failures after 12+ hours of runtime and hundreds of
> > > millions of ops. Hence the testing for backported fixes will need to
> > > replicate these test runs across multiple configurations for
> > > multiple days before we have any confidence that we've actually
> > > fixed the data corruptions and not introduced any new ones.
> > > 
> > > If you pull only a small subset of the fixes, the fsx will still
> > > fail and we have no real way of actually verifying that there have
> > > been no regression introduced by the backport.  IOWs, there's a
> > > /massive/ amount of QA needed for ensuring that these backports work
> > > correctly.
> > > 
> > > Right now the XFS developers don't have the time or resources
> > > available to validate stable backports are correct and regression
> > > fre because we are focussed on ensuring the upstream fixes we've
> > > already made (and are still writing) are solid and reliable.

I feel the need to contribute my own interpretation of what's been going
on the last four months:

What you're seeing is not the usual level of reluctance to backport
fixes to LTS kernels, it's our own frustrations at the kernel
community's systemic inability to QA new fs features properly.

Four months ago (prior to 4.19) Zorro started digging into periodic test
failures with shared/010, which resulted in some fixes to the btrfs
dedupe and clone range ioctl implementations.  He then saw the same
failures on XFS.

Dave and I stared at the btrfs patches for a while, then started looking
at the xfs counterparts, and realized that nobody had ever added those
commands to the fstests stressor programs, nor had anyone ever encoded
into a test the side effects of a file remap (mtime update, removal of
suid).  Nor were there any tests to ensure that these ioctls couldn't be
abused to violate system security and stability constraints.

That's why I refactored a whole ton of vfs file remap code for 4.20, and
(with the help of Dave and Brian and others) worked on fixing all the
problems where fsx and fsstress demonstrate file corruption problems.

Then we started asking the same questions of the copy_file_range system
call, and discovered that yes, we have all of the same problems.  We
also discovered several failure cases that aren't mentioned in any
documentation, which has complicated the generation of automatable
tests.  Worse yet, the stressor programs fell over even sooner with the
fallback splice implementation.

TLDR: New features show up in the vfs without a lot of design
documentation, incomplete userspace interface manuals, and not much
beyond trivial testing.

So the problem I'm facing here is that the XFS team are singlehandedly
trying to pay off years of accumulated technical debt in the vfs.  We
definitely had a role in adding to that debt, so we're fixing it.

Dave is now refactoring the copy_file_range backend to implement all the
necessary security and stability checks, and I'm still QAing all the
stuff we've added to 4.20.

We're not finished, where "finished" means that we can get /one/ kernel
tree to go ~100 billion fsxops without burping up failures, and we've
written fstests to check that said kernel can handle correctly all the
weird side cases.

Until all those fstests go upstream, I don't want to spread out into
backporting and testing LTS kernels, even with test automation.  By the
time we're done with all our upstream work you ought to be able to
autosel backport the whole mess into the LTS kernels /and/ fstests will
be able to tell you if the autosel has succeeded without causing any
obvious regressions.

> > Ok, that's fine, so users of XFS should wait until the 4.20 release
> > before relying on it?  :)

At the rate we're going, we're not going to finish until 4.21, but yes,
let's wait until 4.20 is closer to release to start in on porting all of
its fixes to 4.14/4.19.

> It's getting to the point that with the amount of known issues with XFS
> on LTS kernels it makes sense to mark it as CONFIG_BROKEN.

These aren't all issues specific to XFS; some plague every fs in subtle
weird ways that only show up with extreme testing.  We need the extreme
testing to flush out as many bugs as we can before enabling the feature
by default.  XFS reflink is not enabled by default and due to all this
is not likely to get it any time soon.

(That copy_file_range syscall should have been rigorously tested before
it was turned on in the kernel...)

> > I understand your reluctance to want to backport anything, but it really
> > feels like you are not even allowing for fixes that are "obviously
> > right" to be backported either, even after they pass testing.  Which
> > isn't ok for your users.
> 
> Do the XFS maintainers expect users to always use the latest upstream
> kernel?

For features that are EXPERIMENTAL or aren't enabled by default, yes,
they should be.

--D

> 
> --
> Thanks,
> Sasha

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF
  2018-11-30  8:22         ` Greg KH
  2018-11-30 10:14           ` Sasha Levin
@ 2018-11-30 21:45           ` Dave Chinner
  2018-12-02 20:11             ` Greg KH
  1 sibling, 1 reply; 59+ messages in thread
From: Dave Chinner @ 2018-11-30 21:45 UTC (permalink / raw)
  To: Greg KH
  Cc: Sasha Levin, stable, linux-kernel, Dave Chinner,
	Darrick J . Wong, linux-fsdevel

On Fri, Nov 30, 2018 at 09:22:03AM +0100, Greg KH wrote:
> On Fri, Nov 30, 2018 at 09:40:19AM +1100, Dave Chinner wrote:
> > On Thu, Nov 29, 2018 at 01:47:56PM +0100, Greg KH wrote:
> > > On Thu, Nov 29, 2018 at 11:14:59PM +1100, Dave Chinner wrote:
> > > > 
> > > > Cherry picking only one of the 50-odd patches we've committed into
> > > > late 4.19 and 4.20 kernels to fix the problems we've found really
> > > > seems like asking for trouble. If you're going to back port random
> > > > data corruption fixes, then you need to spend a *lot* of time
> > > > validating that it doesn't make things worse than they already
> > > > are...
> > > 
> > > Any reason why we can't take the 50-odd patches in their entirety?  It
> > > sounds like 4.19 isn't fully fixed, but 4.20-rc1 is?  If so, what do you
> > > recommend we do to make 4.19 working properly?
> > 
> > You coul dpull all the fixes, but then you have a QA problem.
> > Basically, we have multiple badly broken syscalls (FICLONERANGE,
> > FIDEDUPERANGE and copy_file_range), and even 4.20-rc4 isn't fully
> > fixed.
> > 
> > There were ~5 critical dedupe/clone data corruption fixes for XFS
> > went into 4.19-rc8.
> 
> Have any of those been tagged for stable?

None, because I have no confidence that the stable process will do
the necessary QA to validate that such a significant backport is
regression and data corruption free.  The backport needs to be done
as a complete series when we've finished the upstream work because
we can't test isolated patches adequately because fsx will fall over
due to all the unfixed problems and not exercise the fixes that were
backported.

Further, we just had a regression reported in one of the commit that
the autosel bot has selected for automatic backports. It has been
uncovered by overlay which appears to do some unique things with
the piece of crap that is do_splice_direct(). And Darrick just
commented on #xfs that he's just noticed more bugs with FICLONERANGE
and overlay.

IOWs, we're still finding broken stuff in this code and we are
fixing it as fast as we can - we're still putting out fires. We most
certainly don't need the added pressure of having you guys create
more spot fires by breaking stable kernels with largely untested
partial backports and having users exposed to whacky new data
corruption issues.

So, no, it isn't tagged for stable kernels because "commit into
mainline" != "this should be backported immediately". Backports of
these fixes are largely going to be done largely as a function of
time and resources, of which we have zero available right now. Doing
backports right now is premature and ill-advised because we haven't
finished finding and fixing all the bugs and regressions in this
code.

> > Right now the XFS developers don't have the time or resources
> > available to validate stable backports are correct and regression
> > fre because we are focussed on ensuring the upstream fixes we've
> > already made (and are still writing) are solid and reliable.
> 
> Ok, that's fine, so users of XFS should wait until the 4.20 release
> before relying on it?  :)

Ok, Greg, that's *out of line*.

I should throw the CoC at you because I find that comment offensive,
condescending, belittling, denegrating and insulting.  Your smug and
superior "I know what is right for you" attitude is completely
inappropriate, and a little smiley face does not make it acceptible.

If you think your comment is funny, you've badly misjudged how much
effort I've put into this (100-hour weeks for over a month now), how
close I'm flying to burn out (again!), and how pissed off I am about
this whole scenario.

We ended up here because we *trusted* that other people had
implemented and tested their APIs and code properly before it got
merged. We've been severely burnt, and we've been left to clean up
the mess made by other people by ourselves.

Instead of thanks, what we get instead is "we know better" attitude
and jokes implying our work is crap and we don't care about our
users. That's just plain *insulting*.  If anyone is looking for a
demonstration of everything that is wrong with the Linux kernel
development culture, then they don't need to look any further.

> I understand your reluctance to want to backport anything, but it really
> feels like you are not even allowing for fixes that are "obviously
> right" to be backported either, even after they pass testing.  Which
> isn't ok for your users.

It's worse for our users if we introduce regressions into stable
kernels, which is exactly what this "obviously right" auto-backport
would have done.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF
  2018-11-30 10:14           ` Sasha Levin
  2018-11-30 20:35             ` Darrick J. Wong
@ 2018-11-30 21:50             ` Dave Chinner
  2018-12-01  7:49               ` Sasha Levin
  1 sibling, 1 reply; 59+ messages in thread
From: Dave Chinner @ 2018-11-30 21:50 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Greg KH, stable, linux-kernel, Dave Chinner, Darrick J . Wong,
	linux-fsdevel

On Fri, Nov 30, 2018 at 05:14:41AM -0500, Sasha Levin wrote:
> On Fri, Nov 30, 2018 at 09:22:03AM +0100, Greg KH wrote:
> >On Fri, Nov 30, 2018 at 09:40:19AM +1100, Dave Chinner wrote:
> >>I stopped my tests at 5 billion ops yesterday (i.e. 20 billion ops
> >>aggregate) to focus on testing the copy_file_range() changes, but
> >>Darrick's tests are still ongoing and have passed 40 billion ops in
> >>aggregate over the past few days.
> >>
> >>The reason we are running these so long is that we've seen fsx data
> >>corruption failures after 12+ hours of runtime and hundreds of
> >>millions of ops. Hence the testing for backported fixes will need to
> >>replicate these test runs across multiple configurations for
> >>multiple days before we have any confidence that we've actually
> >>fixed the data corruptions and not introduced any new ones.
> >>
> >>If you pull only a small subset of the fixes, the fsx will still
> >>fail and we have no real way of actually verifying that there have
> >>been no regression introduced by the backport.  IOWs, there's a
> >>/massive/ amount of QA needed for ensuring that these backports work
> >>correctly.
> >>
> >>Right now the XFS developers don't have the time or resources
> >>available to validate stable backports are correct and regression
> >>fre because we are focussed on ensuring the upstream fixes we've
> >>already made (and are still writing) are solid and reliable.
> >
> >Ok, that's fine, so users of XFS should wait until the 4.20 release
> >before relying on it?  :)
> 
> It's getting to the point that with the amount of known issues with XFS
> on LTS kernels it makes sense to mark it as CONFIG_BROKEN.

Really? Where are the bug reports?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF
  2018-11-30 21:50             ` Dave Chinner
@ 2018-12-01  7:49               ` Sasha Levin
  2018-12-01  9:09                 ` XFS patches for stable Amir Goldstein
  2018-12-02 23:23                 ` [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF Dave Chinner
  0 siblings, 2 replies; 59+ messages in thread
From: Sasha Levin @ 2018-12-01  7:49 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Greg KH, stable, linux-kernel, Dave Chinner, Darrick J . Wong,
	linux-fsdevel

On Sat, Dec 01, 2018 at 08:50:05AM +1100, Dave Chinner wrote:
>On Fri, Nov 30, 2018 at 05:14:41AM -0500, Sasha Levin wrote:
>> On Fri, Nov 30, 2018 at 09:22:03AM +0100, Greg KH wrote:
>> >On Fri, Nov 30, 2018 at 09:40:19AM +1100, Dave Chinner wrote:
>> >>I stopped my tests at 5 billion ops yesterday (i.e. 20 billion ops
>> >>aggregate) to focus on testing the copy_file_range() changes, but
>> >>Darrick's tests are still ongoing and have passed 40 billion ops in
>> >>aggregate over the past few days.
>> >>
>> >>The reason we are running these so long is that we've seen fsx data
>> >>corruption failures after 12+ hours of runtime and hundreds of
>> >>millions of ops. Hence the testing for backported fixes will need to
>> >>replicate these test runs across multiple configurations for
>> >>multiple days before we have any confidence that we've actually
>> >>fixed the data corruptions and not introduced any new ones.
>> >>
>> >>If you pull only a small subset of the fixes, the fsx will still
>> >>fail and we have no real way of actually verifying that there have
>> >>been no regression introduced by the backport.  IOWs, there's a
>> >>/massive/ amount of QA needed for ensuring that these backports work
>> >>correctly.
>> >>
>> >>Right now the XFS developers don't have the time or resources
>> >>available to validate stable backports are correct and regression
>> >>fre because we are focussed on ensuring the upstream fixes we've
>> >>already made (and are still writing) are solid and reliable.
>> >
>> >Ok, that's fine, so users of XFS should wait until the 4.20 release
>> >before relying on it?  :)
>>
>> It's getting to the point that with the amount of known issues with XFS
>> on LTS kernels it makes sense to mark it as CONFIG_BROKEN.
>
>Really? Where are the bug reports?

In 'git log'! You report these every time you fix something in upstream
xfs but don't backport it to stable trees:

$ git log --oneline v4.18-rc1..v4.18 fs/xfs
d4a34e165557 xfs: properly handle free inodes in extent hint validators
9991274fddb9 xfs: Initialize variables in xfs_alloc_get_rec before using them
d8cb5e423789 xfs: fix fdblocks accounting w/ RMAPBT per-AG reservation
e53c4b598372 xfs: ensure post-EOF zeroing happens after zeroing part of a file
a3a374bf1889 xfs: fix off-by-one error in xfs_rtalloc_query_range
232d0a24b0fc xfs: fix uninitialized field in rtbitmap fsmap backend
5bd88d153998 xfs: recheck reflink state after grabbing ILOCK_SHARED for a write
f62cb48e4319 xfs: don't allow insert-range to shift extents past the maximum offset
aafe12cee0b1 xfs: don't trip over negative free space in xfs_reserve_blocks
10ee25268e1f xfs: allow empty transactions while frozen
e53946dbd31a xfs: xfs_iflush_abort() can be called twice on cluster writeback failure
23fcb3340d03 xfs: More robust inode extent count validation
e2ac836307e3 xfs: simplify xfs_bmap_punch_delalloc_range

Since I'm assuming that at least some of them are based on actual issues
users hit, and some of those apply to stable kernels, why would users
want to use an XFS version which is knowingly buggy?

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: XFS patches for stable
  2018-12-01  7:49               ` Sasha Levin
@ 2018-12-01  9:09                 ` Amir Goldstein
  2018-12-02 15:25                   ` Sasha Levin
  2018-12-02 23:23                 ` [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF Dave Chinner
  1 sibling, 1 reply; 59+ messages in thread
From: Amir Goldstein @ 2018-12-01  9:09 UTC (permalink / raw)
  To: sashal
  Cc: Dave Chinner, Greg KH, stable, linux-kernel, Dave Chinner,
	Darrick J. Wong, linux-fsdevel, linux-xfs, Luis R. Chamberlain

> >> It's getting to the point that with the amount of known issues with XFS
> >> on LTS kernels it makes sense to mark it as CONFIG_BROKEN.
> >
> >Really? Where are the bug reports?
>
> In 'git log'! You report these every time you fix something in upstream
> xfs but don't backport it to stable trees:
>
> $ git log --oneline v4.18-rc1..v4.18 fs/xfs
> d4a34e165557 xfs: properly handle free inodes in extent hint validators
> 9991274fddb9 xfs: Initialize variables in xfs_alloc_get_rec before using them
> d8cb5e423789 xfs: fix fdblocks accounting w/ RMAPBT per-AG reservation
> e53c4b598372 xfs: ensure post-EOF zeroing happens after zeroing part of a file
> a3a374bf1889 xfs: fix off-by-one error in xfs_rtalloc_query_range
> 232d0a24b0fc xfs: fix uninitialized field in rtbitmap fsmap backend
> 5bd88d153998 xfs: recheck reflink state after grabbing ILOCK_SHARED for a write
> f62cb48e4319 xfs: don't allow insert-range to shift extents past the maximum offset
> aafe12cee0b1 xfs: don't trip over negative free space in xfs_reserve_blocks
> 10ee25268e1f xfs: allow empty transactions while frozen
> e53946dbd31a xfs: xfs_iflush_abort() can be called twice on cluster writeback failure
> 23fcb3340d03 xfs: More robust inode extent count validation
> e2ac836307e3 xfs: simplify xfs_bmap_punch_delalloc_range
>
> Since I'm assuming that at least some of them are based on actual issues
> users hit, and some of those apply to stable kernels, why would users
> want to use an XFS version which is knowingly buggy?
>

Sasha,

There is one more point to consider.
Until v4.16, reflink and rmapbt features were experimental:
76883f7988e6 xfs: remove experimental tag for reverse mapping
1e369b0e199b xfs: remove experimental tag for reflinks

And MANY of the bug fixes flowing in through XFS tree to master
are related to those new XFS features and also to vfs functionality
that depends on them (e.g. clone/dedupe), so there MAY be no
bug reports at all for XFS in stable trees.

IMO users should NOT be expecting XFS to be stable with those
features enabled (they are still disabled by default)
when running on stable kernels below v4.16.

Allow me to act as a self-appointed mediator here and say:
There is obviously some bad blood between xfs developers and stable
tree maintainers.
The conflicts are caused by long standing frustration on both sides.
We would all be better off with looking forward on how to improve the
situation instead dwelling on past mistakes.
This issue was on the agenda at the XFS team meeting on last LSF/MM.
The path towards compliance has been laid out by xfs maintainers.
Luis, Sasha and myself have been working to improve the filesystem
test coverage for stable tree candidate patches.
We have still some way to go.

The stable candidate patches that triggered the recent flames
was outside of the fs/xfs subsystem, which AUTOSEL already know
to stay away from, so nobody had any intention to stir things up.

At the end of the day, most xfs developers work for companies that
ship enterprise distros and need to maintain stable trees, so I would
hope that it is in the best interest of everyone involved to cooperate
on the goal of better stable-xfs ecosystem.

On my part, I would be happy if AUTOSEL could point me at
candidate patch *series* for review instead of single patches.
For that matter, it sure wouldn't hurt if an xfs developer sending
out a patch series would cc:stable on the cover letter and if a developer
would be kind enough to add some backporting hints to the cover letter
text that would be very helpful indeed.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: XFS patches for stable
  2018-12-01  9:09                 ` XFS patches for stable Amir Goldstein
@ 2018-12-02 15:25                   ` Sasha Levin
  2018-12-02 16:10                     ` Christoph Hellwig
  0 siblings, 1 reply; 59+ messages in thread
From: Sasha Levin @ 2018-12-02 15:25 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Dave Chinner, Greg KH, stable, linux-kernel, Dave Chinner,
	Darrick J. Wong, linux-fsdevel, linux-xfs, Luis R. Chamberlain

On Sat, Dec 01, 2018 at 11:09:05AM +0200, Amir Goldstein wrote:
>> >> It's getting to the point that with the amount of known issues with XFS
>> >> on LTS kernels it makes sense to mark it as CONFIG_BROKEN.
>> >
>> >Really? Where are the bug reports?
>>
>> In 'git log'! You report these every time you fix something in upstream
>> xfs but don't backport it to stable trees:
>>
>> $ git log --oneline v4.18-rc1..v4.18 fs/xfs
>> d4a34e165557 xfs: properly handle free inodes in extent hint validators
>> 9991274fddb9 xfs: Initialize variables in xfs_alloc_get_rec before using them
>> d8cb5e423789 xfs: fix fdblocks accounting w/ RMAPBT per-AG reservation
>> e53c4b598372 xfs: ensure post-EOF zeroing happens after zeroing part of a file
>> a3a374bf1889 xfs: fix off-by-one error in xfs_rtalloc_query_range
>> 232d0a24b0fc xfs: fix uninitialized field in rtbitmap fsmap backend
>> 5bd88d153998 xfs: recheck reflink state after grabbing ILOCK_SHARED for a write
>> f62cb48e4319 xfs: don't allow insert-range to shift extents past the maximum offset
>> aafe12cee0b1 xfs: don't trip over negative free space in xfs_reserve_blocks
>> 10ee25268e1f xfs: allow empty transactions while frozen
>> e53946dbd31a xfs: xfs_iflush_abort() can be called twice on cluster writeback failure
>> 23fcb3340d03 xfs: More robust inode extent count validation
>> e2ac836307e3 xfs: simplify xfs_bmap_punch_delalloc_range
>>
>> Since I'm assuming that at least some of them are based on actual issues
>> users hit, and some of those apply to stable kernels, why would users
>> want to use an XFS version which is knowingly buggy?
>>
>
>Sasha,
>
>There is one more point to consider.
>Until v4.16, reflink and rmapbt features were experimental:
>76883f7988e6 xfs: remove experimental tag for reverse mapping
>1e369b0e199b xfs: remove experimental tag for reflinks
>
>And MANY of the bug fixes flowing in through XFS tree to master
>are related to those new XFS features and also to vfs functionality
>that depends on them (e.g. clone/dedupe), so there MAY be no
>bug reports at all for XFS in stable trees.
>
>IMO users should NOT be expecting XFS to be stable with those
>features enabled (they are still disabled by default)
>when running on stable kernels below v4.16.
>
>Allow me to act as a self-appointed mediator here and say:
>There is obviously some bad blood between xfs developers and stable
>tree maintainers.
>The conflicts are caused by long standing frustration on both sides.
>We would all be better off with looking forward on how to improve the
>situation instead dwelling on past mistakes.
>This issue was on the agenda at the XFS team meeting on last LSF/MM.
>The path towards compliance has been laid out by xfs maintainers.
>Luis, Sasha and myself have been working to improve the filesystem
>test coverage for stable tree candidate patches.
>We have still some way to go.
>
>The stable candidate patches that triggered the recent flames
>was outside of the fs/xfs subsystem, which AUTOSEL already know
>to stay away from, so nobody had any intention to stir things up.
>
>At the end of the day, most xfs developers work for companies that
>ship enterprise distros and need to maintain stable trees, so I would
>hope that it is in the best interest of everyone involved to cooperate
>on the goal of better stable-xfs ecosystem.
>
>On my part, I would be happy if AUTOSEL could point me at
>candidate patch *series* for review instead of single patches.

I'm afraid it's not smart enough to do that :(

I can grab an entire series if it selects a single patch in a series,
but from my experience it's usually the wrong thing to do.

>For that matter, it sure wouldn't hurt if an xfs developer sending
>out a patch series would cc:stable on the cover letter and if a developer
>would be kind enough to add some backporting hints to the cover letter
>text that would be very helpful indeed.

Given that we have folks (Luis, Amir, etc) working on it already, maybe
a step in the right direction would be having the XFS folks tag fixes
some other way ("#wants-a-backport"?) where this would give a hint that
this should be backported after sufficient testing?

We won't pick these commits to stable ourselves, but only after the XFS
maintainers are satisfied that the commit was sufficiently tested on LTS
trees?

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: XFS patches for stable
  2018-12-02 15:25                   ` Sasha Levin
@ 2018-12-02 16:10                     ` Christoph Hellwig
  2018-12-02 20:08                       ` Greg KH
  0 siblings, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2018-12-02 16:10 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Amir Goldstein, Dave Chinner, Greg KH, stable, linux-kernel,
	Dave Chinner, Darrick J. Wong, linux-fsdevel, linux-xfs,
	Luis R. Chamberlain

As someone who has done xfs stable backports for a while I really don't
think the autoselection is helpful at all.  Someone who is vaguely
familiar with the code needs to manually select the commits and QA them,
which takes a fair amount of time, but just needs some manual help if it
should work ok.

I think we are about ready to have a new xfs stable maintainer lined up
if everything works well fortunately.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: XFS patches for stable
  2018-12-02 16:10                     ` Christoph Hellwig
@ 2018-12-02 20:08                       ` Greg KH
  2018-12-03 14:41                         ` Richard Weinberger
  0 siblings, 1 reply; 59+ messages in thread
From: Greg KH @ 2018-12-02 20:08 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sasha Levin, Amir Goldstein, Dave Chinner, stable, linux-kernel,
	Dave Chinner, Darrick J. Wong, linux-fsdevel, linux-xfs,
	Luis R. Chamberlain

On Sun, Dec 02, 2018 at 08:10:16AM -0800, Christoph Hellwig wrote:
> As someone who has done xfs stable backports for a while I really don't
> think the autoselection is helpful at all.

autoselection for xfs patches has been turned off for a while, what
triggered this email thread was a core vfs patch that was backported
that was not obvious it was created by the xfs developers due to a
problem they had found.

> Someone who is vaguely familiar with the code needs to manually select
> the commits and QA them, which takes a fair amount of time, but just
> needs some manual help if it should work ok.
> 
> I think we are about ready to have a new xfs stable maintainer lined up
> if everything works well fortunately.

That would be wonderful news.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF
  2018-11-30 21:45           ` Dave Chinner
@ 2018-12-02 20:11             ` Greg KH
  0 siblings, 0 replies; 59+ messages in thread
From: Greg KH @ 2018-12-02 20:11 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Sasha Levin, stable, linux-kernel, Dave Chinner,
	Darrick J . Wong, linux-fsdevel

On Sat, Dec 01, 2018 at 08:45:48AM +1100, Dave Chinner wrote:
> > > Right now the XFS developers don't have the time or resources
> > > available to validate stable backports are correct and regression
> > > fre because we are focussed on ensuring the upstream fixes we've
> > > already made (and are still writing) are solid and reliable.
> > 
> > Ok, that's fine, so users of XFS should wait until the 4.20 release
> > before relying on it?  :)
> 
> Ok, Greg, that's *out of line*.

Sorry, I did not mean it that way at all, I apologize.

I do appreciate all the work you do on your subsystem, I was not
criticizing that at all.  I was just trying to make a bad joke that it
felt like no xfs patches should ever be accepted into stable kernels
because more are always being fixed, so the treadmill wouldn't stop.

It's like asking a processor developer "what chip to buy" and they
always say "the next one is going to be great!" because that is what
they are working on at the moment, yet you need to buy something today
to get your work done.  That's all, no harm ment at all, sorry if it
came across the wrong way.

greg k-h

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF
  2018-12-01  7:49               ` Sasha Levin
  2018-12-01  9:09                 ` XFS patches for stable Amir Goldstein
@ 2018-12-02 23:23                 ` Dave Chinner
  2018-12-03  7:11                   ` Amir Goldstein
  2018-12-03  9:22                   ` Sasha Levin
  1 sibling, 2 replies; 59+ messages in thread
From: Dave Chinner @ 2018-12-02 23:23 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Greg KH, stable, linux-kernel, Dave Chinner, Darrick J . Wong,
	linux-fsdevel

On Sat, Dec 01, 2018 at 02:49:09AM -0500, Sasha Levin wrote:
> On Sat, Dec 01, 2018 at 08:50:05AM +1100, Dave Chinner wrote:
> >On Fri, Nov 30, 2018 at 05:14:41AM -0500, Sasha Levin wrote:
> >>On Fri, Nov 30, 2018 at 09:22:03AM +0100, Greg KH wrote:
> >>>On Fri, Nov 30, 2018 at 09:40:19AM +1100, Dave Chinner wrote:
> >>>>I stopped my tests at 5 billion ops yesterday (i.e. 20 billion ops
> >>>>aggregate) to focus on testing the copy_file_range() changes, but
> >>>>Darrick's tests are still ongoing and have passed 40 billion ops in
> >>>>aggregate over the past few days.
> >>>>
> >>>>The reason we are running these so long is that we've seen fsx data
> >>>>corruption failures after 12+ hours of runtime and hundreds of
> >>>>millions of ops. Hence the testing for backported fixes will need to
> >>>>replicate these test runs across multiple configurations for
> >>>>multiple days before we have any confidence that we've actually
> >>>>fixed the data corruptions and not introduced any new ones.
> >>>>
> >>>>If you pull only a small subset of the fixes, the fsx will still
> >>>>fail and we have no real way of actually verifying that there have
> >>>>been no regression introduced by the backport.  IOWs, there's a
> >>>>/massive/ amount of QA needed for ensuring that these backports work
> >>>>correctly.
> >>>>
> >>>>Right now the XFS developers don't have the time or resources
> >>>>available to validate stable backports are correct and regression
> >>>>fre because we are focussed on ensuring the upstream fixes we've
> >>>>already made (and are still writing) are solid and reliable.
> >>>
> >>>Ok, that's fine, so users of XFS should wait until the 4.20 release
> >>>before relying on it?  :)
> >>
> >>It's getting to the point that with the amount of known issues with XFS
> >>on LTS kernels it makes sense to mark it as CONFIG_BROKEN.
> >
> >Really? Where are the bug reports?
> 
> In 'git log'! You report these every time you fix something in upstream
> xfs but don't backport it to stable trees:

That is so wrong on so many levels I don't really know where to
begin. I guess doing a *basic risk analysis* demonstrating that none
of those fixes are backport candidates is a good start:

> $ git log --oneline v4.18-rc1..v4.18 fs/xfs
> d4a34e165557 xfs: properly handle free inodes in extent hint validators

Found by QA with generic/229 on a non-standard config. Not user
reported, unlikely to ever be seen by users.

> 9991274fddb9 xfs: Initialize variables in xfs_alloc_get_rec before using them

Cleaning up coverity reported issues to do with corruption log
messages. No visible symptoms, Not user reported.

> d8cb5e423789 xfs: fix fdblocks accounting w/ RMAPBT per-AG reservation

Minor free space accounting issue, not user reported, doesn't affect
normal operation.

> e53c4b598372 xfs: ensure post-EOF zeroing happens after zeroing part of a file

Found with fsx via generic/127. Not user reported, doesn't affect
userspace operation at all.

> a3a374bf1889 xfs: fix off-by-one error in xfs_rtalloc_query_range

Regression fix for code introduced in 4.18-rc1. Not user reported
because the code has never been released.

> 232d0a24b0fc xfs: fix uninitialized field in rtbitmap fsmap backend

Coverity warning fix, not user reported, not user impact.

> 5bd88d153998 xfs: recheck reflink state after grabbing ILOCK_SHARED for a write

Fixes warning from generic/166, not user reported. Could affect
users mixing direct IO with reflink, but we expect people using
new functionality like reflink to be tracking TOT fairly closely
anyway.

> f62cb48e4319 xfs: don't allow insert-range to shift extents past the maximum offset

Found by QA w/ generic/465. Not user reported, only affects files in
the exabyte range so not a real world problem....

> aafe12cee0b1 xfs: don't trip over negative free space in xfs_reserve_blocks

Found during ENOSPC stress tests that depeleted the reserve pool.
Not user reported, unlikely to ever be hit by users.

> 10ee25268e1f xfs: allow empty transactions while frozen

Removes a spurious warning when running GETFSMAP ioctl on a frozen
filesystem. Not user reported, highly unlikely any user will ever
hit this as nothing but XFs utilities use GETFSMAP at the moment.

> e53946dbd31a xfs: xfs_iflush_abort() can be called twice on cluster writeback failure

Bug in corrupted filesystem handling, been there for ~15 years IIRC.
Not user reported - found by one of our shutdown stress tests
on a debug kernel (generic/388, IIRC). Highly unlikely to show up in
the real world given how long the bug has been there.

> 23fcb3340d03 xfs: More robust inode extent count validation

Found by filesystem image fuzzing (i.e. intentional filesystem
corruption). Not user reported, and the filesystem corruption that
triggered this problem is so artificial there is really no chance of
it ever occurring in the real world.

> e2ac836307e3 xfs: simplify xfs_bmap_punch_delalloc_range

Cleanup and simplification. Not a bug fix, not user reported, not a
backport candidate.

IOWs, there isn't a single commit in this list that is user
reported, nor anything that I'd consider a stable kernel backport
candidate because none of them affect normal user workloads. i.e.
they've all be found by tools designed to break filesystems and
exercise rarely travelled error paths.

> Since I'm assuming that at least some of them are based on actual issues
> users hit, and some of those apply to stable kernels, why would users
> want to use an XFS version which is knowingly buggy?

Your assumption is not only incorrect, it is fundamentally flawed.
A list of commits containing bug fixes is not a list of bug reports
from users.

IOWs, backporting them only increases the risk of regressions for
users, it doesn't reduce the risk of users hitting problems or fix
any problems that users are at risk of actually hitting. IOWs, all
of these changes fall on the wrong side of the risk-benefit analysis
equation.

Risk/benefit analysis is fundamental to software engineering
processes.  Running "git log" is not a risk analysis - it's just
provides a list of things that you need to perform an analysis on.
Risk analsysis takes time and effort, and to imply that it is not
necessary and we should just backport everything makes the incorrect
assumption that backporting carries no risk at all.

It seems to me that the stable kernel process measures itself on how
many commits an dhow fast they are backported from mainline kernels,
and the entire focus of improvement is on backporting /more/ commits
/faster/. i.e.  it's all about the speed and quantity of code being
moved back to the "stable" kernels. What it should be doing is
identifying and addressing bugs or flaws that put users are risk or
that users are reporting.

Further, the speed at which backports occur (i.e. within a day or 3
of upstream commit) means that the code being backported hasn't had
time to reach a wide testing audience and have regressions shaken
out of it. The whole purpose of having progressively stricter -rcX
upstream kernel releases is to allow the new code to stabilise and
shake out unforseen regressions before it gets to users. The stable
process is actually releasing upstream code to users before they can
even get it in a released upstream kernel (i.e. a .0 kernel, not a
-rcX).

IOWs, pulling code back to stable kernels before it's had a chance
to stabilise and be more widely tested in the upstream kernel is
entirely the wrong thing to be doing. Speed here does not improve
stability, it just increases the risk of regressions and unforseen
bugs being introduced into the stable tree. And that's made worse by
the fact that the -rcX process and widespread upstream testing that
goes along with it* to catch those bugs and regressions. And that's
made even worse by the fact that subsystems don't have control over
what is backported anymore, so they may not even be aware that a fix
for a fix needs to be sent back to stable kernels.

This is the issue here - the "stable kernel" criteria is not about
stability - it's being optimised to shovel as much change as
possible with /as little effort as possible/ back into older code
bases. That's not a recipe for stability, especially considering the
relative lack of QA the stable kernels get.

IMO, the whole set of linux kernel processes are being optimised
around the wrong metrics - we count new features, the number of
commits per release and the quantity of code that gets changed. We
then optimise our processes to increase these metrics. IOWs, we're
optimising for speed and rapid change, not quality, reliability and
stability.

We are not measuring code quality improvements, how effective our
code review is, we do not do post-mortem analysis of major failures
and we most certainly don't change processes to avoid those problems
in future, etc. And worst of all is that people who want better
processes to improve code quality, testing, etc get shouted at
because it may slow down the rate at which we change code. i.e. only
"speed and quantity" seems to matter to the core upstream kernel
developement community.

As Darrick said, what we are seeing here is a result of "[...] the
kernel community's systemic inability to QA new fs features
properly." I'm taking that one step further - what we are seeing
here is the kernel community's systemic inability to address
fundamental engineering process deficiencies because "speed and
quantity" are considered more important than the quality of the
product being produced.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF
  2018-12-02 23:23                 ` [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF Dave Chinner
@ 2018-12-03  7:11                   ` Amir Goldstein
  2018-12-03  9:22                   ` Sasha Levin
  1 sibling, 0 replies; 59+ messages in thread
From: Amir Goldstein @ 2018-12-03  7:11 UTC (permalink / raw)
  To: Dave Chinner
  Cc: sashal, Greg KH, stable, linux-kernel, Dave Chinner,
	Darrick J. Wong, linux-fsdevel, Luis R. Chamberlain

On Mon, Dec 3, 2018 at 1:23 AM Dave Chinner <david@fromorbit.com> wrote:
>
> On Sat, Dec 01, 2018 at 02:49:09AM -0500, Sasha Levin wrote:
> > On Sat, Dec 01, 2018 at 08:50:05AM +1100, Dave Chinner wrote:
> > >On Fri, Nov 30, 2018 at 05:14:41AM -0500, Sasha Levin wrote:
> > >>On Fri, Nov 30, 2018 at 09:22:03AM +0100, Greg KH wrote:
> > >>>On Fri, Nov 30, 2018 at 09:40:19AM +1100, Dave Chinner wrote:
> > >>>>I stopped my tests at 5 billion ops yesterday (i.e. 20 billion ops
> > >>>>aggregate) to focus on testing the copy_file_range() changes, but
> > >>>>Darrick's tests are still ongoing and have passed 40 billion ops in
> > >>>>aggregate over the past few days.
> > >>>>
> > >>>>The reason we are running these so long is that we've seen fsx data
> > >>>>corruption failures after 12+ hours of runtime and hundreds of
> > >>>>millions of ops. Hence the testing for backported fixes will need to
> > >>>>replicate these test runs across multiple configurations for
> > >>>>multiple days before we have any confidence that we've actually
> > >>>>fixed the data corruptions and not introduced any new ones.
> > >>>>
> > >>>>If you pull only a small subset of the fixes, the fsx will still
> > >>>>fail and we have no real way of actually verifying that there have
> > >>>>been no regression introduced by the backport.  IOWs, there's a
> > >>>>/massive/ amount of QA needed for ensuring that these backports work
> > >>>>correctly.
> > >>>>
> > >>>>Right now the XFS developers don't have the time or resources
> > >>>>available to validate stable backports are correct and regression
> > >>>>fre because we are focussed on ensuring the upstream fixes we've
> > >>>>already made (and are still writing) are solid and reliable.
> > >>>
> > >>>Ok, that's fine, so users of XFS should wait until the 4.20 release
> > >>>before relying on it?  :)
> > >>
> > >>It's getting to the point that with the amount of known issues with XFS
> > >>on LTS kernels it makes sense to mark it as CONFIG_BROKEN.
> > >
> > >Really? Where are the bug reports?
> >
> > In 'git log'! You report these every time you fix something in upstream
> > xfs but don't backport it to stable trees:
>
> That is so wrong on so many levels I don't really know where to
> begin. I guess doing a *basic risk analysis* demonstrating that none
> of those fixes are backport candidates is a good start:
>
> > $ git log --oneline v4.18-rc1..v4.18 fs/xfs
> > d4a34e165557 xfs: properly handle free inodes in extent hint validators
>
> Found by QA with generic/229 on a non-standard config. Not user
> reported, unlikely to ever be seen by users.
>
> > 9991274fddb9 xfs: Initialize variables in xfs_alloc_get_rec before using them
>
> Cleaning up coverity reported issues to do with corruption log
> messages. No visible symptoms, Not user reported.
>
> > d8cb5e423789 xfs: fix fdblocks accounting w/ RMAPBT per-AG reservation
>
> Minor free space accounting issue, not user reported, doesn't affect
> normal operation.
>
> > e53c4b598372 xfs: ensure post-EOF zeroing happens after zeroing part of a file
>
> Found with fsx via generic/127. Not user reported, doesn't affect
> userspace operation at all.
>
> > a3a374bf1889 xfs: fix off-by-one error in xfs_rtalloc_query_range
>
> Regression fix for code introduced in 4.18-rc1. Not user reported
> because the code has never been released.
>
> > 232d0a24b0fc xfs: fix uninitialized field in rtbitmap fsmap backend
>
> Coverity warning fix, not user reported, not user impact.
>
> > 5bd88d153998 xfs: recheck reflink state after grabbing ILOCK_SHARED for a write
>
> Fixes warning from generic/166, not user reported. Could affect
> users mixing direct IO with reflink, but we expect people using
> new functionality like reflink to be tracking TOT fairly closely
> anyway.
>
> > f62cb48e4319 xfs: don't allow insert-range to shift extents past the maximum offset
>
> Found by QA w/ generic/465. Not user reported, only affects files in
> the exabyte range so not a real world problem....
>
> > aafe12cee0b1 xfs: don't trip over negative free space in xfs_reserve_blocks
>
> Found during ENOSPC stress tests that depeleted the reserve pool.
> Not user reported, unlikely to ever be hit by users.
>
> > 10ee25268e1f xfs: allow empty transactions while frozen
>
> Removes a spurious warning when running GETFSMAP ioctl on a frozen
> filesystem. Not user reported, highly unlikely any user will ever
> hit this as nothing but XFs utilities use GETFSMAP at the moment.
>
> > e53946dbd31a xfs: xfs_iflush_abort() can be called twice on cluster writeback failure
>
> Bug in corrupted filesystem handling, been there for ~15 years IIRC.
> Not user reported - found by one of our shutdown stress tests
> on a debug kernel (generic/388, IIRC). Highly unlikely to show up in
> the real world given how long the bug has been there.
>
> > 23fcb3340d03 xfs: More robust inode extent count validation
>
> Found by filesystem image fuzzing (i.e. intentional filesystem
> corruption). Not user reported, and the filesystem corruption that
> triggered this problem is so artificial there is really no chance of
> it ever occurring in the real world.
>
> > e2ac836307e3 xfs: simplify xfs_bmap_punch_delalloc_range
>
> Cleanup and simplification. Not a bug fix, not user reported, not a
> backport candidate.
>
> IOWs, there isn't a single commit in this list that is user
> reported, nor anything that I'd consider a stable kernel backport
> candidate because none of them affect normal user workloads. i.e.
> they've all be found by tools designed to break filesystems and
> exercise rarely travelled error paths.
>
> > Since I'm assuming that at least some of them are based on actual issues
> > users hit, and some of those apply to stable kernels, why would users
> > want to use an XFS version which is knowingly buggy?
>
> Your assumption is not only incorrect, it is fundamentally flawed.
> A list of commits containing bug fixes is not a list of bug reports
> from users.
>

Up to here, we are in complete agreement.
Thank you for the effort you've put into showing how much effort it takes
to properly review candidate patches.

Further down, although I can tell that your harsh response is due to Sasha's
provoking suggestion to set CONFIG_BROKEN, IMO, your response can be
perceived as sliding a bit into the territory of telling someone else how to do
their job.

It is one thing to advocate that well tested distro kernels are a better
choice for end users. Greg has always advocated the same.
It is another thing to suggest that the kernel.org stable trees have no value
because they are not being maintained with the same standards as the distro
stable kernels.

The time and place for XFS maintainers to make a judgement about stable
trees is whether or not they are willing to look at bug reports reproduced on
kernel.org stable tree kernels.

Whether or not kernel.org stable trees are useful is a risk/benefit
analysis that
each and every downstream user should be doing themselves. And it is the
responsibility of the the stable tree maintainer to make the choices that affect
their downstream users.

In my personal opinion, as a downstream kernel.org stable tree user,
there will be great value in the community maintained stable trees, if and
when filesystem test suites will be run regularly on stable tree candidates.

Whether or not those stable trees include "minor" bug fixes, as the ones
that you listed above, should not be the concern of XFS maintainer, it should
be the concern of downstream users making their own risk/benefit analysis.

I am very much aware of the paradigm that less changes == less risk, which
is the corner stone of maintaining a stable/maint branch.
But at the same time, you seem to be ignoring the fact that people often make
mistakes when cherry-picking over selectively, because some patches in
the series that look like meaningless re-factoring or ones that fix "minor" bugs
may actually be required for a later bug fix and it is not always evident from
reading the commit messages. So there is more to the risk/benefit analysis
then what you present.

There is no replacement for good test coverage. The XFS subsystem excels
in that department, which makes the validation of stable XFS tree candidates
with xfstests very valuable.

There is no replacement for human review of stable tree patch candidates.
*HOWEVER*! the purpose of this review should be to point out backporting
bugs  - it should not be to tell the stable tree maintainer which bugs are
stable tree eligible and which bugs are not.

Please join me in an early New Year's resolution: We shall all strive to make
4.19.y LTS kernel more reliable than previous LTS kernels w.r.t filesystems
in general and XFS in particular.

Cheers to that,
Amir.

> IOWs, backporting them only increases the risk of regressions for
> users, it doesn't reduce the risk of users hitting problems or fix
> any problems that users are at risk of actually hitting. IOWs, all
> of these changes fall on the wrong side of the risk-benefit analysis
> equation.
>
> Risk/benefit analysis is fundamental to software engineering
> processes.  Running "git log" is not a risk analysis - it's just
> provides a list of things that you need to perform an analysis on.
> Risk analsysis takes time and effort, and to imply that it is not
> necessary and we should just backport everything makes the incorrect
> assumption that backporting carries no risk at all.
>
> It seems to me that the stable kernel process measures itself on how
> many commits an dhow fast they are backported from mainline kernels,
> and the entire focus of improvement is on backporting /more/ commits
> /faster/. i.e.  it's all about the speed and quantity of code being
> moved back to the "stable" kernels. What it should be doing is
> identifying and addressing bugs or flaws that put users are risk or
> that users are reporting.
>
> Further, the speed at which backports occur (i.e. within a day or 3
> of upstream commit) means that the code being backported hasn't had
> time to reach a wide testing audience and have regressions shaken
> out of it. The whole purpose of having progressively stricter -rcX
> upstream kernel releases is to allow the new code to stabilise and
> shake out unforseen regressions before it gets to users. The stable
> process is actually releasing upstream code to users before they can
> even get it in a released upstream kernel (i.e. a .0 kernel, not a
> -rcX).
>
> IOWs, pulling code back to stable kernels before it's had a chance
> to stabilise and be more widely tested in the upstream kernel is
> entirely the wrong thing to be doing. Speed here does not improve
> stability, it just increases the risk of regressions and unforseen
> bugs being introduced into the stable tree. And that's made worse by
> the fact that the -rcX process and widespread upstream testing that
> goes along with it* to catch those bugs and regressions. And that's
> made even worse by the fact that subsystems don't have control over
> what is backported anymore, so they may not even be aware that a fix
> for a fix needs to be sent back to stable kernels.
>
> This is the issue here - the "stable kernel" criteria is not about
> stability - it's being optimised to shovel as much change as
> possible with /as little effort as possible/ back into older code
> bases. That's not a recipe for stability, especially considering the
> relative lack of QA the stable kernels get.
>
> IMO, the whole set of linux kernel processes are being optimised
> around the wrong metrics - we count new features, the number of
> commits per release and the quantity of code that gets changed. We
> then optimise our processes to increase these metrics. IOWs, we're
> optimising for speed and rapid change, not quality, reliability and
> stability.
>
> We are not measuring code quality improvements, how effective our
> code review is, we do not do post-mortem analysis of major failures
> and we most certainly don't change processes to avoid those problems
> in future, etc. And worst of all is that people who want better
> processes to improve code quality, testing, etc get shouted at
> because it may slow down the rate at which we change code. i.e. only
> "speed and quantity" seems to matter to the core upstream kernel
> developement community.
>
> As Darrick said, what we are seeing here is a result of "[...] the
> kernel community's systemic inability to QA new fs features
> properly." I'm taking that one step further - what we are seeing
> here is the kernel community's systemic inability to address
> fundamental engineering process deficiencies because "speed and
> quantity" are considered more important than the quality of the
> product being produced.
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF
  2018-12-02 23:23                 ` [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF Dave Chinner
  2018-12-03  7:11                   ` Amir Goldstein
@ 2018-12-03  9:22                   ` Sasha Levin
  2018-12-03 21:23                     ` Thomas Backlund
  1 sibling, 1 reply; 59+ messages in thread
From: Sasha Levin @ 2018-12-03  9:22 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Greg KH, stable, linux-kernel, Dave Chinner, Darrick J . Wong,
	linux-fsdevel

On Mon, Dec 03, 2018 at 10:23:03AM +1100, Dave Chinner wrote:
>On Sat, Dec 01, 2018 at 02:49:09AM -0500, Sasha Levin wrote:
>> In 'git log'! You report these every time you fix something in upstream
>> xfs but don't backport it to stable trees:
>
>That is so wrong on so many levels I don't really know where to
>begin. I guess doing a *basic risk analysis* demonstrating that none
>of those fixes are backport candidates is a good start:
>
>> $ git log --oneline v4.18-rc1..v4.18 fs/xfs
>> d4a34e165557 xfs: properly handle free inodes in extent hint validators
>
>Found by QA with generic/229 on a non-standard config. Not user
>reported, unlikely to ever be seen by users.
>
>> 9991274fddb9 xfs: Initialize variables in xfs_alloc_get_rec before using them
>
>Cleaning up coverity reported issues to do with corruption log
>messages. No visible symptoms, Not user reported.
>
>> d8cb5e423789 xfs: fix fdblocks accounting w/ RMAPBT per-AG reservation
>
>Minor free space accounting issue, not user reported, doesn't affect
>normal operation.
>
>> e53c4b598372 xfs: ensure post-EOF zeroing happens after zeroing part of a file
>
>Found with fsx via generic/127. Not user reported, doesn't affect
>userspace operation at all.
>
>> a3a374bf1889 xfs: fix off-by-one error in xfs_rtalloc_query_range
>
>Regression fix for code introduced in 4.18-rc1. Not user reported
>because the code has never been released.
>
>> 232d0a24b0fc xfs: fix uninitialized field in rtbitmap fsmap backend
>
>Coverity warning fix, not user reported, not user impact.
>
>> 5bd88d153998 xfs: recheck reflink state after grabbing ILOCK_SHARED for a write
>
>Fixes warning from generic/166, not user reported. Could affect
>users mixing direct IO with reflink, but we expect people using
>new functionality like reflink to be tracking TOT fairly closely
>anyway.
>
>> f62cb48e4319 xfs: don't allow insert-range to shift extents past the maximum offset
>
>Found by QA w/ generic/465. Not user reported, only affects files in
>the exabyte range so not a real world problem....
>
>> aafe12cee0b1 xfs: don't trip over negative free space in xfs_reserve_blocks
>
>Found during ENOSPC stress tests that depeleted the reserve pool.
>Not user reported, unlikely to ever be hit by users.
>
>> 10ee25268e1f xfs: allow empty transactions while frozen
>
>Removes a spurious warning when running GETFSMAP ioctl on a frozen
>filesystem. Not user reported, highly unlikely any user will ever
>hit this as nothing but XFs utilities use GETFSMAP at the moment.
>
>> e53946dbd31a xfs: xfs_iflush_abort() can be called twice on cluster writeback failure
>
>Bug in corrupted filesystem handling, been there for ~15 years IIRC.
>Not user reported - found by one of our shutdown stress tests
>on a debug kernel (generic/388, IIRC). Highly unlikely to show up in
>the real world given how long the bug has been there.
>
>> 23fcb3340d03 xfs: More robust inode extent count validation
>
>Found by filesystem image fuzzing (i.e. intentional filesystem
>corruption). Not user reported, and the filesystem corruption that
>triggered this problem is so artificial there is really no chance of
>it ever occurring in the real world.
>
>> e2ac836307e3 xfs: simplify xfs_bmap_punch_delalloc_range
>
>Cleanup and simplification. Not a bug fix, not user reported, not a
>backport candidate.
>
>IOWs, there isn't a single commit in this list that is user
>reported, nor anything that I'd consider a stable kernel backport
>candidate because none of them affect normal user workloads. i.e.
>they've all be found by tools designed to break filesystems and
>exercise rarely travelled error paths.

I think that part of our disagreement is the whole "user reported"
criteria. Looking at myself as an example, unless I experience an
obvious corruption I can reproduce, I am most likely to just ignore it
and recreate the filesystem.

This is even more true for "enterprisy" workloads where data may be
replicated across multiple filesystems, and if one of these fails then
its just silently discarded and replaced.

User reports are hard to come by, not just for XFS but pretty much
anywhere else in the kernel. Our debugging/reporting story is almost as
bad as our QA ;)

A few times above you used the word "unlikely" to indicate that a bug
will never really be hit by real users. I strongly disagree with using
this guess to decide if we're going to backport anything or not. Every
time I meet with the FB folks I keep hearing how they end up hitting
"once in a lifetime" bugs over and over on their infrastructure.

Do we agree that the ideal solution would be backporting every fix, and
having a solid QA system to validate it? Obviously it's not going to
happen in the next year or two, but if we agree on the end goal then
there's no point in this continued arguing about the steps in between :)

>> Since I'm assuming that at least some of them are based on actual issues
>> users hit, and some of those apply to stable kernels, why would users
>> want to use an XFS version which is knowingly buggy?
>
>Your assumption is not only incorrect, it is fundamentally flawed.
>A list of commits containing bug fixes is not a list of bug reports
>from users.
>
>IOWs, backporting them only increases the risk of regressions for
>users, it doesn't reduce the risk of users hitting problems or fix
>any problems that users are at risk of actually hitting. IOWs, all
>of these changes fall on the wrong side of the risk-benefit analysis
>equation.
>
>Risk/benefit analysis is fundamental to software engineering
>processes.  Running "git log" is not a risk analysis - it's just
>provides a list of things that you need to perform an analysis on.
>Risk analsysis takes time and effort, and to imply that it is not
>necessary and we should just backport everything makes the incorrect
>assumption that backporting carries no risk at all.
>
>It seems to me that the stable kernel process measures itself on how
>many commits an dhow fast they are backported from mainline kernels,
>and the entire focus of improvement is on backporting /more/ commits
>/faster/. i.e.  it's all about the speed and quantity of code being
>moved back to the "stable" kernels. What it should be doing is
>identifying and addressing bugs or flaws that put users are risk or
>that users are reporting.
>
>Further, the speed at which backports occur (i.e. within a day or 3
>of upstream commit) means that the code being backported hasn't had
>time to reach a wide testing audience and have regressions shaken
>out of it. The whole purpose of having progressively stricter -rcX
>upstream kernel releases is to allow the new code to stabilise and
>shake out unforseen regressions before it gets to users. The stable
>process is actually releasing upstream code to users before they can
>even get it in a released upstream kernel (i.e. a .0 kernel, not a
>-rcX).

One of the concerns I have about stable trees which we both share here
is that no one really uses Linus's tree: it's used as an integration
tree, but very few people actually test their workloads on it. Most
testing ends up hapenning, sadly enough, on stable trees. I see it as an
issue with our process for which I don't have an idea how to solve.

>IOWs, pulling code back to stable kernels before it's had a chance
>to stabilise and be more widely tested in the upstream kernel is
>entirely the wrong thing to be doing. Speed here does not improve
>stability, it just increases the risk of regressions and unforseen
>bugs being introduced into the stable tree. And that's made worse by
>the fact that the -rcX process and widespread upstream testing that
>goes along with it* to catch those bugs and regressions. And that's
>made even worse by the fact that subsystems don't have control over
>what is backported anymore, so they may not even be aware that a fix
>for a fix needs to be sent back to stable kernels.
>
>This is the issue here - the "stable kernel" criteria is not about
>stability - it's being optimised to shovel as much change as
>possible with /as little effort as possible/ back into older code
>bases. That's not a recipe for stability, especially considering the
>relative lack of QA the stable kernels get.
>
>IMO, the whole set of linux kernel processes are being optimised
>around the wrong metrics - we count new features, the number of
>commits per release and the quantity of code that gets changed. We
>then optimise our processes to increase these metrics. IOWs, we're
>optimising for speed and rapid change, not quality, reliability and
>stability.
>
>We are not measuring code quality improvements, how effective our
>code review is, we do not do post-mortem analysis of major failures
>and we most certainly don't change processes to avoid those problems
>in future, etc. And worst of all is that people who want better
>processes to improve code quality, testing, etc get shouted at
>because it may slow down the rate at which we change code. i.e. only
>"speed and quantity" seems to matter to the core upstream kernel
>developement community.
>
>As Darrick said, what we are seeing here is a result of "[...] the
>kernel community's systemic inability to QA new fs features
>properly." I'm taking that one step further - what we are seeing
>here is the kernel community's systemic inability to address
>fundamental engineering process deficiencies because "speed and
>quantity" are considered more important than the quality of the
>product being produced.

This is a case where theory collides with the real world. Yes, our QA is
lacking, but we don't have the option of not doing the current process.
If we stop backporting until a future data where our QA problem is
solved we'll end up with what we had before: users stuck on ancient
kernels without a way to upgrade.

With the current model we're aware that bugs sneak through, but we try
to deal with it by both improving our QA, and encouraging users to do
their own extensive QA. If we encourage users to update frequently we
can keep improving our process and the quality of kernels will keep
getting better.

We simply can't go back to the "enterprise distro" days.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: XFS patches for stable
  2018-12-02 20:08                       ` Greg KH
@ 2018-12-03 14:41                         ` Richard Weinberger
  2018-12-03 16:56                           ` Sasha Levin
  0 siblings, 1 reply; 59+ messages in thread
From: Richard Weinberger @ 2018-12-03 14:41 UTC (permalink / raw)
  To: Greg KH
  Cc: Christoph Hellwig, sashal, amir73il, Dave Chinner, stable, LKML,
	dchinner, darrick.wong, linux-fsdevel, linux-xfs, mcgrof,
	linux-mtd, boris.brezillon

On Sun, Dec 2, 2018 at 9:09 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Sun, Dec 02, 2018 at 08:10:16AM -0800, Christoph Hellwig wrote:
> > As someone who has done xfs stable backports for a while I really don't
> > think the autoselection is helpful at all.
>
> autoselection for xfs patches has been turned off for a while, what
> triggered this email thread was a core vfs patch that was backported
> that was not obvious it was created by the xfs developers due to a
> problem they had found.

Sorry for hijacking this thread.
Can you please also disable autoselection for MTD, UBI and UBIFS?

fs/ubifs/
drivers/mtd/
include/linux/mtd/
include/uapi/mtd/

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: XFS patches for stable
  2018-12-03 14:41                         ` Richard Weinberger
@ 2018-12-03 16:56                           ` Sasha Levin
  0 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-12-03 16:56 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: Greg KH, Christoph Hellwig, amir73il, Dave Chinner, stable, LKML,
	dchinner, darrick.wong, linux-fsdevel, linux-xfs, mcgrof,
	linux-mtd, boris.brezillon

On Mon, Dec 03, 2018 at 03:41:27PM +0100, Richard Weinberger wrote:
>On Sun, Dec 2, 2018 at 9:09 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>>
>> On Sun, Dec 02, 2018 at 08:10:16AM -0800, Christoph Hellwig wrote:
>> > As someone who has done xfs stable backports for a while I really don't
>> > think the autoselection is helpful at all.
>>
>> autoselection for xfs patches has been turned off for a while, what
>> triggered this email thread was a core vfs patch that was backported
>> that was not obvious it was created by the xfs developers due to a
>> problem they had found.
>
>Sorry for hijacking this thread.
>Can you please also disable autoselection for MTD, UBI and UBIFS?
>
>fs/ubifs/
>drivers/mtd/
>include/linux/mtd/
>include/uapi/mtd/

Sure, done!

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF
  2018-12-03  9:22                   ` Sasha Levin
@ 2018-12-03 21:23                     ` Thomas Backlund
  2018-12-04  7:28                       ` Greg KH
                                         ` (2 more replies)
  0 siblings, 3 replies; 59+ messages in thread
From: Thomas Backlund @ 2018-12-03 21:23 UTC (permalink / raw)
  To: Sasha Levin, Dave Chinner
  Cc: Greg KH, stable, linux-kernel, Dave Chinner, Darrick J . Wong,
	linux-fsdevel

Den 2018-12-03 kl. 11:22, skrev Sasha Levin:

> 
> This is a case where theory collides with the real world. Yes, our QA is
> lacking, but we don't have the option of not doing the current process.
> If we stop backporting until a future data where our QA problem is
> solved we'll end up with what we had before: users stuck on ancient
> kernels without a way to upgrade.
> 

Sorry, but you seem to be living in a different "real world"...

People stay on "ancient kernels" that "just works" instead of updating
to a newer one that "hopefully/maybe/... works"


> With the current model we're aware that bugs sneak through, but we try
> to deal with it by both improving our QA, and encouraging users to do
> their own extensive QA. If we encourage users to update frequently we
> can keep improving our process and the quality of kernels will keep
> getting better.

And here you want to turn/force users into QA ... good luck with that.

In reality they wont "update frequently", instead they will stop
updating when they have something that works... and start ignoring
updates as they expect something "to break as usual" as they actually
need to get some real work done too...


> 
> We simply can't go back to the "enterprise distro" days.
> 

Maybe so, but we should atleast get back to having "stable" or
"longterm" actually mean something again...

Or what does it say when distros starts thinking about ignoring
(and some already do) stable/longterm trees because there is
_way_ too much questionable changes coming through, even overriding
maintainers to the point where they basically state "we dont care
about monitoring stable trees anymore, as they add whatever they want
anyway"...

And pretending that every fix is important enough to backport,
and saying if you dont take everything you have an "unsecure" kernel
wont help, as reality has shown from time to time that backports
can/will open up a new issue instead for no good reason

Wich for distros starts to mean, switch back to selectively taking fixes
for _known_ security issues are considered way better choice

End result, no-one cares about -stable trees -> no-one uses them -> a
lot of wasted work for nothing...

--
Thomas



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF
  2018-12-03 21:23                     ` Thomas Backlund
@ 2018-12-04  7:28                       ` Greg KH
  2018-12-04  8:12                       ` Sasha Levin
  2018-12-28  8:06                       ` Pavel Machek
  2 siblings, 0 replies; 59+ messages in thread
From: Greg KH @ 2018-12-04  7:28 UTC (permalink / raw)
  To: Thomas Backlund
  Cc: Sasha Levin, Dave Chinner, stable, linux-kernel, Dave Chinner,
	Darrick J . Wong, linux-fsdevel

On Mon, Dec 03, 2018 at 11:22:46PM +0159, Thomas Backlund wrote:
> Den 2018-12-03 kl. 11:22, skrev Sasha Levin:
> 
> > 
> > This is a case where theory collides with the real world. Yes, our QA is
> > lacking, but we don't have the option of not doing the current process.
> > If we stop backporting until a future data where our QA problem is
> > solved we'll end up with what we had before: users stuck on ancient
> > kernels without a way to upgrade.
> > 
> 
> Sorry, but you seem to be living in a different "real world"...
> 
> People stay on "ancient kernels" that "just works" instead of updating
> to a newer one that "hopefully/maybe/... works"

That's not good as those "ancient kernels" really just are "kernels with
lots of known security bugs".

It's your systems, I can't tell you what to do, but I will tell you that
running older, unfixed kernels, is a known liability.

Good luck!

greg k-h

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF
  2018-12-03 21:23                     ` Thomas Backlund
  2018-12-04  7:28                       ` Greg KH
@ 2018-12-04  8:12                       ` Sasha Levin
  2018-12-28  8:06                       ` Pavel Machek
  2 siblings, 0 replies; 59+ messages in thread
From: Sasha Levin @ 2018-12-04  8:12 UTC (permalink / raw)
  To: Thomas Backlund
  Cc: Dave Chinner, Greg KH, stable, linux-kernel, Dave Chinner,
	Darrick J . Wong, linux-fsdevel

On Mon, Dec 03, 2018 at 11:22:46PM +0159, Thomas Backlund wrote:
>Den 2018-12-03 kl. 11:22, skrev Sasha Levin:
>
>>
>> This is a case where theory collides with the real world. Yes, our QA is
>> lacking, but we don't have the option of not doing the current process.
>> If we stop backporting until a future data where our QA problem is
>> solved we'll end up with what we had before: users stuck on ancient
>> kernels without a way to upgrade.
>>
>
>Sorry, but you seem to be living in a different "real world"...
>
>People stay on "ancient kernels" that "just works" instead of updating
>to a newer one that "hopefully/maybe/... works"

If users are stuck at older kernels and refuse to update then there's
not much I can do about it. They are knowingly staying on kernels with
known issues and will end up paying a much bigger price later to update.

>> With the current model we're aware that bugs sneak through, but we try
>> to deal with it by both improving our QA, and encouraging users to do
>> their own extensive QA. If we encourage users to update frequently we
>> can keep improving our process and the quality of kernels will keep
>> getting better.
>
>And here you want to turn/force users into QA ... good luck with that.

Yes, users are expected to test their workloads with new kernels - I'm
not sure why this is a surprise to anyone. Isn't it true for every other
piece of software?

I invite you to read Jon's great summary on LWN of a related session
that happened during the maintainer's summit:
https://lwn.net/Articles/769253/ . The conclusion reached was very
similar.

>In reality they wont "update frequently", instead they will stop
>updating when they have something that works... and start ignoring
>updates as they expect something "to break as usual" as they actually
>need to get some real work done too...

Again, this model was proven to be bad in the past, and if users keep
following it then they're knowingly shooting themselves in the foot.

>
>>
>> We simply can't go back to the "enterprise distro" days.
>>
>
>Maybe so, but we should atleast get back to having "stable" or
>"longterm" actually mean something again...
>
>Or what does it say when distros starts thinking about ignoring
>(and some already do) stable/longterm trees because there is
>_way_ too much questionable changes coming through, even overriding
>maintainers to the point where they basically state "we dont care
>about monitoring stable trees anymore, as they add whatever they want
>anyway"...

I'm assuming you mean "enterprise distros" here, as most of the
community distros I'm aware of are tracking stable trees.

Enterprise distros are a mix of everything: on one hand they would
refuse most stable patches because they don't have any demand from
customers to fix those bugs, but on the other hand they will update
drivers and subsystems as a whole to create these frankenstein kernels
that are very difficult to support.

When your kernel is driven by paying customer demands it's difficult to
argue for the technical merits of your process.

>And pretending that every fix is important enough to backport,
>and saying if you dont take everything you have an "unsecure" kernel
>wont help, as reality has shown from time to time that backports
>can/will open up a new issue instead for no good reason
>
>Wich for distros starts to mean, switch back to selectively taking fixes
>for _known_ security issues are considered way better choice

That was my exact thinking 2 years ago (see my stable-security project:
https://lwn.net/Articles/683335/). I even had a back-and-forth with Greg
on LKML when I was trying to argue your point: "Lets only take security
fixes because no one cares about the other crap".

If you're interested, I'd be happy to explain further why this was a
complete flop.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF
  2018-12-03 21:23                     ` Thomas Backlund
  2018-12-04  7:28                       ` Greg KH
  2018-12-04  8:12                       ` Sasha Levin
@ 2018-12-28  8:06                       ` Pavel Machek
  2018-12-29 23:35                         ` Dave Chinner
  2 siblings, 1 reply; 59+ messages in thread
From: Pavel Machek @ 2018-12-28  8:06 UTC (permalink / raw)
  To: Thomas Backlund
  Cc: Sasha Levin, Dave Chinner, Greg KH, stable, linux-kernel,
	Dave Chinner, Darrick J . Wong, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 1129 bytes --]

On Mon 2018-12-03 23:22:46, Thomas Backlund wrote:
> Den 2018-12-03 kl. 11:22, skrev Sasha Levin:
> 
> > 
> > This is a case where theory collides with the real world. Yes, our QA is
> > lacking, but we don't have the option of not doing the current process.
> > If we stop backporting until a future data where our QA problem is
> > solved we'll end up with what we had before: users stuck on ancient
> > kernels without a way to upgrade.
> > 
> 
> Sorry, but you seem to be living in a different "real world"...

I have to agree here :-(.

> People stay on "ancient kernels" that "just works" instead of updating
> to a newer one that "hopefully/maybe/... works"

Stable has a rules community agreed on, unfortunately stable team just
simply ignores those and decided to do "whatever they please".

Process went from "serious bugs that bother people only" to "hey, this
looks like a bugfix, lets put it into tree and see what it breaks"...

:-(.
								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF
  2018-12-28  8:06                       ` Pavel Machek
@ 2018-12-29 23:35                         ` Dave Chinner
  0 siblings, 0 replies; 59+ messages in thread
From: Dave Chinner @ 2018-12-29 23:35 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Thomas Backlund, Sasha Levin, Greg KH, stable, linux-kernel,
	Dave Chinner, Darrick J . Wong, linux-fsdevel

On Fri, Dec 28, 2018 at 09:06:24AM +0100, Pavel Machek wrote:
> On Mon 2018-12-03 23:22:46, Thomas Backlund wrote:
> > Den 2018-12-03 kl. 11:22, skrev Sasha Levin:
> > 
> > > 
> > > This is a case where theory collides with the real world. Yes, our QA is
> > > lacking, but we don't have the option of not doing the current process.
> > > If we stop backporting until a future data where our QA problem is
> > > solved we'll end up with what we had before: users stuck on ancient
> > > kernels without a way to upgrade.
> > > 
> > 
> > Sorry, but you seem to be living in a different "real world"...
> 
> I have to agree here :-(.
> 
> > People stay on "ancient kernels" that "just works" instead of updating
> > to a newer one that "hopefully/maybe/... works"
> 
> Stable has a rules community agreed on, unfortunately stable team just
> simply ignores those and decided to do "whatever they please".
> 
> Process went from "serious bugs that bother people only" to "hey, this
> looks like a bugfix, lets put it into tree and see what it breaks"...

Resulting in us having to tell users not to use stable kernels
because they can contain broken commits from upstream that did not
go through maintainer tree and test cycles.

https://marc.info/?l=linux-xfs&m=154544499507105&w=2

In this case, the broken commit to the fs/iomap.c code was merged
upstream through the akpm tree, rather than the XFS tree and test
process as previous changes to this code had been staged.

It was then backported so fast and released so quickly that it
hadn't got back into the XFS upstream tree test cycles until
after it had already committed to at least one stable kernel.  We'd
only just registered and confirmed a regression in in post -rc7
upstream trees when the stale kernel containing the commit was
released. It took us another couple of days to isolate failing
configuration and bisect it down to the commit.

Only when I got "formlettered" for cc'ing the stable kernel list on
the revert patch (because I wanted to make sure the stable kernel
maintainers knew it was being reverted and so it wouldn't be
backported) did I learn it had already been "auto-backported" and
released in a stable kernel in under a week. Essentially, the
"auto-backport" completely short-circuited the upstream QA
process.....

IOWs, if you were looking for a case study to demonstrate the
failings of the current stable process, this is it.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2018-12-29 23:35 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-29  6:00 [PATCH AUTOSEL 4.14 01/35] media: omap3isp: Unregister media device as first Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 02/35] iommu/vt-d: Fix NULL pointer dereference in prq_event_thread() Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 03/35] brcmutil: really fix decoding channel info for 160 MHz bandwidth Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 04/35] iommu/ipmmu-vmsa: Fix crash on early domain free Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 05/35] can: rcar_can: Fix erroneous registration Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 06/35] test_firmware: fix error return getting clobbered Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 07/35] HID: input: Ignore battery reported by Symbol DS4308 Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 08/35] batman-adv: Use explicit tvlv padding for ELP packets Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 09/35] batman-adv: Expand merged fragment buffer for full packet Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 10/35] amd/iommu: Fix Guest Virtual APIC Log Tail Address Register Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 11/35] bnx2x: Assign unique DMAE channel number for FW DMAE transactions Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 12/35] qed: Fix PTT leak in qed_drain() Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 13/35] qed: Fix reading wrong value in loop condition Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 14/35] Revert "usb: gadget: ffs: Fix BUG when userland exits with submitted AIO transfers" Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 15/35] net/mlx4_core: Zero out lkey field in SW2HW_MPT fw command Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 16/35] net/mlx4_core: Fix uninitialized variable compilation warning Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 17/35] net/mlx4: Fix UBSAN warning of signed integer overflow Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 18/35] gpio: mockup: fix indicated direction Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 19/35] mtd: rawnand: qcom: Namespace prefix some commands Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 20/35] exec: make de_thread() freezable Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 21/35] HID: multitouch: Add pointstick support for Cirque Touchpad Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 22/35] mtd: spi-nor: Fix Cadence QSPI page fault kernel panic Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 23/35] qed: Fix bitmap_weight() check Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 24/35] qed: Fix QM getters to always return a valid pq Sasha Levin
2018-11-29  6:00 ` [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF Sasha Levin
2018-11-29 12:14   ` Dave Chinner
2018-11-29 12:47     ` Greg KH
2018-11-29 22:40       ` Dave Chinner
2018-11-30  8:22         ` Greg KH
2018-11-30 10:14           ` Sasha Levin
2018-11-30 20:35             ` Darrick J. Wong
2018-11-30 21:50             ` Dave Chinner
2018-12-01  7:49               ` Sasha Levin
2018-12-01  9:09                 ` XFS patches for stable Amir Goldstein
2018-12-02 15:25                   ` Sasha Levin
2018-12-02 16:10                     ` Christoph Hellwig
2018-12-02 20:08                       ` Greg KH
2018-12-03 14:41                         ` Richard Weinberger
2018-12-03 16:56                           ` Sasha Levin
2018-12-02 23:23                 ` [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF Dave Chinner
2018-12-03  7:11                   ` Amir Goldstein
2018-12-03  9:22                   ` Sasha Levin
2018-12-03 21:23                     ` Thomas Backlund
2018-12-04  7:28                       ` Greg KH
2018-12-04  8:12                       ` Sasha Levin
2018-12-28  8:06                       ` Pavel Machek
2018-12-29 23:35                         ` Dave Chinner
2018-11-30 21:45           ` Dave Chinner
2018-12-02 20:11             ` Greg KH
2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 26/35] net: faraday: ftmac100: remove netif_running(netdev) check before disabling interrupts Sasha Levin
2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 27/35] iommu/vt-d: Use memunmap to free memremap Sasha Levin
2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 28/35] flexfiles: use per-mirror specified stateid for IO Sasha Levin
2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 29/35] net: thunderx: set xdp_prog to NULL if bpf_prog_add fails Sasha Levin
2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 30/35] ibmvnic: Fix RX queue buffer cleanup Sasha Levin
2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 31/35] virtio-net: disable guest csum during XDP set Sasha Levin
2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 32/35] virtio-net: fail XDP set if guest csum is negotiated Sasha Levin
2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 33/35] team: no need to do team_notify_peers or team_mcast_rejoin when disabling port Sasha Levin
2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 34/35] net: amd: add missing of_node_put() Sasha Levin
2018-11-29  6:01 ` [PATCH AUTOSEL 4.14 35/35] net: thunderx: set tso_hdrs pointer to NULL in nicvf_free_snd_queue Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).