All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 3.10.y 01/12] PCI/ACPI: Fix _OSC ordering to allow PCIe hotplug use when available
@ 2016-07-15 18:26 Charles (Chas) Williams
  2016-07-15 18:26 ` [PATCH 3.10.y 02/12] udp: properly support MSG_PEEK with truncated buffers Charles (Chas) Williams
                   ` (11 more replies)
  0 siblings, 12 replies; 18+ messages in thread
From: Charles (Chas) Williams @ 2016-07-15 18:26 UTC (permalink / raw)
  To: stable
  Cc: Neil Horman, Bjorn Helgaas, Len Brown, Rafael J. Wysocki,
	Charles (Chas) Williams

From: Neil Horman <nhorman@tuxdriver.com>

commit 3dc48af310709b85d07c8b0d3aa8f1ead02829d3 upstream.

This fixes the problem of acpiphp claiming slots that should be managed
by pciehp, which may keep ExpressCard slots from working.

The acpiphp driver claims PCIe slots unless the BIOS has granted us
control of PCIe native hotplug via _OSC.  Prior to v3.10, the acpiphp
.add method (add_bridge()) was always called *after* we had requested
native hotplug control with _OSC.

But after 3b63aaa70e ("PCI: acpiphp: Do not use ACPI PCI subdriver
mechanism"), which appeared in v3.10, acpiphp initialization is done
during the bus scan via the pcibios_add_bus() hook, and this happens
*before* we request native hotplug control.

Therefore, acpiphp doesn't know yet whether the BIOS will grant control,
and it claims slots that we should be handling with native hotplug.

This patch requests native hotplug control earlier, so we know whether
the BIOS granted it to us before we initialize acpiphp.

To avoid reintroducing the ASPM issue fixed by b8178f130e ('Revert
"PCI/ACPI: Request _OSC control before scanning PCI root bus"'), we run
_OSC earlier but defer the actual ASPM calls until after the bus scan is
complete.

Tested successfully by myself.

[bhelgaas: changelog, mark for stable]
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=60736
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
CC: stable@vger.kernel.org	# v3.10+
CC: Len Brown <lenb@kernel.org>
CC: "Rafael J. Wysocki" <rjw@sisk.pl>
[ciwillia@brocade.com: backported to 3.10: adjusted context]
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
---
 drivers/acpi/pci_root.c | 67 ++++++++++++++++++++++++++++---------------------
 1 file changed, 38 insertions(+), 29 deletions(-)

diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index a02a91c..c5e3dd9 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -385,6 +385,7 @@ static int acpi_pci_root_add(struct acpi_device *device,
 	int result;
 	struct acpi_pci_root *root;
 	u32 flags, base_flags;
+	bool no_aspm = false, clear_aspm = false;
 
 	root = kzalloc(sizeof(struct acpi_pci_root), GFP_KERNEL);
 	if (!root)
@@ -445,31 +446,10 @@ static int acpi_pci_root_add(struct acpi_device *device,
 	flags = base_flags = OSC_PCI_SEGMENT_GROUPS_SUPPORT;
 	acpi_pci_osc_support(root, flags);
 
-	/*
-	 * TBD: Need PCI interface for enumeration/configuration of roots.
-	 */
-
 	mutex_lock(&acpi_pci_root_lock);
 	list_add_tail(&root->node, &acpi_pci_roots);
 	mutex_unlock(&acpi_pci_root_lock);
 
-	/*
-	 * Scan the Root Bridge
-	 * --------------------
-	 * Must do this prior to any attempt to bind the root device, as the
-	 * PCI namespace does not get created until this call is made (and
-	 * thus the root bridge's pci_dev does not exist).
-	 */
-	root->bus = pci_acpi_scan_root(root);
-	if (!root->bus) {
-		printk(KERN_ERR PREFIX
-			    "Bus %04x:%02x not present in PCI namespace\n",
-			    root->segment, (unsigned int)root->secondary.start);
-		result = -ENODEV;
-		goto out_del_root;
-	}
-
-	/* Indicate support for various _OSC capabilities. */
 	if (pci_ext_cfg_avail())
 		flags |= OSC_EXT_PCI_CONFIG_SUPPORT;
 	if (pcie_aspm_support_enabled()) {
@@ -483,7 +463,7 @@ static int acpi_pci_root_add(struct acpi_device *device,
 		if (ACPI_FAILURE(status)) {
 			dev_info(&device->dev, "ACPI _OSC support "
 				"notification failed, disabling PCIe ASPM\n");
-			pcie_no_aspm();
+			no_aspm = true;
 			flags = base_flags;
 		}
 	}
@@ -515,7 +495,7 @@ static int acpi_pci_root_add(struct acpi_device *device,
 				 * We have ASPM control, but the FADT indicates
 				 * that it's unsupported. Clear it.
 				 */
-				pcie_clear_aspm(root->bus);
+				clear_aspm = true;
 			}
 		} else {
 			dev_info(&device->dev,
@@ -524,7 +504,14 @@ static int acpi_pci_root_add(struct acpi_device *device,
 				acpi_format_exception(status), flags);
 			pr_info("ACPI _OSC control for PCIe not granted, "
 				"disabling ASPM\n");
-			pcie_no_aspm();
+			/*
+			 * We want to disable ASPM here, but aspm_disabled
+			 * needs to remain in its state from boot so that we
+			 * properly handle PCIe 1.1 devices.  So we set this
+			 * flag here, to defer the action until after the ACPI
+			 * root scan.
+			 */
+			no_aspm = true;
 		}
 	} else {
 		dev_info(&device->dev,
@@ -532,6 +519,33 @@ static int acpi_pci_root_add(struct acpi_device *device,
 			 "(_OSC support mask: 0x%02x)\n", flags);
 	}
 
+	/*
+	 * TBD: Need PCI interface for enumeration/configuration of roots.
+	 */
+
+	/*
+	 * Scan the Root Bridge
+	 * --------------------
+	 * Must do this prior to any attempt to bind the root device, as the
+	 * PCI namespace does not get created until this call is made (and
+	 * thus the root bridge's pci_dev does not exist).
+	 */
+	root->bus = pci_acpi_scan_root(root);
+	if (!root->bus) {
+		dev_err(&device->dev,
+			"Bus %04x:%02x not present in PCI namespace\n",
+			root->segment, (unsigned int)root->secondary.start);
+		result = -ENODEV;
+		goto end;
+	}
+
+	if (clear_aspm) {
+		dev_info(&device->dev, "Disabling ASPM (FADT indicates it is unsupported)\n");
+		pcie_clear_aspm(root->bus);
+	}
+	if (no_aspm)
+		pcie_no_aspm();
+
 	pci_acpi_add_bus_pm_notifier(device, root->bus);
 	if (device->wakeup.flags.run_wake)
 		device_set_run_wake(root->bus->bridge, true);
@@ -548,11 +562,6 @@ static int acpi_pci_root_add(struct acpi_device *device,
 	pci_bus_add_devices(root->bus);
 	return 1;
 
-out_del_root:
-	mutex_lock(&acpi_pci_root_lock);
-	list_del(&root->node);
-	mutex_unlock(&acpi_pci_root_lock);
-
 end:
 	kfree(root);
 	return result;
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3.10.y 02/12] udp: properly support MSG_PEEK with truncated buffers
  2016-07-15 18:26 [PATCH 3.10.y 01/12] PCI/ACPI: Fix _OSC ordering to allow PCIe hotplug use when available Charles (Chas) Williams
@ 2016-07-15 18:26 ` Charles (Chas) Williams
  2016-07-15 18:26 ` [PATCH 3.10.y 03/12] USB: fix invalid memory access in hub_activate() Charles (Chas) Williams
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Charles (Chas) Williams @ 2016-07-15 18:26 UTC (permalink / raw)
  To: stable
  Cc: Eric Dumazet, David S. Miller, Luis Henriques, Charles (Chas) Williams

From: Eric Dumazet <edumazet@google.com>

commit 197c949e7798fbf28cfadc69d9ca0c2abbf93191 upstream.

Backport of this upstream commit into stable kernels :
89c22d8c3b27 ("net: Fix skb csum races when peeking")
exposed a bug in udp stack vs MSG_PEEK support, when user provides
a buffer smaller than skb payload.

In this case,
skb_copy_and_csum_datagram_iovec(skb, sizeof(struct udphdr),
                                 msg->msg_iov);
returns -EFAULT.

This bug does not happen in upstream kernels since Al Viro did a great
job to replace this into :
skb_copy_and_csum_datagram_msg(skb, sizeof(struct udphdr), msg);
This variant is safe vs short buffers.

For the time being, instead reverting Herbert Xu patch and add back
skb->ip_summed invalid changes, simply store the result of
udp_lib_checksum_complete() so that we avoid computing the checksum a
second time, and avoid the problematic
skb_copy_and_csum_datagram_iovec() call.

This patch can be applied on recent kernels as it avoids a double
checksumming, then backported to stable kernels as a bug fix.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
---
 net/ipv4/udp.c | 6 ++++--
 net/ipv6/udp.c | 6 ++++--
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 63b536b..68174e4 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1208,6 +1208,7 @@ int udp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 	int peeked, off = 0;
 	int err;
 	int is_udplite = IS_UDPLITE(sk);
+	bool checksum_valid = false;
 	bool slow;
 
 	if (flags & MSG_ERRQUEUE)
@@ -1233,11 +1234,12 @@ try_again:
 	 */
 
 	if (copied < ulen || UDP_SKB_CB(skb)->partial_cov) {
-		if (udp_lib_checksum_complete(skb))
+		checksum_valid = !udp_lib_checksum_complete(skb);
+		if (!checksum_valid)
 			goto csum_copy_err;
 	}
 
-	if (skb_csum_unnecessary(skb))
+	if (checksum_valid || skb_csum_unnecessary(skb))
 		err = skb_copy_datagram_iovec(skb, sizeof(struct udphdr),
 					      msg->msg_iov, copied);
 	else {
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 3046d02..d234e6f 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -370,6 +370,7 @@ int udpv6_recvmsg(struct kiocb *iocb, struct sock *sk,
 	int peeked, off = 0;
 	int err;
 	int is_udplite = IS_UDPLITE(sk);
+	bool checksum_valid = false;
 	int is_udp4;
 	bool slow;
 
@@ -401,11 +402,12 @@ try_again:
 	 */
 
 	if (copied < ulen || UDP_SKB_CB(skb)->partial_cov) {
-		if (udp_lib_checksum_complete(skb))
+		checksum_valid = !udp_lib_checksum_complete(skb);
+		if (!checksum_valid)
 			goto csum_copy_err;
 	}
 
-	if (skb_csum_unnecessary(skb))
+	if (checksum_valid || skb_csum_unnecessary(skb))
 		err = skb_copy_datagram_iovec(skb, sizeof(struct udphdr),
 					      msg->msg_iov, copied);
 	else {
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3.10.y 03/12] USB: fix invalid memory access in hub_activate()
  2016-07-15 18:26 [PATCH 3.10.y 01/12] PCI/ACPI: Fix _OSC ordering to allow PCIe hotplug use when available Charles (Chas) Williams
  2016-07-15 18:26 ` [PATCH 3.10.y 02/12] udp: properly support MSG_PEEK with truncated buffers Charles (Chas) Williams
@ 2016-07-15 18:26 ` Charles (Chas) Williams
  2016-07-15 18:26   ` Charles (Chas) Williams
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Charles (Chas) Williams @ 2016-07-15 18:26 UTC (permalink / raw)
  To: stable
  Cc: Alan Stern, Greg Kroah-Hartman, Luis Henriques, Charles (Chas) Williams

From: Alan Stern <stern@rowland.harvard.edu>

commit e50293ef9775c5f1cf3fcc093037dd6a8c5684ea upstream.

Commit 8520f38099cc ("USB: change hub initialization sleeps to
delayed_work") changed the hub_activate() routine to make part of it
run in a workqueue.  However, the commit failed to take a reference to
the usb_hub structure or to lock the hub interface while doing so.  As
a result, if a hub is plugged in and quickly unplugged before the work
routine can run, the routine will try to access memory that has been
deallocated.  Or, if the hub is unplugged while the routine is
running, the memory may be deallocated while it is in active use.

This patch fixes the problem by taking a reference to the usb_hub at
the start of hub_activate() and releasing it at the end (when the work
is finished), and by locking the hub interface while the work routine
is running.  It also adds a check at the start of the routine to see
if the hub has already been disconnected, in which nothing should be
done.

CVE-2015-8816

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Alexandru Cornea <alexandru.cornea@intel.com>
Tested-by: Alexandru Cornea <alexandru.cornea@intel.com>
Fixes: 8520f38099cc ("USB: change hub initialization sleeps to delayed_work")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ luis: backported to 3.16:
  - Added forward declaration of hub_release() which mainline had with commit
    32a6958998c5 ("usb: hub: convert khubd into workqueue") ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
---
 drivers/usb/core/hub.c | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 8eb2de6..4e5156d 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -113,6 +113,7 @@ EXPORT_SYMBOL_GPL(ehci_cf_port_reset_rwsem);
 #define HUB_DEBOUNCE_STEP	  25
 #define HUB_DEBOUNCE_STABLE	 100
 
+static void hub_release(struct kref *kref);
 static int usb_reset_and_verify_device(struct usb_device *udev);
 
 static inline char *portspeed(struct usb_hub *hub, int portstatus)
@@ -1024,10 +1025,20 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type)
 	unsigned delay;
 
 	/* Continue a partial initialization */
-	if (type == HUB_INIT2)
-		goto init2;
-	if (type == HUB_INIT3)
+	if (type == HUB_INIT2 || type == HUB_INIT3) {
+		device_lock(hub->intfdev);
+
+		/* Was the hub disconnected while we were waiting? */
+		if (hub->disconnected) {
+			device_unlock(hub->intfdev);
+			kref_put(&hub->kref, hub_release);
+			return;
+		}
+		if (type == HUB_INIT2)
+			goto init2;
 		goto init3;
+	}
+	kref_get(&hub->kref);
 
 	/* The superspeed hub except for root hub has to use Hub Depth
 	 * value as an offset into the route string to locate the bits
@@ -1224,6 +1235,7 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type)
 			PREPARE_DELAYED_WORK(&hub->init_work, hub_init_func3);
 			schedule_delayed_work(&hub->init_work,
 					msecs_to_jiffies(delay));
+			device_unlock(hub->intfdev);
 			return;		/* Continues at init3: below */
 		} else {
 			msleep(delay);
@@ -1244,6 +1256,11 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type)
 	/* Allow autosuspend if it was suppressed */
 	if (type <= HUB_INIT3)
 		usb_autopm_put_interface_async(to_usb_interface(hub->intfdev));
+
+	if (type == HUB_INIT2 || type == HUB_INIT3)
+		device_unlock(hub->intfdev);
+
+	kref_put(&hub->kref, hub_release);
 }
 
 /* Implement the continuations for the delays above */
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3.10.y 04/12] x86/mm: Add barriers and document switch_mm()-vs-flush synchronization
  2016-07-15 18:26 [PATCH 3.10.y 01/12] PCI/ACPI: Fix _OSC ordering to allow PCIe hotplug use when available Charles (Chas) Williams
@ 2016-07-15 18:26   ` Charles (Chas) Williams
  2016-07-15 18:26 ` [PATCH 3.10.y 03/12] USB: fix invalid memory access in hub_activate() Charles (Chas) Williams
                     ` (10 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Charles (Chas) Williams @ 2016-07-15 18:26 UTC (permalink / raw)
  To: stable
  Cc: Andy Lutomirski, Andrew Morton, Andy Lutomirski, Borislav Petkov,
	Brian Gerst, Dave Hansen, Denys Vlasenko, H. Peter Anvin,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	linux-mm, Ingo Molnar, Luis Henriques, Charles (Chas) Williams

From: Andy Lutomirski <luto@kernel.org>

commit 71b3c126e61177eb693423f2e18a1914205b165e upstream.

When switch_mm() activates a new PGD, it also sets a bit that
tells other CPUs that the PGD is in use so that TLB flush IPIs
will be sent.  In order for that to work correctly, the bit
needs to be visible prior to loading the PGD and therefore
starting to fill the local TLB.

Document all the barriers that make this work correctly and add
a couple that were missing.

CVE-2016-2069

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[ luis: backported to 3.16:
  - dropped N/A comment in flush_tlb_mm_range()
  - adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
[ciwillia@brocade.com: backported to 3.10: adjusted context]
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
---
 arch/x86/include/asm/mmu_context.h | 32 +++++++++++++++++++++++++++++++-
 arch/x86/mm/tlb.c                  | 24 +++++++++++++++++++++---
 2 files changed, 52 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index be12c53..c0d2f6b 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -42,7 +42,32 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 #endif
 		cpumask_set_cpu(cpu, mm_cpumask(next));
 
-		/* Re-load page tables */
+		/*
+		 * Re-load page tables.
+		 *
+		 * This logic has an ordering constraint:
+		 *
+		 *  CPU 0: Write to a PTE for 'next'
+		 *  CPU 0: load bit 1 in mm_cpumask.  if nonzero, send IPI.
+		 *  CPU 1: set bit 1 in next's mm_cpumask
+		 *  CPU 1: load from the PTE that CPU 0 writes (implicit)
+		 *
+		 * We need to prevent an outcome in which CPU 1 observes
+		 * the new PTE value and CPU 0 observes bit 1 clear in
+		 * mm_cpumask.  (If that occurs, then the IPI will never
+		 * be sent, and CPU 0's TLB will contain a stale entry.)
+		 *
+		 * The bad outcome can occur if either CPU's load is
+		 * reordered before that CPU's store, so both CPUs much
+		 * execute full barriers to prevent this from happening.
+		 *
+		 * Thus, switch_mm needs a full barrier between the
+		 * store to mm_cpumask and any operation that could load
+		 * from next->pgd.  This barrier synchronizes with
+		 * remote TLB flushers.  Fortunately, load_cr3 is
+		 * serializing and thus acts as a full barrier.
+		 *
+		 */
 		load_cr3(next->pgd);
 
 		/* Stop flush ipis for the previous mm */
@@ -65,10 +90,15 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 			 * schedule, protecting us from simultaneous changes.
 			 */
 			cpumask_set_cpu(cpu, mm_cpumask(next));
+
 			/*
 			 * We were in lazy tlb mode and leave_mm disabled
 			 * tlb flush IPI delivery. We must reload CR3
 			 * to make sure to use no freed page tables.
+			 *
+			 * As above, this is a barrier that forces
+			 * TLB repopulation to be ordered after the
+			 * store to mm_cpumask.
 			 */
 			load_cr3(next->pgd);
 			load_LDT_nolock(&next->context);
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 282375f..c26b610 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -149,7 +149,9 @@ void flush_tlb_current_task(void)
 
 	preempt_disable();
 
+	/* This is an implicit full barrier that synchronizes with switch_mm. */
 	local_flush_tlb();
+
 	if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
 		flush_tlb_others(mm_cpumask(mm), mm, 0UL, TLB_FLUSH_ALL);
 	preempt_enable();
@@ -188,11 +190,19 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
 	unsigned act_entries, tlb_entries = 0;
 
 	preempt_disable();
-	if (current->active_mm != mm)
+	if (current->active_mm != mm) {
+		/* Synchronize with switch_mm. */
+		smp_mb();
+
 		goto flush_all;
+	}
 
 	if (!current->mm) {
 		leave_mm(smp_processor_id());
+
+		/* Synchronize with switch_mm. */
+		smp_mb();
+
 		goto flush_all;
 	}
 
@@ -242,10 +252,18 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long start)
 	preempt_disable();
 
 	if (current->active_mm == mm) {
-		if (current->mm)
+		if (current->mm) {
+			/*
+			 * Implicit full barrier (INVLPG) that synchronizes
+			 * with switch_mm.
+			 */
 			__flush_tlb_one(start);
-		else
+		} else {
 			leave_mm(smp_processor_id());
+
+			/* Synchronize with switch_mm. */
+			smp_mb();
+		}
 	}
 
 	if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
-- 
2.5.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3.10.y 04/12] x86/mm: Add barriers and document switch_mm()-vs-flush synchronization
@ 2016-07-15 18:26   ` Charles (Chas) Williams
  0 siblings, 0 replies; 18+ messages in thread
From: Charles (Chas) Williams @ 2016-07-15 18:26 UTC (permalink / raw)
  To: stable
  Cc: Andy Lutomirski, Andrew Morton, Andy Lutomirski, Borislav Petkov,
	Brian Gerst, Dave Hansen, Denys Vlasenko, H. Peter Anvin,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	linux-mm, Ingo Molnar, Luis Henriques, Charles (Chas) Williams

From: Andy Lutomirski <luto@kernel.org>

commit 71b3c126e61177eb693423f2e18a1914205b165e upstream.

When switch_mm() activates a new PGD, it also sets a bit that
tells other CPUs that the PGD is in use so that TLB flush IPIs
will be sent.  In order for that to work correctly, the bit
needs to be visible prior to loading the PGD and therefore
starting to fill the local TLB.

Document all the barriers that make this work correctly and add
a couple that were missing.

CVE-2016-2069

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[ luis: backported to 3.16:
  - dropped N/A comment in flush_tlb_mm_range()
  - adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
[ciwillia@brocade.com: backported to 3.10: adjusted context]
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
---
 arch/x86/include/asm/mmu_context.h | 32 +++++++++++++++++++++++++++++++-
 arch/x86/mm/tlb.c                  | 24 +++++++++++++++++++++---
 2 files changed, 52 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index be12c53..c0d2f6b 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -42,7 +42,32 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 #endif
 		cpumask_set_cpu(cpu, mm_cpumask(next));
 
-		/* Re-load page tables */
+		/*
+		 * Re-load page tables.
+		 *
+		 * This logic has an ordering constraint:
+		 *
+		 *  CPU 0: Write to a PTE for 'next'
+		 *  CPU 0: load bit 1 in mm_cpumask.  if nonzero, send IPI.
+		 *  CPU 1: set bit 1 in next's mm_cpumask
+		 *  CPU 1: load from the PTE that CPU 0 writes (implicit)
+		 *
+		 * We need to prevent an outcome in which CPU 1 observes
+		 * the new PTE value and CPU 0 observes bit 1 clear in
+		 * mm_cpumask.  (If that occurs, then the IPI will never
+		 * be sent, and CPU 0's TLB will contain a stale entry.)
+		 *
+		 * The bad outcome can occur if either CPU's load is
+		 * reordered before that CPU's store, so both CPUs much
+		 * execute full barriers to prevent this from happening.
+		 *
+		 * Thus, switch_mm needs a full barrier between the
+		 * store to mm_cpumask and any operation that could load
+		 * from next->pgd.  This barrier synchronizes with
+		 * remote TLB flushers.  Fortunately, load_cr3 is
+		 * serializing and thus acts as a full barrier.
+		 *
+		 */
 		load_cr3(next->pgd);
 
 		/* Stop flush ipis for the previous mm */
@@ -65,10 +90,15 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 			 * schedule, protecting us from simultaneous changes.
 			 */
 			cpumask_set_cpu(cpu, mm_cpumask(next));
+
 			/*
 			 * We were in lazy tlb mode and leave_mm disabled
 			 * tlb flush IPI delivery. We must reload CR3
 			 * to make sure to use no freed page tables.
+			 *
+			 * As above, this is a barrier that forces
+			 * TLB repopulation to be ordered after the
+			 * store to mm_cpumask.
 			 */
 			load_cr3(next->pgd);
 			load_LDT_nolock(&next->context);
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 282375f..c26b610 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -149,7 +149,9 @@ void flush_tlb_current_task(void)
 
 	preempt_disable();
 
+	/* This is an implicit full barrier that synchronizes with switch_mm. */
 	local_flush_tlb();
+
 	if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
 		flush_tlb_others(mm_cpumask(mm), mm, 0UL, TLB_FLUSH_ALL);
 	preempt_enable();
@@ -188,11 +190,19 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
 	unsigned act_entries, tlb_entries = 0;
 
 	preempt_disable();
-	if (current->active_mm != mm)
+	if (current->active_mm != mm) {
+		/* Synchronize with switch_mm. */
+		smp_mb();
+
 		goto flush_all;
+	}
 
 	if (!current->mm) {
 		leave_mm(smp_processor_id());
+
+		/* Synchronize with switch_mm. */
+		smp_mb();
+
 		goto flush_all;
 	}
 
@@ -242,10 +252,18 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long start)
 	preempt_disable();
 
 	if (current->active_mm == mm) {
-		if (current->mm)
+		if (current->mm) {
+			/*
+			 * Implicit full barrier (INVLPG) that synchronizes
+			 * with switch_mm.
+			 */
 			__flush_tlb_one(start);
-		else
+		} else {
 			leave_mm(smp_processor_id());
+
+			/* Synchronize with switch_mm. */
+			smp_mb();
+		}
 	}
 
 	if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
-- 
2.5.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3.10.y 05/12] pipe: limit the per-user amount of pages allocated in pipes
  2016-07-15 18:26 [PATCH 3.10.y 01/12] PCI/ACPI: Fix _OSC ordering to allow PCIe hotplug use when available Charles (Chas) Williams
                   ` (2 preceding siblings ...)
  2016-07-15 18:26   ` Charles (Chas) Williams
@ 2016-07-15 18:26 ` Charles (Chas) Williams
  2016-07-15 19:08   ` Charles (Chas) Williams
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Charles (Chas) Williams @ 2016-07-15 18:26 UTC (permalink / raw)
  To: stable; +Cc: Willy Tarreau, Al Viro, Luis Henriques, Chas Williams

From: Willy Tarreau <w@1wt.eu>

commit 759c01142a5d0f364a462346168a56de28a80f52 upstream.

On no-so-small systems, it is possible for a single process to cause an
OOM condition by filling large pipes with data that are never read. A
typical process filling 4000 pipes with 1 MB of data will use 4 GB of
memory. On small systems it may be tricky to set the pipe max size to
prevent this from happening.

This patch makes it possible to enforce a per-user soft limit above
which new pipes will be limited to a single page, effectively limiting
them to 4 kB each, as well as a hard limit above which no new pipes may
be created for this user. This has the effect of protecting the system
against memory abuse without hurting other users, and still allowing
pipes to work correctly though with less data at once.

The limit are controlled by two new sysctls : pipe-user-pages-soft, and
pipe-user-pages-hard. Both may be disabled by setting them to zero. The
default soft limit allows the default number of FDs per process (1024)
to create pipes of the default size (64kB), thus reaching a limit of 64MB
before starting to create only smaller pipes. With 256 processes limited
to 1024 FDs each, this results in 1024*64kB + (256*1024 - 1024) * 4kB =
1084 MB of memory allocated for a user. The hard limit is disabled by
default to avoid breaking existing applications that make intensive use
of pipes (eg: for splicing).

CVE-2016-2847

Reported-by: socketpair@gmail.com
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Mitigates: CVE-2013-4312 (Linux 2.0+)
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Chas Williams <3chas3@gmail.com>
---
 Documentation/sysctl/fs.txt | 23 ++++++++++++++++++++++
 fs/pipe.c                   | 47 +++++++++++++++++++++++++++++++++++++++++++--
 include/linux/pipe_fs_i.h   |  4 ++++
 include/linux/sched.h       |  1 +
 kernel/sysctl.c             | 14 ++++++++++++++
 5 files changed, 87 insertions(+), 2 deletions(-)

diff --git a/Documentation/sysctl/fs.txt b/Documentation/sysctl/fs.txt
index 88152f2..302b5ed 100644
--- a/Documentation/sysctl/fs.txt
+++ b/Documentation/sysctl/fs.txt
@@ -32,6 +32,8 @@ Currently, these files are in /proc/sys/fs:
 - nr_open
 - overflowuid
 - overflowgid
+- pipe-user-pages-hard
+- pipe-user-pages-soft
 - protected_hardlinks
 - protected_symlinks
 - suid_dumpable
@@ -159,6 +161,27 @@ The default is 65534.
 
 ==============================================================
 
+pipe-user-pages-hard:
+
+Maximum total number of pages a non-privileged user may allocate for pipes.
+Once this limit is reached, no new pipes may be allocated until usage goes
+below the limit again. When set to 0, no limit is applied, which is the default
+setting.
+
+==============================================================
+
+pipe-user-pages-soft:
+
+Maximum total number of pages a non-privileged user may allocate for pipes
+before the pipe size gets limited to a single page. Once this limit is reached,
+new pipes will be limited to a single page in size for this user in order to
+limit total memory usage, and trying to increase them using fcntl() will be
+denied until usage goes below the limit again. The default value allows to
+allocate up to 1024 pipes at their default size. When set to 0, no limit is
+applied.
+
+==============================================================
+
 protected_hardlinks:
 
 A long-standing class of security issues is the hardlink-based
diff --git a/fs/pipe.c b/fs/pipe.c
index 50267e6..c281867 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -39,6 +39,12 @@ unsigned int pipe_max_size = 1048576;
  */
 unsigned int pipe_min_size = PAGE_SIZE;
 
+/* Maximum allocatable pages per user. Hard limit is unset by default, soft
+ * matches default values.
+ */
+unsigned long pipe_user_pages_hard;
+unsigned long pipe_user_pages_soft = PIPE_DEF_BUFFERS * INR_OPEN_CUR;
+
 /*
  * We use a start+len construction, which provides full use of the 
  * allocated memory.
@@ -794,20 +800,49 @@ pipe_fasync(int fd, struct file *filp, int on)
 	return retval;
 }
 
+static void account_pipe_buffers(struct pipe_inode_info *pipe,
+                                 unsigned long old, unsigned long new)
+{
+	atomic_long_add(new - old, &pipe->user->pipe_bufs);
+}
+
+static bool too_many_pipe_buffers_soft(struct user_struct *user)
+{
+	return pipe_user_pages_soft &&
+	       atomic_long_read(&user->pipe_bufs) >= pipe_user_pages_soft;
+}
+
+static bool too_many_pipe_buffers_hard(struct user_struct *user)
+{
+	return pipe_user_pages_hard &&
+	       atomic_long_read(&user->pipe_bufs) >= pipe_user_pages_hard;
+}
+
 struct pipe_inode_info *alloc_pipe_info(void)
 {
 	struct pipe_inode_info *pipe;
 
 	pipe = kzalloc(sizeof(struct pipe_inode_info), GFP_KERNEL);
 	if (pipe) {
-		pipe->bufs = kzalloc(sizeof(struct pipe_buffer) * PIPE_DEF_BUFFERS, GFP_KERNEL);
+		unsigned long pipe_bufs = PIPE_DEF_BUFFERS;
+		struct user_struct *user = get_current_user();
+
+		if (!too_many_pipe_buffers_hard(user)) {
+			if (too_many_pipe_buffers_soft(user))
+				pipe_bufs = 1;
+			pipe->bufs = kzalloc(sizeof(struct pipe_buffer) * pipe_bufs, GFP_KERNEL);
+		}
+
 		if (pipe->bufs) {
 			init_waitqueue_head(&pipe->wait);
 			pipe->r_counter = pipe->w_counter = 1;
-			pipe->buffers = PIPE_DEF_BUFFERS;
+			pipe->buffers = pipe_bufs;
+			pipe->user = user;
+			account_pipe_buffers(pipe, 0, pipe_bufs);
 			mutex_init(&pipe->mutex);
 			return pipe;
 		}
+		free_uid(user);
 		kfree(pipe);
 	}
 
@@ -818,6 +853,8 @@ void free_pipe_info(struct pipe_inode_info *pipe)
 {
 	int i;
 
+	account_pipe_buffers(pipe, pipe->buffers, 0);
+	free_uid(pipe->user);
 	for (i = 0; i < pipe->buffers; i++) {
 		struct pipe_buffer *buf = pipe->bufs + i;
 		if (buf->ops)
@@ -1208,6 +1245,7 @@ static long pipe_set_size(struct pipe_inode_info *pipe, unsigned long nr_pages)
 			memcpy(bufs + head, pipe->bufs, tail * sizeof(struct pipe_buffer));
 	}
 
+	account_pipe_buffers(pipe, pipe->buffers, nr_pages);
 	pipe->curbuf = 0;
 	kfree(pipe->bufs);
 	pipe->bufs = bufs;
@@ -1279,6 +1317,11 @@ long pipe_fcntl(struct file *file, unsigned int cmd, unsigned long arg)
 		if (!capable(CAP_SYS_RESOURCE) && size > pipe_max_size) {
 			ret = -EPERM;
 			goto out;
+		} else if ((too_many_pipe_buffers_hard(pipe->user) ||
+			    too_many_pipe_buffers_soft(pipe->user)) &&
+		           !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) {
+			ret = -EPERM;
+			goto out;
 		}
 		ret = pipe_set_size(pipe, nr_pages);
 		break;
diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h
index ab57526..b3374f6 100644
--- a/include/linux/pipe_fs_i.h
+++ b/include/linux/pipe_fs_i.h
@@ -42,6 +42,7 @@ struct pipe_buffer {
  *	@fasync_readers: reader side fasync
  *	@fasync_writers: writer side fasync
  *	@bufs: the circular array of pipe buffers
+ *	@user: the user who created this pipe
  **/
 struct pipe_inode_info {
 	struct mutex mutex;
@@ -57,6 +58,7 @@ struct pipe_inode_info {
 	struct fasync_struct *fasync_readers;
 	struct fasync_struct *fasync_writers;
 	struct pipe_buffer *bufs;
+	struct user_struct *user;
 };
 
 /*
@@ -140,6 +142,8 @@ void pipe_unlock(struct pipe_inode_info *);
 void pipe_double_lock(struct pipe_inode_info *, struct pipe_inode_info *);
 
 extern unsigned int pipe_max_size, pipe_min_size;
+extern unsigned long pipe_user_pages_hard;
+extern unsigned long pipe_user_pages_soft;
 int pipe_proc_fn(struct ctl_table *, int, void __user *, size_t *, loff_t *);
 
 
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 4781332..7728941 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -671,6 +671,7 @@ struct user_struct {
 #endif
 	unsigned long locked_shm; /* How many pages of mlocked shm ? */
 	unsigned long unix_inflight;	/* How many files in flight in unix sockets */
+	atomic_long_t pipe_bufs;  /* how many pages are allocated in pipe buffers */
 
 #ifdef CONFIG_KEYS
 	struct key *uid_keyring;	/* UID specific keyring */
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 9469f4c..4fd49fe 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1632,6 +1632,20 @@ static struct ctl_table fs_table[] = {
 		.proc_handler	= &pipe_proc_fn,
 		.extra1		= &pipe_min_size,
 	},
+	{
+		.procname	= "pipe-user-pages-hard",
+		.data		= &pipe_user_pages_hard,
+		.maxlen		= sizeof(pipe_user_pages_hard),
+		.mode		= 0644,
+		.proc_handler	= proc_doulongvec_minmax,
+	},
+	{
+		.procname	= "pipe-user-pages-soft",
+		.data		= &pipe_user_pages_soft,
+		.maxlen		= sizeof(pipe_user_pages_soft),
+		.mode		= 0644,
+		.proc_handler	= proc_doulongvec_minmax,
+	},
 	{ }
 };
 
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3.10.y 04/12] x86/mm: Add barriers and document switch_mm()-vs-flush synchronization
  2016-07-15 18:26 [PATCH 3.10.y 01/12] PCI/ACPI: Fix _OSC ordering to allow PCIe hotplug use when available Charles (Chas) Williams
@ 2016-07-15 19:08   ` Charles (Chas) Williams
  2016-07-15 18:26 ` [PATCH 3.10.y 03/12] USB: fix invalid memory access in hub_activate() Charles (Chas) Williams
                     ` (10 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Charles (Chas) Williams @ 2016-07-15 19:08 UTC (permalink / raw)
  To: stable
  Cc: Andy Lutomirski, Andrew Morton, Andy Lutomirski, Borislav Petkov,
	Brian Gerst, Dave Hansen, Denys Vlasenko, H. Peter Anvin,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	linux-mm, Ingo Molnar, Luis Henriques, Charles (Chas) Williams

From: Andy Lutomirski <luto@kernel.org>

commit 71b3c126e61177eb693423f2e18a1914205b165e upstream.

When switch_mm() activates a new PGD, it also sets a bit that
tells other CPUs that the PGD is in use so that TLB flush IPIs
will be sent.  In order for that to work correctly, the bit
needs to be visible prior to loading the PGD and therefore
starting to fill the local TLB.

Document all the barriers that make this work correctly and add
a couple that were missing.

CVE-2016-2069

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[ luis: backported to 3.16:
  - dropped N/A comment in flush_tlb_mm_range()
  - adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
[ciwillia@brocade.com: backported to 3.10: adjusted context]
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
---
 arch/x86/include/asm/mmu_context.h | 32 +++++++++++++++++++++++++++++++-
 arch/x86/mm/tlb.c                  | 24 +++++++++++++++++++++---
 2 files changed, 52 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index be12c53..c0d2f6b 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -42,7 +42,32 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 #endif
 		cpumask_set_cpu(cpu, mm_cpumask(next));
 
-		/* Re-load page tables */
+		/*
+		 * Re-load page tables.
+		 *
+		 * This logic has an ordering constraint:
+		 *
+		 *  CPU 0: Write to a PTE for 'next'
+		 *  CPU 0: load bit 1 in mm_cpumask.  if nonzero, send IPI.
+		 *  CPU 1: set bit 1 in next's mm_cpumask
+		 *  CPU 1: load from the PTE that CPU 0 writes (implicit)
+		 *
+		 * We need to prevent an outcome in which CPU 1 observes
+		 * the new PTE value and CPU 0 observes bit 1 clear in
+		 * mm_cpumask.  (If that occurs, then the IPI will never
+		 * be sent, and CPU 0's TLB will contain a stale entry.)
+		 *
+		 * The bad outcome can occur if either CPU's load is
+		 * reordered before that CPU's store, so both CPUs much
+		 * execute full barriers to prevent this from happening.
+		 *
+		 * Thus, switch_mm needs a full barrier between the
+		 * store to mm_cpumask and any operation that could load
+		 * from next->pgd.  This barrier synchronizes with
+		 * remote TLB flushers.  Fortunately, load_cr3 is
+		 * serializing and thus acts as a full barrier.
+		 *
+		 */
 		load_cr3(next->pgd);
 
 		/* Stop flush ipis for the previous mm */
@@ -65,10 +90,15 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 			 * schedule, protecting us from simultaneous changes.
 			 */
 			cpumask_set_cpu(cpu, mm_cpumask(next));
+
 			/*
 			 * We were in lazy tlb mode and leave_mm disabled
 			 * tlb flush IPI delivery. We must reload CR3
 			 * to make sure to use no freed page tables.
+			 *
+			 * As above, this is a barrier that forces
+			 * TLB repopulation to be ordered after the
+			 * store to mm_cpumask.
 			 */
 			load_cr3(next->pgd);
 			load_LDT_nolock(&next->context);
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 282375f..c26b610 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -149,7 +149,9 @@ void flush_tlb_current_task(void)
 
 	preempt_disable();
 
+	/* This is an implicit full barrier that synchronizes with switch_mm. */
 	local_flush_tlb();
+
 	if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
 		flush_tlb_others(mm_cpumask(mm), mm, 0UL, TLB_FLUSH_ALL);
 	preempt_enable();
@@ -188,11 +190,19 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
 	unsigned act_entries, tlb_entries = 0;
 
 	preempt_disable();
-	if (current->active_mm != mm)
+	if (current->active_mm != mm) {
+		/* Synchronize with switch_mm. */
+		smp_mb();
+
 		goto flush_all;
+	}
 
 	if (!current->mm) {
 		leave_mm(smp_processor_id());
+
+		/* Synchronize with switch_mm. */
+		smp_mb();
+
 		goto flush_all;
 	}
 
@@ -242,10 +252,18 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long start)
 	preempt_disable();
 
 	if (current->active_mm == mm) {
-		if (current->mm)
+		if (current->mm) {
+			/*
+			 * Implicit full barrier (INVLPG) that synchronizes
+			 * with switch_mm.
+			 */
 			__flush_tlb_one(start);
-		else
+		} else {
 			leave_mm(smp_processor_id());
+
+			/* Synchronize with switch_mm. */
+			smp_mb();
+		}
 	}
 
 	if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
-- 
2.5.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3.10.y 04/12] x86/mm: Add barriers and document switch_mm()-vs-flush synchronization
@ 2016-07-15 19:08   ` Charles (Chas) Williams
  0 siblings, 0 replies; 18+ messages in thread
From: Charles (Chas) Williams @ 2016-07-15 19:08 UTC (permalink / raw)
  To: stable
  Cc: Andy Lutomirski, Andrew Morton, Andy Lutomirski, Borislav Petkov,
	Brian Gerst, Dave Hansen, Denys Vlasenko, H. Peter Anvin,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	linux-mm, Ingo Molnar, Luis Henriques, Charles (Chas) Williams

From: Andy Lutomirski <luto@kernel.org>

commit 71b3c126e61177eb693423f2e18a1914205b165e upstream.

When switch_mm() activates a new PGD, it also sets a bit that
tells other CPUs that the PGD is in use so that TLB flush IPIs
will be sent.  In order for that to work correctly, the bit
needs to be visible prior to loading the PGD and therefore
starting to fill the local TLB.

Document all the barriers that make this work correctly and add
a couple that were missing.

CVE-2016-2069

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[ luis: backported to 3.16:
  - dropped N/A comment in flush_tlb_mm_range()
  - adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
[ciwillia@brocade.com: backported to 3.10: adjusted context]
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
---
 arch/x86/include/asm/mmu_context.h | 32 +++++++++++++++++++++++++++++++-
 arch/x86/mm/tlb.c                  | 24 +++++++++++++++++++++---
 2 files changed, 52 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index be12c53..c0d2f6b 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -42,7 +42,32 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 #endif
 		cpumask_set_cpu(cpu, mm_cpumask(next));
 
-		/* Re-load page tables */
+		/*
+		 * Re-load page tables.
+		 *
+		 * This logic has an ordering constraint:
+		 *
+		 *  CPU 0: Write to a PTE for 'next'
+		 *  CPU 0: load bit 1 in mm_cpumask.  if nonzero, send IPI.
+		 *  CPU 1: set bit 1 in next's mm_cpumask
+		 *  CPU 1: load from the PTE that CPU 0 writes (implicit)
+		 *
+		 * We need to prevent an outcome in which CPU 1 observes
+		 * the new PTE value and CPU 0 observes bit 1 clear in
+		 * mm_cpumask.  (If that occurs, then the IPI will never
+		 * be sent, and CPU 0's TLB will contain a stale entry.)
+		 *
+		 * The bad outcome can occur if either CPU's load is
+		 * reordered before that CPU's store, so both CPUs much
+		 * execute full barriers to prevent this from happening.
+		 *
+		 * Thus, switch_mm needs a full barrier between the
+		 * store to mm_cpumask and any operation that could load
+		 * from next->pgd.  This barrier synchronizes with
+		 * remote TLB flushers.  Fortunately, load_cr3 is
+		 * serializing and thus acts as a full barrier.
+		 *
+		 */
 		load_cr3(next->pgd);
 
 		/* Stop flush ipis for the previous mm */
@@ -65,10 +90,15 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 			 * schedule, protecting us from simultaneous changes.
 			 */
 			cpumask_set_cpu(cpu, mm_cpumask(next));
+
 			/*
 			 * We were in lazy tlb mode and leave_mm disabled
 			 * tlb flush IPI delivery. We must reload CR3
 			 * to make sure to use no freed page tables.
+			 *
+			 * As above, this is a barrier that forces
+			 * TLB repopulation to be ordered after the
+			 * store to mm_cpumask.
 			 */
 			load_cr3(next->pgd);
 			load_LDT_nolock(&next->context);
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 282375f..c26b610 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -149,7 +149,9 @@ void flush_tlb_current_task(void)
 
 	preempt_disable();
 
+	/* This is an implicit full barrier that synchronizes with switch_mm. */
 	local_flush_tlb();
+
 	if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
 		flush_tlb_others(mm_cpumask(mm), mm, 0UL, TLB_FLUSH_ALL);
 	preempt_enable();
@@ -188,11 +190,19 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
 	unsigned act_entries, tlb_entries = 0;
 
 	preempt_disable();
-	if (current->active_mm != mm)
+	if (current->active_mm != mm) {
+		/* Synchronize with switch_mm. */
+		smp_mb();
+
 		goto flush_all;
+	}
 
 	if (!current->mm) {
 		leave_mm(smp_processor_id());
+
+		/* Synchronize with switch_mm. */
+		smp_mb();
+
 		goto flush_all;
 	}
 
@@ -242,10 +252,18 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long start)
 	preempt_disable();
 
 	if (current->active_mm == mm) {
-		if (current->mm)
+		if (current->mm) {
+			/*
+			 * Implicit full barrier (INVLPG) that synchronizes
+			 * with switch_mm.
+			 */
 			__flush_tlb_one(start);
-		else
+		} else {
 			leave_mm(smp_processor_id());
+
+			/* Synchronize with switch_mm. */
+			smp_mb();
+		}
 	}
 
 	if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
-- 
2.5.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3.10.y 06/12] netfilter: x_tables: don't move to non-existent next rule
  2016-07-15 18:26 [PATCH 3.10.y 01/12] PCI/ACPI: Fix _OSC ordering to allow PCIe hotplug use when available Charles (Chas) Williams
                   ` (4 preceding siblings ...)
  2016-07-15 19:08   ` Charles (Chas) Williams
@ 2016-07-15 19:08 ` Charles (Chas) Williams
  2016-07-15 19:08 ` [PATCH 3.10.y 07/12] cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind Charles (Chas) Williams
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Charles (Chas) Williams @ 2016-07-15 19:08 UTC (permalink / raw)
  To: stable; +Cc: Florian Westphal, Pablo Neira Ayuso, Chas Williams

From: Florian Westphal <fw@strlen.de>

commit f24e230d257af1ad7476c6e81a8dc3127a74204e upstream.

Ben Hawkes says:

 In the mark_source_chains function (net/ipv4/netfilter/ip_tables.c) it
 is possible for a user-supplied ipt_entry structure to have a large
 next_offset field. This field is not bounds checked prior to writing a
 counter value at the supplied offset.

Base chains enforce absolute verdict.

User defined chains are supposed to end with an unconditional return,
xtables userspace adds them automatically.

But if such return is missing we will move to non-existent next rule.

CVE-2016-3134

Reported-by: Ben Hawkes <hawkes@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Chas Williams <3chas3@gmail.com>
---
 net/ipv4/netfilter/arp_tables.c | 8 +++++---
 net/ipv4/netfilter/ip_tables.c  | 4 ++++
 net/ipv6/netfilter/ip6_tables.c | 4 ++++
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index c8abe31..5fa96b7 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -430,6 +430,8 @@ static int mark_source_chains(const struct xt_table_info *newinfo,
 				size = e->next_offset;
 				e = (struct arpt_entry *)
 					(entry0 + pos + size);
+				if (pos + size >= newinfo->size)
+					return 0;
 				e->counters.pcnt = pos;
 				pos += size;
 			} else {
@@ -452,6 +454,8 @@ static int mark_source_chains(const struct xt_table_info *newinfo,
 				} else {
 					/* ... this is a fallthru */
 					newpos = pos + e->next_offset;
+					if (newpos >= newinfo->size)
+						return 0;
 				}
 				e = (struct arpt_entry *)
 					(entry0 + newpos);
@@ -675,10 +679,8 @@ static int translate_table(struct xt_table_info *newinfo, void *entry0,
 		}
 	}
 
-	if (!mark_source_chains(newinfo, repl->valid_hooks, entry0)) {
-		duprintf("Looping hook\n");
+	if (!mark_source_chains(newinfo, repl->valid_hooks, entry0))
 		return -ELOOP;
-	}
 
 	/* Finally, each sanity check must pass */
 	i = 0;
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 651c107..1822fba 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -512,6 +512,8 @@ mark_source_chains(const struct xt_table_info *newinfo,
 				size = e->next_offset;
 				e = (struct ipt_entry *)
 					(entry0 + pos + size);
+				if (pos + size >= newinfo->size)
+					return 0;
 				e->counters.pcnt = pos;
 				pos += size;
 			} else {
@@ -533,6 +535,8 @@ mark_source_chains(const struct xt_table_info *newinfo,
 				} else {
 					/* ... this is a fallthru */
 					newpos = pos + e->next_offset;
+					if (newpos >= newinfo->size)
+						return 0;
 				}
 				e = (struct ipt_entry *)
 					(entry0 + newpos);
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index 89a4e4d..95072593 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -522,6 +522,8 @@ mark_source_chains(const struct xt_table_info *newinfo,
 				size = e->next_offset;
 				e = (struct ip6t_entry *)
 					(entry0 + pos + size);
+				if (pos + size >= newinfo->size)
+					return 0;
 				e->counters.pcnt = pos;
 				pos += size;
 			} else {
@@ -543,6 +545,8 @@ mark_source_chains(const struct xt_table_info *newinfo,
 				} else {
 					/* ... this is a fallthru */
 					newpos = pos + e->next_offset;
+					if (newpos >= newinfo->size)
+						return 0;
 				}
 				e = (struct ip6t_entry *)
 					(entry0 + newpos);
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3.10.y 07/12] cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind
  2016-07-15 18:26 [PATCH 3.10.y 01/12] PCI/ACPI: Fix _OSC ordering to allow PCIe hotplug use when available Charles (Chas) Williams
                   ` (5 preceding siblings ...)
  2016-07-15 19:08 ` [PATCH 3.10.y 06/12] netfilter: x_tables: don't move to non-existent next rule Charles (Chas) Williams
@ 2016-07-15 19:08 ` Charles (Chas) Williams
  2016-07-15 19:08 ` [PATCH 3.10.y 08/12] KEYS: potential uninitialized variable Charles (Chas) Williams
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Charles (Chas) Williams @ 2016-07-15 19:08 UTC (permalink / raw)
  To: stable; +Cc: Bjørn Mork, David S. Miller, Charles (Chas) Williams

From: Bjørn Mork <bjorn@mork.no>

commit 4d06dd537f95683aba3651098ae288b7cbff8274 upstream.

usbnet_link_change will call schedule_work and should be
avoided if bind is failing. Otherwise we will end up with
scheduled work referring to a netdev which has gone away.

Instead of making the call conditional, we can just defer
it to usbnet_probe, using the driver_info flag made for
this purpose.

CVE-2016-3951

Fixes: 8a34b0ae8778 ("usbnet: cdc_ncm: apply usbnet_link_change")
Reported-by: Andrey Konovalov <andreyknvl@gmail.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
[ciwillia@brocade.com: backported to 3.10: adjusted context]
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
---
 drivers/net/usb/cdc_ncm.c | 20 +++++---------------
 1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
index 74581cb..6ee9665 100644
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -598,24 +598,13 @@ EXPORT_SYMBOL_GPL(cdc_ncm_select_altsetting);
 
 static int cdc_ncm_bind(struct usbnet *dev, struct usb_interface *intf)
 {
-	int ret;
-
 	/* MBIM backwards compatible function? */
 	cdc_ncm_select_altsetting(dev, intf);
 	if (cdc_ncm_comm_intf_is_mbim(intf->cur_altsetting))
 		return -ENODEV;
 
 	/* NCM data altsetting is always 1 */
-	ret = cdc_ncm_bind_common(dev, intf, 1);
-
-	/*
-	 * We should get an event when network connection is "connected" or
-	 * "disconnected". Set network connection in "disconnected" state
-	 * (carrier is OFF) during attach, so the IP network stack does not
-	 * start IPv6 negotiation and more.
-	 */
-	usbnet_link_change(dev, 0, 0);
-	return ret;
+	return cdc_ncm_bind_common(dev, intf, 1);
 }
 
 static void cdc_ncm_align_tail(struct sk_buff *skb, size_t modulus, size_t remainder, size_t max)
@@ -1161,7 +1150,8 @@ static void cdc_ncm_disconnect(struct usb_interface *intf)
 
 static const struct driver_info cdc_ncm_info = {
 	.description = "CDC NCM",
-	.flags = FLAG_POINTTOPOINT | FLAG_NO_SETINT | FLAG_MULTI_PACKET,
+	.flags = FLAG_POINTTOPOINT | FLAG_NO_SETINT | FLAG_MULTI_PACKET
+			| FLAG_LINK_INTR,
 	.bind = cdc_ncm_bind,
 	.unbind = cdc_ncm_unbind,
 	.check_connect = cdc_ncm_check_connect,
@@ -1175,7 +1165,7 @@ static const struct driver_info cdc_ncm_info = {
 static const struct driver_info wwan_info = {
 	.description = "Mobile Broadband Network Device",
 	.flags = FLAG_POINTTOPOINT | FLAG_NO_SETINT | FLAG_MULTI_PACKET
-			| FLAG_WWAN,
+			| FLAG_LINK_INTR | FLAG_WWAN,
 	.bind = cdc_ncm_bind,
 	.unbind = cdc_ncm_unbind,
 	.check_connect = cdc_ncm_check_connect,
@@ -1189,7 +1179,7 @@ static const struct driver_info wwan_info = {
 static const struct driver_info wwan_noarp_info = {
 	.description = "Mobile Broadband Network Device (NO ARP)",
 	.flags = FLAG_POINTTOPOINT | FLAG_NO_SETINT | FLAG_MULTI_PACKET
-			| FLAG_WWAN | FLAG_NOARP,
+			| FLAG_LINK_INTR | FLAG_WWAN | FLAG_NOARP,
 	.bind = cdc_ncm_bind,
 	.unbind = cdc_ncm_unbind,
 	.check_connect = cdc_ncm_check_connect,
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3.10.y 08/12] KEYS: potential uninitialized variable
  2016-07-15 18:26 [PATCH 3.10.y 01/12] PCI/ACPI: Fix _OSC ordering to allow PCIe hotplug use when available Charles (Chas) Williams
                   ` (6 preceding siblings ...)
  2016-07-15 19:08 ` [PATCH 3.10.y 07/12] cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind Charles (Chas) Williams
@ 2016-07-15 19:08 ` Charles (Chas) Williams
  2016-07-15 19:08 ` [PATCH 3.10.y 09/12] USB: usbfs: fix potential infoleak in devio Charles (Chas) Williams
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Charles (Chas) Williams @ 2016-07-15 19:08 UTC (permalink / raw)
  To: stable
  Cc: Dan Carpenter, David Howells, Linus Torvalds, Charles (Chas) Williams

From: Dan Carpenter <dan.carpenter@oracle.com>

commit 38327424b40bcebe2de92d07312c89360ac9229a upstream.

If __key_link_begin() failed then "edit" would be uninitialized.  I've
added a check to fix that.

This allows a random user to crash the kernel, though it's quite
difficult to achieve.  There are three ways it can be done as the user
would have to cause an error to occur in __key_link():

 (1) Cause the kernel to run out of memory.  In practice, this is difficult
     to achieve without ENOMEM cropping up elsewhere and aborting the
     attempt.

 (2) Revoke the destination keyring between the keyring ID being looked up
     and it being tested for revocation.  In practice, this is difficult to
     time correctly because the KEYCTL_REJECT function can only be used
     from the request-key upcall process.  Further, users can only make use
     of what's in /sbin/request-key.conf, though this does including a
     rejection debugging test - which means that the destination keyring
     has to be the caller's session keyring in practice.

 (3) Have just enough key quota available to create a key, a new session
     keyring for the upcall and a link in the session keyring, but not then
     sufficient quota to create a link in the nominated destination keyring
     so that it fails with EDQUOT.

The bug can be triggered using option (3) above using something like the
following:

	echo 80 >/proc/sys/kernel/keys/root_maxbytes
	keyctl request2 user debug:fred negate @t

The above sets the quota to something much lower (80) to make the bug
easier to trigger, but this is dependent on the system.  Note also that
the name of the keyring created contains a random number that may be
between 1 and 10 characters in size, so may throw the test off by
changing the amount of quota used.

Assuming the failure occurs, something like the following will be seen:

	kfree_debugcheck: out of range ptr 6b6b6b6b6b6b6b68h
	------------[ cut here ]------------
	kernel BUG at ../mm/slab.c:2821!
	...
	RIP: 0010:[<ffffffff811600f9>] kfree_debugcheck+0x20/0x25
	RSP: 0018:ffff8804014a7de8  EFLAGS: 00010092
	RAX: 0000000000000034 RBX: 6b6b6b6b6b6b6b68 RCX: 0000000000000000
	RDX: 0000000000040001 RSI: 00000000000000f6 RDI: 0000000000000300
	RBP: ffff8804014a7df0 R08: 0000000000000001 R09: 0000000000000000
	R10: ffff8804014a7e68 R11: 0000000000000054 R12: 0000000000000202
	R13: ffffffff81318a66 R14: 0000000000000000 R15: 0000000000000001
	...
	Call Trace:
	  kfree+0xde/0x1bc
	  assoc_array_cancel_edit+0x1f/0x36
	  __key_link_end+0x55/0x63
	  key_reject_and_link+0x124/0x155
	  keyctl_reject_key+0xb6/0xe0
	  keyctl_negate_key+0x10/0x12
	  SyS_keyctl+0x9f/0xe7
	  do_syscall_64+0x63/0x13a
	  entry_SYSCALL64_slow_path+0x25/0x25

CVE-2016-4470

Fixes: f70e2e06196a ('KEYS: Do preallocation for __key_link()')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ciwillia@brocade.com: backported to 3.10: adjusted context]
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
---
 security/keys/key.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/security/keys/key.c b/security/keys/key.c
index 8fb7c7b..6595b2d 100644
--- a/security/keys/key.c
+++ b/security/keys/key.c
@@ -580,7 +580,7 @@ int key_reject_and_link(struct key *key,
 
 	mutex_unlock(&key_construction_mutex);
 
-	if (keyring)
+	if (keyring && link_ret == 0)
 		__key_link_end(keyring, key->type, prealloc);
 
 	/* wake up anyone waiting for a key to be constructed */
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3.10.y 09/12] USB: usbfs: fix potential infoleak in devio
  2016-07-15 18:26 [PATCH 3.10.y 01/12] PCI/ACPI: Fix _OSC ordering to allow PCIe hotplug use when available Charles (Chas) Williams
                   ` (7 preceding siblings ...)
  2016-07-15 19:08 ` [PATCH 3.10.y 08/12] KEYS: potential uninitialized variable Charles (Chas) Williams
@ 2016-07-15 19:08 ` Charles (Chas) Williams
  2016-07-15 19:08 ` [PATCH 3.10.y 10/12] mm: migrate dirty page without clear_page_dirty_for_io etc Charles (Chas) Williams
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Charles (Chas) Williams @ 2016-07-15 19:08 UTC (permalink / raw)
  To: stable
  Cc: Kangjie Lu, Kangjie Lu, Greg Kroah-Hartman, Charles (Chas) Williams

From: Kangjie Lu <kangjielu@gmail.com>

commit 681fef8380eb818c0b845fca5d2ab1dcbab114ee upstream.

The stack object “ci” has a total size of 8 bytes. Its last 3 bytes
are padding bytes which are not initialized and leaked to userland
via “copy_to_user”.

CVE-2016-4482

Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ciwillia@brocade.com: backported to 3.10: adjusted context]
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
---
 drivers/usb/core/devio.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
index 62e532f..cfce807 100644
--- a/drivers/usb/core/devio.c
+++ b/drivers/usb/core/devio.c
@@ -1106,10 +1106,11 @@ static int proc_getdriver(struct dev_state *ps, void __user *arg)
 
 static int proc_connectinfo(struct dev_state *ps, void __user *arg)
 {
-	struct usbdevfs_connectinfo ci = {
-		.devnum = ps->dev->devnum,
-		.slow = ps->dev->speed == USB_SPEED_LOW
-	};
+	struct usbdevfs_connectinfo ci;
+
+	memset(&ci, 0, sizeof(ci));
+	ci.devnum = ps->dev->devnum;
+	ci.slow = ps->dev->speed == USB_SPEED_LOW;
 
 	if (copy_to_user(arg, &ci, sizeof(ci)))
 		return -EFAULT;
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3.10.y 10/12] mm: migrate dirty page without clear_page_dirty_for_io etc
  2016-07-15 18:26 [PATCH 3.10.y 01/12] PCI/ACPI: Fix _OSC ordering to allow PCIe hotplug use when available Charles (Chas) Williams
                   ` (8 preceding siblings ...)
  2016-07-15 19:08 ` [PATCH 3.10.y 09/12] USB: usbfs: fix potential infoleak in devio Charles (Chas) Williams
@ 2016-07-15 19:08 ` Charles (Chas) Williams
  2016-07-15 19:08 ` [PATCH 3.10.y 11/12] printk: do cond_resched() between lines while outputting to consoles Charles (Chas) Williams
  2016-07-15 19:08 ` [PATCH 3.10.y 12/12] HID: hiddev: validate num_values for HIDIOCGUSAGES, HIDIOCSUSAGES commands Charles (Chas) Williams
  11 siblings, 0 replies; 18+ messages in thread
From: Charles (Chas) Williams @ 2016-07-15 19:08 UTC (permalink / raw)
  To: stable
  Cc: Hugh Dickins, Christoph Lameter, Kirill A. Shutemov,
	Rik van Riel, Vlastimil Babka, Davidlohr Bueso, Oleg Nesterov,
	Sasha Levin, Dmitry Vyukov, KOSAKI Motohiro, Andrew Morton,
	Linus Torvalds, Charles (Chas) Williams

From: Hugh Dickins <hughd@google.com>

commit 42cb14b110a5698ccf26ce59c4441722605a3743 upstream.

clear_page_dirty_for_io() has accumulated writeback and memcg subtleties
since v2.6.16 first introduced page migration; and the set_page_dirty()
which completed its migration of PageDirty, later had to be moderated to
__set_page_dirty_nobuffers(); then PageSwapBacked had to skip that too.

No actual problems seen with this procedure recently, but if you look into
what the clear_page_dirty_for_io(page)+set_page_dirty(newpage) is actually
achieving, it turns out to be nothing more than moving the PageDirty flag,
and its NR_FILE_DIRTY stat from one zone to another.

It would be good to avoid a pile of irrelevant decrementations and
incrementations, and improper event counting, and unnecessary descent of
the radix_tree under tree_lock (to set the PAGECACHE_TAG_DIRTY which
radix_tree_replace_slot() left in place anyway).

Do the NR_FILE_DIRTY movement, like the other stats movements, while
interrupts still disabled in migrate_page_move_mapping(); and don't even
bother if the zone is the same.  Do the PageDirty movement there under
tree_lock too, where old page is frozen and newpage not yet visible:
bearing in mind that as soon as newpage becomes visible in radix_tree, an
un-page-locked set_page_dirty() might interfere (or perhaps that's just
not possible: anything doing so should already hold an additional
reference to the old page, preventing its migration; but play safe).

But we do still need to transfer PageDirty in migrate_page_copy(), for
those who don't go the mapping route through migrate_page_move_mapping().

CVE-2016-3070

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ciwillia@brocade.com: backported to 3.10: adjusted context]
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
---
 mm/migrate.c | 51 +++++++++++++++++++++++++++++++--------------------
 1 file changed, 31 insertions(+), 20 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index a88c12f..a61500f 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -30,6 +30,7 @@
 #include <linux/mempolicy.h>
 #include <linux/vmalloc.h>
 #include <linux/security.h>
+#include <linux/backing-dev.h>
 #include <linux/memcontrol.h>
 #include <linux/syscalls.h>
 #include <linux/hugetlb.h>
@@ -311,6 +312,8 @@ static int migrate_page_move_mapping(struct address_space *mapping,
 		struct page *newpage, struct page *page,
 		struct buffer_head *head, enum migrate_mode mode)
 {
+	struct zone *oldzone, *newzone;
+	int dirty;
 	int expected_count = 0;
 	void **pslot;
 
@@ -321,6 +324,9 @@ static int migrate_page_move_mapping(struct address_space *mapping,
 		return MIGRATEPAGE_SUCCESS;
 	}
 
+	oldzone = page_zone(page);
+	newzone = page_zone(newpage);
+
 	spin_lock_irq(&mapping->tree_lock);
 
 	pslot = radix_tree_lookup_slot(&mapping->page_tree,
@@ -361,6 +367,13 @@ static int migrate_page_move_mapping(struct address_space *mapping,
 		set_page_private(newpage, page_private(page));
 	}
 
+	/* Move dirty while page refs frozen and newpage not yet exposed */
+	dirty = PageDirty(page);
+	if (dirty) {
+		ClearPageDirty(page);
+		SetPageDirty(newpage);
+	}
+
 	radix_tree_replace_slot(pslot, newpage);
 
 	/*
@@ -370,6 +383,9 @@ static int migrate_page_move_mapping(struct address_space *mapping,
 	 */
 	page_unfreeze_refs(page, expected_count - 1);
 
+	spin_unlock(&mapping->tree_lock);
+	/* Leave irq disabled to prevent preemption while updating stats */
+
 	/*
 	 * If moved to a different zone then also account
 	 * the page for that zone. Other VM counters will be
@@ -380,13 +396,19 @@ static int migrate_page_move_mapping(struct address_space *mapping,
 	 * via NR_FILE_PAGES and NR_ANON_PAGES if they
 	 * are mapped to swap space.
 	 */
-	__dec_zone_page_state(page, NR_FILE_PAGES);
-	__inc_zone_page_state(newpage, NR_FILE_PAGES);
-	if (!PageSwapCache(page) && PageSwapBacked(page)) {
-		__dec_zone_page_state(page, NR_SHMEM);
-		__inc_zone_page_state(newpage, NR_SHMEM);
+	if (newzone != oldzone) {
+		__dec_zone_state(oldzone, NR_FILE_PAGES);
+		__inc_zone_state(newzone, NR_FILE_PAGES);
+		if (PageSwapBacked(page) && !PageSwapCache(page)) {
+			__dec_zone_state(oldzone, NR_SHMEM);
+			__inc_zone_state(newzone, NR_SHMEM);
+		}
+		if (dirty && mapping_cap_account_dirty(mapping)) {
+			__dec_zone_state(oldzone, NR_FILE_DIRTY);
+			__inc_zone_state(newzone, NR_FILE_DIRTY);
+		}
 	}
-	spin_unlock_irq(&mapping->tree_lock);
+	local_irq_enable();
 
 	return MIGRATEPAGE_SUCCESS;
 }
@@ -460,20 +482,9 @@ void migrate_page_copy(struct page *newpage, struct page *page)
 	if (PageMappedToDisk(page))
 		SetPageMappedToDisk(newpage);
 
-	if (PageDirty(page)) {
-		clear_page_dirty_for_io(page);
-		/*
-		 * Want to mark the page and the radix tree as dirty, and
-		 * redo the accounting that clear_page_dirty_for_io undid,
-		 * but we can't use set_page_dirty because that function
-		 * is actually a signal that all of the page has become dirty.
-		 * Whereas only part of our page may be dirty.
-		 */
-		if (PageSwapBacked(page))
-			SetPageDirty(newpage);
-		else
-			__set_page_dirty_nobuffers(newpage);
- 	}
+	/* Move dirty on pages not done by migrate_page_move_mapping() */
+	if (PageDirty(page))
+		SetPageDirty(newpage);
 
 	mlock_migrate_page(newpage, page);
 	ksm_migrate_page(newpage, page);
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3.10.y 11/12] printk: do cond_resched() between lines while outputting to consoles
  2016-07-15 18:26 [PATCH 3.10.y 01/12] PCI/ACPI: Fix _OSC ordering to allow PCIe hotplug use when available Charles (Chas) Williams
                   ` (9 preceding siblings ...)
  2016-07-15 19:08 ` [PATCH 3.10.y 10/12] mm: migrate dirty page without clear_page_dirty_for_io etc Charles (Chas) Williams
@ 2016-07-15 19:08 ` Charles (Chas) Williams
  2016-07-15 19:08 ` [PATCH 3.10.y 12/12] HID: hiddev: validate num_values for HIDIOCGUSAGES, HIDIOCSUSAGES commands Charles (Chas) Williams
  11 siblings, 0 replies; 18+ messages in thread
From: Charles (Chas) Williams @ 2016-07-15 19:08 UTC (permalink / raw)
  To: stable
  Cc: Tejun Heo, Dave Jones, Kyle McMartin, Andrew Morton,
	Linus Torvalds, Chas Williams

From: Tejun Heo <tj@kernel.org>

Commit 8d91f8b15361dfb438ab6eb3b319e2ded43458ff upstream.

@console_may_schedule tracks whether console_sem was acquired through
lock or trylock.  If the former, we're inside a sleepable context and
console_conditional_schedule() performs cond_resched().  This allows
console drivers which use console_lock for synchronization to yield
while performing time-consuming operations such as scrolling.

However, the actual console outputting is performed while holding
irq-safe logbuf_lock, so console_unlock() clears @console_may_schedule
before starting outputting lines.  Also, only a few drivers call
console_conditional_schedule() to begin with.  This means that when a
lot of lines need to be output by console_unlock(), for example on a
console registration, the task doing console_unlock() may not yield for
a long time on a non-preemptible kernel.

If this happens with a slow console devices, for example a serial
console, the outputting task may occupy the cpu for a very long time.
Long enough to trigger softlockup and/or RCU stall warnings, which in
turn pile more messages, sometimes enough to trigger the next cycle of
warnings incapacitating the system.

Fix it by making console_unlock() insert cond_resched() between lines if
@console_may_schedule.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Calvin Owens <calvinowens@fb.com>
Acked-by: Jan Kara <jack@suse.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Kyle McMartin <kyle@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ciwillia@brocade.com: adjust context for 3.10.y]
Signed-off-by: Chas Williams <ciwillia@brocade.com>
---
 include/linux/console.h |  1 +
 kernel/panic.c          |  3 +++
 kernel/printk.c         | 35 ++++++++++++++++++++++++++++++++++-
 3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/include/linux/console.h b/include/linux/console.h
index 73bab0f..6877ffc 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -153,6 +153,7 @@ extern int console_trylock(void);
 extern void console_unlock(void);
 extern void console_conditional_schedule(void);
 extern void console_unblank(void);
+extern void console_flush_on_panic(void);
 extern struct tty_driver *console_device(int *);
 extern void console_stop(struct console *);
 extern void console_start(struct console *);
diff --git a/kernel/panic.c b/kernel/panic.c
index 167ec097..d3d74c4 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -22,6 +22,7 @@
 #include <linux/sysrq.h>
 #include <linux/init.h>
 #include <linux/nmi.h>
+#include <linux/console.h>
 
 #define PANIC_TIMER_STEP 100
 #define PANIC_BLINK_SPD 18
@@ -128,6 +129,8 @@ void panic(const char *fmt, ...)
 
 	bust_spinlocks(0);
 
+	console_flush_on_panic();
+
 	if (!panic_blink)
 		panic_blink = no_blink;
 
diff --git a/kernel/printk.c b/kernel/printk.c
index fd0154a..ee8f6be 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -2033,13 +2033,24 @@ void console_unlock(void)
 	static u64 seen_seq;
 	unsigned long flags;
 	bool wake_klogd = false;
-	bool retry;
+	bool do_cond_resched, retry;
 
 	if (console_suspended) {
 		up(&console_sem);
 		return;
 	}
 
+	/*
+	 * Console drivers are called under logbuf_lock, so
+	 * @console_may_schedule should be cleared before; however, we may
+	 * end up dumping a lot of lines, for example, if called from
+	 * console registration path, and should invoke cond_resched()
+	 * between lines if allowable.  Not doing so can cause a very long
+	 * scheduling stall on a slow console leading to RCU stall and
+	 * softlockup warnings which exacerbate the issue with more
+	 * messages practically incapacitating the system.
+	 */
+	do_cond_resched = console_may_schedule;
 	console_may_schedule = 0;
 
 	/* flush buffered message fragment immediately to console */
@@ -2096,6 +2107,9 @@ skip:
 		call_console_drivers(level, text, len);
 		start_critical_timings();
 		local_irq_restore(flags);
+
+		if (do_cond_resched)
+			cond_resched();
 	}
 	console_locked = 0;
 	mutex_release(&console_lock_dep_map, 1, _RET_IP_);
@@ -2164,6 +2178,25 @@ void console_unblank(void)
 	console_unlock();
 }
 
+/**
+ * console_flush_on_panic - flush console content on panic
+ *
+ * Immediately output all pending messages no matter what.
+ */
+void console_flush_on_panic(void)
+{
+	/*
+	 * If someone else is holding the console lock, trylock will fail
+	 * and may_schedule may be set.  Ignore and proceed to unlock so
+	 * that messages are flushed out.  As this can be called from any
+	 * context and we don't want to get preempted while flushing,
+	 * ensure may_schedule is cleared.
+	 */
+	console_trylock();
+	console_may_schedule = 0;
+	console_unlock();
+}
+
 /*
  * Return the console tty driver structure and its associated index
  */
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3.10.y 12/12] HID: hiddev: validate num_values for HIDIOCGUSAGES, HIDIOCSUSAGES commands
  2016-07-15 18:26 [PATCH 3.10.y 01/12] PCI/ACPI: Fix _OSC ordering to allow PCIe hotplug use when available Charles (Chas) Williams
                   ` (10 preceding siblings ...)
  2016-07-15 19:08 ` [PATCH 3.10.y 11/12] printk: do cond_resched() between lines while outputting to consoles Charles (Chas) Williams
@ 2016-07-15 19:08 ` Charles (Chas) Williams
  11 siblings, 0 replies; 18+ messages in thread
From: Charles (Chas) Williams @ 2016-07-15 19:08 UTC (permalink / raw)
  To: stable; +Cc: Scott Bauer, Jiri Kosina, Chas Williams

From: Scott Bauer <sbauer@plzdonthack.me>

commit 93a2001bdfd5376c3dc2158653034c20392d15c5 upstream.

This patch validates the num_values parameter from userland during the
HIDIOCGUSAGES and HIDIOCSUSAGES commands. Previously, if the report id was set
to HID_REPORT_ID_UNKNOWN, we would fail to validate the num_values parameter
leading to a heap overflow.

CVE-2016-5829

Cc: stable@vger.kernel.org
Signed-off-by: Scott Bauer <sbauer@plzdonthack.me>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Chas Williams <ciwillia@brocade.com>
---
 drivers/hid/usbhid/hiddev.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/hid/usbhid/hiddev.c b/drivers/hid/usbhid/hiddev.c
index 2f1ddca..700145b 100644
--- a/drivers/hid/usbhid/hiddev.c
+++ b/drivers/hid/usbhid/hiddev.c
@@ -516,13 +516,13 @@ static noinline int hiddev_ioctl_usage(struct hiddev *hiddev, unsigned int cmd,
 					goto inval;
 			} else if (uref->usage_index >= field->report_count)
 				goto inval;
-
-			else if ((cmd == HIDIOCGUSAGES || cmd == HIDIOCSUSAGES) &&
-				 (uref_multi->num_values > HID_MAX_MULTI_USAGES ||
-				  uref->usage_index + uref_multi->num_values > field->report_count))
-				goto inval;
 		}
 
+		if ((cmd == HIDIOCGUSAGES || cmd == HIDIOCSUSAGES) &&
+		    (uref_multi->num_values > HID_MAX_MULTI_USAGES ||
+		     uref->usage_index + uref_multi->num_values > field->report_count))
+			goto inval;
+
 		switch (cmd) {
 		case HIDIOCGUSAGE:
 			uref->value = field->value[uref->usage_index];
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 3.10.y 04/12] x86/mm: Add barriers and document switch_mm()-vs-flush synchronization
  2016-07-15 18:26   ` Charles (Chas) Williams
  (?)
@ 2016-07-16  9:15   ` Willy Tarreau
  2016-07-16 17:04       ` Charles (Chas) Williams
  -1 siblings, 1 reply; 18+ messages in thread
From: Willy Tarreau @ 2016-07-16  9:15 UTC (permalink / raw)
  To: Charles (Chas) Williams
  Cc: stable, Andy Lutomirski, Andrew Morton, Andy Lutomirski,
	Borislav Petkov, Brian Gerst, Dave Hansen, Denys Vlasenko,
	H. Peter Anvin, Linus Torvalds, Peter Zijlstra, Rik van Riel,
	Thomas Gleixner, linux-mm, Ingo Molnar, Luis Henriques

Hi Chas,

On Fri, Jul 15, 2016 at 02:26:26PM -0400, Charles (Chas) Williams wrote:
> From: Andy Lutomirski <luto@kernel.org>
> 
> commit 71b3c126e61177eb693423f2e18a1914205b165e upstream.
> 
> When switch_mm() activates a new PGD, it also sets a bit that
> tells other CPUs that the PGD is in use so that TLB flush IPIs
> will be sent.  In order for that to work correctly, the bit
> needs to be visible prior to loading the PGD and therefore
> starting to fill the local TLB.
> 
> Document all the barriers that make this work correctly and add
> a couple that were missing.
> 
> CVE-2016-2069

I'm fine with queuing these patches for 3.10, but patches 4, 9 and 12
of your series are not in 3.14, and I only apply patches to 3.10 if
they are already present in 3.14 (or if there's a good reason of course).
Please could you check that you already submitted them ? If so I'll just
wait for them to pop up there. It's important for us to ensure that users
upgrading from extended LTS kernels to normal LTS kernels are never hit
by a bug that was previously fixed in the older one and not yet in the
newer one.

Thanks,
Willy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3.10.y 04/12] x86/mm: Add barriers and document switch_mm()-vs-flush synchronization
  2016-07-16  9:15   ` Willy Tarreau
@ 2016-07-16 17:04       ` Charles (Chas) Williams
  0 siblings, 0 replies; 18+ messages in thread
From: Charles (Chas) Williams @ 2016-07-16 17:04 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: stable, Andy Lutomirski, Andrew Morton, Andy Lutomirski,
	Borislav Petkov, Brian Gerst, Dave Hansen, Denys Vlasenko,
	H. Peter Anvin, Linus Torvalds, Peter Zijlstra, Rik van Riel,
	Thomas Gleixner, linux-mm, Ingo Molnar, Luis Henriques

I didn't submit for 3.14 --  I will do so Monday.

On 07/16/2016 05:15 AM, Willy Tarreau wrote:
> Hi Chas,
>
> On Fri, Jul 15, 2016 at 02:26:26PM -0400, Charles (Chas) Williams wrote:
>> From: Andy Lutomirski <luto@kernel.org>
>>
>> commit 71b3c126e61177eb693423f2e18a1914205b165e upstream.
>>
>> When switch_mm() activates a new PGD, it also sets a bit that
>> tells other CPUs that the PGD is in use so that TLB flush IPIs
>> will be sent.  In order for that to work correctly, the bit
>> needs to be visible prior to loading the PGD and therefore
>> starting to fill the local TLB.
>>
>> Document all the barriers that make this work correctly and add
>> a couple that were missing.
>>
>> CVE-2016-2069
>
> I'm fine with queuing these patches for 3.10, but patches 4, 9 and 12
> of your series are not in 3.14, and I only apply patches to 3.10 if
> they are already present in 3.14 (or if there's a good reason of course).
> Please could you check that you already submitted them ? If so I'll just
> wait for them to pop up there. It's important for us to ensure that users
> upgrading from extended LTS kernels to normal LTS kernels are never hit
> by a bug that was previously fixed in the older one and not yet in the
> newer one.
>
> Thanks,
> Willy
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3.10.y 04/12] x86/mm: Add barriers and document switch_mm()-vs-flush synchronization
@ 2016-07-16 17:04       ` Charles (Chas) Williams
  0 siblings, 0 replies; 18+ messages in thread
From: Charles (Chas) Williams @ 2016-07-16 17:04 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: stable, Andy Lutomirski, Andrew Morton, Andy Lutomirski,
	Borislav Petkov, Brian Gerst, Dave Hansen, Denys Vlasenko,
	H. Peter Anvin, Linus Torvalds, Peter Zijlstra, Rik van Riel,
	Thomas Gleixner, linux-mm, Ingo Molnar, Luis Henriques

I didn't submit for 3.14 --  I will do so Monday.

On 07/16/2016 05:15 AM, Willy Tarreau wrote:
> Hi Chas,
>
> On Fri, Jul 15, 2016 at 02:26:26PM -0400, Charles (Chas) Williams wrote:
>> From: Andy Lutomirski <luto@kernel.org>
>>
>> commit 71b3c126e61177eb693423f2e18a1914205b165e upstream.
>>
>> When switch_mm() activates a new PGD, it also sets a bit that
>> tells other CPUs that the PGD is in use so that TLB flush IPIs
>> will be sent.  In order for that to work correctly, the bit
>> needs to be visible prior to loading the PGD and therefore
>> starting to fill the local TLB.
>>
>> Document all the barriers that make this work correctly and add
>> a couple that were missing.
>>
>> CVE-2016-2069
>
> I'm fine with queuing these patches for 3.10, but patches 4, 9 and 12
> of your series are not in 3.14, and I only apply patches to 3.10 if
> they are already present in 3.14 (or if there's a good reason of course).
> Please could you check that you already submitted them ? If so I'll just
> wait for them to pop up there. It's important for us to ensure that users
> upgrading from extended LTS kernels to normal LTS kernels are never hit
> by a bug that was previously fixed in the older one and not yet in the
> newer one.
>
> Thanks,
> Willy
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-07-16 17:06 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-15 18:26 [PATCH 3.10.y 01/12] PCI/ACPI: Fix _OSC ordering to allow PCIe hotplug use when available Charles (Chas) Williams
2016-07-15 18:26 ` [PATCH 3.10.y 02/12] udp: properly support MSG_PEEK with truncated buffers Charles (Chas) Williams
2016-07-15 18:26 ` [PATCH 3.10.y 03/12] USB: fix invalid memory access in hub_activate() Charles (Chas) Williams
2016-07-15 18:26 ` [PATCH 3.10.y 04/12] x86/mm: Add barriers and document switch_mm()-vs-flush synchronization Charles (Chas) Williams
2016-07-15 18:26   ` Charles (Chas) Williams
2016-07-16  9:15   ` Willy Tarreau
2016-07-16 17:04     ` Charles (Chas) Williams
2016-07-16 17:04       ` Charles (Chas) Williams
2016-07-15 18:26 ` [PATCH 3.10.y 05/12] pipe: limit the per-user amount of pages allocated in pipes Charles (Chas) Williams
2016-07-15 19:08 ` [PATCH 3.10.y 04/12] x86/mm: Add barriers and document switch_mm()-vs-flush synchronization Charles (Chas) Williams
2016-07-15 19:08   ` Charles (Chas) Williams
2016-07-15 19:08 ` [PATCH 3.10.y 06/12] netfilter: x_tables: don't move to non-existent next rule Charles (Chas) Williams
2016-07-15 19:08 ` [PATCH 3.10.y 07/12] cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind Charles (Chas) Williams
2016-07-15 19:08 ` [PATCH 3.10.y 08/12] KEYS: potential uninitialized variable Charles (Chas) Williams
2016-07-15 19:08 ` [PATCH 3.10.y 09/12] USB: usbfs: fix potential infoleak in devio Charles (Chas) Williams
2016-07-15 19:08 ` [PATCH 3.10.y 10/12] mm: migrate dirty page without clear_page_dirty_for_io etc Charles (Chas) Williams
2016-07-15 19:08 ` [PATCH 3.10.y 11/12] printk: do cond_resched() between lines while outputting to consoles Charles (Chas) Williams
2016-07-15 19:08 ` [PATCH 3.10.y 12/12] HID: hiddev: validate num_values for HIDIOCGUSAGES, HIDIOCSUSAGES commands Charles (Chas) Williams

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.