All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 3.14.y 1/9] udp: properly support MSG_PEEK with truncated buffers
@ 2016-07-18 20:53 Charles (Chas) Williams
  2016-07-18 20:53 ` [PATCH 3.14.y 2/9] USB: fix invalid memory access in hub_activate() Charles (Chas) Williams
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Charles (Chas) Williams @ 2016-07-18 20:53 UTC (permalink / raw)
  To: stable
  Cc: Eric Dumazet, David S. Miller, Luis Henriques, Charles (Chas) Williams

From: Eric Dumazet <edumazet@google.com>

commit 197c949e7798fbf28cfadc69d9ca0c2abbf93191 upstream.

Backport of this upstream commit into stable kernels :
89c22d8c3b27 ("net: Fix skb csum races when peeking")
exposed a bug in udp stack vs MSG_PEEK support, when user provides
a buffer smaller than skb payload.

In this case,
skb_copy_and_csum_datagram_iovec(skb, sizeof(struct udphdr),
                                 msg->msg_iov);
returns -EFAULT.

This bug does not happen in upstream kernels since Al Viro did a great
job to replace this into :
skb_copy_and_csum_datagram_msg(skb, sizeof(struct udphdr), msg);
This variant is safe vs short buffers.

For the time being, instead reverting Herbert Xu patch and add back
skb->ip_summed invalid changes, simply store the result of
udp_lib_checksum_complete() so that we avoid computing the checksum a
second time, and avoid the problematic
skb_copy_and_csum_datagram_iovec() call.

This patch can be applied on recent kernels as it avoids a double
checksumming, then backported to stable kernels as a bug fix.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
---
 net/ipv4/udp.c | 6 ++++--
 net/ipv6/udp.c | 6 ++++--
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index b0fe135..f305c4b 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1233,6 +1233,7 @@ int udp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 	int peeked, off = 0;
 	int err;
 	int is_udplite = IS_UDPLITE(sk);
+	bool checksum_valid = false;
 	bool slow;
 
 	if (flags & MSG_ERRQUEUE)
@@ -1258,11 +1259,12 @@ try_again:
 	 */
 
 	if (copied < ulen || UDP_SKB_CB(skb)->partial_cov) {
-		if (udp_lib_checksum_complete(skb))
+		checksum_valid = !udp_lib_checksum_complete(skb);
+		if (!checksum_valid)
 			goto csum_copy_err;
 	}
 
-	if (skb_csum_unnecessary(skb))
+	if (checksum_valid || skb_csum_unnecessary(skb))
 		err = skb_copy_datagram_iovec(skb, sizeof(struct udphdr),
 					      msg->msg_iov, copied);
 	else {
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index d2013c7..639401c 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -389,6 +389,7 @@ int udpv6_recvmsg(struct kiocb *iocb, struct sock *sk,
 	int peeked, off = 0;
 	int err;
 	int is_udplite = IS_UDPLITE(sk);
+	bool checksum_valid = false;
 	int is_udp4;
 	bool slow;
 
@@ -420,11 +421,12 @@ try_again:
 	 */
 
 	if (copied < ulen || UDP_SKB_CB(skb)->partial_cov) {
-		if (udp_lib_checksum_complete(skb))
+		checksum_valid = !udp_lib_checksum_complete(skb);
+		if (!checksum_valid)
 			goto csum_copy_err;
 	}
 
-	if (skb_csum_unnecessary(skb))
+	if (checksum_valid || skb_csum_unnecessary(skb))
 		err = skb_copy_datagram_iovec(skb, sizeof(struct udphdr),
 					      msg->msg_iov, copied);
 	else {
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3.14.y 2/9] USB: fix invalid memory access in hub_activate()
  2016-07-18 20:53 [PATCH 3.14.y 1/9] udp: properly support MSG_PEEK with truncated buffers Charles (Chas) Williams
@ 2016-07-18 20:53 ` Charles (Chas) Williams
  2016-07-18 20:53 ` [PATCH 3.14.y 3/9] cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind Charles (Chas) Williams
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Charles (Chas) Williams @ 2016-07-18 20:53 UTC (permalink / raw)
  To: stable
  Cc: Alan Stern, Greg Kroah-Hartman, Luis Henriques, Charles (Chas) Williams

From: Alan Stern <stern@rowland.harvard.edu>

commit e50293ef9775c5f1cf3fcc093037dd6a8c5684ea upstream.

Commit 8520f38099cc ("USB: change hub initialization sleeps to
delayed_work") changed the hub_activate() routine to make part of it
run in a workqueue.  However, the commit failed to take a reference to
the usb_hub structure or to lock the hub interface while doing so.  As
a result, if a hub is plugged in and quickly unplugged before the work
routine can run, the routine will try to access memory that has been
deallocated.  Or, if the hub is unplugged while the routine is
running, the memory may be deallocated while it is in active use.

This patch fixes the problem by taking a reference to the usb_hub at
the start of hub_activate() and releasing it at the end (when the work
is finished), and by locking the hub interface while the work routine
is running.  It also adds a check at the start of the routine to see
if the hub has already been disconnected, in which nothing should be
done.

CVE-2015-8816

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Alexandru Cornea <alexandru.cornea@intel.com>
Tested-by: Alexandru Cornea <alexandru.cornea@intel.com>
Fixes: 8520f38099cc ("USB: change hub initialization sleeps to delayed_work")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ luis: backported to 3.16:
  - Added forward declaration of hub_release() which mainline had with commit
    32a6958998c5 ("usb: hub: convert khubd into workqueue") ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
---
 drivers/usb/core/hub.c | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index dcee3f0..f46ac92 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -106,6 +106,7 @@ EXPORT_SYMBOL_GPL(ehci_cf_port_reset_rwsem);
 #define HUB_DEBOUNCE_STEP	  25
 #define HUB_DEBOUNCE_STABLE	 100
 
+static void hub_release(struct kref *kref);
 static int usb_reset_and_verify_device(struct usb_device *udev);
 
 static inline char *portspeed(struct usb_hub *hub, int portstatus)
@@ -1023,10 +1024,20 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type)
 	unsigned delay;
 
 	/* Continue a partial initialization */
-	if (type == HUB_INIT2)
-		goto init2;
-	if (type == HUB_INIT3)
+	if (type == HUB_INIT2 || type == HUB_INIT3) {
+		device_lock(hub->intfdev);
+
+		/* Was the hub disconnected while we were waiting? */
+		if (hub->disconnected) {
+			device_unlock(hub->intfdev);
+			kref_put(&hub->kref, hub_release);
+			return;
+		}
+		if (type == HUB_INIT2)
+			goto init2;
 		goto init3;
+	}
+	kref_get(&hub->kref);
 
 	/* The superspeed hub except for root hub has to use Hub Depth
 	 * value as an offset into the route string to locate the bits
@@ -1220,6 +1231,7 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type)
 			PREPARE_DELAYED_WORK(&hub->init_work, hub_init_func3);
 			schedule_delayed_work(&hub->init_work,
 					msecs_to_jiffies(delay));
+			device_unlock(hub->intfdev);
 			return;		/* Continues at init3: below */
 		} else {
 			msleep(delay);
@@ -1240,6 +1252,11 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type)
 	/* Allow autosuspend if it was suppressed */
 	if (type <= HUB_INIT3)
 		usb_autopm_put_interface_async(to_usb_interface(hub->intfdev));
+
+	if (type == HUB_INIT2 || type == HUB_INIT3)
+		device_unlock(hub->intfdev);
+
+	kref_put(&hub->kref, hub_release);
 }
 
 /* Implement the continuations for the delays above */
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3.14.y 3/9] cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind
  2016-07-18 20:53 [PATCH 3.14.y 1/9] udp: properly support MSG_PEEK with truncated buffers Charles (Chas) Williams
  2016-07-18 20:53 ` [PATCH 3.14.y 2/9] USB: fix invalid memory access in hub_activate() Charles (Chas) Williams
@ 2016-07-18 20:53 ` Charles (Chas) Williams
  2016-08-14 14:43   ` Greg KH
  2016-07-18 20:53 ` [PATCH 3.14.y 4/9] KEYS: potential uninitialized variable Charles (Chas) Williams
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Charles (Chas) Williams @ 2016-07-18 20:53 UTC (permalink / raw)
  To: stable; +Cc: Bjørn Mork, David S. Miller, Charles (Chas) Williams

From: Bjørn Mork <bjorn@mork.no>

commit 4d06dd537f95683aba3651098ae288b7cbff8274 upstream.

usbnet_link_change will call schedule_work and should be
avoided if bind is failing. Otherwise we will end up with
scheduled work referring to a netdev which has gone away.

Instead of making the call conditional, we can just defer
it to usbnet_probe, using the driver_info flag made for
this purpose.

CVE-2016-3951

Fixes: 8a34b0ae8778 ("usbnet: cdc_ncm: apply usbnet_link_change")
Reported-by: Andrey Konovalov <andreyknvl@gmail.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
[ciwillia@brocade.com: backported to 3.14: adjusted context]
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
---
 drivers/net/usb/cdc_ncm.c | 20 +++++---------------
 1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
index c663722..1f24dfc 100644
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -564,24 +564,13 @@ EXPORT_SYMBOL_GPL(cdc_ncm_select_altsetting);
 
 static int cdc_ncm_bind(struct usbnet *dev, struct usb_interface *intf)
 {
-	int ret;
-
 	/* MBIM backwards compatible function? */
 	cdc_ncm_select_altsetting(dev, intf);
 	if (cdc_ncm_comm_intf_is_mbim(intf->cur_altsetting))
 		return -ENODEV;
 
 	/* NCM data altsetting is always 1 */
-	ret = cdc_ncm_bind_common(dev, intf, 1);
-
-	/*
-	 * We should get an event when network connection is "connected" or
-	 * "disconnected". Set network connection in "disconnected" state
-	 * (carrier is OFF) during attach, so the IP network stack does not
-	 * start IPv6 negotiation and more.
-	 */
-	usbnet_link_change(dev, 0, 0);
-	return ret;
+	return cdc_ncm_bind_common(dev, intf, 1);
 }
 
 static void cdc_ncm_align_tail(struct sk_buff *skb, size_t modulus, size_t remainder, size_t max)
@@ -1110,7 +1099,8 @@ static int cdc_ncm_check_connect(struct usbnet *dev)
 
 static const struct driver_info cdc_ncm_info = {
 	.description = "CDC NCM",
-	.flags = FLAG_POINTTOPOINT | FLAG_NO_SETINT | FLAG_MULTI_PACKET,
+	.flags = FLAG_POINTTOPOINT | FLAG_NO_SETINT | FLAG_MULTI_PACKET
+			| FLAG_LINK_INTR,
 	.bind = cdc_ncm_bind,
 	.unbind = cdc_ncm_unbind,
 	.check_connect = cdc_ncm_check_connect,
@@ -1124,7 +1114,7 @@ static const struct driver_info cdc_ncm_info = {
 static const struct driver_info wwan_info = {
 	.description = "Mobile Broadband Network Device",
 	.flags = FLAG_POINTTOPOINT | FLAG_NO_SETINT | FLAG_MULTI_PACKET
-			| FLAG_WWAN,
+			| FLAG_LINK_INTR | FLAG_WWAN,
 	.bind = cdc_ncm_bind,
 	.unbind = cdc_ncm_unbind,
 	.check_connect = cdc_ncm_check_connect,
@@ -1138,7 +1128,7 @@ static const struct driver_info wwan_info = {
 static const struct driver_info wwan_noarp_info = {
 	.description = "Mobile Broadband Network Device (NO ARP)",
 	.flags = FLAG_POINTTOPOINT | FLAG_NO_SETINT | FLAG_MULTI_PACKET
-			| FLAG_WWAN | FLAG_NOARP,
+			| FLAG_LINK_INTR | FLAG_WWAN | FLAG_NOARP,
 	.bind = cdc_ncm_bind,
 	.unbind = cdc_ncm_unbind,
 	.check_connect = cdc_ncm_check_connect,
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3.14.y 4/9] KEYS: potential uninitialized variable
  2016-07-18 20:53 [PATCH 3.14.y 1/9] udp: properly support MSG_PEEK with truncated buffers Charles (Chas) Williams
  2016-07-18 20:53 ` [PATCH 3.14.y 2/9] USB: fix invalid memory access in hub_activate() Charles (Chas) Williams
  2016-07-18 20:53 ` [PATCH 3.14.y 3/9] cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind Charles (Chas) Williams
@ 2016-07-18 20:53 ` Charles (Chas) Williams
  2016-07-18 20:53 ` [PATCH 3.14.y 5/9] USB: usbfs: fix potential infoleak in devio Charles (Chas) Williams
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Charles (Chas) Williams @ 2016-07-18 20:53 UTC (permalink / raw)
  To: stable
  Cc: Dan Carpenter, David Howells, Linus Torvalds, Charles (Chas) Williams

From: Dan Carpenter <dan.carpenter@oracle.com>

commit 38327424b40bcebe2de92d07312c89360ac9229a upstream.

If __key_link_begin() failed then "edit" would be uninitialized.  I've
added a check to fix that.

This allows a random user to crash the kernel, though it's quite
difficult to achieve.  There are three ways it can be done as the user
would have to cause an error to occur in __key_link():

 (1) Cause the kernel to run out of memory.  In practice, this is difficult
     to achieve without ENOMEM cropping up elsewhere and aborting the
     attempt.

 (2) Revoke the destination keyring between the keyring ID being looked up
     and it being tested for revocation.  In practice, this is difficult to
     time correctly because the KEYCTL_REJECT function can only be used
     from the request-key upcall process.  Further, users can only make use
     of what's in /sbin/request-key.conf, though this does including a
     rejection debugging test - which means that the destination keyring
     has to be the caller's session keyring in practice.

 (3) Have just enough key quota available to create a key, a new session
     keyring for the upcall and a link in the session keyring, but not then
     sufficient quota to create a link in the nominated destination keyring
     so that it fails with EDQUOT.

The bug can be triggered using option (3) above using something like the
following:

	echo 80 >/proc/sys/kernel/keys/root_maxbytes
	keyctl request2 user debug:fred negate @t

The above sets the quota to something much lower (80) to make the bug
easier to trigger, but this is dependent on the system.  Note also that
the name of the keyring created contains a random number that may be
between 1 and 10 characters in size, so may throw the test off by
changing the amount of quota used.

Assuming the failure occurs, something like the following will be seen:

	kfree_debugcheck: out of range ptr 6b6b6b6b6b6b6b68h
	------------[ cut here ]------------
	kernel BUG at ../mm/slab.c:2821!
	...
	RIP: 0010:[<ffffffff811600f9>] kfree_debugcheck+0x20/0x25
	RSP: 0018:ffff8804014a7de8  EFLAGS: 00010092
	RAX: 0000000000000034 RBX: 6b6b6b6b6b6b6b68 RCX: 0000000000000000
	RDX: 0000000000040001 RSI: 00000000000000f6 RDI: 0000000000000300
	RBP: ffff8804014a7df0 R08: 0000000000000001 R09: 0000000000000000
	R10: ffff8804014a7e68 R11: 0000000000000054 R12: 0000000000000202
	R13: ffffffff81318a66 R14: 0000000000000000 R15: 0000000000000001
	...
	Call Trace:
	  kfree+0xde/0x1bc
	  assoc_array_cancel_edit+0x1f/0x36
	  __key_link_end+0x55/0x63
	  key_reject_and_link+0x124/0x155
	  keyctl_reject_key+0xb6/0xe0
	  keyctl_negate_key+0x10/0x12
	  SyS_keyctl+0x9f/0xe7
	  do_syscall_64+0x63/0x13a
	  entry_SYSCALL64_slow_path+0x25/0x25

CVE-2016-4470

Fixes: f70e2e06196a ('KEYS: Do preallocation for __key_link()')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ciwillia@brocade.com: backported to 3.14: adjusted context]
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
---
 security/keys/key.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/security/keys/key.c b/security/keys/key.c
index 6e21c11..9478d66 100644
--- a/security/keys/key.c
+++ b/security/keys/key.c
@@ -575,7 +575,7 @@ int key_reject_and_link(struct key *key,
 
 	mutex_unlock(&key_construction_mutex);
 
-	if (keyring)
+	if (keyring && link_ret == 0)
 		__key_link_end(keyring, &key->index_key, edit);
 
 	/* wake up anyone waiting for a key to be constructed */
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3.14.y 5/9] USB: usbfs: fix potential infoleak in devio
  2016-07-18 20:53 [PATCH 3.14.y 1/9] udp: properly support MSG_PEEK with truncated buffers Charles (Chas) Williams
                   ` (2 preceding siblings ...)
  2016-07-18 20:53 ` [PATCH 3.14.y 4/9] KEYS: potential uninitialized variable Charles (Chas) Williams
@ 2016-07-18 20:53 ` Charles (Chas) Williams
  2016-08-14 14:44   ` Greg Kroah-Hartman
  2016-07-18 20:58 ` [PATCH 3.14.y 6/9] mm: migrate dirty page without clear_page_dirty_for_io etc Charles (Chas) Williams
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Charles (Chas) Williams @ 2016-07-18 20:53 UTC (permalink / raw)
  To: stable
  Cc: Kangjie Lu, Kangjie Lu, Greg Kroah-Hartman, Charles (Chas) Williams

From: Kangjie Lu <kangjielu@gmail.com>

commit 681fef8380eb818c0b845fca5d2ab1dcbab114ee upstream.

The stack object “ci” has a total size of 8 bytes. Its last 3 bytes
are padding bytes which are not initialized and leaked to userland
via “copy_to_user”.

CVE-2016-4482

Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ciwillia@brocade.com: backported to 3.14: adjusted context]
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
---
 drivers/usb/core/devio.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
index 8016aaa..a787664 100644
--- a/drivers/usb/core/devio.c
+++ b/drivers/usb/core/devio.c
@@ -1104,10 +1104,11 @@ static int proc_getdriver(struct dev_state *ps, void __user *arg)
 
 static int proc_connectinfo(struct dev_state *ps, void __user *arg)
 {
-	struct usbdevfs_connectinfo ci = {
-		.devnum = ps->dev->devnum,
-		.slow = ps->dev->speed == USB_SPEED_LOW
-	};
+	struct usbdevfs_connectinfo ci;
+
+	memset(&ci, 0, sizeof(ci));
+	ci.devnum = ps->dev->devnum;
+	ci.slow = ps->dev->speed == USB_SPEED_LOW;
 
 	if (copy_to_user(arg, &ci, sizeof(ci)))
 		return -EFAULT;
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3.14.y 6/9] mm: migrate dirty page without clear_page_dirty_for_io etc
  2016-07-18 20:53 [PATCH 3.14.y 1/9] udp: properly support MSG_PEEK with truncated buffers Charles (Chas) Williams
                   ` (3 preceding siblings ...)
  2016-07-18 20:53 ` [PATCH 3.14.y 5/9] USB: usbfs: fix potential infoleak in devio Charles (Chas) Williams
@ 2016-07-18 20:58 ` Charles (Chas) Williams
  2016-07-18 20:59 ` [PATCH 3.14.y 7/9] printk: do cond_resched() between lines while outputting to consoles Charles (Chas) Williams
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Charles (Chas) Williams @ 2016-07-18 20:58 UTC (permalink / raw)
  To: stable
  Cc: Hugh Dickins, Christoph Lameter, Kirill A. Shutemov,
	Rik van Riel, Vlastimil Babka, Davidlohr Bueso, Oleg Nesterov,
	Sasha Levin, Dmitry Vyukov, KOSAKI Motohiro, Andrew Morton,
	Linus Torvalds, Charles (Chas) Williams

From: Hugh Dickins <hughd@google.com>

commit 42cb14b110a5698ccf26ce59c4441722605a3743 upstream.

clear_page_dirty_for_io() has accumulated writeback and memcg subtleties
since v2.6.16 first introduced page migration; and the set_page_dirty()
which completed its migration of PageDirty, later had to be moderated to
__set_page_dirty_nobuffers(); then PageSwapBacked had to skip that too.

No actual problems seen with this procedure recently, but if you look into
what the clear_page_dirty_for_io(page)+set_page_dirty(newpage) is actually
achieving, it turns out to be nothing more than moving the PageDirty flag,
and its NR_FILE_DIRTY stat from one zone to another.

It would be good to avoid a pile of irrelevant decrementations and
incrementations, and improper event counting, and unnecessary descent of
the radix_tree under tree_lock (to set the PAGECACHE_TAG_DIRTY which
radix_tree_replace_slot() left in place anyway).

Do the NR_FILE_DIRTY movement, like the other stats movements, while
interrupts still disabled in migrate_page_move_mapping(); and don't even
bother if the zone is the same.  Do the PageDirty movement there under
tree_lock too, where old page is frozen and newpage not yet visible:
bearing in mind that as soon as newpage becomes visible in radix_tree, an
un-page-locked set_page_dirty() might interfere (or perhaps that's just
not possible: anything doing so should already hold an additional
reference to the old page, preventing its migration; but play safe).

But we do still need to transfer PageDirty in migrate_page_copy(), for
those who don't go the mapping route through migrate_page_move_mapping().

CVE-2016-3070

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ciwillia@brocade.com: backported to 3.14: adjusted context]
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
---
 mm/migrate.c | 51 +++++++++++++++++++++++++++++++--------------------
 1 file changed, 31 insertions(+), 20 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 3acac4a..98c5464 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -30,6 +30,7 @@
 #include <linux/mempolicy.h>
 #include <linux/vmalloc.h>
 #include <linux/security.h>
+#include <linux/backing-dev.h>
 #include <linux/memcontrol.h>
 #include <linux/syscalls.h>
 #include <linux/hugetlb.h>
@@ -344,6 +345,8 @@ int migrate_page_move_mapping(struct address_space *mapping,
 		struct buffer_head *head, enum migrate_mode mode,
 		int extra_count)
 {
+	struct zone *oldzone, *newzone;
+	int dirty;
 	int expected_count = 1 + extra_count;
 	void **pslot;
 
@@ -354,6 +357,9 @@ int migrate_page_move_mapping(struct address_space *mapping,
 		return MIGRATEPAGE_SUCCESS;
 	}
 
+	oldzone = page_zone(page);
+	newzone = page_zone(newpage);
+
 	spin_lock_irq(&mapping->tree_lock);
 
 	pslot = radix_tree_lookup_slot(&mapping->page_tree,
@@ -394,6 +400,13 @@ int migrate_page_move_mapping(struct address_space *mapping,
 		set_page_private(newpage, page_private(page));
 	}
 
+	/* Move dirty while page refs frozen and newpage not yet exposed */
+	dirty = PageDirty(page);
+	if (dirty) {
+		ClearPageDirty(page);
+		SetPageDirty(newpage);
+	}
+
 	radix_tree_replace_slot(pslot, newpage);
 
 	/*
@@ -403,6 +416,9 @@ int migrate_page_move_mapping(struct address_space *mapping,
 	 */
 	page_unfreeze_refs(page, expected_count - 1);
 
+	spin_unlock(&mapping->tree_lock);
+	/* Leave irq disabled to prevent preemption while updating stats */
+
 	/*
 	 * If moved to a different zone then also account
 	 * the page for that zone. Other VM counters will be
@@ -413,13 +429,19 @@ int migrate_page_move_mapping(struct address_space *mapping,
 	 * via NR_FILE_PAGES and NR_ANON_PAGES if they
 	 * are mapped to swap space.
 	 */
-	__dec_zone_page_state(page, NR_FILE_PAGES);
-	__inc_zone_page_state(newpage, NR_FILE_PAGES);
-	if (!PageSwapCache(page) && PageSwapBacked(page)) {
-		__dec_zone_page_state(page, NR_SHMEM);
-		__inc_zone_page_state(newpage, NR_SHMEM);
+	if (newzone != oldzone) {
+		__dec_zone_state(oldzone, NR_FILE_PAGES);
+		__inc_zone_state(newzone, NR_FILE_PAGES);
+		if (PageSwapBacked(page) && !PageSwapCache(page)) {
+			__dec_zone_state(oldzone, NR_SHMEM);
+			__inc_zone_state(newzone, NR_SHMEM);
+		}
+		if (dirty && mapping_cap_account_dirty(mapping)) {
+			__dec_zone_state(oldzone, NR_FILE_DIRTY);
+			__inc_zone_state(newzone, NR_FILE_DIRTY);
+		}
 	}
-	spin_unlock_irq(&mapping->tree_lock);
+	local_irq_enable();
 
 	return MIGRATEPAGE_SUCCESS;
 }
@@ -543,20 +565,9 @@ void migrate_page_copy(struct page *newpage, struct page *page)
 	if (PageMappedToDisk(page))
 		SetPageMappedToDisk(newpage);
 
-	if (PageDirty(page)) {
-		clear_page_dirty_for_io(page);
-		/*
-		 * Want to mark the page and the radix tree as dirty, and
-		 * redo the accounting that clear_page_dirty_for_io undid,
-		 * but we can't use set_page_dirty because that function
-		 * is actually a signal that all of the page has become dirty.
-		 * Whereas only part of our page may be dirty.
-		 */
-		if (PageSwapBacked(page))
-			SetPageDirty(newpage);
-		else
-			__set_page_dirty_nobuffers(newpage);
- 	}
+	/* Move dirty on pages not done by migrate_page_move_mapping() */
+	if (PageDirty(page))
+		SetPageDirty(newpage);
 
 	/*
 	 * Copy NUMA information to the new page, to prevent over-eager
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3.14.y 7/9] printk: do cond_resched() between lines while outputting to consoles
  2016-07-18 20:53 [PATCH 3.14.y 1/9] udp: properly support MSG_PEEK with truncated buffers Charles (Chas) Williams
                   ` (4 preceding siblings ...)
  2016-07-18 20:58 ` [PATCH 3.14.y 6/9] mm: migrate dirty page without clear_page_dirty_for_io etc Charles (Chas) Williams
@ 2016-07-18 20:59 ` Charles (Chas) Williams
  2016-07-18 20:59 ` [PATCH 3.14.y 8/9] HID: hiddev: validate num_values for HIDIOCGUSAGES, HIDIOCSUSAGES commands Charles (Chas) Williams
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Charles (Chas) Williams @ 2016-07-18 20:59 UTC (permalink / raw)
  To: stable
  Cc: Tejun Heo, Dave Jones, Kyle McMartin, Andrew Morton,
	Linus Torvalds, Chas Williams

From: Tejun Heo <tj@kernel.org>

commit 8d91f8b15361dfb438ab6eb3b319e2ded43458ff upstream.

@console_may_schedule tracks whether console_sem was acquired through
lock or trylock.  If the former, we're inside a sleepable context and
console_conditional_schedule() performs cond_resched().  This allows
console drivers which use console_lock for synchronization to yield
while performing time-consuming operations such as scrolling.

However, the actual console outputting is performed while holding
irq-safe logbuf_lock, so console_unlock() clears @console_may_schedule
before starting outputting lines.  Also, only a few drivers call
console_conditional_schedule() to begin with.  This means that when a
lot of lines need to be output by console_unlock(), for example on a
console registration, the task doing console_unlock() may not yield for
a long time on a non-preemptible kernel.

If this happens with a slow console devices, for example a serial
console, the outputting task may occupy the cpu for a very long time.
Long enough to trigger softlockup and/or RCU stall warnings, which in
turn pile more messages, sometimes enough to trigger the next cycle of
warnings incapacitating the system.

Fix it by making console_unlock() insert cond_resched() between lines if
@console_may_schedule.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Calvin Owens <calvinowens@fb.com>
Acked-by: Jan Kara <jack@suse.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Kyle McMartin <kyle@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ciwillia@brocade.com: adjust context for 3.14.y]
Signed-off-by: Chas Williams <ciwillia@brocade.com>
---
 include/linux/console.h |  1 +
 kernel/panic.c          |  3 +++
 kernel/printk/printk.c  | 35 ++++++++++++++++++++++++++++++++++-
 3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/include/linux/console.h b/include/linux/console.h
index 7571a16..ac1599b 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -150,6 +150,7 @@ extern int console_trylock(void);
 extern void console_unlock(void);
 extern void console_conditional_schedule(void);
 extern void console_unblank(void);
+extern void console_flush_on_panic(void);
 extern struct tty_driver *console_device(int *);
 extern void console_stop(struct console *);
 extern void console_start(struct console *);
diff --git a/kernel/panic.c b/kernel/panic.c
index 6d63003..16458b3 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -23,6 +23,7 @@
 #include <linux/sysrq.h>
 #include <linux/init.h>
 #include <linux/nmi.h>
+#include <linux/console.h>
 
 #define PANIC_TIMER_STEP 100
 #define PANIC_BLINK_SPD 18
@@ -133,6 +134,8 @@ void panic(const char *fmt, ...)
 
 	bust_spinlocks(0);
 
+	console_flush_on_panic();
+
 	if (!panic_blink)
 		panic_blink = no_blink;
 
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 02e7fb4..2e0406f 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2011,13 +2011,24 @@ void console_unlock(void)
 	static u64 seen_seq;
 	unsigned long flags;
 	bool wake_klogd = false;
-	bool retry;
+	bool do_cond_resched, retry;
 
 	if (console_suspended) {
 		up(&console_sem);
 		return;
 	}
 
+	/*
+	 * Console drivers are called under logbuf_lock, so
+	 * @console_may_schedule should be cleared before; however, we may
+	 * end up dumping a lot of lines, for example, if called from
+	 * console registration path, and should invoke cond_resched()
+	 * between lines if allowable.  Not doing so can cause a very long
+	 * scheduling stall on a slow console leading to RCU stall and
+	 * softlockup warnings which exacerbate the issue with more
+	 * messages practically incapacitating the system.
+	 */
+	do_cond_resched = console_may_schedule;
 	console_may_schedule = 0;
 
 	/* flush buffered message fragment immediately to console */
@@ -2074,6 +2085,9 @@ skip:
 		call_console_drivers(level, text, len);
 		start_critical_timings();
 		local_irq_restore(flags);
+
+		if (do_cond_resched)
+			cond_resched();
 	}
 	console_locked = 0;
 	mutex_release(&console_lock_dep_map, 1, _RET_IP_);
@@ -2142,6 +2156,25 @@ void console_unblank(void)
 	console_unlock();
 }
 
+/**
+ * console_flush_on_panic - flush console content on panic
+ *
+ * Immediately output all pending messages no matter what.
+ */
+void console_flush_on_panic(void)
+{
+	/*
+	 * If someone else is holding the console lock, trylock will fail
+	 * and may_schedule may be set.  Ignore and proceed to unlock so
+	 * that messages are flushed out.  As this can be called from any
+	 * context and we don't want to get preempted while flushing,
+	 * ensure may_schedule is cleared.
+	 */
+	console_trylock();
+	console_may_schedule = 0;
+	console_unlock();
+}
+
 /*
  * Return the console tty driver structure and its associated index
  */
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3.14.y 8/9] HID: hiddev: validate num_values for HIDIOCGUSAGES, HIDIOCSUSAGES commands
  2016-07-18 20:53 [PATCH 3.14.y 1/9] udp: properly support MSG_PEEK with truncated buffers Charles (Chas) Williams
                   ` (5 preceding siblings ...)
  2016-07-18 20:59 ` [PATCH 3.14.y 7/9] printk: do cond_resched() between lines while outputting to consoles Charles (Chas) Williams
@ 2016-07-18 20:59 ` Charles (Chas) Williams
  2016-07-18 20:59   ` Charles (Chas) Williams
  2016-08-14 14:42 ` [PATCH 3.14.y 1/9] udp: properly support MSG_PEEK with truncated buffers Greg KH
  8 siblings, 0 replies; 17+ messages in thread
From: Charles (Chas) Williams @ 2016-07-18 20:59 UTC (permalink / raw)
  To: stable; +Cc: Scott Bauer, Jiri Kosina, Chas Williams

From: Scott Bauer <sbauer@plzdonthack.me>

commit 93a2001bdfd5376c3dc2158653034c20392d15c5 upstream.

This patch validates the num_values parameter from userland during the
HIDIOCGUSAGES and HIDIOCSUSAGES commands. Previously, if the report id was set
to HID_REPORT_ID_UNKNOWN, we would fail to validate the num_values parameter
leading to a heap overflow.

CVE-2016-5829

Cc: stable@vger.kernel.org
Signed-off-by: Scott Bauer <sbauer@plzdonthack.me>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Chas Williams <ciwillia@brocade.com>
---
 drivers/hid/usbhid/hiddev.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/hid/usbhid/hiddev.c b/drivers/hid/usbhid/hiddev.c
index 2f1ddca..700145b 100644
--- a/drivers/hid/usbhid/hiddev.c
+++ b/drivers/hid/usbhid/hiddev.c
@@ -516,13 +516,13 @@ static noinline int hiddev_ioctl_usage(struct hiddev *hiddev, unsigned int cmd,
 					goto inval;
 			} else if (uref->usage_index >= field->report_count)
 				goto inval;
-
-			else if ((cmd == HIDIOCGUSAGES || cmd == HIDIOCSUSAGES) &&
-				 (uref_multi->num_values > HID_MAX_MULTI_USAGES ||
-				  uref->usage_index + uref_multi->num_values > field->report_count))
-				goto inval;
 		}
 
+		if ((cmd == HIDIOCGUSAGES || cmd == HIDIOCSUSAGES) &&
+		    (uref_multi->num_values > HID_MAX_MULTI_USAGES ||
+		     uref->usage_index + uref_multi->num_values > field->report_count))
+			goto inval;
+
 		switch (cmd) {
 		case HIDIOCGUSAGE:
 			uref->value = field->value[uref->usage_index];
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3.14.y 9/9] x86/mm: Add barriers and document switch_mm()-vs-flush synchronization
  2016-07-18 20:53 [PATCH 3.14.y 1/9] udp: properly support MSG_PEEK with truncated buffers Charles (Chas) Williams
@ 2016-07-18 20:59   ` Charles (Chas) Williams
  2016-07-18 20:53 ` [PATCH 3.14.y 3/9] cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind Charles (Chas) Williams
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Charles (Chas) Williams @ 2016-07-18 20:59 UTC (permalink / raw)
  To: stable
  Cc: Andy Lutomirski, Andrew Morton, Andy Lutomirski, Borislav Petkov,
	Brian Gerst, Dave Hansen, Denys Vlasenko, H. Peter Anvin,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	linux-mm, Ingo Molnar, Luis Henriques, Charles (Chas) Williams

From: Andy Lutomirski <luto@kernel.org>

commit 71b3c126e61177eb693423f2e18a1914205b165e upstream.

When switch_mm() activates a new PGD, it also sets a bit that
tells other CPUs that the PGD is in use so that TLB flush IPIs
will be sent.  In order for that to work correctly, the bit
needs to be visible prior to loading the PGD and therefore
starting to fill the local TLB.

Document all the barriers that make this work correctly and add
a couple that were missing.

CVE-2016-2069

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[ luis: backported to 3.16:
  - dropped N/A comment in flush_tlb_mm_range()
  - adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
[ciwillia@brocade.com: backported to 3.14: adjusted context]
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
---
 arch/x86/include/asm/mmu_context.h | 32 +++++++++++++++++++++++++++++++-
 arch/x86/mm/tlb.c                  | 25 ++++++++++++++++++++++---
 2 files changed, 53 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index be12c53..c0d2f6b 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -42,7 +42,32 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 #endif
 		cpumask_set_cpu(cpu, mm_cpumask(next));
 
-		/* Re-load page tables */
+		/*
+		 * Re-load page tables.
+		 *
+		 * This logic has an ordering constraint:
+		 *
+		 *  CPU 0: Write to a PTE for 'next'
+		 *  CPU 0: load bit 1 in mm_cpumask.  if nonzero, send IPI.
+		 *  CPU 1: set bit 1 in next's mm_cpumask
+		 *  CPU 1: load from the PTE that CPU 0 writes (implicit)
+		 *
+		 * We need to prevent an outcome in which CPU 1 observes
+		 * the new PTE value and CPU 0 observes bit 1 clear in
+		 * mm_cpumask.  (If that occurs, then the IPI will never
+		 * be sent, and CPU 0's TLB will contain a stale entry.)
+		 *
+		 * The bad outcome can occur if either CPU's load is
+		 * reordered before that CPU's store, so both CPUs much
+		 * execute full barriers to prevent this from happening.
+		 *
+		 * Thus, switch_mm needs a full barrier between the
+		 * store to mm_cpumask and any operation that could load
+		 * from next->pgd.  This barrier synchronizes with
+		 * remote TLB flushers.  Fortunately, load_cr3 is
+		 * serializing and thus acts as a full barrier.
+		 *
+		 */
 		load_cr3(next->pgd);
 
 		/* Stop flush ipis for the previous mm */
@@ -65,10 +90,15 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 			 * schedule, protecting us from simultaneous changes.
 			 */
 			cpumask_set_cpu(cpu, mm_cpumask(next));
+
 			/*
 			 * We were in lazy tlb mode and leave_mm disabled
 			 * tlb flush IPI delivery. We must reload CR3
 			 * to make sure to use no freed page tables.
+			 *
+			 * As above, this is a barrier that forces
+			 * TLB repopulation to be ordered after the
+			 * store to mm_cpumask.
 			 */
 			load_cr3(next->pgd);
 			load_LDT_nolock(&next->context);
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index dd8dda1..46e82e7 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -152,7 +152,10 @@ void flush_tlb_current_task(void)
 	preempt_disable();
 
 	count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
+
+	/* This is an implicit full barrier that synchronizes with switch_mm. */
 	local_flush_tlb();
+
 	if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
 		flush_tlb_others(mm_cpumask(mm), mm, 0UL, TLB_FLUSH_ALL);
 	preempt_enable();
@@ -166,11 +169,19 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
 	unsigned long nr_base_pages;
 
 	preempt_disable();
-	if (current->active_mm != mm)
+	if (current->active_mm != mm) {
+		/* Synchronize with switch_mm. */
+		smp_mb();
+
 		goto flush_all;
+	}
 
 	if (!current->mm) {
 		leave_mm(smp_processor_id());
+
+		/* Synchronize with switch_mm. */
+		smp_mb();
+
 		goto flush_all;
 	}
 
@@ -222,10 +233,18 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long start)
 	preempt_disable();
 
 	if (current->active_mm == mm) {
-		if (current->mm)
+		if (current->mm) {
+			/*
+			 * Implicit full barrier (INVLPG) that synchronizes
+			 * with switch_mm.
+			 */
 			__flush_tlb_one(start);
-		else
+		} else {
 			leave_mm(smp_processor_id());
+
+			/* Synchronize with switch_mm. */
+			smp_mb();
+		}
 	}
 
 	if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
-- 
2.5.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3.14.y 9/9] x86/mm: Add barriers and document switch_mm()-vs-flush synchronization
@ 2016-07-18 20:59   ` Charles (Chas) Williams
  0 siblings, 0 replies; 17+ messages in thread
From: Charles (Chas) Williams @ 2016-07-18 20:59 UTC (permalink / raw)
  To: stable
  Cc: Andy Lutomirski, Andrew Morton, Andy Lutomirski, Borislav Petkov,
	Brian Gerst, Dave Hansen, Denys Vlasenko, H. Peter Anvin,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	linux-mm, Ingo Molnar, Luis Henriques, Charles (Chas) Williams

From: Andy Lutomirski <luto@kernel.org>

commit 71b3c126e61177eb693423f2e18a1914205b165e upstream.

When switch_mm() activates a new PGD, it also sets a bit that
tells other CPUs that the PGD is in use so that TLB flush IPIs
will be sent.  In order for that to work correctly, the bit
needs to be visible prior to loading the PGD and therefore
starting to fill the local TLB.

Document all the barriers that make this work correctly and add
a couple that were missing.

CVE-2016-2069

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[ luis: backported to 3.16:
  - dropped N/A comment in flush_tlb_mm_range()
  - adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
[ciwillia@brocade.com: backported to 3.14: adjusted context]
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
---
 arch/x86/include/asm/mmu_context.h | 32 +++++++++++++++++++++++++++++++-
 arch/x86/mm/tlb.c                  | 25 ++++++++++++++++++++++---
 2 files changed, 53 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index be12c53..c0d2f6b 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -42,7 +42,32 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 #endif
 		cpumask_set_cpu(cpu, mm_cpumask(next));
 
-		/* Re-load page tables */
+		/*
+		 * Re-load page tables.
+		 *
+		 * This logic has an ordering constraint:
+		 *
+		 *  CPU 0: Write to a PTE for 'next'
+		 *  CPU 0: load bit 1 in mm_cpumask.  if nonzero, send IPI.
+		 *  CPU 1: set bit 1 in next's mm_cpumask
+		 *  CPU 1: load from the PTE that CPU 0 writes (implicit)
+		 *
+		 * We need to prevent an outcome in which CPU 1 observes
+		 * the new PTE value and CPU 0 observes bit 1 clear in
+		 * mm_cpumask.  (If that occurs, then the IPI will never
+		 * be sent, and CPU 0's TLB will contain a stale entry.)
+		 *
+		 * The bad outcome can occur if either CPU's load is
+		 * reordered before that CPU's store, so both CPUs much
+		 * execute full barriers to prevent this from happening.
+		 *
+		 * Thus, switch_mm needs a full barrier between the
+		 * store to mm_cpumask and any operation that could load
+		 * from next->pgd.  This barrier synchronizes with
+		 * remote TLB flushers.  Fortunately, load_cr3 is
+		 * serializing and thus acts as a full barrier.
+		 *
+		 */
 		load_cr3(next->pgd);
 
 		/* Stop flush ipis for the previous mm */
@@ -65,10 +90,15 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 			 * schedule, protecting us from simultaneous changes.
 			 */
 			cpumask_set_cpu(cpu, mm_cpumask(next));
+
 			/*
 			 * We were in lazy tlb mode and leave_mm disabled
 			 * tlb flush IPI delivery. We must reload CR3
 			 * to make sure to use no freed page tables.
+			 *
+			 * As above, this is a barrier that forces
+			 * TLB repopulation to be ordered after the
+			 * store to mm_cpumask.
 			 */
 			load_cr3(next->pgd);
 			load_LDT_nolock(&next->context);
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index dd8dda1..46e82e7 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -152,7 +152,10 @@ void flush_tlb_current_task(void)
 	preempt_disable();
 
 	count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
+
+	/* This is an implicit full barrier that synchronizes with switch_mm. */
 	local_flush_tlb();
+
 	if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
 		flush_tlb_others(mm_cpumask(mm), mm, 0UL, TLB_FLUSH_ALL);
 	preempt_enable();
@@ -166,11 +169,19 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
 	unsigned long nr_base_pages;
 
 	preempt_disable();
-	if (current->active_mm != mm)
+	if (current->active_mm != mm) {
+		/* Synchronize with switch_mm. */
+		smp_mb();
+
 		goto flush_all;
+	}
 
 	if (!current->mm) {
 		leave_mm(smp_processor_id());
+
+		/* Synchronize with switch_mm. */
+		smp_mb();
+
 		goto flush_all;
 	}
 
@@ -222,10 +233,18 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long start)
 	preempt_disable();
 
 	if (current->active_mm == mm) {
-		if (current->mm)
+		if (current->mm) {
+			/*
+			 * Implicit full barrier (INVLPG) that synchronizes
+			 * with switch_mm.
+			 */
 			__flush_tlb_one(start);
-		else
+		} else {
 			leave_mm(smp_processor_id());
+
+			/* Synchronize with switch_mm. */
+			smp_mb();
+		}
 	}
 
 	if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
-- 
2.5.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 3.14.y 1/9] udp: properly support MSG_PEEK with truncated buffers
  2016-07-18 20:53 [PATCH 3.14.y 1/9] udp: properly support MSG_PEEK with truncated buffers Charles (Chas) Williams
                   ` (7 preceding siblings ...)
  2016-07-18 20:59   ` Charles (Chas) Williams
@ 2016-08-14 14:42 ` Greg KH
  2016-08-15  7:21   ` Michal Kubecek
  8 siblings, 1 reply; 17+ messages in thread
From: Greg KH @ 2016-08-14 14:42 UTC (permalink / raw)
  To: Charles (Chas) Williams
  Cc: stable, Eric Dumazet, David S. Miller, Luis Henriques

On Mon, Jul 18, 2016 at 04:53:15PM -0400, Charles (Chas) Williams wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> commit 197c949e7798fbf28cfadc69d9ca0c2abbf93191 upstream.

Why isn't this in 4.4-stable?  I can't take it into 3.14-stable unless I
also have it in 4.4, otherwise it would be a regression when people
upgraded, right?

And if this isn't in 4.4-stable, why not?  I'm guessing that the authors
didn't think it was necessary...

> Backport of this upstream commit into stable kernels :
> 89c22d8c3b27 ("net: Fix skb csum races when peeking")
> exposed a bug in udp stack vs MSG_PEEK support, when user provides
> a buffer smaller than skb payload.

Or maybe someone missed the above text?

Anyway, I'm not going to do anything with this until I learn more...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3.14.y 3/9] cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind
  2016-07-18 20:53 ` [PATCH 3.14.y 3/9] cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind Charles (Chas) Williams
@ 2016-08-14 14:43   ` Greg KH
  2016-08-14 14:52     ` Bjørn Mork
  0 siblings, 1 reply; 17+ messages in thread
From: Greg KH @ 2016-08-14 14:43 UTC (permalink / raw)
  To: Charles (Chas) Williams; +Cc: stable, Bjørn Mork, David S. Miller

On Mon, Jul 18, 2016 at 04:53:17PM -0400, Charles (Chas) Williams wrote:
> From: Bj�rn Mork <bjorn@mork.no>
> 
> commit 4d06dd537f95683aba3651098ae288b7cbff8274 upstream.
> 
> usbnet_link_change will call schedule_work and should be
> avoided if bind is failing. Otherwise we will end up with
> scheduled work referring to a netdev which has gone away.
> 
> Instead of making the call conditional, we can just defer
> it to usbnet_probe, using the driver_info flag made for
> this purpose.
> 
> CVE-2016-3951
> 
> Fixes: 8a34b0ae8778 ("usbnet: cdc_ncm: apply usbnet_link_change")
> Reported-by: Andrey Konovalov <andreyknvl@gmail.com>
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Bj�rn Mork <bjorn@mork.no>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> [ciwillia@brocade.com: backported to 3.14: adjusted context]
> Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
> ---
>  drivers/net/usb/cdc_ncm.c | 20 +++++---------------
>  1 file changed, 5 insertions(+), 15 deletions(-)

Another patch that isn't in 4.4-stable, why not?

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3.14.y 5/9] USB: usbfs: fix potential infoleak in devio
  2016-07-18 20:53 ` [PATCH 3.14.y 5/9] USB: usbfs: fix potential infoleak in devio Charles (Chas) Williams
@ 2016-08-14 14:44   ` Greg Kroah-Hartman
  2016-08-15 14:41     ` Charles (Chas) Williams
  0 siblings, 1 reply; 17+ messages in thread
From: Greg Kroah-Hartman @ 2016-08-14 14:44 UTC (permalink / raw)
  To: Charles (Chas) Williams; +Cc: stable, Kangjie Lu, Kangjie Lu

On Mon, Jul 18, 2016 at 04:53:19PM -0400, Charles (Chas) Williams wrote:
> From: Kangjie Lu <kangjielu@gmail.com>
> 
> commit 681fef8380eb818c0b845fca5d2ab1dcbab114ee upstream.
> 
> The stack object “ci” has a total size of 8 bytes. Its last 3 bytes
> are padding bytes which are not initialized and leaked to userland
> via “copy_to_user”.
> 
> CVE-2016-4482
> 
> Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> [ciwillia@brocade.com: backported to 3.14: adjusted context]
> Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
> ---
>  drivers/usb/core/devio.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)

Another one not in 4.4 :(


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3.14.y 3/9] cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind
  2016-08-14 14:43   ` Greg KH
@ 2016-08-14 14:52     ` Bjørn Mork
  2016-08-14 15:05       ` Greg KH
  0 siblings, 1 reply; 17+ messages in thread
From: Bjørn Mork @ 2016-08-14 14:52 UTC (permalink / raw)
  To: Greg KH; +Cc: Charles (Chas) Williams, stable, David S. Miller

Greg KH <greg@kroah.com> writes:
> On Mon, Jul 18, 2016 at 04:53:17PM -0400, Charles (Chas) Williams wrote:
>> From: Bjørn Mork <bjorn@mork.no>
>> 
>> commit 4d06dd537f95683aba3651098ae288b7cbff8274 upstream.
>> 
>> usbnet_link_change will call schedule_work and should be
>> avoided if bind is failing. Otherwise we will end up with
>> scheduled work referring to a netdev which has gone away.
>> 
>> Instead of making the call conditional, we can just defer
>> it to usbnet_probe, using the driver_info flag made for
>> this purpose.
>> 
>> CVE-2016-3951
>> 
>> Fixes: 8a34b0ae8778 ("usbnet: cdc_ncm: apply usbnet_link_change")
>> Reported-by: Andrey Konovalov <andreyknvl@gmail.com>
>> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
>> Signed-off-by: Bjørn Mork <bjorn@mork.no>
>> Signed-off-by: David S. Miller <davem@davemloft.net>
>> [ciwillia@brocade.com: backported to 3.14: adjusted context]
>> Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
>> ---
>>  drivers/net/usb/cdc_ncm.c | 20 +++++---------------
>>  1 file changed, 5 insertions(+), 15 deletions(-)
>
> Another patch that isn't in 4.4-stable, why not?

Probably because I was sloppy when first posting it, and forgot to ask
David to queue it for stable.  I see that this has later been requested
by  Chas Williams:
https://www.mail-archive.com/netdev@vger.kernel.org/msg120134.html 

Yes, it should go into 4.4-stable.


Bjørn

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3.14.y 3/9] cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind
  2016-08-14 14:52     ` Bjørn Mork
@ 2016-08-14 15:05       ` Greg KH
  0 siblings, 0 replies; 17+ messages in thread
From: Greg KH @ 2016-08-14 15:05 UTC (permalink / raw)
  To: Bjørn Mork; +Cc: Charles (Chas) Williams, stable, David S. Miller

On Sun, Aug 14, 2016 at 04:52:55PM +0200, Bj�rn Mork wrote:
> Greg KH <greg@kroah.com> writes:
> > On Mon, Jul 18, 2016 at 04:53:17PM -0400, Charles (Chas) Williams wrote:
> >> From: Bj�rn Mork <bjorn@mork.no>
> >> 
> >> commit 4d06dd537f95683aba3651098ae288b7cbff8274 upstream.
> >> 
> >> usbnet_link_change will call schedule_work and should be
> >> avoided if bind is failing. Otherwise we will end up with
> >> scheduled work referring to a netdev which has gone away.
> >> 
> >> Instead of making the call conditional, we can just defer
> >> it to usbnet_probe, using the driver_info flag made for
> >> this purpose.
> >> 
> >> CVE-2016-3951
> >> 
> >> Fixes: 8a34b0ae8778 ("usbnet: cdc_ncm: apply usbnet_link_change")
> >> Reported-by: Andrey Konovalov <andreyknvl@gmail.com>
> >> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> >> Signed-off-by: Bj�rn Mork <bjorn@mork.no>
> >> Signed-off-by: David S. Miller <davem@davemloft.net>
> >> [ciwillia@brocade.com: backported to 3.14: adjusted context]
> >> Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
> >> ---
> >>  drivers/net/usb/cdc_ncm.c | 20 +++++---------------
> >>  1 file changed, 5 insertions(+), 15 deletions(-)
> >
> > Another patch that isn't in 4.4-stable, why not?
> 
> Probably because I was sloppy when first posting it, and forgot to ask
> David to queue it for stable.  I see that this has later been requested
> by  Chas Williams:
> https://www.mail-archive.com/netdev@vger.kernel.org/msg120134.html 
> 
> Yes, it should go into 4.4-stable.

Thanks for letting me know, now queued up.

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3.14.y 1/9] udp: properly support MSG_PEEK with truncated buffers
  2016-08-14 14:42 ` [PATCH 3.14.y 1/9] udp: properly support MSG_PEEK with truncated buffers Greg KH
@ 2016-08-15  7:21   ` Michal Kubecek
  0 siblings, 0 replies; 17+ messages in thread
From: Michal Kubecek @ 2016-08-15  7:21 UTC (permalink / raw)
  To: Greg KH
  Cc: Charles (Chas) Williams, stable, Eric Dumazet, David S. Miller,
	Luis Henriques

On Sun, Aug 14, 2016 at 04:42:09PM +0200, Greg KH wrote:
> On Mon, Jul 18, 2016 at 04:53:15PM -0400, Charles (Chas) Williams wrote:
> > From: Eric Dumazet <edumazet@google.com>
> > 
> > commit 197c949e7798fbf28cfadc69d9ca0c2abbf93191 upstream.
> 
> Why isn't this in 4.4-stable?  I can't take it into 3.14-stable unless I
> also have it in 4.4, otherwise it would be a regression when people
> upgraded, right?
> 
> And if this isn't in 4.4-stable, why not?  I'm guessing that the authors
> didn't think it was necessary...

As the commit message says:

  This bug does not happen in upstream kernels since Al Viro did a great
  job to replace this into :
  skb_copy_and_csum_datagram_msg(skb, sizeof(struct udphdr), msg);
  This variant is safe vs short buffers.

That happened in 3.19 by commit 227158db1604 ("new helper:
skb_copy_and_csum_datagram_msg()"). The mainline commit is useful as it
prevents calculating the checksum twice but it's not really stable
material; on the other hand, it addresses a serious issue in stable
kernels 3.18 and older (those with backport of commit 89c22d8c3b27).

                                                         Michal Kubecek

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3.14.y 5/9] USB: usbfs: fix potential infoleak in devio
  2016-08-14 14:44   ` Greg Kroah-Hartman
@ 2016-08-15 14:41     ` Charles (Chas) Williams
  0 siblings, 0 replies; 17+ messages in thread
From: Charles (Chas) Williams @ 2016-08-15 14:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: stable, Kangjie Lu, Kangjie Lu



On 08/14/2016 10:44 AM, Greg Kroah-Hartman wrote:
> On Mon, Jul 18, 2016 at 04:53:19PM -0400, Charles (Chas) Williams wrote:
>> From: Kangjie Lu <kangjielu@gmail.com>
>>
>> commit 681fef8380eb818c0b845fca5d2ab1dcbab114ee upstream.
>>
>> The stack object “ci” has a total size of 8 bytes. Its last 3 bytes
>> are padding bytes which are not initialized and leaked to userland
>> via “copy_to_user”.
>>
>> CVE-2016-4482
>>
>> Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
>> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> [ciwillia@brocade.com: backported to 3.14: adjusted context]
>> Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
>> ---
>>  drivers/usb/core/devio.c | 9 +++++----
>>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> Another one not in 4.4 :(

I swear I sent this earlier.  It's on my update branch.  I can't to
get to the list archive at the moment to check.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-08-15 14:41 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-18 20:53 [PATCH 3.14.y 1/9] udp: properly support MSG_PEEK with truncated buffers Charles (Chas) Williams
2016-07-18 20:53 ` [PATCH 3.14.y 2/9] USB: fix invalid memory access in hub_activate() Charles (Chas) Williams
2016-07-18 20:53 ` [PATCH 3.14.y 3/9] cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind Charles (Chas) Williams
2016-08-14 14:43   ` Greg KH
2016-08-14 14:52     ` Bjørn Mork
2016-08-14 15:05       ` Greg KH
2016-07-18 20:53 ` [PATCH 3.14.y 4/9] KEYS: potential uninitialized variable Charles (Chas) Williams
2016-07-18 20:53 ` [PATCH 3.14.y 5/9] USB: usbfs: fix potential infoleak in devio Charles (Chas) Williams
2016-08-14 14:44   ` Greg Kroah-Hartman
2016-08-15 14:41     ` Charles (Chas) Williams
2016-07-18 20:58 ` [PATCH 3.14.y 6/9] mm: migrate dirty page without clear_page_dirty_for_io etc Charles (Chas) Williams
2016-07-18 20:59 ` [PATCH 3.14.y 7/9] printk: do cond_resched() between lines while outputting to consoles Charles (Chas) Williams
2016-07-18 20:59 ` [PATCH 3.14.y 8/9] HID: hiddev: validate num_values for HIDIOCGUSAGES, HIDIOCSUSAGES commands Charles (Chas) Williams
2016-07-18 20:59 ` [PATCH 3.14.y 9/9] x86/mm: Add barriers and document switch_mm()-vs-flush synchronization Charles (Chas) Williams
2016-07-18 20:59   ` Charles (Chas) Williams
2016-08-14 14:42 ` [PATCH 3.14.y 1/9] udp: properly support MSG_PEEK with truncated buffers Greg KH
2016-08-15  7:21   ` Michal Kubecek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.