All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-4.4 00/16] Stable commits from Ubuntu Xenial 4.4-lts
@ 2017-04-10 17:44 Sumit Semwal
  2017-04-10 17:44 ` [PATCH for-4.4 01/16] net/mlx4_core: Fix when to save some qp context flags for dynamic VST to VGT transitions Sumit Semwal
                   ` (15 more replies)
  0 siblings, 16 replies; 21+ messages in thread
From: Sumit Semwal @ 2017-04-10 17:44 UTC (permalink / raw)
  To: stable; +Cc: Sumit Semwal

Hi Greg,

For your consideration, a few patches that seem useful for 4.4-lts. These
are taken from the patches added for the 4.4-based Ubuntu Xenial tree. I've
build tested these for x86 with allmodconfig.

These apply cleanly on top of 4.4.59.

Best regards,
Sumit.  

Eugenia Emantayev (1):
  net/mlx4_en: Fix bad WQE issue

Gabriel Krisman Bertazi (1):
  blk-mq: Avoid memory reclaim when remapping queues

Guenter Roeck (1):
  usb: hub: Wait for connection to be reestablished after port reset

Jack Morgenstein (2):
  net/mlx4_core: Fix when to save some qp context flags for dynamic VST
    to VGT transitions
  net/mlx4_core: Fix racy CQ (Completion Queue) free

K. Y. Srinivasan (1):
  Drivers: hv: vmbus: Reduce the delay between retries in
    vmbus_post_msg()

NeilBrown (1):
  SUNRPC: fix refcounting problems with auth_gss messages.

Thomas Falcon (2):
  ibmveth: set correct gso_size and gso_type
  ibmveth: calculate gso_segs for large packets

Vitaly Kuznetsov (7):
  Drivers: hv: get rid of redundant messagecount in
    create_gpadl_header()
  Drivers: hv: don't leak memory in vmbus_establish_gpadl()
  Drivers: hv: get rid of timeout in vmbus_open()
  Tools: hv: kvp: ensure kvp device fd is closed on exec
  Drivers: hv: balloon: keep track of where ha_region starts
  Drivers: hv: balloon: account for gaps in hot add regions
  hv: don't reset hv_context.tsc_page on crash

 block/blk-mq.c                                     |   6 +-
 drivers/hv/channel.c                               |  51 ++++----
 drivers/hv/connection.c                            |   8 +-
 drivers/hv/hv.c                                    |   5 +-
 drivers/hv/hv_balloon.c                            | 136 +++++++++++++++------
 drivers/net/ethernet/ibm/ibmveth.c                 |  73 ++++++++++-
 drivers/net/ethernet/ibm/ibmveth.h                 |   1 +
 drivers/net/ethernet/mellanox/mlx4/cq.c            |  38 +++---
 drivers/net/ethernet/mellanox/mlx4/en_rx.c         |   8 +-
 .../net/ethernet/mellanox/mlx4/resource_tracker.c  |   5 +-
 drivers/usb/core/hub.c                             |  11 +-
 net/sunrpc/auth_gss/auth_gss.c                     |   7 +-
 tools/hv/hv_kvp_daemon.c                           |   2 +-
 13 files changed, 248 insertions(+), 103 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH for-4.4 01/16] net/mlx4_core: Fix when to save some qp context flags for dynamic VST to VGT transitions
  2017-04-10 17:44 [PATCH for-4.4 00/16] Stable commits from Ubuntu Xenial 4.4-lts Sumit Semwal
@ 2017-04-10 17:44 ` Sumit Semwal
  2017-04-12 13:41   ` Greg KH
  2017-04-10 17:44 ` [PATCH for-4.4 02/16] net/mlx4_core: Fix racy CQ (Completion Queue) free Sumit Semwal
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 21+ messages in thread
From: Sumit Semwal @ 2017-04-10 17:44 UTC (permalink / raw)
  To: stable; +Cc: Jack Morgenstein, Tariq Toukan, David S . Miller, Sumit Semwal

From: Jack Morgenstein <jackm@dev.mellanox.co.il>

[Upstream commit 61a4577c9a4419b99e647744923517d47255da35]

Save the qp context flags byte containing the flag disabling vlan stripping
in the RESET to INIT qp transition, rather than in the INIT to RTR
transition. Per the firmware spec, the flags in this byte are active
in the RESET to INIT transition.

As a result of saving the flags in the incorrect qp transition, when
switching dynamically from VGT to VST and back to VGT, the vlan
remained stripped (as is required for VST) and did not return to
not-stripped (as is required for VGT).

Fixes: f0f829bf42cd ("net/mlx4_core: Add immediate activate for VGT->VST->VGT")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
 [sumits: cherry picked for 4.4.y]
---
 drivers/net/ethernet/mellanox/mlx4/resource_tracker.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
index d314d96..d1fc7fa 100644
--- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
+++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
@@ -2955,6 +2955,9 @@ int mlx4_RST2INIT_QP_wrapper(struct mlx4_dev *dev, int slave,
 		put_res(dev, slave, srqn, RES_SRQ);
 		qp->srq = srq;
 	}
+
+	/* Save param3 for dynamic changes from VST back to VGT */
+	qp->param3 = qpc->param3;
 	put_res(dev, slave, rcqn, RES_CQ);
 	put_res(dev, slave, mtt_base, RES_MTT);
 	res_end_move(dev, slave, RES_QP, qpn);
@@ -3747,7 +3750,6 @@ int mlx4_INIT2RTR_QP_wrapper(struct mlx4_dev *dev, int slave,
 	int qpn = vhcr->in_modifier & 0x7fffff;
 	struct res_qp *qp;
 	u8 orig_sched_queue;
-	__be32	orig_param3 = qpc->param3;
 	u8 orig_vlan_control = qpc->pri_path.vlan_control;
 	u8 orig_fvl_rx = qpc->pri_path.fvl_rx;
 	u8 orig_pri_path_fl = qpc->pri_path.fl;
@@ -3789,7 +3791,6 @@ out:
 	 */
 	if (!err) {
 		qp->sched_queue = orig_sched_queue;
-		qp->param3	= orig_param3;
 		qp->vlan_control = orig_vlan_control;
 		qp->fvl_rx	=  orig_fvl_rx;
 		qp->pri_path_fl = orig_pri_path_fl;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH for-4.4 02/16] net/mlx4_core: Fix racy CQ (Completion Queue) free
  2017-04-10 17:44 [PATCH for-4.4 00/16] Stable commits from Ubuntu Xenial 4.4-lts Sumit Semwal
  2017-04-10 17:44 ` [PATCH for-4.4 01/16] net/mlx4_core: Fix when to save some qp context flags for dynamic VST to VGT transitions Sumit Semwal
@ 2017-04-10 17:44 ` Sumit Semwal
  2017-04-10 17:44 ` [PATCH for-4.4 03/16] net/mlx4_en: Fix bad WQE issue Sumit Semwal
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Sumit Semwal @ 2017-04-10 17:44 UTC (permalink / raw)
  To: stable
  Cc: Jack Morgenstein, Matan Barak, Tariq Toukan, David S . Miller,
	Sumit Semwal

From: Jack Morgenstein <jackm@dev.mellanox.co.il>

[Upstream commit 291c566a28910614ce42d0ffe82196eddd6346f4]

In function mlx4_cq_completion() and mlx4_cq_event(), the
radix_tree_lookup requires a rcu_read_lock.
This is mandatory: if another core frees the CQ, it could
run the radix_tree_node_rcu_free() call_rcu() callback while
its being used by the radix tree lookup function.

Additionally, in function mlx4_cq_event(), since we are adding
the rcu lock around the radix-tree lookup, we no longer need to take
the spinlock. Also, the synchronize_irq() call for the async event
eliminates the need for incrementing the cq reference count in
mlx4_cq_event().

Other changes:
1. In function mlx4_cq_free(), replace spin_lock_irq with spin_lock:
   we no longer take this spinlock in the interrupt context.
   The spinlock here, therefore, simply protects against different
   threads simultaneously invoking mlx4_cq_free() for different cq's.

2. In function mlx4_cq_free(), we move the radix tree delete to before
   the synchronize_irq() calls. This guarantees that we will not
   access this cq during any subsequent interrupts, and therefore can
   safely free the CQ after the synchronize_irq calls. The rcu_read_lock
   in the interrupt handlers only needs to protect against corrupting the
   radix tree; the interrupt handlers may access the cq outside the
   rcu_read_lock due to the synchronize_irq calls which protect against
   premature freeing of the cq.

3. In function mlx4_cq_event(), we change the mlx_warn message to mlx4_dbg.

4. We leave the cq reference count mechanism in place, because it is
   still needed for the cq completion tasklet mechanism.

Fixes: 6d90aa5cf17b ("net/mlx4_core: Make sure there are no pending async events when freeing CQ")
Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
 [sumits: cherry-picked for 4.4.y]
---
 drivers/net/ethernet/mellanox/mlx4/cq.c | 38 +++++++++++++++++----------------
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/cq.c b/drivers/net/ethernet/mellanox/mlx4/cq.c
index 3348e64..6eba580 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cq.c
@@ -101,13 +101,19 @@ void mlx4_cq_completion(struct mlx4_dev *dev, u32 cqn)
 {
 	struct mlx4_cq *cq;
 
+	rcu_read_lock();
 	cq = radix_tree_lookup(&mlx4_priv(dev)->cq_table.tree,
 			       cqn & (dev->caps.num_cqs - 1));
+	rcu_read_unlock();
+
 	if (!cq) {
 		mlx4_dbg(dev, "Completion event for bogus CQ %08x\n", cqn);
 		return;
 	}
 
+	/* Acessing the CQ outside of rcu_read_lock is safe, because
+	 * the CQ is freed only after interrupt handling is completed.
+	 */
 	++cq->arm_sn;
 
 	cq->comp(cq);
@@ -118,23 +124,19 @@ void mlx4_cq_event(struct mlx4_dev *dev, u32 cqn, int event_type)
 	struct mlx4_cq_table *cq_table = &mlx4_priv(dev)->cq_table;
 	struct mlx4_cq *cq;
 
-	spin_lock(&cq_table->lock);
-
+	rcu_read_lock();
 	cq = radix_tree_lookup(&cq_table->tree, cqn & (dev->caps.num_cqs - 1));
-	if (cq)
-		atomic_inc(&cq->refcount);
-
-	spin_unlock(&cq_table->lock);
+	rcu_read_unlock();
 
 	if (!cq) {
-		mlx4_warn(dev, "Async event for bogus CQ %08x\n", cqn);
+		mlx4_dbg(dev, "Async event for bogus CQ %08x\n", cqn);
 		return;
 	}
 
+	/* Acessing the CQ outside of rcu_read_lock is safe, because
+	 * the CQ is freed only after interrupt handling is completed.
+	 */
 	cq->event(cq, event_type);
-
-	if (atomic_dec_and_test(&cq->refcount))
-		complete(&cq->free);
 }
 
 static int mlx4_SW2HW_CQ(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox,
@@ -301,9 +303,9 @@ int mlx4_cq_alloc(struct mlx4_dev *dev, int nent,
 	if (err)
 		return err;
 
-	spin_lock_irq(&cq_table->lock);
+	spin_lock(&cq_table->lock);
 	err = radix_tree_insert(&cq_table->tree, cq->cqn, cq);
-	spin_unlock_irq(&cq_table->lock);
+	spin_unlock(&cq_table->lock);
 	if (err)
 		goto err_icm;
 
@@ -347,9 +349,9 @@ int mlx4_cq_alloc(struct mlx4_dev *dev, int nent,
 	return 0;
 
 err_radix:
-	spin_lock_irq(&cq_table->lock);
+	spin_lock(&cq_table->lock);
 	radix_tree_delete(&cq_table->tree, cq->cqn);
-	spin_unlock_irq(&cq_table->lock);
+	spin_unlock(&cq_table->lock);
 
 err_icm:
 	mlx4_cq_free_icm(dev, cq->cqn);
@@ -368,15 +370,15 @@ void mlx4_cq_free(struct mlx4_dev *dev, struct mlx4_cq *cq)
 	if (err)
 		mlx4_warn(dev, "HW2SW_CQ failed (%d) for CQN %06x\n", err, cq->cqn);
 
+	spin_lock(&cq_table->lock);
+	radix_tree_delete(&cq_table->tree, cq->cqn);
+	spin_unlock(&cq_table->lock);
+
 	synchronize_irq(priv->eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(cq->vector)].irq);
 	if (priv->eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(cq->vector)].irq !=
 	    priv->eq_table.eq[MLX4_EQ_ASYNC].irq)
 		synchronize_irq(priv->eq_table.eq[MLX4_EQ_ASYNC].irq);
 
-	spin_lock_irq(&cq_table->lock);
-	radix_tree_delete(&cq_table->tree, cq->cqn);
-	spin_unlock_irq(&cq_table->lock);
-
 	if (atomic_dec_and_test(&cq->refcount))
 		complete(&cq->free);
 	wait_for_completion(&cq->free);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH for-4.4 03/16] net/mlx4_en: Fix bad WQE issue
  2017-04-10 17:44 [PATCH for-4.4 00/16] Stable commits from Ubuntu Xenial 4.4-lts Sumit Semwal
  2017-04-10 17:44 ` [PATCH for-4.4 01/16] net/mlx4_core: Fix when to save some qp context flags for dynamic VST to VGT transitions Sumit Semwal
  2017-04-10 17:44 ` [PATCH for-4.4 02/16] net/mlx4_core: Fix racy CQ (Completion Queue) free Sumit Semwal
@ 2017-04-10 17:44 ` Sumit Semwal
  2017-04-12 13:45   ` Greg KH
  2017-04-10 17:44 ` [PATCH for-4.4 04/16] SUNRPC: fix refcounting problems with auth_gss messages Sumit Semwal
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 21+ messages in thread
From: Sumit Semwal @ 2017-04-10 17:44 UTC (permalink / raw)
  To: stable; +Cc: Eugenia Emantayev, Tariq Toukan, David S . Miller, Sumit Semwal

From: Eugenia Emantayev <eugenia@mellanox.com>

[Upstream commit 6496bbf0ec481966ef9ffe5b6660d8d1b55c60cc]

Single send WQE in RX buffer should be stamped with software
ownership in order to prevent the flow of QP in error in FW
once UPDATE_QP is called.

Fixes: 9f519f68cfff ('mlx4_en: Not using Shared Receive Queues')
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
 [sumits: cherry-picked for 4.4.y]
---
 drivers/net/ethernet/mellanox/mlx4/en_rx.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 28a4b34..82bf1b5 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -439,8 +439,14 @@ int mlx4_en_activate_rx_rings(struct mlx4_en_priv *priv)
 		ring->cqn = priv->rx_cq[ring_ind]->mcq.cqn;
 
 		ring->stride = stride;
-		if (ring->stride <= TXBB_SIZE)
+		if (ring->stride <= TXBB_SIZE) {
+			/* Stamp first unused send wqe */
+			__be32 *ptr = (__be32 *)ring->buf;
+			__be32 stamp = cpu_to_be32(1 << STAMP_SHIFT);
+			*ptr = stamp;
+			/* Move pointer to start of rx section */
 			ring->buf += TXBB_SIZE;
+		}
 
 		ring->log_stride = ffs(ring->stride) - 1;
 		ring->buf_size = ring->size * ring->stride;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH for-4.4 04/16] SUNRPC: fix refcounting problems with auth_gss messages.
  2017-04-10 17:44 [PATCH for-4.4 00/16] Stable commits from Ubuntu Xenial 4.4-lts Sumit Semwal
                   ` (2 preceding siblings ...)
  2017-04-10 17:44 ` [PATCH for-4.4 03/16] net/mlx4_en: Fix bad WQE issue Sumit Semwal
@ 2017-04-10 17:44 ` Sumit Semwal
  2017-04-10 17:44 ` [PATCH for-4.4 05/16] ibmveth: set correct gso_size and gso_type Sumit Semwal
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Sumit Semwal @ 2017-04-10 17:44 UTC (permalink / raw)
  To: stable; +Cc: NeilBrown, Trond Myklebust, Sumit Semwal

From: NeilBrown <neilb@suse.com>

[Upstream commit 1cded9d2974fe4fe339fc0ccd6638b80d465ab2c]

There are two problems with refcounting of auth_gss messages.

First, the reference on the pipe->pipe list (taken by a call
to rpc_queue_upcall()) is not counted.  It seems to be
assumed that a message in pipe->pipe will always also be in
pipe->in_downcall, where it is correctly reference counted.

However there is no guaranty of this.  I have a report of a
NULL dereferences in rpc_pipe_read() which suggests a msg
that has been freed is still on the pipe->pipe list.

One way I imagine this might happen is:
- message is queued for uid=U and auth->service=S1
- rpc.gssd reads this message and starts processing.
  This removes the message from pipe->pipe
- message is queued for uid=U and auth->service=S2
- rpc.gssd replies to the first message. gss_pipe_downcall()
  calls __gss_find_upcall(pipe, U, NULL) and it finds the
  *second* message, as new messages are placed at the head
  of ->in_downcall, and the service type is not checked.
- This second message is removed from ->in_downcall and freed
  by gss_release_msg() (even though it is still on pipe->pipe)
- rpc.gssd tries to read another message, and dereferences a pointer
  to this message that has just been freed.

I fix this by incrementing the reference count before calling
rpc_queue_upcall(), and decrementing it if that fails, or normally in
gss_pipe_destroy_msg().

It seems strange that the reply doesn't target the message more
precisely, but I don't know all the details.  In any case, I think the
reference counting irregularity became a measureable bug when the
extra arg was added to __gss_find_upcall(), hence the Fixes: line
below.

The second problem is that if rpc_queue_upcall() fails, the new
message is not freed. gss_alloc_msg() set the ->count to 1,
gss_add_msg() increments this to 2, gss_unhash_msg() decrements to 1,
then the pointer is discarded so the memory never gets freed.

Fixes: 9130b8dbc6ac ("SUNRPC: allow for upcalls for same uid but different gss service")
Cc: stable@vger.kernel.org
Link: https://bugzilla.opensuse.org/show_bug.cgi?id=1011250
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
---
 net/sunrpc/auth_gss/auth_gss.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
index 06095cc..1f0687d 100644
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -541,9 +541,13 @@ gss_setup_upcall(struct gss_auth *gss_auth, struct rpc_cred *cred)
 		return gss_new;
 	gss_msg = gss_add_msg(gss_new);
 	if (gss_msg == gss_new) {
-		int res = rpc_queue_upcall(gss_new->pipe, &gss_new->msg);
+		int res;
+		atomic_inc(&gss_msg->count);
+		res = rpc_queue_upcall(gss_new->pipe, &gss_new->msg);
 		if (res) {
 			gss_unhash_msg(gss_new);
+			atomic_dec(&gss_msg->count);
+			gss_release_msg(gss_new);
 			gss_msg = ERR_PTR(res);
 		}
 	} else
@@ -836,6 +840,7 @@ gss_pipe_destroy_msg(struct rpc_pipe_msg *msg)
 			warn_gssd();
 		gss_release_msg(gss_msg);
 	}
+	gss_release_msg(gss_msg);
 }
 
 static void gss_pipe_dentry_destroy(struct dentry *dir,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH for-4.4 05/16] ibmveth: set correct gso_size and gso_type
  2017-04-10 17:44 [PATCH for-4.4 00/16] Stable commits from Ubuntu Xenial 4.4-lts Sumit Semwal
                   ` (3 preceding siblings ...)
  2017-04-10 17:44 ` [PATCH for-4.4 04/16] SUNRPC: fix refcounting problems with auth_gss messages Sumit Semwal
@ 2017-04-10 17:44 ` Sumit Semwal
  2017-04-10 17:44 ` [PATCH for-4.4 06/16] ibmveth: calculate gso_segs for large packets Sumit Semwal
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Sumit Semwal @ 2017-04-10 17:44 UTC (permalink / raw)
  To: stable; +Cc: Thomas Falcon, David S . Miller, Sumit Semwal

From: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>

[Upstream commit 7b5967389f5a8dfb9d32843830f5e2717e20995d]

This patch is based on an earlier one submitted
by Jon Maxwell with the following commit message:

"We recently encountered a bug where a few customers using ibmveth on the
same LPAR hit an issue where a TCP session hung when large receive was
enabled. Closer analysis revealed that the session was stuck because the
one side was advertising a zero window repeatedly.

We narrowed this down to the fact the ibmveth driver did not set gso_size
which is translated by TCP into the MSS later up the stack. The MSS is
used to calculate the TCP window size and as that was abnormally large,
it was calculating a zero window, even although the sockets receive buffer
was completely empty."

We rely on the Virtual I/O Server partition in a pseries
environment to provide the MSS through the TCP header checksum
field. The stipulation is that users should not disable checksum
offloading if rx packet aggregation is enabled through VIOS.

Some firmware offerings provide the MSS in the RX buffer.
This is signalled by a bit in the RX queue descriptor.

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Reviewed-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Jonathan Maxwell <jmaxwell37@gmail.com>
Reviewed-by: David Dai <zdai@us.ibm.com>
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
 [sumits: cherry-picked for 4.4.y]
---
 drivers/net/ethernet/ibm/ibmveth.c | 65 ++++++++++++++++++++++++++++++++++++--
 drivers/net/ethernet/ibm/ibmveth.h |  1 +
 2 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c
index 7af870a..855c43d 100644
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -58,7 +58,7 @@ static struct kobj_type ktype_veth_pool;
 
 static const char ibmveth_driver_name[] = "ibmveth";
 static const char ibmveth_driver_string[] = "IBM Power Virtual Ethernet Driver";
-#define ibmveth_driver_version "1.05"
+#define ibmveth_driver_version "1.06"
 
 MODULE_AUTHOR("Santiago Leon <santil@linux.vnet.ibm.com>");
 MODULE_DESCRIPTION("IBM Power Virtual Ethernet Driver");
@@ -137,6 +137,11 @@ static inline int ibmveth_rxq_frame_offset(struct ibmveth_adapter *adapter)
 	return ibmveth_rxq_flags(adapter) & IBMVETH_RXQ_OFF_MASK;
 }
 
+static inline int ibmveth_rxq_large_packet(struct ibmveth_adapter *adapter)
+{
+	return ibmveth_rxq_flags(adapter) & IBMVETH_RXQ_LRG_PKT;
+}
+
 static inline int ibmveth_rxq_frame_length(struct ibmveth_adapter *adapter)
 {
 	return be32_to_cpu(adapter->rx_queue.queue_addr[adapter->rx_queue.index].length);
@@ -1172,6 +1177,45 @@ map_failed:
 	goto retry_bounce;
 }
 
+static void ibmveth_rx_mss_helper(struct sk_buff *skb, u16 mss, int lrg_pkt)
+{
+	int offset = 0;
+
+	/* only TCP packets will be aggregated */
+	if (skb->protocol == htons(ETH_P_IP)) {
+		struct iphdr *iph = (struct iphdr *)skb->data;
+
+		if (iph->protocol == IPPROTO_TCP) {
+			offset = iph->ihl * 4;
+			skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
+		} else {
+			return;
+		}
+	} else if (skb->protocol == htons(ETH_P_IPV6)) {
+		struct ipv6hdr *iph6 = (struct ipv6hdr *)skb->data;
+
+		if (iph6->nexthdr == IPPROTO_TCP) {
+			offset = sizeof(struct ipv6hdr);
+			skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
+		} else {
+			return;
+		}
+	} else {
+		return;
+	}
+	/* if mss is not set through Large Packet bit/mss in rx buffer,
+	 * expect that the mss will be written to the tcp header checksum.
+	 */
+	if (lrg_pkt) {
+		skb_shinfo(skb)->gso_size = mss;
+	} else if (offset) {
+		struct tcphdr *tcph = (struct tcphdr *)(skb->data + offset);
+
+		skb_shinfo(skb)->gso_size = ntohs(tcph->check);
+		tcph->check = 0;
+	}
+}
+
 static int ibmveth_poll(struct napi_struct *napi, int budget)
 {
 	struct ibmveth_adapter *adapter =
@@ -1180,6 +1224,7 @@ static int ibmveth_poll(struct napi_struct *napi, int budget)
 	int frames_processed = 0;
 	unsigned long lpar_rc;
 	struct iphdr *iph;
+	u16 mss = 0;
 
 restart_poll:
 	while (frames_processed < budget) {
@@ -1197,9 +1242,21 @@ restart_poll:
 			int length = ibmveth_rxq_frame_length(adapter);
 			int offset = ibmveth_rxq_frame_offset(adapter);
 			int csum_good = ibmveth_rxq_csum_good(adapter);
+			int lrg_pkt = ibmveth_rxq_large_packet(adapter);
 
 			skb = ibmveth_rxq_get_buffer(adapter);
 
+			/* if the large packet bit is set in the rx queue
+			 * descriptor, the mss will be written by PHYP eight
+			 * bytes from the start of the rx buffer, which is
+			 * skb->data at this stage
+			 */
+			if (lrg_pkt) {
+				__be64 *rxmss = (__be64 *)(skb->data + 8);
+
+				mss = (u16)be64_to_cpu(*rxmss);
+			}
+
 			new_skb = NULL;
 			if (length < rx_copybreak)
 				new_skb = netdev_alloc_skb(netdev, length);
@@ -1233,11 +1290,15 @@ restart_poll:
 					if (iph->check == 0xffff) {
 						iph->check = 0;
 						iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl);
-						adapter->rx_large_packets++;
 					}
 				}
 			}
 
+			if (length > netdev->mtu + ETH_HLEN) {
+				ibmveth_rx_mss_helper(skb, mss, lrg_pkt);
+				adapter->rx_large_packets++;
+			}
+
 			napi_gro_receive(napi, skb);	/* send it up */
 
 			netdev->stats.rx_packets++;
diff --git a/drivers/net/ethernet/ibm/ibmveth.h b/drivers/net/ethernet/ibm/ibmveth.h
index 4eade67..7acda04 100644
--- a/drivers/net/ethernet/ibm/ibmveth.h
+++ b/drivers/net/ethernet/ibm/ibmveth.h
@@ -209,6 +209,7 @@ struct ibmveth_rx_q_entry {
 #define IBMVETH_RXQ_TOGGLE		0x80000000
 #define IBMVETH_RXQ_TOGGLE_SHIFT	31
 #define IBMVETH_RXQ_VALID		0x40000000
+#define IBMVETH_RXQ_LRG_PKT		0x04000000
 #define IBMVETH_RXQ_NO_CSUM		0x02000000
 #define IBMVETH_RXQ_CSUM_GOOD		0x01000000
 #define IBMVETH_RXQ_OFF_MASK		0x0000FFFF
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH for-4.4 06/16] ibmveth: calculate gso_segs for large packets
  2017-04-10 17:44 [PATCH for-4.4 00/16] Stable commits from Ubuntu Xenial 4.4-lts Sumit Semwal
                   ` (4 preceding siblings ...)
  2017-04-10 17:44 ` [PATCH for-4.4 05/16] ibmveth: set correct gso_size and gso_type Sumit Semwal
@ 2017-04-10 17:44 ` Sumit Semwal
  2017-04-10 17:44 ` [PATCH for-4.4 07/16] Drivers: hv: get rid of redundant messagecount in create_gpadl_header() Sumit Semwal
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Sumit Semwal @ 2017-04-10 17:44 UTC (permalink / raw)
  To: stable; +Cc: Thomas Falcon, David S . Miller, Sumit Semwal

From: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>

[Upstream commit 94acf164dc8f1184e8d0737be7125134c2701dbe]

Include calculations to compute the number of segments
that comprise an aggregated large packet.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Jonathan Maxwell <jmaxwell37@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
 [sumits: cherry-picked for 4.4.y]
---
 drivers/net/ethernet/ibm/ibmveth.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c
index 855c43d..f9e4988 100644
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -1179,7 +1179,9 @@ map_failed:
 
 static void ibmveth_rx_mss_helper(struct sk_buff *skb, u16 mss, int lrg_pkt)
 {
+	struct tcphdr *tcph;
 	int offset = 0;
+	int hdr_len;
 
 	/* only TCP packets will be aggregated */
 	if (skb->protocol == htons(ETH_P_IP)) {
@@ -1206,14 +1208,20 @@ static void ibmveth_rx_mss_helper(struct sk_buff *skb, u16 mss, int lrg_pkt)
 	/* if mss is not set through Large Packet bit/mss in rx buffer,
 	 * expect that the mss will be written to the tcp header checksum.
 	 */
+	tcph = (struct tcphdr *)(skb->data + offset);
 	if (lrg_pkt) {
 		skb_shinfo(skb)->gso_size = mss;
 	} else if (offset) {
-		struct tcphdr *tcph = (struct tcphdr *)(skb->data + offset);
-
 		skb_shinfo(skb)->gso_size = ntohs(tcph->check);
 		tcph->check = 0;
 	}
+
+	if (skb_shinfo(skb)->gso_size) {
+		hdr_len = offset + tcph->doff * 4;
+		skb_shinfo(skb)->gso_segs =
+				DIV_ROUND_UP(skb->len - hdr_len,
+					     skb_shinfo(skb)->gso_size);
+	}
 }
 
 static int ibmveth_poll(struct napi_struct *napi, int budget)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH for-4.4 07/16] Drivers: hv: get rid of redundant messagecount in create_gpadl_header()
  2017-04-10 17:44 [PATCH for-4.4 00/16] Stable commits from Ubuntu Xenial 4.4-lts Sumit Semwal
                   ` (5 preceding siblings ...)
  2017-04-10 17:44 ` [PATCH for-4.4 06/16] ibmveth: calculate gso_segs for large packets Sumit Semwal
@ 2017-04-10 17:44 ` Sumit Semwal
  2017-04-10 17:44 ` [PATCH for-4.4 08/16] Drivers: hv: don't leak memory in vmbus_establish_gpadl() Sumit Semwal
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Sumit Semwal @ 2017-04-10 17:44 UTC (permalink / raw)
  To: stable
  Cc: Vitaly Kuznetsov, K . Y . Srinivasan, Greg Kroah-Hartman, Sumit Semwal

From: Vitaly Kuznetsov <vkuznets@redhat.com>

[ Upstream commit 4d63763296ab7865a98bc29cc7d77145815ef89f ]
We use messagecount only once in vmbus_establish_gpadl() to check if
it is safe to iterate through the submsglist. We can just initialize
the list header in all cases in create_gpadl_header() instead.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
 [sumits: cherry picked for 4.4.y]
---
 drivers/hv/channel.c | 38 ++++++++++++++++----------------------
 1 file changed, 16 insertions(+), 22 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index 1ef37c7..fb1e3df 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -223,8 +223,7 @@ EXPORT_SYMBOL_GPL(vmbus_open);
  * create_gpadl_header - Creates a gpadl for the specified buffer
  */
 static int create_gpadl_header(void *kbuffer, u32 size,
-					 struct vmbus_channel_msginfo **msginfo,
-					 u32 *messagecount)
+			       struct vmbus_channel_msginfo **msginfo)
 {
 	int i;
 	int pagecount;
@@ -268,7 +267,6 @@ static int create_gpadl_header(void *kbuffer, u32 size,
 			gpadl_header->range[0].pfn_array[i] = slow_virt_to_phys(
 				kbuffer + PAGE_SIZE * i) >> PAGE_SHIFT;
 		*msginfo = msgheader;
-		*messagecount = 1;
 
 		pfnsum = pfncount;
 		pfnleft = pagecount - pfncount;
@@ -308,7 +306,6 @@ static int create_gpadl_header(void *kbuffer, u32 size,
 			}
 
 			msgbody->msgsize = msgsize;
-			(*messagecount)++;
 			gpadl_body =
 				(struct vmbus_channel_gpadl_body *)msgbody->msg;
 
@@ -337,6 +334,8 @@ static int create_gpadl_header(void *kbuffer, u32 size,
 		msgheader = kzalloc(msgsize, GFP_KERNEL);
 		if (msgheader == NULL)
 			goto nomem;
+
+		INIT_LIST_HEAD(&msgheader->submsglist);
 		msgheader->msgsize = msgsize;
 
 		gpadl_header = (struct vmbus_channel_gpadl_header *)
@@ -351,7 +350,6 @@ static int create_gpadl_header(void *kbuffer, u32 size,
 				kbuffer + PAGE_SIZE * i) >> PAGE_SHIFT;
 
 		*msginfo = msgheader;
-		*messagecount = 1;
 	}
 
 	return 0;
@@ -376,7 +374,6 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer,
 	struct vmbus_channel_gpadl_body *gpadl_body;
 	struct vmbus_channel_msginfo *msginfo = NULL;
 	struct vmbus_channel_msginfo *submsginfo;
-	u32 msgcount;
 	struct list_head *curr;
 	u32 next_gpadl_handle;
 	unsigned long flags;
@@ -385,7 +382,7 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer,
 	next_gpadl_handle =
 		(atomic_inc_return(&vmbus_connection.next_gpadl_handle) - 1);
 
-	ret = create_gpadl_header(kbuffer, size, &msginfo, &msgcount);
+	ret = create_gpadl_header(kbuffer, size, &msginfo);
 	if (ret)
 		return ret;
 
@@ -408,24 +405,21 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer,
 	if (ret != 0)
 		goto cleanup;
 
-	if (msgcount > 1) {
-		list_for_each(curr, &msginfo->submsglist) {
+	list_for_each(curr, &msginfo->submsglist) {
+		submsginfo = (struct vmbus_channel_msginfo *)curr;
+		gpadl_body =
+			(struct vmbus_channel_gpadl_body *)submsginfo->msg;
 
-			submsginfo = (struct vmbus_channel_msginfo *)curr;
-			gpadl_body =
-			     (struct vmbus_channel_gpadl_body *)submsginfo->msg;
+		gpadl_body->header.msgtype =
+			CHANNELMSG_GPADL_BODY;
+		gpadl_body->gpadl = next_gpadl_handle;
 
-			gpadl_body->header.msgtype =
-				CHANNELMSG_GPADL_BODY;
-			gpadl_body->gpadl = next_gpadl_handle;
+		ret = vmbus_post_msg(gpadl_body,
+				     submsginfo->msgsize -
+				     sizeof(*submsginfo));
+		if (ret != 0)
+			goto cleanup;
 
-			ret = vmbus_post_msg(gpadl_body,
-					       submsginfo->msgsize -
-					       sizeof(*submsginfo));
-			if (ret != 0)
-				goto cleanup;
-
-		}
 	}
 	wait_for_completion(&msginfo->waitevent);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH for-4.4 08/16] Drivers: hv: don't leak memory in vmbus_establish_gpadl()
  2017-04-10 17:44 [PATCH for-4.4 00/16] Stable commits from Ubuntu Xenial 4.4-lts Sumit Semwal
                   ` (6 preceding siblings ...)
  2017-04-10 17:44 ` [PATCH for-4.4 07/16] Drivers: hv: get rid of redundant messagecount in create_gpadl_header() Sumit Semwal
@ 2017-04-10 17:44 ` Sumit Semwal
  2017-04-10 17:44 ` [PATCH for-4.4 09/16] Drivers: hv: get rid of timeout in vmbus_open() Sumit Semwal
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Sumit Semwal @ 2017-04-10 17:44 UTC (permalink / raw)
  To: stable
  Cc: Vitaly Kuznetsov, K . Y . Srinivasan, Greg Kroah-Hartman, Sumit Semwal

From: Vitaly Kuznetsov <vkuznets@redhat.com>

[ Upstream commit 7cc80c98070ccc7940fc28811c92cca0a681015d ]

In some cases create_gpadl_header() allocates submessages but we never
free them.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
---
 drivers/hv/channel.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index fb1e3df..ec61ad8 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -373,7 +373,7 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer,
 	struct vmbus_channel_gpadl_header *gpadlmsg;
 	struct vmbus_channel_gpadl_body *gpadl_body;
 	struct vmbus_channel_msginfo *msginfo = NULL;
-	struct vmbus_channel_msginfo *submsginfo;
+	struct vmbus_channel_msginfo *submsginfo, *tmp;
 	struct list_head *curr;
 	u32 next_gpadl_handle;
 	unsigned long flags;
@@ -430,6 +430,10 @@ cleanup:
 	spin_lock_irqsave(&vmbus_connection.channelmsg_lock, flags);
 	list_del(&msginfo->msglistentry);
 	spin_unlock_irqrestore(&vmbus_connection.channelmsg_lock, flags);
+	list_for_each_entry_safe(submsginfo, tmp, &msginfo->submsglist,
+				 msglistentry) {
+		kfree(submsginfo);
+	}
 
 	kfree(msginfo);
 	return ret;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH for-4.4 09/16] Drivers: hv: get rid of timeout in vmbus_open()
  2017-04-10 17:44 [PATCH for-4.4 00/16] Stable commits from Ubuntu Xenial 4.4-lts Sumit Semwal
                   ` (7 preceding siblings ...)
  2017-04-10 17:44 ` [PATCH for-4.4 08/16] Drivers: hv: don't leak memory in vmbus_establish_gpadl() Sumit Semwal
@ 2017-04-10 17:44 ` Sumit Semwal
  2017-04-10 17:44 ` [PATCH for-4.4 10/16] Drivers: hv: vmbus: Reduce the delay between retries in vmbus_post_msg() Sumit Semwal
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Sumit Semwal @ 2017-04-10 17:44 UTC (permalink / raw)
  To: stable
  Cc: Vitaly Kuznetsov, K . Y . Srinivasan, Greg Kroah-Hartman, Sumit Semwal

From: Vitaly Kuznetsov <vkuznets@redhat.com>

[ Upstream commit 396e287fa2ff46e83ae016cdcb300c3faa3b02f6 ]

vmbus_teardown_gpadl() can result in infinite wait when it is called on 5
second timeout in vmbus_open(). The issue is caused by the fact that gpadl
teardown operation won't ever succeed for an opened channel and the timeout
isn't always enough. As a guest, we can always trust the host to respond to
our request (and there is nothing we can do if it doesn't).

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
 [sumits: cherry picked for 4.4.y]
---
 drivers/hv/channel.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index ec61ad8..800598c 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -73,7 +73,6 @@ int vmbus_open(struct vmbus_channel *newchannel, u32 send_ringbuffer_size,
 	void *in, *out;
 	unsigned long flags;
 	int ret, err = 0;
-	unsigned long t;
 	struct page *page;
 
 	spin_lock_irqsave(&newchannel->lock, flags);
@@ -183,11 +182,7 @@ int vmbus_open(struct vmbus_channel *newchannel, u32 send_ringbuffer_size,
 		goto error1;
 	}
 
-	t = wait_for_completion_timeout(&open_info->waitevent, 5*HZ);
-	if (t == 0) {
-		err = -ETIMEDOUT;
-		goto error1;
-	}
+	wait_for_completion(&open_info->waitevent);
 
 	spin_lock_irqsave(&vmbus_connection.channelmsg_lock, flags);
 	list_del(&open_info->msglistentry);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH for-4.4 10/16] Drivers: hv: vmbus: Reduce the delay between retries in vmbus_post_msg()
  2017-04-10 17:44 [PATCH for-4.4 00/16] Stable commits from Ubuntu Xenial 4.4-lts Sumit Semwal
                   ` (8 preceding siblings ...)
  2017-04-10 17:44 ` [PATCH for-4.4 09/16] Drivers: hv: get rid of timeout in vmbus_open() Sumit Semwal
@ 2017-04-10 17:44 ` Sumit Semwal
  2017-04-10 17:44 ` [PATCH for-4.4 11/16] Tools: hv: kvp: ensure kvp device fd is closed on exec Sumit Semwal
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Sumit Semwal @ 2017-04-10 17:44 UTC (permalink / raw)
  To: stable; +Cc: K. Y. Srinivasan, Greg Kroah-Hartman, Sumit Semwal

From: "K. Y. Srinivasan" <kys@microsoft.com>

[ Upstream commit 8de0d7e951826d7592e0ba1da655b175c4aa0923 ]

The current delay between retries is unnecessarily high and is negatively
affecting the time it takes to boot the system.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
---
 drivers/hv/connection.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index 4fc2e88..2bbc530 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -429,7 +429,7 @@ int vmbus_post_msg(void *buffer, size_t buflen)
 	union hv_connection_id conn_id;
 	int ret = 0;
 	int retries = 0;
-	u32 msec = 1;
+	u32 usec = 1;
 
 	conn_id.asu32 = 0;
 	conn_id.u.id = VMBUS_MESSAGE_CONNECTION_ID;
@@ -462,9 +462,9 @@ int vmbus_post_msg(void *buffer, size_t buflen)
 		}
 
 		retries++;
-		msleep(msec);
-		if (msec < 2048)
-			msec *= 2;
+		udelay(usec);
+		if (usec < 2048)
+			usec *= 2;
 	}
 	return ret;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH for-4.4 11/16] Tools: hv: kvp: ensure kvp device fd is closed on exec
  2017-04-10 17:44 [PATCH for-4.4 00/16] Stable commits from Ubuntu Xenial 4.4-lts Sumit Semwal
                   ` (9 preceding siblings ...)
  2017-04-10 17:44 ` [PATCH for-4.4 10/16] Drivers: hv: vmbus: Reduce the delay between retries in vmbus_post_msg() Sumit Semwal
@ 2017-04-10 17:44 ` Sumit Semwal
  2017-04-10 17:44 ` [PATCH for-4.4 12/16] Drivers: hv: balloon: keep track of where ha_region starts Sumit Semwal
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Sumit Semwal @ 2017-04-10 17:44 UTC (permalink / raw)
  To: stable
  Cc: Vitaly Kuznetsov, K . Y . Srinivasan, Greg Kroah-Hartman, Sumit Semwal

From: Vitaly Kuznetsov <vkuznets@redhat.com>

[ Upstream commit 26840437cbd6d3625ea6ab34e17cd34bb810c861 ]

KVP daemon does fork()/exec() (with popen()) so we need to close our fds
to avoid sharing them with child processes. The immediate implication of
not doing so I see is SELinux complaining about 'ip' trying to access
'/dev/vmbus/hv_kvp'.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
---
 tools/hv/hv_kvp_daemon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c
index 0d9f48e..bc7adb8 100644
--- a/tools/hv/hv_kvp_daemon.c
+++ b/tools/hv/hv_kvp_daemon.c
@@ -1433,7 +1433,7 @@ int main(int argc, char *argv[])
 	openlog("KVP", 0, LOG_USER);
 	syslog(LOG_INFO, "KVP starting; pid is:%d", getpid());
 
-	kvp_fd = open("/dev/vmbus/hv_kvp", O_RDWR);
+	kvp_fd = open("/dev/vmbus/hv_kvp", O_RDWR | O_CLOEXEC);
 
 	if (kvp_fd < 0) {
 		syslog(LOG_ERR, "open /dev/vmbus/hv_kvp failed; error: %d %s",
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH for-4.4 12/16] Drivers: hv: balloon: keep track of where ha_region starts
  2017-04-10 17:44 [PATCH for-4.4 00/16] Stable commits from Ubuntu Xenial 4.4-lts Sumit Semwal
                   ` (10 preceding siblings ...)
  2017-04-10 17:44 ` [PATCH for-4.4 11/16] Tools: hv: kvp: ensure kvp device fd is closed on exec Sumit Semwal
@ 2017-04-10 17:44 ` Sumit Semwal
  2017-04-10 17:44 ` [PATCH for-4.4 13/16] Drivers: hv: balloon: account for gaps in hot add regions Sumit Semwal
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Sumit Semwal @ 2017-04-10 17:44 UTC (permalink / raw)
  To: stable
  Cc: Vitaly Kuznetsov, K . Y . Srinivasan, Greg Kroah-Hartman, Sumit Semwal

From: Vitaly Kuznetsov <vkuznets@redhat.com>

[ Upstream commit 7cf3b79ec85ee1a5bbaaf936bb1d050dc652983b ]

Windows 2012 (non-R2) does not specify hot add region in hot add requests
and the logic in hot_add_req() is trying to find a 128Mb-aligned region
covering the request. It may also happen that host's requests are not 128Mb
aligned and the created ha_region will start before the first specified
PFN. We can't online these non-present pages but we don't remember the real
start of the region.

This is a regression introduced by the commit 5abbbb75d733 ("Drivers: hv:
hv_balloon: don't lose memory when onlining order is not natural"). While
the idea of keeping the 'moving window' was wrong (as there is no guarantee
that hot add requests come ordered) we should still keep track of
covered_start_pfn. This is not a revert, the logic is different.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
---
 drivers/hv/hv_balloon.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index 43af913..1542d89 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -430,13 +430,14 @@ struct dm_info_msg {
  * currently hot added. We hot add in multiples of 128M
  * chunks; it is possible that we may not be able to bring
  * online all the pages in the region. The range
- * covered_end_pfn defines the pages that can
+ * covered_start_pfn:covered_end_pfn defines the pages that can
  * be brough online.
  */
 
 struct hv_hotadd_state {
 	struct list_head list;
 	unsigned long start_pfn;
+	unsigned long covered_start_pfn;
 	unsigned long covered_end_pfn;
 	unsigned long ha_end_pfn;
 	unsigned long end_pfn;
@@ -682,7 +683,8 @@ static void hv_online_page(struct page *pg)
 
 	list_for_each(cur, &dm_device.ha_region_list) {
 		has = list_entry(cur, struct hv_hotadd_state, list);
-		cur_start_pgp = (unsigned long)pfn_to_page(has->start_pfn);
+		cur_start_pgp = (unsigned long)
+			pfn_to_page(has->covered_start_pfn);
 		cur_end_pgp = (unsigned long)pfn_to_page(has->covered_end_pfn);
 
 		if (((unsigned long)pg >= cur_start_pgp) &&
@@ -854,6 +856,7 @@ static unsigned long process_hot_add(unsigned long pg_start,
 		list_add_tail(&ha_region->list, &dm_device.ha_region_list);
 		ha_region->start_pfn = rg_start;
 		ha_region->ha_end_pfn = rg_start;
+		ha_region->covered_start_pfn = pg_start;
 		ha_region->covered_end_pfn = pg_start;
 		ha_region->end_pfn = rg_start + rg_size;
 	}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH for-4.4 13/16] Drivers: hv: balloon: account for gaps in hot add regions
  2017-04-10 17:44 [PATCH for-4.4 00/16] Stable commits from Ubuntu Xenial 4.4-lts Sumit Semwal
                   ` (11 preceding siblings ...)
  2017-04-10 17:44 ` [PATCH for-4.4 12/16] Drivers: hv: balloon: keep track of where ha_region starts Sumit Semwal
@ 2017-04-10 17:44 ` Sumit Semwal
  2017-04-10 17:44 ` [PATCH for-4.4 14/16] hv: don't reset hv_context.tsc_page on crash Sumit Semwal
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Sumit Semwal @ 2017-04-10 17:44 UTC (permalink / raw)
  To: stable
  Cc: Vitaly Kuznetsov, K . Y . Srinivasan, Greg Kroah-Hartman, Sumit Semwal

From: Vitaly Kuznetsov <vkuznets@redhat.com>

[ Upstream commit cb7a5724c7e1bfb5766ad1c3beba14cc715991cf ]

I'm observing the following hot add requests from the WS2012 host:

hot_add_req: start_pfn = 0x108200 count = 330752
hot_add_req: start_pfn = 0x158e00 count = 193536
hot_add_req: start_pfn = 0x188400 count = 239616

As the host doesn't specify hot add regions we're trying to create
128Mb-aligned region covering the first request, we create the 0x108000 -
0x160000 region and we add 0x108000 - 0x158e00 memory. The second request
passes the pfn_covered() check, we enlarge the region to 0x108000 -
0x190000 and add 0x158e00 - 0x188200 memory. The problem emerges with the
third request as it starts at 0x188400 so there is a 0x200 gap which is
not covered. As the end of our region is 0x190000 now it again passes the
pfn_covered() check were we just adjust the covered_end_pfn and make it
0x188400 instead of 0x188200 which means that we'll try to online
0x188200-0x188400 pages but these pages were never assigned to us and we
crash.

We can't react to such requests by creating new hot add regions as it may
happen that the whole suggested range falls into the previously identified
128Mb-aligned area so we'll end up adding nothing or create intersecting
regions and our current logic doesn't allow that. Instead, create a list of
such 'gaps' and check for them in the page online callback.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
 [sumits: cherry picked for 4.4.y]
---
 drivers/hv/hv_balloon.c | 131 ++++++++++++++++++++++++++++++++++--------------
 1 file changed, 94 insertions(+), 37 deletions(-)

diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index 1542d89..354da7f 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -441,6 +441,16 @@ struct hv_hotadd_state {
 	unsigned long covered_end_pfn;
 	unsigned long ha_end_pfn;
 	unsigned long end_pfn;
+	/*
+	 * A list of gaps.
+	 */
+	struct list_head gap_list;
+};
+
+struct hv_hotadd_gap {
+	struct list_head list;
+	unsigned long start_pfn;
+	unsigned long end_pfn;
 };
 
 struct balloon_state {
@@ -596,18 +606,46 @@ static struct notifier_block hv_memory_nb = {
 	.priority = 0
 };
 
+/* Check if the particular page is backed and can be onlined and online it. */
+static void hv_page_online_one(struct hv_hotadd_state *has, struct page *pg)
+{
+	unsigned long cur_start_pgp;
+	unsigned long cur_end_pgp;
+	struct hv_hotadd_gap *gap;
+
+	cur_start_pgp = (unsigned long)pfn_to_page(has->covered_start_pfn);
+	cur_end_pgp = (unsigned long)pfn_to_page(has->covered_end_pfn);
 
-static void hv_bring_pgs_online(unsigned long start_pfn, unsigned long size)
+	/* The page is not backed. */
+	if (((unsigned long)pg < cur_start_pgp) ||
+	    ((unsigned long)pg >= cur_end_pgp))
+		return;
+
+	/* Check for gaps. */
+	list_for_each_entry(gap, &has->gap_list, list) {
+		cur_start_pgp = (unsigned long)
+			pfn_to_page(gap->start_pfn);
+		cur_end_pgp = (unsigned long)
+			pfn_to_page(gap->end_pfn);
+		if (((unsigned long)pg >= cur_start_pgp) &&
+		    ((unsigned long)pg < cur_end_pgp)) {
+			return;
+		}
+	}
+
+	/* This frame is currently backed; online the page. */
+	__online_page_set_limits(pg);
+	__online_page_increment_counters(pg);
+	__online_page_free(pg);
+}
+
+static void hv_bring_pgs_online(struct hv_hotadd_state *has,
+				unsigned long start_pfn, unsigned long size)
 {
 	int i;
 
-	for (i = 0; i < size; i++) {
-		struct page *pg;
-		pg = pfn_to_page(start_pfn + i);
-		__online_page_set_limits(pg);
-		__online_page_increment_counters(pg);
-		__online_page_free(pg);
-	}
+	for (i = 0; i < size; i++)
+		hv_page_online_one(has, pfn_to_page(start_pfn + i));
 }
 
 static void hv_mem_hot_add(unsigned long start, unsigned long size,
@@ -684,26 +722,24 @@ static void hv_online_page(struct page *pg)
 	list_for_each(cur, &dm_device.ha_region_list) {
 		has = list_entry(cur, struct hv_hotadd_state, list);
 		cur_start_pgp = (unsigned long)
-			pfn_to_page(has->covered_start_pfn);
-		cur_end_pgp = (unsigned long)pfn_to_page(has->covered_end_pfn);
+			pfn_to_page(has->start_pfn);
+		cur_end_pgp = (unsigned long)pfn_to_page(has->end_pfn);
 
-		if (((unsigned long)pg >= cur_start_pgp) &&
-			((unsigned long)pg < cur_end_pgp)) {
-			/*
-			 * This frame is currently backed; online the
-			 * page.
-			 */
-			__online_page_set_limits(pg);
-			__online_page_increment_counters(pg);
-			__online_page_free(pg);
-		}
+		/* The page belongs to a different HAS. */
+		if (((unsigned long)pg < cur_start_pgp) ||
+		    ((unsigned long)pg >= cur_end_pgp))
+			continue;
+
+		hv_page_online_one(has, pg);
+		break;
 	}
 }
 
-static bool pfn_covered(unsigned long start_pfn, unsigned long pfn_cnt)
+static int pfn_covered(unsigned long start_pfn, unsigned long pfn_cnt)
 {
 	struct list_head *cur;
 	struct hv_hotadd_state *has;
+	struct hv_hotadd_gap *gap;
 	unsigned long residual, new_inc;
 
 	if (list_empty(&dm_device.ha_region_list))
@@ -718,6 +754,24 @@ static bool pfn_covered(unsigned long start_pfn, unsigned long pfn_cnt)
 		 */
 		if (start_pfn < has->start_pfn || start_pfn >= has->end_pfn)
 			continue;
+
+		/*
+		 * If the current start pfn is not where the covered_end
+		 * is, create a gap and update covered_end_pfn.
+		 */
+		if (has->covered_end_pfn != start_pfn) {
+			gap = kzalloc(sizeof(struct hv_hotadd_gap), GFP_ATOMIC);
+			if (!gap)
+				return -ENOMEM;
+
+			INIT_LIST_HEAD(&gap->list);
+			gap->start_pfn = has->covered_end_pfn;
+			gap->end_pfn = start_pfn;
+			list_add_tail(&gap->list, &has->gap_list);
+
+			has->covered_end_pfn = start_pfn;
+		}
+
 		/*
 		 * If the current hot add-request extends beyond
 		 * our current limit; extend it.
@@ -734,19 +788,10 @@ static bool pfn_covered(unsigned long start_pfn, unsigned long pfn_cnt)
 			has->end_pfn += new_inc;
 		}
 
-		/*
-		 * If the current start pfn is not where the covered_end
-		 * is, update it.
-		 */
-
-		if (has->covered_end_pfn != start_pfn)
-			has->covered_end_pfn = start_pfn;
-
-		return true;
-
+		return 1;
 	}
 
-	return false;
+	return 0;
 }
 
 static unsigned long handle_pg_range(unsigned long pg_start,
@@ -785,6 +830,8 @@ static unsigned long handle_pg_range(unsigned long pg_start,
 			if (pgs_ol > pfn_cnt)
 				pgs_ol = pfn_cnt;
 
+			has->covered_end_pfn +=  pgs_ol;
+			pfn_cnt -= pgs_ol;
 			/*
 			 * Check if the corresponding memory block is already
 			 * online by checking its last previously backed page.
@@ -793,10 +840,8 @@ static unsigned long handle_pg_range(unsigned long pg_start,
 			 */
 			if (start_pfn > has->start_pfn &&
 			    !PageReserved(pfn_to_page(start_pfn - 1)))
-				hv_bring_pgs_online(start_pfn, pgs_ol);
+				hv_bring_pgs_online(has, start_pfn, pgs_ol);
 
-			has->covered_end_pfn +=  pgs_ol;
-			pfn_cnt -= pgs_ol;
 		}
 
 		if ((has->ha_end_pfn < has->end_pfn) && (pfn_cnt > 0)) {
@@ -834,13 +879,19 @@ static unsigned long process_hot_add(unsigned long pg_start,
 					unsigned long rg_size)
 {
 	struct hv_hotadd_state *ha_region = NULL;
+	int covered;
 
 	if (pfn_cnt == 0)
 		return 0;
 
-	if (!dm_device.host_specified_ha_region)
-		if (pfn_covered(pg_start, pfn_cnt))
+	if (!dm_device.host_specified_ha_region) {
+		covered = pfn_covered(pg_start, pfn_cnt);
+		if (covered < 0)
+			return 0;
+
+		if (covered)
 			goto do_pg_range;
+	}
 
 	/*
 	 * If the host has specified a hot-add range; deal with it first.
@@ -852,6 +903,7 @@ static unsigned long process_hot_add(unsigned long pg_start,
 			return 0;
 
 		INIT_LIST_HEAD(&ha_region->list);
+		INIT_LIST_HEAD(&ha_region->gap_list);
 
 		list_add_tail(&ha_region->list, &dm_device.ha_region_list);
 		ha_region->start_pfn = rg_start;
@@ -1584,6 +1636,7 @@ static int balloon_remove(struct hv_device *dev)
 	struct hv_dynmem_device *dm = hv_get_drvdata(dev);
 	struct list_head *cur, *tmp;
 	struct hv_hotadd_state *has;
+	struct hv_hotadd_gap *gap, *tmp_gap;
 
 	if (dm->num_pages_ballooned != 0)
 		pr_warn("Ballooned pages: %d\n", dm->num_pages_ballooned);
@@ -1600,6 +1653,10 @@ static int balloon_remove(struct hv_device *dev)
 #endif
 	list_for_each_safe(cur, tmp, &dm->ha_region_list) {
 		has = list_entry(cur, struct hv_hotadd_state, list);
+		list_for_each_entry_safe(gap, tmp_gap, &has->gap_list, list) {
+			list_del(&gap->list);
+			kfree(gap);
+		}
 		list_del(&has->list);
 		kfree(has);
 	}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH for-4.4 14/16] hv: don't reset hv_context.tsc_page on crash
  2017-04-10 17:44 [PATCH for-4.4 00/16] Stable commits from Ubuntu Xenial 4.4-lts Sumit Semwal
                   ` (12 preceding siblings ...)
  2017-04-10 17:44 ` [PATCH for-4.4 13/16] Drivers: hv: balloon: account for gaps in hot add regions Sumit Semwal
@ 2017-04-10 17:44 ` Sumit Semwal
  2017-04-10 17:44 ` [PATCH for-4.4 15/16] blk-mq: Avoid memory reclaim when remapping queues Sumit Semwal
  2017-04-10 17:44 ` [PATCH for-4.4 16/16] usb: hub: Wait for connection to be reestablished after port reset Sumit Semwal
  15 siblings, 0 replies; 21+ messages in thread
From: Sumit Semwal @ 2017-04-10 17:44 UTC (permalink / raw)
  To: stable
  Cc: Vitaly Kuznetsov, K . Y . Srinivasan, Greg Kroah-Hartman, Sumit Semwal

From: Vitaly Kuznetsov <vkuznets@redhat.com>

[ Upstream commit 56ef6718a1d8d77745033c5291e025ce18504159 ]

It may happen that secondary CPUs are still alive and resetting
hv_context.tsc_page will cause a consequent crash in read_hv_clock_tsc()
as we don't check for it being not NULL there. It is safe as we're not
freeing this page anyways.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
---
 drivers/hv/hv.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index ddbf7e7..8ce1f2e 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -305,9 +305,10 @@ void hv_cleanup(bool crash)
 
 		hypercall_msr.as_uint64 = 0;
 		wrmsrl(HV_X64_MSR_REFERENCE_TSC, hypercall_msr.as_uint64);
-		if (!crash)
+		if (!crash) {
 			vfree(hv_context.tsc_page);
-		hv_context.tsc_page = NULL;
+			hv_context.tsc_page = NULL;
+		}
 	}
 #endif
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH for-4.4 15/16] blk-mq: Avoid memory reclaim when remapping queues
  2017-04-10 17:44 [PATCH for-4.4 00/16] Stable commits from Ubuntu Xenial 4.4-lts Sumit Semwal
                   ` (13 preceding siblings ...)
  2017-04-10 17:44 ` [PATCH for-4.4 14/16] hv: don't reset hv_context.tsc_page on crash Sumit Semwal
@ 2017-04-10 17:44 ` Sumit Semwal
  2017-04-10 17:44 ` [PATCH for-4.4 16/16] usb: hub: Wait for connection to be reestablished after port reset Sumit Semwal
  15 siblings, 0 replies; 21+ messages in thread
From: Sumit Semwal @ 2017-04-10 17:44 UTC (permalink / raw)
  To: stable
  Cc: Gabriel Krisman Bertazi, Brian King, Douglas Miller, linux-block,
	linux-scsi, Jens Axboe, Sumit Semwal

From: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>

[ Upstream commit 36e1f3d107867b25c616c2fd294f5a1c9d4e5d09 ]

While stressing memory and IO at the same time we changed SMT settings,
we were able to consistently trigger deadlocks in the mm system, which
froze the entire machine.

I think that under memory stress conditions, the large allocations
performed by blk_mq_init_rq_map may trigger a reclaim, which stalls
waiting on the block layer remmaping completion, thus deadlocking the
system.  The trace below was collected after the machine stalled,
waiting for the hotplug event completion.

The simplest fix for this is to make allocations in this path
non-reclaimable, with GFP_NOIO.  With this patch, We couldn't hit the
issue anymore.

This should apply on top of Jens's for-next branch cleanly.

Changes since v1:
  - Use GFP_NOIO instead of GFP_NOWAIT.

 Call Trace:
[c000000f0160aaf0] [c000000f0160ab50] 0xc000000f0160ab50 (unreliable)
[c000000f0160acc0] [c000000000016624] __switch_to+0x2e4/0x430
[c000000f0160ad20] [c000000000b1a880] __schedule+0x310/0x9b0
[c000000f0160ae00] [c000000000b1af68] schedule+0x48/0xc0
[c000000f0160ae30] [c000000000b1b4b0] schedule_preempt_disabled+0x20/0x30
[c000000f0160ae50] [c000000000b1d4fc] __mutex_lock_slowpath+0xec/0x1f0
[c000000f0160aed0] [c000000000b1d678] mutex_lock+0x78/0xa0
[c000000f0160af00] [d000000019413cac] xfs_reclaim_inodes_ag+0x33c/0x380 [xfs]
[c000000f0160b0b0] [d000000019415164] xfs_reclaim_inodes_nr+0x54/0x70 [xfs]
[c000000f0160b0f0] [d0000000194297f8] xfs_fs_free_cached_objects+0x38/0x60 [xfs]
[c000000f0160b120] [c0000000003172c8] super_cache_scan+0x1f8/0x210
[c000000f0160b190] [c00000000026301c] shrink_slab.part.13+0x21c/0x4c0
[c000000f0160b2d0] [c000000000268088] shrink_zone+0x2d8/0x3c0
[c000000f0160b380] [c00000000026834c] do_try_to_free_pages+0x1dc/0x520
[c000000f0160b450] [c00000000026876c] try_to_free_pages+0xdc/0x250
[c000000f0160b4e0] [c000000000251978] __alloc_pages_nodemask+0x868/0x10d0
[c000000f0160b6f0] [c000000000567030] blk_mq_init_rq_map+0x160/0x380
[c000000f0160b7a0] [c00000000056758c] blk_mq_map_swqueue+0x33c/0x360
[c000000f0160b820] [c000000000567904] blk_mq_queue_reinit+0x64/0xb0
[c000000f0160b850] [c00000000056a16c] blk_mq_queue_reinit_notify+0x19c/0x250
[c000000f0160b8a0] [c0000000000f5d38] notifier_call_chain+0x98/0x100
[c000000f0160b8f0] [c0000000000c5fb0] __cpu_notify+0x70/0xe0
[c000000f0160b930] [c0000000000c63c4] notify_prepare+0x44/0xb0
[c000000f0160b9b0] [c0000000000c52f4] cpuhp_invoke_callback+0x84/0x250
[c000000f0160ba10] [c0000000000c570c] cpuhp_up_callbacks+0x5c/0x120
[c000000f0160ba60] [c0000000000c7cb8] _cpu_up+0xf8/0x1d0
[c000000f0160bac0] [c0000000000c7eb0] do_cpu_up+0x120/0x150
[c000000f0160bb40] [c0000000006fe024] cpu_subsys_online+0x64/0xe0
[c000000f0160bb90] [c0000000006f5124] device_online+0xb4/0x120
[c000000f0160bbd0] [c0000000006f5244] online_store+0xb4/0xc0
[c000000f0160bc20] [c0000000006f0a68] dev_attr_store+0x68/0xa0
[c000000f0160bc60] [c0000000003ccc30] sysfs_kf_write+0x80/0xb0
[c000000f0160bca0] [c0000000003cbabc] kernfs_fop_write+0x17c/0x250
[c000000f0160bcf0] [c00000000030fe6c] __vfs_write+0x6c/0x1e0
[c000000f0160bd90] [c000000000311490] vfs_write+0xd0/0x270
[c000000f0160bde0] [c0000000003131fc] SyS_write+0x6c/0x110
[c000000f0160be30] [c000000000009204] system_call+0x38/0xec

Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Cc: Douglas Miller <dougmill@linux.vnet.ibm.com>
Cc: linux-block@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
---
 block/blk-mq.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index d8d63c3..0d1af3e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1470,7 +1470,7 @@ static struct blk_mq_tags *blk_mq_init_rq_map(struct blk_mq_tag_set *set,
 	INIT_LIST_HEAD(&tags->page_list);
 
 	tags->rqs = kzalloc_node(set->queue_depth * sizeof(struct request *),
-				 GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY,
+				 GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY,
 				 set->numa_node);
 	if (!tags->rqs) {
 		blk_mq_free_tags(tags);
@@ -1496,7 +1496,7 @@ static struct blk_mq_tags *blk_mq_init_rq_map(struct blk_mq_tag_set *set,
 
 		do {
 			page = alloc_pages_node(set->numa_node,
-				GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY | __GFP_ZERO,
+				GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY | __GFP_ZERO,
 				this_order);
 			if (page)
 				break;
@@ -1517,7 +1517,7 @@ static struct blk_mq_tags *blk_mq_init_rq_map(struct blk_mq_tag_set *set,
 		 * Allow kmemleak to scan these pages as they contain pointers
 		 * to additional allocations like via ops->init_request().
 		 */
-		kmemleak_alloc(p, order_to_size(this_order), 1, GFP_KERNEL);
+		kmemleak_alloc(p, order_to_size(this_order), 1, GFP_NOIO);
 		entries_per_page = order_to_size(this_order) / rq_size;
 		to_do = min(entries_per_page, set->queue_depth - i);
 		left -= to_do * rq_size;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH for-4.4 16/16] usb: hub: Wait for connection to be reestablished after port reset
  2017-04-10 17:44 [PATCH for-4.4 00/16] Stable commits from Ubuntu Xenial 4.4-lts Sumit Semwal
                   ` (14 preceding siblings ...)
  2017-04-10 17:44 ` [PATCH for-4.4 15/16] blk-mq: Avoid memory reclaim when remapping queues Sumit Semwal
@ 2017-04-10 17:44 ` Sumit Semwal
  15 siblings, 0 replies; 21+ messages in thread
From: Sumit Semwal @ 2017-04-10 17:44 UTC (permalink / raw)
  To: stable; +Cc: Guenter Roeck, Douglas Anderson, Greg Kroah-Hartman, Sumit Semwal

From: Guenter Roeck <linux@roeck-us.net>

[ Upstream commit 22547c4cc4fe20698a6a85a55b8788859134b8e4 ]

On a system with a defective USB device connected to an USB hub,
an endless sequence of port connect events was observed. The sequence
of events as observed is as follows:

- Port reports connected event (port status=USB_PORT_STAT_CONNECTION).
- Event handler debounces port and resets it by calling hub_port_reset().
- hub_port_reset() calls hub_port_wait_reset() to wait for the reset
  to complete.
- The reset completes, but USB_PORT_STAT_CONNECTION is not immediately
  set in the port status register.
- hub_port_wait_reset() returns -ENOTCONN.
- Port initialization sequence is aborted.
- A few milliseconds later, the port again reports a connected event,
  and the sequence repeats.

This continues either forever or, randomly, stops if the connection
is already re-established when the port status is read. It results in
a high rate of udev events. This in turn destabilizes userspace since
the above sequence holds the device mutex pretty much continuously
and prevents userspace from actually reading the device status.

To prevent the problem from happening, let's wait for the connection
to be re-established after a port reset. If the device was actually
disconnected, the code will still return an error, but it will do so
only after the long reset timeout.

Cc: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
---
 drivers/usb/core/hub.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 9e62c93..7c2d87b 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -2602,8 +2602,15 @@ static int hub_port_wait_reset(struct usb_hub *hub, int port1,
 		if (ret < 0)
 			return ret;
 
-		/* The port state is unknown until the reset completes. */
-		if (!(portstatus & USB_PORT_STAT_RESET))
+		/*
+		 * The port state is unknown until the reset completes.
+		 *
+		 * On top of that, some chips may require additional time
+		 * to re-establish a connection after the reset is complete,
+		 * so also wait for the connection to be re-established.
+		 */
+		if (!(portstatus & USB_PORT_STAT_RESET) &&
+		    (portstatus & USB_PORT_STAT_CONNECTION))
 			break;
 
 		/* switch to the long delay after two short delay failures */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH for-4.4 01/16] net/mlx4_core: Fix when to save some qp context flags for dynamic VST to VGT transitions
  2017-04-10 17:44 ` [PATCH for-4.4 01/16] net/mlx4_core: Fix when to save some qp context flags for dynamic VST to VGT transitions Sumit Semwal
@ 2017-04-12 13:41   ` Greg KH
  2017-04-12 15:14     ` Sumit Semwal
  0 siblings, 1 reply; 21+ messages in thread
From: Greg KH @ 2017-04-12 13:41 UTC (permalink / raw)
  To: Sumit Semwal; +Cc: stable, Jack Morgenstein, Tariq Toukan, David S . Miller

On Mon, Apr 10, 2017 at 11:14:17PM +0530, Sumit Semwal wrote:
> From: Jack Morgenstein <jackm@dev.mellanox.co.il>
> 
> [Upstream commit 61a4577c9a4419b99e647744923517d47255da35]

Huh?  That commit is the v4.4.59 commit id.  It's not a commit in
Linus's tree at all.  What went wrong here?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for-4.4 03/16] net/mlx4_en: Fix bad WQE issue
  2017-04-10 17:44 ` [PATCH for-4.4 03/16] net/mlx4_en: Fix bad WQE issue Sumit Semwal
@ 2017-04-12 13:45   ` Greg KH
  2017-04-12 14:38     ` Sumit Semwal
  0 siblings, 1 reply; 21+ messages in thread
From: Greg KH @ 2017-04-12 13:45 UTC (permalink / raw)
  To: Sumit Semwal; +Cc: stable, Eugenia Emantayev, Tariq Toukan, David S . Miller

On Mon, Apr 10, 2017 at 11:14:19PM +0530, Sumit Semwal wrote:
> From: Eugenia Emantayev <eugenia@mellanox.com>
> 
> [Upstream commit 6496bbf0ec481966ef9ffe5b6660d8d1b55c60cc]
> 
> Single send WQE in RX buffer should be stamped with software
> ownership in order to prevent the flow of QP in error in FW
> once UPDATE_QP is called.
> 
> Fixes: 9f519f68cfff ('mlx4_en: Not using Shared Receive Queues')
> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
> Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
>  [sumits: cherry-picked for 4.4.y]

I can't take a patch for 4.4 that is not also in 4.9, sorry.  Please fix
up this series, and send me what is needed for 4.10, 4.9, and 4.4, and
re-check the git commit ids, I don't want to have to do it all for
you...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for-4.4 03/16] net/mlx4_en: Fix bad WQE issue
  2017-04-12 13:45   ` Greg KH
@ 2017-04-12 14:38     ` Sumit Semwal
  0 siblings, 0 replies; 21+ messages in thread
From: Sumit Semwal @ 2017-04-12 14:38 UTC (permalink / raw)
  To: Greg KH; +Cc: stable, Eugenia Emantayev, Tariq Toukan, David S . Miller

Hi Greg,

Apologies on the copy-paste error wrt the commit ID: I'll double check
on the whole series.

On 12 April 2017 at 19:15, Greg KH <greg@kroah.com> wrote:
> On Mon, Apr 10, 2017 at 11:14:19PM +0530, Sumit Semwal wrote:
>> From: Eugenia Emantayev <eugenia@mellanox.com>
>>
>> [Upstream commit 6496bbf0ec481966ef9ffe5b6660d8d1b55c60cc]
>>
>> Single send WQE in RX buffer should be stamped with software
>> ownership in order to prevent the flow of QP in error in FW
>> once UPDATE_QP is called.
>>
>> Fixes: 9f519f68cfff ('mlx4_en: Not using Shared Receive Queues')
>> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
>> Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
>> Signed-off-by: David S. Miller <davem@davemloft.net>
>> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
>>  [sumits: cherry-picked for 4.4.y]
>
> I can't take a patch for 4.4 that is not also in 4.9, sorry.  Please fix
> up this series, and send me what is needed for 4.10, 4.9, and 4.4, and
> re-check the git commit ids, I don't want to have to do it all for
> you...
I didn't realise that having patches in 4.9 was a prerequisite to
adding patches to 4.4; will check applicability for 4.9/4.10 as well.

Of course, I don't want _you_ to do this kind of checks for patches
submitted to stable. Will be careful next time.

>
> thanks,
>
> greg k-h

Best,
Sumit.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for-4.4 01/16] net/mlx4_core: Fix when to save some qp context flags for dynamic VST to VGT transitions
  2017-04-12 13:41   ` Greg KH
@ 2017-04-12 15:14     ` Sumit Semwal
  0 siblings, 0 replies; 21+ messages in thread
From: Sumit Semwal @ 2017-04-12 15:14 UTC (permalink / raw)
  To: Greg KH; +Cc: stable, Jack Morgenstein, Tariq Toukan, David S . Miller

Hi Greg,

On 12 April 2017 at 19:11, Greg KH <greg@kroah.com> wrote:
> On Mon, Apr 10, 2017 at 11:14:17PM +0530, Sumit Semwal wrote:
>> From: Jack Morgenstein <jackm@dev.mellanox.co.il>
>>
>> [Upstream commit 61a4577c9a4419b99e647744923517d47255da35]
>
> Huh?  That commit is the v4.4.59 commit id.  It's not a commit in
> Linus's tree at all.  What went wrong here?
>
I checked all the patches in this series, and it was a stupid
copy/paste problem, only for this one patch, unfortunately right in
the beginning :/.

I have corrected this and updated the series; will post out after
checking patch applicability in 4.9/4.10.

> thanks,
>
> greg k-h

Best,
Sumit.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2017-04-12 15:15 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-10 17:44 [PATCH for-4.4 00/16] Stable commits from Ubuntu Xenial 4.4-lts Sumit Semwal
2017-04-10 17:44 ` [PATCH for-4.4 01/16] net/mlx4_core: Fix when to save some qp context flags for dynamic VST to VGT transitions Sumit Semwal
2017-04-12 13:41   ` Greg KH
2017-04-12 15:14     ` Sumit Semwal
2017-04-10 17:44 ` [PATCH for-4.4 02/16] net/mlx4_core: Fix racy CQ (Completion Queue) free Sumit Semwal
2017-04-10 17:44 ` [PATCH for-4.4 03/16] net/mlx4_en: Fix bad WQE issue Sumit Semwal
2017-04-12 13:45   ` Greg KH
2017-04-12 14:38     ` Sumit Semwal
2017-04-10 17:44 ` [PATCH for-4.4 04/16] SUNRPC: fix refcounting problems with auth_gss messages Sumit Semwal
2017-04-10 17:44 ` [PATCH for-4.4 05/16] ibmveth: set correct gso_size and gso_type Sumit Semwal
2017-04-10 17:44 ` [PATCH for-4.4 06/16] ibmveth: calculate gso_segs for large packets Sumit Semwal
2017-04-10 17:44 ` [PATCH for-4.4 07/16] Drivers: hv: get rid of redundant messagecount in create_gpadl_header() Sumit Semwal
2017-04-10 17:44 ` [PATCH for-4.4 08/16] Drivers: hv: don't leak memory in vmbus_establish_gpadl() Sumit Semwal
2017-04-10 17:44 ` [PATCH for-4.4 09/16] Drivers: hv: get rid of timeout in vmbus_open() Sumit Semwal
2017-04-10 17:44 ` [PATCH for-4.4 10/16] Drivers: hv: vmbus: Reduce the delay between retries in vmbus_post_msg() Sumit Semwal
2017-04-10 17:44 ` [PATCH for-4.4 11/16] Tools: hv: kvp: ensure kvp device fd is closed on exec Sumit Semwal
2017-04-10 17:44 ` [PATCH for-4.4 12/16] Drivers: hv: balloon: keep track of where ha_region starts Sumit Semwal
2017-04-10 17:44 ` [PATCH for-4.4 13/16] Drivers: hv: balloon: account for gaps in hot add regions Sumit Semwal
2017-04-10 17:44 ` [PATCH for-4.4 14/16] hv: don't reset hv_context.tsc_page on crash Sumit Semwal
2017-04-10 17:44 ` [PATCH for-4.4 15/16] blk-mq: Avoid memory reclaim when remapping queues Sumit Semwal
2017-04-10 17:44 ` [PATCH for-4.4 16/16] usb: hub: Wait for connection to be reestablished after port reset Sumit Semwal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.