linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 2.6.26-4.14] IB/ipoib: Arm "send_cq" to process completions in due time
@ 2020-06-12 19:41 Gerd Rausch
  2020-06-12 19:55 ` Jason Gunthorpe
  0 siblings, 1 reply; 8+ messages in thread
From: Gerd Rausch @ 2020-06-12 19:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Sasha Levin, Jason Gunthorpe, Doug Ledford,
	Sean Hefty, Hal Rosenstock, linux-rdma

This issue appears to only exist in Linux versions
2.6.26 through 4.14 inclusively:

With the introduction of commit
f56bcd8013566 ("IPoIB: Use separate CQ for UD send completions")

work completions are only processed once there are
more than 17 outstanding TX work requests.

Unfortunately, that also delays the processing of the
completion handler and holds on to references
held by the "skb" since "dev_kfree_skb_any"
won't be called for a very long time.

E.g. we've observed "nf_conntrack_cleanup_net_list" spin
     around for hours until "net->ct.count" goes down to zero
     on a sufficiently idle interface.

This fix arms the TX CQ after those "poll_tx" loops,
in order for "ipoib_send_comp_handler" to do its thing:

While it's obvious that processing completions one-by-one
is more costly than doing so in bulk,
holding on to "skb" resources for a potentially unlimited
amount of time appears to be a less favorable trade-off.

This issue appears to no longer exist in Linux-4.15
and younger, because the following commit does
call "ib_req_notify_cq" on "send_cq":
8966e28d2e40c ("IB/ipoib: Use NAPI in UD/TX flows")

Fixes: f56bcd8013566 ("IPoIB: Use separate CQ for UD send completions")

Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com>
---
 drivers/infiniband/ulp/ipoib/ipoib_ib.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index 18f732aa15101..b26b31b9e455e 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -491,8 +491,13 @@ static void drain_tx_cq(struct net_device *dev)
 	struct ipoib_dev_priv *priv = netdev_priv(dev);
 
 	netif_tx_lock(dev);
-	while (poll_tx(priv))
-		; /* nothing */
+
+	do {
+		while (poll_tx(priv))
+			; /* nothing */
+	} while (ib_req_notify_cq(priv->send_cq,
+				  IB_CQ_NEXT_COMP |
+				  IB_CQ_REPORT_MISSED_EVENTS) > 0);
 
 	if (netif_queue_stopped(dev))
 		mod_timer(&priv->poll_timer, jiffies + 1);
@@ -628,9 +633,14 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb,
 		++priv->tx_head;
 	}
 
-	if (unlikely(priv->tx_outstanding > MAX_SEND_CQE))
-		while (poll_tx(priv))
-			; /* nothing */
+	if (unlikely(priv->tx_outstanding > MAX_SEND_CQE)) {
+		do {
+			while (poll_tx(priv))
+				; /* nothing */
+		} while (ib_req_notify_cq(priv->send_cq,
+					  IB_CQ_NEXT_COMP |
+					  IB_CQ_REPORT_MISSED_EVENTS) > 0);
+	}
 }
 
 static void __ipoib_reap_ah(struct net_device *dev)
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 2.6.26-4.14] IB/ipoib: Arm "send_cq" to process completions in due time
  2020-06-12 19:41 [PATCH 2.6.26-4.14] IB/ipoib: Arm "send_cq" to process completions in due time Gerd Rausch
@ 2020-06-12 19:55 ` Jason Gunthorpe
  2020-06-12 20:44   ` Gerd Rausch
  0 siblings, 1 reply; 8+ messages in thread
From: Jason Gunthorpe @ 2020-06-12 19:55 UTC (permalink / raw)
  To: Gerd Rausch
  Cc: Greg Kroah-Hartman, Sasha Levin, Doug Ledford, Sean Hefty,
	Hal Rosenstock, linux-rdma

On Fri, Jun 12, 2020 at 12:41:16PM -0700, Gerd Rausch wrote:
> This issue appears to only exist in Linux versions
> 2.6.26 through 4.14 inclusively:
> 
> With the introduction of commit
> f56bcd8013566 ("IPoIB: Use separate CQ for UD send completions")
> 
> work completions are only processed once there are
> more than 17 outstanding TX work requests.
> 
> Unfortunately, that also delays the processing of the
> completion handler and holds on to references
> held by the "skb" since "dev_kfree_skb_any"
> won't be called for a very long time.
> 
> E.g. we've observed "nf_conntrack_cleanup_net_list" spin
>      around for hours until "net->ct.count" goes down to zero
>      on a sufficiently idle interface.
> 
> This fix arms the TX CQ after those "poll_tx" loops,
> in order for "ipoib_send_comp_handler" to do its thing:
> 
> While it's obvious that processing completions one-by-one
> is more costly than doing so in bulk,
> holding on to "skb" resources for a potentially unlimited
> amount of time appears to be a less favorable trade-off.
> 
> This issue appears to no longer exist in Linux-4.15
> and younger, because the following commit does
> call "ib_req_notify_cq" on "send_cq":
> 8966e28d2e40c ("IB/ipoib: Use NAPI in UD/TX flows")

I'm not really clear what you want to happen to this patch - are you
proposing a stable patch that is not just a backport? Why can't you
backport the fix above instead?

You'll need to follow everything in

Documentation/process/stable-kernel-rules.rst

Or the stable maintainers won't even look at this.

Jasom

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2.6.26-4.14] IB/ipoib: Arm "send_cq" to process completions in due time
  2020-06-12 19:55 ` Jason Gunthorpe
@ 2020-06-12 20:44   ` Gerd Rausch
  2020-06-16 12:08     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 8+ messages in thread
From: Gerd Rausch @ 2020-06-12 20:44 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Greg Kroah-Hartman, Sasha Levin, Doug Ledford, Sean Hefty,
	Hal Rosenstock, linux-rdma

Hi Jason,

On 12/06/2020 12.55, Jason Gunthorpe wrote:
> On Fri, Jun 12, 2020 at 12:41:16PM -0700, Gerd Rausch wrote:
>> This issue appears to no longer exist in Linux-4.15
>> and younger, because the following commit does
>> call "ib_req_notify_cq" on "send_cq":
>> 8966e28d2e40c ("IB/ipoib: Use NAPI in UD/TX flows")
> 
> I'm not really clear what you want to happen to this patch - are you
> proposing a stable patch that is not just a backport? Why can't you
> backport the fix above instead?

I considered backporting commit 8966e28d2e40c ("IB/ipoib: Use NAPI in UD/TX flows")
with all the dependencies it may have a considerably higher risk
than just arming the TX CQ.

> 
> You'll need to follow everything in
> 
> Documentation/process/stable-kernel-rules.rst
> 
> Or the stable maintainers won't even look at this.
> 

Thanks,

  Gerd


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2.6.26-4.14] IB/ipoib: Arm "send_cq" to process completions in due time
  2020-06-12 20:44   ` Gerd Rausch
@ 2020-06-16 12:08     ` Greg Kroah-Hartman
  2020-06-16 16:35       ` Gerd Rausch
  0 siblings, 1 reply; 8+ messages in thread
From: Greg Kroah-Hartman @ 2020-06-16 12:08 UTC (permalink / raw)
  To: Gerd Rausch
  Cc: Jason Gunthorpe, Sasha Levin, Doug Ledford, Sean Hefty,
	Hal Rosenstock, linux-rdma

On Fri, Jun 12, 2020 at 01:44:55PM -0700, Gerd Rausch wrote:
> Hi Jason,
> 
> On 12/06/2020 12.55, Jason Gunthorpe wrote:
> > On Fri, Jun 12, 2020 at 12:41:16PM -0700, Gerd Rausch wrote:
> >> This issue appears to no longer exist in Linux-4.15
> >> and younger, because the following commit does
> >> call "ib_req_notify_cq" on "send_cq":
> >> 8966e28d2e40c ("IB/ipoib: Use NAPI in UD/TX flows")
> > 
> > I'm not really clear what you want to happen to this patch - are you
> > proposing a stable patch that is not just a backport? Why can't you
> > backport the fix above instead?
> 
> I considered backporting commit 8966e28d2e40c ("IB/ipoib: Use NAPI in UD/TX flows")
> with all the dependencies it may have a considerably higher risk
> than just arming the TX CQ.

90% of the time when we apply a patch that does NOT match the upstream
tree, it has a bug in it and needs to have another fix or something
else.

So please, if at all possible, stick to the upstream tree, so
backporting the current patches are the best thing to do.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2.6.26-4.14] IB/ipoib: Arm "send_cq" to process completions in due time
  2020-06-16 12:08     ` Greg Kroah-Hartman
@ 2020-06-16 16:35       ` Gerd Rausch
  2020-06-17  5:03         ` Leon Romanovsky
  0 siblings, 1 reply; 8+ messages in thread
From: Gerd Rausch @ 2020-06-16 16:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Jason Gunthorpe, Sasha Levin, Doug Ledford, Sean Hefty,
	Hal Rosenstock, linux-rdma

Hi,

On 16/06/2020 05.08, Greg Kroah-Hartman wrote:
>> I considered backporting commit 8966e28d2e40c ("IB/ipoib: Use NAPI in UD/TX flows")
>> with all the dependencies it may have a considerably higher risk
>> than just arming the TX CQ.
> 
> 90% of the time when we apply a patch that does NOT match the upstream
> tree, it has a bug in it and needs to have another fix or something
> else.
> 
> So please, if at all possible, stick to the upstream tree, so
> backporting the current patches are the best thing to do.
> 

Jason,

With Mellanox writing and fixing the vast majority of the code found
in IB/IPoIB, do you or one of your colleagues want to look into this?

It would be considerably less error-prone if the authors of that code
did that more risky work of backporting.

AFAIK, Mellanox also has the regression tests to ensure that everything
still works after this re-write as it did before.

Thanks,

 Gerd


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2.6.26-4.14] IB/ipoib: Arm "send_cq" to process completions in due time
  2020-06-16 16:35       ` Gerd Rausch
@ 2020-06-17  5:03         ` Leon Romanovsky
  2020-07-13 14:53           ` Greg Kroah-Hartman
  0 siblings, 1 reply; 8+ messages in thread
From: Leon Romanovsky @ 2020-06-17  5:03 UTC (permalink / raw)
  To: Gerd Rausch
  Cc: Greg Kroah-Hartman, Jason Gunthorpe, Sasha Levin, Doug Ledford,
	Sean Hefty, Hal Rosenstock, linux-rdma

On Tue, Jun 16, 2020 at 09:35:38AM -0700, Gerd Rausch wrote:
> Hi,
>
> On 16/06/2020 05.08, Greg Kroah-Hartman wrote:
> >> I considered backporting commit 8966e28d2e40c ("IB/ipoib: Use NAPI in UD/TX flows")
> >> with all the dependencies it may have a considerably higher risk
> >> than just arming the TX CQ.
> >
> > 90% of the time when we apply a patch that does NOT match the upstream
> > tree, it has a bug in it and needs to have another fix or something
> > else.
> >
> > So please, if at all possible, stick to the upstream tree, so
> > backporting the current patches are the best thing to do.
> >
>
> Jason,
>
> With Mellanox writing and fixing the vast majority of the code found
> in IB/IPoIB, do you or one of your colleagues want to look into this?
>
> It would be considerably less error-prone if the authors of that code
> did that more risky work of backporting.
>
> AFAIK, Mellanox also has the regression tests to ensure that everything
> still works after this re-write as it did before.

Please approach your Mellanox FAE representatives, they will know how to
handle it internally.

Thanks

>
> Thanks,
>
>  Gerd
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2.6.26-4.14] IB/ipoib: Arm "send_cq" to process completions in due time
  2020-06-17  5:03         ` Leon Romanovsky
@ 2020-07-13 14:53           ` Greg Kroah-Hartman
  2020-07-14  7:02             ` Leon Romanovsky
  0 siblings, 1 reply; 8+ messages in thread
From: Greg Kroah-Hartman @ 2020-07-13 14:53 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Gerd Rausch, Jason Gunthorpe, Sasha Levin, Doug Ledford,
	Sean Hefty, Hal Rosenstock, linux-rdma

On Wed, Jun 17, 2020 at 08:03:41AM +0300, Leon Romanovsky wrote:
> On Tue, Jun 16, 2020 at 09:35:38AM -0700, Gerd Rausch wrote:
> > Hi,
> >
> > On 16/06/2020 05.08, Greg Kroah-Hartman wrote:
> > >> I considered backporting commit 8966e28d2e40c ("IB/ipoib: Use NAPI in UD/TX flows")
> > >> with all the dependencies it may have a considerably higher risk
> > >> than just arming the TX CQ.
> > >
> > > 90% of the time when we apply a patch that does NOT match the upstream
> > > tree, it has a bug in it and needs to have another fix or something
> > > else.
> > >
> > > So please, if at all possible, stick to the upstream tree, so
> > > backporting the current patches are the best thing to do.
> > >
> >
> > Jason,
> >
> > With Mellanox writing and fixing the vast majority of the code found
> > in IB/IPoIB, do you or one of your colleagues want to look into this?
> >
> > It would be considerably less error-prone if the authors of that code
> > did that more risky work of backporting.
> >
> > AFAIK, Mellanox also has the regression tests to ensure that everything
> > still works after this re-write as it did before.
> 
> Please approach your Mellanox FAE representatives, they will know how to
> handle it internally.

Ah, so you all don't care about any IB fixes for 4.14 and older kernels
anymore?  If so, great, please let us know so we will not do any
backporting anymore, that will save us time!

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2.6.26-4.14] IB/ipoib: Arm "send_cq" to process completions in due time
  2020-07-13 14:53           ` Greg Kroah-Hartman
@ 2020-07-14  7:02             ` Leon Romanovsky
  0 siblings, 0 replies; 8+ messages in thread
From: Leon Romanovsky @ 2020-07-14  7:02 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Gerd Rausch, Jason Gunthorpe, Sasha Levin, Doug Ledford,
	Sean Hefty, Hal Rosenstock, linux-rdma

On Mon, Jul 13, 2020 at 04:53:44PM +0200, Greg Kroah-Hartman wrote:
> On Wed, Jun 17, 2020 at 08:03:41AM +0300, Leon Romanovsky wrote:
> > On Tue, Jun 16, 2020 at 09:35:38AM -0700, Gerd Rausch wrote:
> > > Hi,
> > >
> > > On 16/06/2020 05.08, Greg Kroah-Hartman wrote:
> > > >> I considered backporting commit 8966e28d2e40c ("IB/ipoib: Use NAPI in UD/TX flows")
> > > >> with all the dependencies it may have a considerably higher risk
> > > >> than just arming the TX CQ.
> > > >
> > > > 90% of the time when we apply a patch that does NOT match the upstream
> > > > tree, it has a bug in it and needs to have another fix or something
> > > > else.
> > > >
> > > > So please, if at all possible, stick to the upstream tree, so
> > > > backporting the current patches are the best thing to do.
> > > >
> > >
> > > Jason,
> > >
> > > With Mellanox writing and fixing the vast majority of the code found
> > > in IB/IPoIB, do you or one of your colleagues want to look into this?
> > >
> > > It would be considerably less error-prone if the authors of that code
> > > did that more risky work of backporting.
> > >
> > > AFAIK, Mellanox also has the regression tests to ensure that everything
> > > still works after this re-write as it did before.
> >
> > Please approach your Mellanox FAE representatives, they will know how to
> > handle it internally.
>
> Ah, so you all don't care about any IB fixes for 4.14 and older kernels
> anymore?  If so, great, please let us know so we will not do any
> backporting anymore, that will save us time!

Greg,

This is not what I said. As a Mellanox employee, I can't commit for any
internal resources, the FAE path is a standard way for our customers
to get proper attention.

Thanks

>
> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-07-14  7:02 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-12 19:41 [PATCH 2.6.26-4.14] IB/ipoib: Arm "send_cq" to process completions in due time Gerd Rausch
2020-06-12 19:55 ` Jason Gunthorpe
2020-06-12 20:44   ` Gerd Rausch
2020-06-16 12:08     ` Greg Kroah-Hartman
2020-06-16 16:35       ` Gerd Rausch
2020-06-17  5:03         ` Leon Romanovsky
2020-07-13 14:53           ` Greg Kroah-Hartman
2020-07-14  7:02             ` Leon Romanovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).