netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Jubran, Samih" <sameehj@amazon.com>
To: "Machulsky, Zorik" <zorik@amazon.com>,
	Josh Triplett <josh@joshtriplett.org>
Cc: "Belgazal, Netanel" <netanel@amazon.com>,
	"Kiyanovski, Arthur" <akiyano@amazon.com>,
	"Tzalik, Guy" <gtzalik@amazon.com>,
	"Bshara, Saeed" <saeedb@amazon.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: Re: [PATCH] ena: Speed up initialization 90x by reducing poll delays
Date: Wed, 11 Mar 2020 13:24:17 +0000	[thread overview]
Message-ID: <eb427583ff2444dcae18e1e37fb27918@EX13D11EUB003.ant.amazon.com> (raw)

Hi Josh,

Thanks for taking the time to write this patch. I have faced a bug while testing it that I haven't pinpointed yet the root cause of the issue, but it seems to me like a race in the netlink infrastructure.

Here is the bug scenario:
1. created ac  c5.24xlarge instance in AWS in v_virginia region using the default amazon Linux 2 AMI 
2. apply your patch won top of net-next v5.2 and install the kernel (currently I'm able to boot net-next v5.2 only, higher versions of net-next suffer from errors during boot time)
3. run "rmmod ena && insmod ena.ko" twice

Result:
The interface is not in up state

Expected result:
The interface should be in up state

What I know so far:
* ena_probe() seems to finish with no errors whatsoever
* adding prints / delays to ena_probe() causes the bug to vanish or less likely to occur depending on the amount of delays I add
* ena_up() is not called at all when the bug occurs, so it's something to do with netlink not invoking dev_open()

Did you face such issues? Do you have any idea what might be causing this?

> -----Original Message-----
> From: linux-kernel-owner@vger.kernel.org <linux-kernel-
> owner@vger.kernel.org> On Behalf Of Machulsky, Zorik
> <zorik@amazon.com>
> Sent: Tuesday, March 3, 2020 2:54 AM
> To: Josh Triplett <josh@joshtriplett.org>
> Cc: Belgazal, Netanel <netanel@amazon.com>; Kiyanovski, Arthur
> <akiyano@amazon.com>; Tzalik, Guy <gtzalik@amazon.com>; Bshara, Saeed
> <saeedb@amazon.com>; netdev@vger.kernel.org; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH] ena: Speed up initialization 90x by reducing poll delays
> 
> 
> 
> On 3/2/20, 4:40 PM, "Josh Triplett" <josh@joshtriplett.org> wrote:
> 
> 
>     On Mon, Mar 02, 2020 at 11:16:32PM +0000, Machulsky, Zorik wrote:
>     >
>     > On 2/28/20, 4:29 PM, "Josh Triplett" <josh@joshtriplett.org> wrote:
>     >
>     >     Before initializing completion queue interrupts, the ena driver uses
>     >     polling to wait for responses on the admin command queue. The ena
> driver
>     >     waits 5ms between polls, but the hardware has generally finished long
>     >     before that. Reduce the poll time to 10us.
>     >
>     >     On a c5.12xlarge, this improves ena initialization time from 173.6ms to
>     >     1.920ms, an improvement of more than 90x. This improves server boot
> time
>     >     and time to network bringup.
>     >
>     > Thanks Josh,
>     > We agree that polling rate should be increased, but prefer not to do it
> aggressively and blindly.
>     > For example linear backoff approach might be a better choice. Please let
> us re-work a little this
>     > patch and bring it to review. Thanks!
> 
>     That's fine, as long as it has the same net improvement on boot time.
> 
>     I'd appreciate the opportunity to test any alternate approach you might
>     have.
> 
>     (Also, as long as you're working on this, you might wish to make a
>     similar change to the EFA driver, and to the FreeBSD drivers.)
> 
> Absolutely! Already forwarded this to the owners of these drivers.  Thanks!
> 
>     >     Before:
>     >     [    0.531722] calling  ena_init+0x0/0x63 @ 1
>     >     [    0.531722] ena: Elastic Network Adapter (ENA) v2.1.0K
>     >     [    0.531751] ena 0000:00:05.0: Elastic Network Adapter (ENA) v2.1.0K
>     >     [    0.531946] PCI Interrupt Link [LNKD] enabled at IRQ 11
>     >     [    0.547425] ena: ena device version: 0.10
>     >     [    0.547427] ena: ena controller version: 0.0.1 implementation version
> 1
>     >     [    0.709497] ena 0000:00:05.0: Elastic Network Adapter (ENA) found at
> mem febf4000, mac addr 06:c4:22:0e:dc:da, Placement policy: Low Latency
>     >     [    0.709508] initcall ena_init+0x0/0x63 returned 0 after 173616 usecs
>     >
>     >     After:
>     >     [    0.526965] calling  ena_init+0x0/0x63 @ 1
>     >     [    0.526966] ena: Elastic Network Adapter (ENA) v2.1.0K
>     >     [    0.527056] ena 0000:00:05.0: Elastic Network Adapter (ENA) v2.1.0K
>     >     [    0.527196] PCI Interrupt Link [LNKD] enabled at IRQ 11
>     >     [    0.527211] ena: ena device version: 0.10
>     >     [    0.527212] ena: ena controller version: 0.0.1 implementation version
> 1
>     >     [    0.528925] ena 0000:00:05.0: Elastic Network Adapter (ENA) found at
> mem febf4000, mac addr 06:c4:22:0e:dc:da, Placement policy: Low Latency
>     >     [    0.528934] initcall ena_init+0x0/0x63 returned 0 after 1920 usecs
> 


             reply	other threads:[~2020-03-11 13:25 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-11 13:24 Jubran, Samih [this message]
2020-03-13 12:28 ` Re: [PATCH] ena: Speed up initialization 90x by reducing poll delays Josh Triplett
2020-04-12  9:37   ` Jubran, Samih
2020-04-12 20:27     ` Josh Triplett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eb427583ff2444dcae18e1e37fb27918@EX13D11EUB003.ant.amazon.com \
    --to=sameehj@amazon.com \
    --cc=akiyano@amazon.com \
    --cc=gtzalik@amazon.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netanel@amazon.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedb@amazon.com \
    --cc=zorik@amazon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).