All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexander Lobakin <aleksander.lobakin@intel.com>
To: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Cc: Alexander H Duyck <alexander.duyck@gmail.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Magnus Karlsson <magnus.karlsson@intel.com>,
	Michal Kubiak <michal.kubiak@intel.com>,
	"Larysa Zaremba" <larysa.zaremba@intel.com>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	Ilias Apalodimas <ilias.apalodimas@linaro.org>,
	"Christoph Hellwig" <hch@lst.de>,
	Paul Menzel <pmenzel@molgen.mpg.de>, <netdev@vger.kernel.org>,
	<intel-wired-lan@lists.osuosl.org>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH net-next v2 03/12] iavf: optimize Rx buffer allocation a bunch
Date: Wed, 31 May 2023 17:22:37 +0200	[thread overview]
Message-ID: <7ddd93fd-437d-8d67-c693-1bbcf36b552a@intel.com> (raw)
In-Reply-To: <ZHcsBPr0EXx6hkkd@boxer>

From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date: Wed, 31 May 2023 13:14:12 +0200

> On Tue, May 30, 2023 at 09:18:40AM -0700, Alexander H Duyck wrote:
> 
> FWIW I agree with what Alex is saying over here.

There are 2 Alexes, "choose wisely" :P

> 
>> On Thu, 2023-05-25 at 14:57 +0200, Alexander Lobakin wrote:
>>> The Rx hotpath code of IAVF is not well-optimized TBH. Before doing any
>>> further buffer model changes, shake it up a bit. Notably:
>>>
>>> 1. Cache more variables on the stack.
>>>    DMA device, Rx page size, NTC -- these are the most common things
>>>    used all throughout the hotpath, often in loops on each iteration.
>>>    Instead of fetching (or even calculating, as with the page size) them
>>>    from the ring all the time, cache them on the stack at the beginning
>>>    of the NAPI polling callback. NTC will be written back at the end,
>>>    the rest are used read-only, so no sync needed.
>>
>> The advantage of this is going to vary based on the attribute. One of
>> the reasons why I left most of this on the ring is because the section
>> of the ring most of these variables were meant to be read-mostly and
>> shouldn't have resulted in any additional overhead versus accessing
>> them from the stack.
> 
> I believe it depends on ring struct layout which vary across our drivers,
> no? On ice using making more usage of stack as described above improved
> perf.

It's +/- the same, most layout changes usually come with us moving stuff
around to optimize paddings and cachelines lol. Here's the same as with
ice, I don't think it's driver specific to get some positive results
from shortcutting more hotties. The sole time I was surprised is when
you were getting worse results storing xdp_buff on the stack vs on the
ring :D

> 
>>
>>> 2. Don't move the recycled buffers around the ring.
>>>    The idea of passing the page of the right-now-recycled-buffer to a
>>>    different buffer, in this case, the first one that needs to be
>>>    allocated, moreover, on each new frame, is fundamentally wrong. It
>>>    involves a few o' fetches, branches and then writes (and one Rx
>>>    buffer struct is at least 32 bytes) where they're completely unneeded,
>>>    but gives no good -- the result is the same as if we'd recycle it
>>>    inplace, at the same position where it was used. So drop this and let
>>>    the main refilling function take care of all the buffers, which were
>>>    processed and now need to be recycled/refilled.
>>
>> The next_to_alloc logic was put in place to deal with systems that are
>> experiencing memory issues. Specifically what can end up happening is
>> that the ring can stall due to failing memory allocations and the
>> memory can get stuck on the ring. For that reason we were essentially
>> defragmenting the buffers when we started suffering memory pressure so
>> that they could be reusued and/or freed following immediate use.
>>
>> Basically what you are trading off is some exception handling for
>> performance by removing it.
> 
> With all of the mix of the changes this patch carries, I find it hard to
> follow from description which parts of diff I should be looking at.

Huge piece removed before add_rx_frag_blah.

> 
>>
>>> 3. Don't allocate with %GPF_ATOMIC on ifup.
>>>    This involved introducing the @gfp parameter to a couple functions.
>>>    Doesn't change anything for Rx -> softirq.

[...]

>>> + up to 2% performance.
>>>
>>
>> What is the test you saw the 2% performance improvement in? Is it
>> something XDP related or a full stack test?
> 
> +1, can you say more about that measurement?

My prev reply to Alex.

> 
>>
>>> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
>>
>> Also one thing I am not a huge fan of is a patch that is really a
>> patchset onto itself. With all 6 items called out here I would have
>> preferred to see this as 6 patches as it would have been easier to
>> review.
> 
> +1

+1 :D

[...]

>>>  	/* if we are the last buffer then there is nothing else to do */
>>>  #define IAVF_RXD_EOF BIT(IAVF_RX_DESC_STATUS_EOF_SHIFT)
>>>  	if (likely(iavf_test_staterr(rx_desc, IAVF_RXD_EOF)))
>>
>> You may want to see if you can get rid of this function entirely,
>> perhaps you do in a later patch. This function was added for ixgbe back
>> in the day to allow us to place the skb back in the ring for the RSC
>> based workloads where we had to deal with interleaved frames in the Rx
>> path.
>>
>> For example, one question here would be why are we passing skb? It
>> isn't used as far as I can tell.
>>
> this was used back when skb was stored within the Rx buffer and now we
> just store skb on Rx ring struct, so good catch, this arg is redundant.

Also prev reply. I'm removing it later in the series hehe.

> 
> I'll go and take a look at code on v3.

No changes apart fixing OcteonTX2 compilation =\

Thanks,
Olek

WARNING: multiple messages have this Message-ID (diff)
From: Alexander Lobakin <aleksander.lobakin@intel.com>
To: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	Larysa Zaremba <larysa.zaremba@intel.com>,
	netdev@vger.kernel.org,
	Ilias Apalodimas <ilias.apalodimas@linaro.org>,
	linux-kernel@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
	Eric Dumazet <edumazet@google.com>,
	Michal Kubiak <michal.kubiak@intel.com>,
	intel-wired-lan@lists.osuosl.org,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	"David S. Miller" <davem@davemloft.net>,
	Magnus Karlsson <magnus.karlsson@intel.com>
Subject: Re: [Intel-wired-lan] [PATCH net-next v2 03/12] iavf: optimize Rx buffer allocation a bunch
Date: Wed, 31 May 2023 17:22:37 +0200	[thread overview]
Message-ID: <7ddd93fd-437d-8d67-c693-1bbcf36b552a@intel.com> (raw)
In-Reply-To: <ZHcsBPr0EXx6hkkd@boxer>

From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date: Wed, 31 May 2023 13:14:12 +0200

> On Tue, May 30, 2023 at 09:18:40AM -0700, Alexander H Duyck wrote:
> 
> FWIW I agree with what Alex is saying over here.

There are 2 Alexes, "choose wisely" :P

> 
>> On Thu, 2023-05-25 at 14:57 +0200, Alexander Lobakin wrote:
>>> The Rx hotpath code of IAVF is not well-optimized TBH. Before doing any
>>> further buffer model changes, shake it up a bit. Notably:
>>>
>>> 1. Cache more variables on the stack.
>>>    DMA device, Rx page size, NTC -- these are the most common things
>>>    used all throughout the hotpath, often in loops on each iteration.
>>>    Instead of fetching (or even calculating, as with the page size) them
>>>    from the ring all the time, cache them on the stack at the beginning
>>>    of the NAPI polling callback. NTC will be written back at the end,
>>>    the rest are used read-only, so no sync needed.
>>
>> The advantage of this is going to vary based on the attribute. One of
>> the reasons why I left most of this on the ring is because the section
>> of the ring most of these variables were meant to be read-mostly and
>> shouldn't have resulted in any additional overhead versus accessing
>> them from the stack.
> 
> I believe it depends on ring struct layout which vary across our drivers,
> no? On ice using making more usage of stack as described above improved
> perf.

It's +/- the same, most layout changes usually come with us moving stuff
around to optimize paddings and cachelines lol. Here's the same as with
ice, I don't think it's driver specific to get some positive results
from shortcutting more hotties. The sole time I was surprised is when
you were getting worse results storing xdp_buff on the stack vs on the
ring :D

> 
>>
>>> 2. Don't move the recycled buffers around the ring.
>>>    The idea of passing the page of the right-now-recycled-buffer to a
>>>    different buffer, in this case, the first one that needs to be
>>>    allocated, moreover, on each new frame, is fundamentally wrong. It
>>>    involves a few o' fetches, branches and then writes (and one Rx
>>>    buffer struct is at least 32 bytes) where they're completely unneeded,
>>>    but gives no good -- the result is the same as if we'd recycle it
>>>    inplace, at the same position where it was used. So drop this and let
>>>    the main refilling function take care of all the buffers, which were
>>>    processed and now need to be recycled/refilled.
>>
>> The next_to_alloc logic was put in place to deal with systems that are
>> experiencing memory issues. Specifically what can end up happening is
>> that the ring can stall due to failing memory allocations and the
>> memory can get stuck on the ring. For that reason we were essentially
>> defragmenting the buffers when we started suffering memory pressure so
>> that they could be reusued and/or freed following immediate use.
>>
>> Basically what you are trading off is some exception handling for
>> performance by removing it.
> 
> With all of the mix of the changes this patch carries, I find it hard to
> follow from description which parts of diff I should be looking at.

Huge piece removed before add_rx_frag_blah.

> 
>>
>>> 3. Don't allocate with %GPF_ATOMIC on ifup.
>>>    This involved introducing the @gfp parameter to a couple functions.
>>>    Doesn't change anything for Rx -> softirq.

[...]

>>> + up to 2% performance.
>>>
>>
>> What is the test you saw the 2% performance improvement in? Is it
>> something XDP related or a full stack test?
> 
> +1, can you say more about that measurement?

My prev reply to Alex.

> 
>>
>>> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
>>
>> Also one thing I am not a huge fan of is a patch that is really a
>> patchset onto itself. With all 6 items called out here I would have
>> preferred to see this as 6 patches as it would have been easier to
>> review.
> 
> +1

+1 :D

[...]

>>>  	/* if we are the last buffer then there is nothing else to do */
>>>  #define IAVF_RXD_EOF BIT(IAVF_RX_DESC_STATUS_EOF_SHIFT)
>>>  	if (likely(iavf_test_staterr(rx_desc, IAVF_RXD_EOF)))
>>
>> You may want to see if you can get rid of this function entirely,
>> perhaps you do in a later patch. This function was added for ixgbe back
>> in the day to allow us to place the skb back in the ring for the RSC
>> based workloads where we had to deal with interleaved frames in the Rx
>> path.
>>
>> For example, one question here would be why are we passing skb? It
>> isn't used as far as I can tell.
>>
> this was used back when skb was stored within the Rx buffer and now we
> just store skb on Rx ring struct, so good catch, this arg is redundant.

Also prev reply. I'm removing it later in the series hehe.

> 
> I'll go and take a look at code on v3.

No changes apart fixing OcteonTX2 compilation =\

Thanks,
Olek
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

  reply	other threads:[~2023-05-31 15:24 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-25 12:57 [Intel-wired-lan] [PATCH net-next v2 00/12] net: intel: start The Great Code Dedup + Page Pool for iavf Alexander Lobakin
2023-05-25 12:57 ` Alexander Lobakin
2023-05-25 12:57 ` [Intel-wired-lan] [PATCH net-next v2 01/12] net: intel: introduce Intel Ethernet common library Alexander Lobakin
2023-05-25 12:57   ` Alexander Lobakin
2023-05-25 12:57 ` [Intel-wired-lan] [PATCH net-next v2 02/12] iavf: kill "legacy-rx" for good Alexander Lobakin
2023-05-25 12:57   ` Alexander Lobakin
2023-05-30 15:29   ` Alexander H Duyck
2023-05-30 15:29     ` [Intel-wired-lan] " Alexander H Duyck
2023-05-30 16:22     ` Alexander Lobakin
2023-05-30 16:22       ` [Intel-wired-lan] " Alexander Lobakin
2023-05-25 12:57 ` [Intel-wired-lan] [PATCH net-next v2 03/12] iavf: optimize Rx buffer allocation a bunch Alexander Lobakin
2023-05-25 12:57   ` Alexander Lobakin
2023-05-30 16:18   ` Alexander H Duyck
2023-05-30 16:18     ` [Intel-wired-lan] " Alexander H Duyck
2023-05-31 11:14     ` Maciej Fijalkowski
2023-05-31 11:14       ` [Intel-wired-lan] " Maciej Fijalkowski
2023-05-31 15:22       ` Alexander Lobakin [this message]
2023-05-31 15:22         ` Alexander Lobakin
2023-05-31 15:13     ` Alexander Lobakin
2023-05-31 15:13       ` Alexander Lobakin
2023-05-31 17:22       ` Alexander Duyck
2023-06-02 13:58         ` Alexander Lobakin
2023-06-02 13:58           ` Alexander Lobakin
2023-06-02 15:04           ` Alexander Duyck
2023-06-02 15:04             ` Alexander Duyck
2023-06-02 16:15             ` Alexander Lobakin
2023-06-02 16:15               ` Alexander Lobakin
2023-06-02 17:50               ` Alexander Duyck
2023-06-02 17:50                 ` Alexander Duyck
2023-06-06 12:47                 ` Alexander Lobakin
2023-06-06 12:47                   ` Alexander Lobakin
2023-06-06 14:24                   ` Alexander Duyck
2023-06-06 14:24                     ` Alexander Duyck
2023-05-25 12:57 ` [Intel-wired-lan] [PATCH net-next v2 04/12] iavf: remove page splitting/recycling Alexander Lobakin
2023-05-25 12:57   ` Alexander Lobakin
2023-05-25 12:57 ` [Intel-wired-lan] [PATCH net-next v2 05/12] iavf: always use a full order-0 page Alexander Lobakin
2023-05-25 12:57   ` Alexander Lobakin
2023-05-26  8:57   ` David Laight
2023-05-26  8:57     ` [Intel-wired-lan] " David Laight
2023-05-26 12:52     ` Alexander Lobakin
2023-05-26 12:52       ` [Intel-wired-lan] " Alexander Lobakin
2023-05-25 12:57 ` [Intel-wired-lan] [PATCH net-next v2 06/12] net: skbuff: don't include <net/page_pool.h> into <linux/skbuff.h> Alexander Lobakin
2023-05-25 12:57   ` Alexander Lobakin
2023-05-27  3:54   ` Jakub Kicinski
2023-05-27  3:54     ` [Intel-wired-lan] " Jakub Kicinski
2023-05-30 13:12     ` Alexander Lobakin
2023-05-30 13:12       ` [Intel-wired-lan] " Alexander Lobakin
2023-05-25 12:57 ` [Intel-wired-lan] [PATCH net-next v2 07/12] net: page_pool: avoid calling no-op externals when possible Alexander Lobakin
2023-05-25 12:57   ` Alexander Lobakin
2023-05-25 12:57 ` [Intel-wired-lan] [PATCH net-next v2 08/12] net: page_pool: add DMA-sync-for-CPU inline helpers Alexander Lobakin
2023-05-25 12:57   ` Alexander Lobakin
2023-05-25 12:57 ` [Intel-wired-lan] [PATCH net-next v2 09/12] iavf: switch to Page Pool Alexander Lobakin
2023-05-25 12:57   ` Alexander Lobakin
2023-05-25 12:57 ` [Intel-wired-lan] [PATCH net-next v2 10/12] libie: add common queue stats Alexander Lobakin
2023-05-25 12:57   ` Alexander Lobakin
2023-05-25 12:57 ` [Intel-wired-lan] [PATCH net-next v2 11/12] libie: add per-queue Page Pool stats Alexander Lobakin
2023-05-25 12:57   ` Alexander Lobakin
2023-05-25 12:57 ` [Intel-wired-lan] [PATCH net-next v2 12/12] iavf: switch queue stats to libie Alexander Lobakin
2023-05-25 12:57   ` Alexander Lobakin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7ddd93fd-437d-8d67-c693-1bbcf36b552a@intel.com \
    --to=aleksander.lobakin@intel.com \
    --cc=alexander.duyck@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=hch@lst.de \
    --cc=ilias.apalodimas@linaro.org \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=kuba@kernel.org \
    --cc=larysa.zaremba@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=michal.kubiak@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pmenzel@molgen.mpg.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.