All of lore.kernel.org
 help / color / mirror / Atom feed
From: Florian Fainelli <f.fainelli@gmail.com>
To: Gregory CLEMENT <gregory.clement@free-electrons.com>,
	Arnd Bergmann <arnd@arndb.de>
Cc: Jisheng Zhang <jszhang@marvell.com>,
	Thomas Petazzoni <thomas.petazzoni@free-electrons.com>,
	Jason Cooper <jason@lakedaemon.net>, Andrew Lunn <andrew@lunn.ch>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Marcin Wojtas <mw@semihalf.com>,
	"David S. Miller" <davem@davemloft.net>,
	linux-arm-kernel@lists.infradead.org,
	Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Subject: Re: [PATCH net-next 1/4] net: mvneta: Convert to be 64 bits compatible
Date: Thu, 24 Nov 2016 11:04:38 -0800	[thread overview]
Message-ID: <e6160675-9d75-f617-b154-d23a3d725a07@gmail.com> (raw)
In-Reply-To: <8760ncly5s.fsf@free-electrons.com>

Le 24/11/2016 à 07:01, Gregory CLEMENT a écrit :
> Hi Arnd,
>  
>  On jeu., nov. 24 2016, Arnd Bergmann <arnd@arndb.de> wrote:
> 
>> On Thursday, November 24, 2016 4:37:36 PM CET Jisheng Zhang wrote:
>>> solB (a SW shadow cookie) perhaps gives a better performance: in hot path,
>>> such as mvneta_rx(), the driver accesses buf_cookie and buf_phys_addr of
>>> rx_desc which is allocated by dma_alloc_coherent, it's noncacheable if the
>>> device isn't cache-coherent. I didn't measure the performance difference,
>>> because in fact we take solA as well internally. From your experience,
>>> can the performance gain deserve the complex code?
>>
>> Yes, a read from uncached memory is fairly slow, so if you have a chance
>> to avoid that it will probably help. When adding complexity to the code,
>> it probably makes sense to take a runtime profile anyway quantify how
>> much it gains.
>>
>> On machines that have cache-coherent DMA, accessing the descriptor
>> should be fine, as you already have to load the entire cache line
>> to read the status field.
>>
>> Looking at this snippet:
>>
>>                 rx_status = rx_desc->status;
>>                 rx_bytes = rx_desc->data_size - (ETH_FCS_LEN + MVNETA_MH_SIZE);
>>                 data = (unsigned char *)rx_desc->buf_cookie;
>>                 phys_addr = rx_desc->buf_phys_addr;
>>                 pool_id = MVNETA_RX_GET_BM_POOL_ID(rx_desc);
>>                 bm_pool = &pp->bm_priv->bm_pools[pool_id];
>>
>>                 if (!mvneta_rxq_desc_is_first_last(rx_status) ||
>>                     (rx_status & MVNETA_RXD_ERR_SUMMARY)) {
>> err_drop_frame_ret_pool:
>>                         /* Return the buffer to the pool */
>>                         mvneta_bm_pool_put_bp(pp->bm_priv, bm_pool,
>>                                               rx_desc->buf_phys_addr);
>> err_drop_frame:
>>
>>
>> I think there is more room for optimizing if you start: you read
>> the status field twice (the second one in MVNETA_RX_GET_BM_POOL_ID)
>> and you can cache the buf_phys_addr along with the virtual address
>> once you add that.
> 
> I agree we can optimize this code but it is not related to the 64 bits
> conversion. Indeed this part is running when we use the HW buffer
> management, however currently this part is not ready at all for 64
> bits. The virtual address is directly handled by the hardware but it has
> only 32 bits to store it in the cookie.So if we want to use the HWBM in
> 64 bits we need to redesign the code, (maybe by storing the virtual
> address in a array and pass the index in the cookie).

Can't you make sure that skb->data is aligned to a value big enough that
you can still cover the <N> bit physical address space of the adapter
within a 32-bit quantity if you drop the low bits that would be all zeroes?

That way, even though you only have 32-bits of storage/cookie, these
don't have to be the actual 32-bits of your original address, but could
be addr >> 8 for instance?

As you indicate using an index stored in the cookie might be a better
scheme though, since you could attach a lot more metadata to an index in
an local array (which could be in cached memory) as opposed to just an
address.
-- 
Florian

WARNING: multiple messages have this Message-ID (diff)
From: f.fainelli@gmail.com (Florian Fainelli)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH net-next 1/4] net: mvneta: Convert to be 64 bits compatible
Date: Thu, 24 Nov 2016 11:04:38 -0800	[thread overview]
Message-ID: <e6160675-9d75-f617-b154-d23a3d725a07@gmail.com> (raw)
In-Reply-To: <8760ncly5s.fsf@free-electrons.com>

Le 24/11/2016 ? 07:01, Gregory CLEMENT a ?crit :
> Hi Arnd,
>  
>  On jeu., nov. 24 2016, Arnd Bergmann <arnd@arndb.de> wrote:
> 
>> On Thursday, November 24, 2016 4:37:36 PM CET Jisheng Zhang wrote:
>>> solB (a SW shadow cookie) perhaps gives a better performance: in hot path,
>>> such as mvneta_rx(), the driver accesses buf_cookie and buf_phys_addr of
>>> rx_desc which is allocated by dma_alloc_coherent, it's noncacheable if the
>>> device isn't cache-coherent. I didn't measure the performance difference,
>>> because in fact we take solA as well internally. From your experience,
>>> can the performance gain deserve the complex code?
>>
>> Yes, a read from uncached memory is fairly slow, so if you have a chance
>> to avoid that it will probably help. When adding complexity to the code,
>> it probably makes sense to take a runtime profile anyway quantify how
>> much it gains.
>>
>> On machines that have cache-coherent DMA, accessing the descriptor
>> should be fine, as you already have to load the entire cache line
>> to read the status field.
>>
>> Looking at this snippet:
>>
>>                 rx_status = rx_desc->status;
>>                 rx_bytes = rx_desc->data_size - (ETH_FCS_LEN + MVNETA_MH_SIZE);
>>                 data = (unsigned char *)rx_desc->buf_cookie;
>>                 phys_addr = rx_desc->buf_phys_addr;
>>                 pool_id = MVNETA_RX_GET_BM_POOL_ID(rx_desc);
>>                 bm_pool = &pp->bm_priv->bm_pools[pool_id];
>>
>>                 if (!mvneta_rxq_desc_is_first_last(rx_status) ||
>>                     (rx_status & MVNETA_RXD_ERR_SUMMARY)) {
>> err_drop_frame_ret_pool:
>>                         /* Return the buffer to the pool */
>>                         mvneta_bm_pool_put_bp(pp->bm_priv, bm_pool,
>>                                               rx_desc->buf_phys_addr);
>> err_drop_frame:
>>
>>
>> I think there is more room for optimizing if you start: you read
>> the status field twice (the second one in MVNETA_RX_GET_BM_POOL_ID)
>> and you can cache the buf_phys_addr along with the virtual address
>> once you add that.
> 
> I agree we can optimize this code but it is not related to the 64 bits
> conversion. Indeed this part is running when we use the HW buffer
> management, however currently this part is not ready at all for 64
> bits. The virtual address is directly handled by the hardware but it has
> only 32 bits to store it in the cookie.So if we want to use the HWBM in
> 64 bits we need to redesign the code, (maybe by storing the virtual
> address in a array and pass the index in the cookie).

Can't you make sure that skb->data is aligned to a value big enough that
you can still cover the <N> bit physical address space of the adapter
within a 32-bit quantity if you drop the low bits that would be all zeroes?

That way, even though you only have 32-bits of storage/cookie, these
don't have to be the actual 32-bits of your original address, but could
be addr >> 8 for instance?

As you indicate using an index stored in the cookie might be a better
scheme though, since you could attach a lot more metadata to an index in
an local array (which could be in cached memory) as opposed to just an
address.
-- 
Florian

  parent reply	other threads:[~2016-11-24 19:04 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-22 16:48 [PATCH net-next 0/4] Extend mvneta to support Armada 3700 (ARM 64) Gregory CLEMENT
2016-11-22 16:48 ` Gregory CLEMENT
2016-11-22 16:48 ` Gregory CLEMENT
2016-11-22 16:48 ` [PATCH net-next 1/4] net: mvneta: Convert to be 64 bits compatible Gregory CLEMENT
2016-11-22 16:48   ` Gregory CLEMENT
2016-11-22 21:04   ` Arnd Bergmann
2016-11-22 21:04     ` Arnd Bergmann
2016-11-23  9:53     ` Jisheng Zhang
2016-11-23  9:53       ` Jisheng Zhang
2016-11-23  9:53       ` Jisheng Zhang
2016-11-23 10:15       ` Arnd Bergmann
2016-11-23 10:15         ` Arnd Bergmann
2016-11-23 11:03         ` Jisheng Zhang
2016-11-23 11:03           ` Jisheng Zhang
2016-11-23 13:07         ` Gregory CLEMENT
2016-11-23 13:07           ` Gregory CLEMENT
2016-11-23 16:02           ` Marcin Wojtas
2016-11-23 16:02             ` Marcin Wojtas
2016-11-23 16:02             ` Marcin Wojtas
2016-11-24  8:37             ` Jisheng Zhang
2016-11-24  8:37               ` Jisheng Zhang
2016-11-24  8:37               ` Jisheng Zhang
2016-11-24  9:00               ` Arnd Bergmann
2016-11-24  9:00                 ` Arnd Bergmann
2016-11-24  9:11                 ` Jisheng Zhang
2016-11-24  9:11                   ` Jisheng Zhang
2016-11-24 15:01                 ` Gregory CLEMENT
2016-11-24 15:01                   ` Gregory CLEMENT
2016-11-24 15:09                   ` Marcin Wojtas
2016-11-24 15:09                     ` Marcin Wojtas
2016-11-24 15:09                     ` Marcin Wojtas
2016-11-24 19:04                   ` Florian Fainelli [this message]
2016-11-24 19:04                     ` Florian Fainelli
2016-11-22 16:48 ` [PATCH net-next 2/4] net: mvneta: Only disable mvneta_bm for 64-bits Gregory CLEMENT
2016-11-22 16:48   ` Gregory CLEMENT
2016-11-22 16:48   ` Gregory CLEMENT
2016-11-22 16:48 ` [PATCH net-next 3/4] net: mvneta: Add network support for Armada 3700 SoC Gregory CLEMENT
2016-11-22 16:48   ` Gregory CLEMENT
2016-11-22 16:48 ` [PATCH net-next 4/4] ARM64: dts: marvell: Add network support for Armada 3700 Gregory CLEMENT
2016-11-22 16:48   ` Gregory CLEMENT
2016-11-22 16:48   ` Gregory CLEMENT

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e6160675-9d75-f617-b154-d23a3d725a07@gmail.com \
    --to=f.fainelli@gmail.com \
    --cc=andrew@lunn.ch \
    --cc=arnd@arndb.de \
    --cc=davem@davemloft.net \
    --cc=gregory.clement@free-electrons.com \
    --cc=jason@lakedaemon.net \
    --cc=jszhang@marvell.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mw@semihalf.com \
    --cc=netdev@vger.kernel.org \
    --cc=sebastian.hesselbarth@gmail.com \
    --cc=thomas.petazzoni@free-electrons.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.