From: Jisheng Zhang <jszhang@marvell.com> To: Arnd Bergmann <arnd@arndb.de> Cc: <linux-arm-kernel@lists.infradead.org>, Marcin Wojtas <mw@semihalf.com>, Gregory CLEMENT <gregory.clement@free-electrons.com>, Thomas Petazzoni <thomas.petazzoni@free-electrons.com>, Andrew Lunn <andrew@lunn.ch>, "Jason Cooper" <jason@lakedaemon.net>, <netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>, "David S. Miller" <davem@davemloft.net>, Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Subject: Re: [PATCH net-next 1/4] net: mvneta: Convert to be 64 bits compatible Date: Thu, 24 Nov 2016 17:11:59 +0800 [thread overview] Message-ID: <20161124171159.2a82da4f@xhacker> (raw) In-Reply-To: <21520380.oWTKcrq8DS@wuerfel> On Thu, 24 Nov 2016 10:00:36 +0100 Arnd Bergmann <arnd@arndb.de> wrote: > On Thursday, November 24, 2016 4:37:36 PM CET Jisheng Zhang wrote: > > solB (a SW shadow cookie) perhaps gives a better performance: in hot path, > > such as mvneta_rx(), the driver accesses buf_cookie and buf_phys_addr of > > rx_desc which is allocated by dma_alloc_coherent, it's noncacheable if the > > device isn't cache-coherent. I didn't measure the performance difference, > > because in fact we take solA as well internally. From your experience, > > can the performance gain deserve the complex code? > > Yes, a read from uncached memory is fairly slow, so if you have a chance > to avoid that it will probably help. When adding complexity to the code, > it probably makes sense to take a runtime profile anyway quantify how > much it gains. > > On machines that have cache-coherent DMA, accessing the descriptor > should be fine, as you already have to load the entire cache line > to read the status field. > > Looking at this snippet: > > rx_status = rx_desc->status; > rx_bytes = rx_desc->data_size - (ETH_FCS_LEN + MVNETA_MH_SIZE); > data = (unsigned char *)rx_desc->buf_cookie; > phys_addr = rx_desc->buf_phys_addr; > pool_id = MVNETA_RX_GET_BM_POOL_ID(rx_desc); > bm_pool = &pp->bm_priv->bm_pools[pool_id]; > > if (!mvneta_rxq_desc_is_first_last(rx_status) || > (rx_status & MVNETA_RXD_ERR_SUMMARY)) { > err_drop_frame_ret_pool: > /* Return the buffer to the pool */ > mvneta_bm_pool_put_bp(pp->bm_priv, bm_pool, > rx_desc->buf_phys_addr); > err_drop_frame: > > > I think there is more room for optimizing if you start: you read > the status field twice (the second one in MVNETA_RX_GET_BM_POOL_ID) > and you can cache the buf_phys_addr along with the virtual address > once you add that. oh, yeah! buf_phy_addr could be included too. > > Generally speaking, I'd recommend using READ_ONCE()/WRITE_ONCE() > to access the descriptor fields, to ensure the compiler doesn't > add extra references as well as to annotate the expensive > operations. > > Arnd Got it. Thanks so much for the detailed guide.
WARNING: multiple messages have this Message-ID (diff)
From: jszhang@marvell.com (Jisheng Zhang) To: linux-arm-kernel@lists.infradead.org Subject: [PATCH net-next 1/4] net: mvneta: Convert to be 64 bits compatible Date: Thu, 24 Nov 2016 17:11:59 +0800 [thread overview] Message-ID: <20161124171159.2a82da4f@xhacker> (raw) In-Reply-To: <21520380.oWTKcrq8DS@wuerfel> On Thu, 24 Nov 2016 10:00:36 +0100 Arnd Bergmann <arnd@arndb.de> wrote: > On Thursday, November 24, 2016 4:37:36 PM CET Jisheng Zhang wrote: > > solB (a SW shadow cookie) perhaps gives a better performance: in hot path, > > such as mvneta_rx(), the driver accesses buf_cookie and buf_phys_addr of > > rx_desc which is allocated by dma_alloc_coherent, it's noncacheable if the > > device isn't cache-coherent. I didn't measure the performance difference, > > because in fact we take solA as well internally. From your experience, > > can the performance gain deserve the complex code? > > Yes, a read from uncached memory is fairly slow, so if you have a chance > to avoid that it will probably help. When adding complexity to the code, > it probably makes sense to take a runtime profile anyway quantify how > much it gains. > > On machines that have cache-coherent DMA, accessing the descriptor > should be fine, as you already have to load the entire cache line > to read the status field. > > Looking at this snippet: > > rx_status = rx_desc->status; > rx_bytes = rx_desc->data_size - (ETH_FCS_LEN + MVNETA_MH_SIZE); > data = (unsigned char *)rx_desc->buf_cookie; > phys_addr = rx_desc->buf_phys_addr; > pool_id = MVNETA_RX_GET_BM_POOL_ID(rx_desc); > bm_pool = &pp->bm_priv->bm_pools[pool_id]; > > if (!mvneta_rxq_desc_is_first_last(rx_status) || > (rx_status & MVNETA_RXD_ERR_SUMMARY)) { > err_drop_frame_ret_pool: > /* Return the buffer to the pool */ > mvneta_bm_pool_put_bp(pp->bm_priv, bm_pool, > rx_desc->buf_phys_addr); > err_drop_frame: > > > I think there is more room for optimizing if you start: you read > the status field twice (the second one in MVNETA_RX_GET_BM_POOL_ID) > and you can cache the buf_phys_addr along with the virtual address > once you add that. oh, yeah! buf_phy_addr could be included too. > > Generally speaking, I'd recommend using READ_ONCE()/WRITE_ONCE() > to access the descriptor fields, to ensure the compiler doesn't > add extra references as well as to annotate the expensive > operations. > > Arnd Got it. Thanks so much for the detailed guide.
next prev parent reply other threads:[~2016-11-24 9:17 UTC|newest] Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top 2016-11-22 16:48 [PATCH net-next 0/4] Extend mvneta to support Armada 3700 (ARM 64) Gregory CLEMENT 2016-11-22 16:48 ` Gregory CLEMENT 2016-11-22 16:48 ` Gregory CLEMENT 2016-11-22 16:48 ` [PATCH net-next 1/4] net: mvneta: Convert to be 64 bits compatible Gregory CLEMENT 2016-11-22 16:48 ` Gregory CLEMENT 2016-11-22 21:04 ` Arnd Bergmann 2016-11-22 21:04 ` Arnd Bergmann 2016-11-23 9:53 ` Jisheng Zhang 2016-11-23 9:53 ` Jisheng Zhang 2016-11-23 9:53 ` Jisheng Zhang 2016-11-23 10:15 ` Arnd Bergmann 2016-11-23 10:15 ` Arnd Bergmann 2016-11-23 11:03 ` Jisheng Zhang 2016-11-23 11:03 ` Jisheng Zhang 2016-11-23 13:07 ` Gregory CLEMENT 2016-11-23 13:07 ` Gregory CLEMENT 2016-11-23 16:02 ` Marcin Wojtas 2016-11-23 16:02 ` Marcin Wojtas 2016-11-23 16:02 ` Marcin Wojtas 2016-11-24 8:37 ` Jisheng Zhang 2016-11-24 8:37 ` Jisheng Zhang 2016-11-24 8:37 ` Jisheng Zhang 2016-11-24 9:00 ` Arnd Bergmann 2016-11-24 9:00 ` Arnd Bergmann 2016-11-24 9:11 ` Jisheng Zhang [this message] 2016-11-24 9:11 ` Jisheng Zhang 2016-11-24 15:01 ` Gregory CLEMENT 2016-11-24 15:01 ` Gregory CLEMENT 2016-11-24 15:09 ` Marcin Wojtas 2016-11-24 15:09 ` Marcin Wojtas 2016-11-24 15:09 ` Marcin Wojtas 2016-11-24 19:04 ` Florian Fainelli 2016-11-24 19:04 ` Florian Fainelli 2016-11-22 16:48 ` [PATCH net-next 2/4] net: mvneta: Only disable mvneta_bm for 64-bits Gregory CLEMENT 2016-11-22 16:48 ` Gregory CLEMENT 2016-11-22 16:48 ` Gregory CLEMENT 2016-11-22 16:48 ` [PATCH net-next 3/4] net: mvneta: Add network support for Armada 3700 SoC Gregory CLEMENT 2016-11-22 16:48 ` Gregory CLEMENT 2016-11-22 16:48 ` [PATCH net-next 4/4] ARM64: dts: marvell: Add network support for Armada 3700 Gregory CLEMENT 2016-11-22 16:48 ` Gregory CLEMENT 2016-11-22 16:48 ` Gregory CLEMENT
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20161124171159.2a82da4f@xhacker \ --to=jszhang@marvell.com \ --cc=andrew@lunn.ch \ --cc=arnd@arndb.de \ --cc=davem@davemloft.net \ --cc=gregory.clement@free-electrons.com \ --cc=jason@lakedaemon.net \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=mw@semihalf.com \ --cc=netdev@vger.kernel.org \ --cc=sebastian.hesselbarth@gmail.com \ --cc=thomas.petazzoni@free-electrons.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.