All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnd Bergmann <arnd@arndb.de>
To: linux-arm-kernel@lists.infradead.org
Cc: zhangfei <zhangfei.gao@linaro.org>,
	mark.rutland@arm.com, devicetree@vger.kernel.org,
	f.fainelli@gmail.com, linux@arm.linux.org.uk,
	eric.dumazet@gmail.com, sergei.shtylyov@cogentembedded.com,
	netdev@vger.kernel.org, David.Laight@aculab.com,
	davem@davemloft.net
Subject: Re: [PATCH 3/3] net: hisilicon: new hip04 ethernet driver
Date: Wed, 02 Apr 2014 17:24:24 +0200	[thread overview]
Message-ID: <4242558.6NaQec4f7j@wuerfel> (raw)
In-Reply-To: <533BDDBA.1060800@linaro.org>

On Wednesday 02 April 2014 17:51:54 zhangfei wrote:
> Dear Arnd
> 
> On 04/02/2014 05:21 PM, Arnd Bergmann wrote:
> > On Tuesday 01 April 2014 21:27:12 Zhangfei Gao wrote:
> >> +static int hip04_mac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
> >
> > While it looks like there are no serious functionality bugs left, this
> > function is rather inefficient, as has been pointed out before:
> 
> Yes, still need more performance tuning in the next step.
> We need to enable the hardware feature of cache flush, under help of 
> arm-smmu, as a result dma_map_single etc can be removed.

You cannot remove the dma_map_single call here, but the implementation
of that function will be different when you use the iommu_coherent_ops:
Instead of flushing the caches, it will create or remove an iommu entry
and return the bus address.

I remember you mentioned before that using the iommu on this particular
SoC actually gives you cache-coherent DMA, so you may also be able
to use arm_coherent_dma_ops if you can set up a static 1:1 mapping 
between bus and phys addresses.

> >> +{
> >> +       struct hip04_priv *priv = netdev_priv(ndev);
> >> +       struct net_device_stats *stats = &ndev->stats;
> >> +       unsigned int tx_head = priv->tx_head;
> >> +       struct tx_desc *desc = &priv->tx_desc[tx_head];
> >> +       dma_addr_t phys;
> >> +
> >> +       hip04_tx_reclaim(ndev, false);
> >> +       mod_timer(&priv->txtimer, jiffies + RECLAIM_PERIOD);
> >> +
> >> +       if (priv->tx_count >= TX_DESC_NUM) {
> >> +               netif_stop_queue(ndev);
> >> +               return NETDEV_TX_BUSY;
> >> +       }
> >
> > This is where you have two problems:
> >
> > - if the descriptor ring is full, you wait for RECLAIM_PERIOD,
> >    which is far too long at 500ms, because during that time you
> >    are not able to add further data to the stopped queue.
> 
> Understand
> The idea here is not using the timer as much as possible.
> As experiment shows, only xmit reclaim buffers, the best throughput can 
> be achieved.

I'm only talking about the case where that doesn't work: once you stop
the queue, the xmit function won't get called again until the timer
causes the reclaim be done and restart the queue.

> > - As David Laight pointed out earlier, you must also ensure that
> >    you don't have too much /data/ pending in the descriptor ring
> >    when you stop the queue. For a 10mbit connection, you have already
> >    tested (as we discussed on IRC) that 64 descriptors with 1500 byte
> >    frames gives you a 68ms round-trip ping time, which is too much.
> 
> When iperf & ping running together and only ping, it is 0.7 ms.
> 
> >    Conversely, on 1gbit, having only 64 descriptors actually seems
> >    a little low, and you may be able to get better throughput if
> >    you extend the ring to e.g. 512 descriptors.
> 
> OK, Will check throughput of upgrade xmit descriptors.
> But is it said not using too much descripors for xmit since no xmit 
> interrupt?

The important part is to limit the time that data spends in the queue,
which is a function of the interface tx speed and the number of bytes
in the queue.

> >> +       phys = dma_map_single(&ndev->dev, skb->data, skb->len, DMA_TO_DEVICE);
> >> +       if (dma_mapping_error(&ndev->dev, phys)) {
> >> +               dev_kfree_skb(skb);
> >> +               return NETDEV_TX_OK;
> >> +       }
> >> +
> >> +       priv->tx_skb[tx_head] = skb;
> >> +       priv->tx_phys[tx_head] = phys;
> >> +       desc->send_addr = cpu_to_be32(phys);
> >> +       desc->send_size = cpu_to_be16(skb->len);
> >> +       desc->cfg = cpu_to_be32(DESC_DEF_CFG);
> >> +       phys = priv->tx_desc_dma + tx_head * sizeof(struct tx_desc);
> >> +       desc->wb_addr = cpu_to_be32(phys);
> >
> > One detail: since you don't have cache-coherent DMA, "desc" will
> > reside in uncached memory, so you try to minimize the number of accesses.
> > It's probably faster if you build the descriptor on the stack and
> > then atomically copy it over, rather than assigning each member at
> > a time.
> 
> I am sorry, not quite understand, could you clarify more?
> The phys and size etc of skb->data is changing, so need to assign.
> If member contents keep constant, it can be set when initializing.

I meant you should use 64-bit accesses here instead of multiple 32 and
16 bit accesses, but as David noted, it's actually not that much of
a deal for the writes as it is for the reads from uncached memory.

The important part is to avoid the line where you do 'if (desc->send_addr
!= 0)' as much as possible.

	Arnd

WARNING: multiple messages have this Message-ID (diff)
From: arnd@arndb.de (Arnd Bergmann)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 3/3] net: hisilicon: new hip04 ethernet driver
Date: Wed, 02 Apr 2014 17:24:24 +0200	[thread overview]
Message-ID: <4242558.6NaQec4f7j@wuerfel> (raw)
In-Reply-To: <533BDDBA.1060800@linaro.org>

On Wednesday 02 April 2014 17:51:54 zhangfei wrote:
> Dear Arnd
> 
> On 04/02/2014 05:21 PM, Arnd Bergmann wrote:
> > On Tuesday 01 April 2014 21:27:12 Zhangfei Gao wrote:
> >> +static int hip04_mac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
> >
> > While it looks like there are no serious functionality bugs left, this
> > function is rather inefficient, as has been pointed out before:
> 
> Yes, still need more performance tuning in the next step.
> We need to enable the hardware feature of cache flush, under help of 
> arm-smmu, as a result dma_map_single etc can be removed.

You cannot remove the dma_map_single call here, but the implementation
of that function will be different when you use the iommu_coherent_ops:
Instead of flushing the caches, it will create or remove an iommu entry
and return the bus address.

I remember you mentioned before that using the iommu on this particular
SoC actually gives you cache-coherent DMA, so you may also be able
to use arm_coherent_dma_ops if you can set up a static 1:1 mapping 
between bus and phys addresses.

> >> +{
> >> +       struct hip04_priv *priv = netdev_priv(ndev);
> >> +       struct net_device_stats *stats = &ndev->stats;
> >> +       unsigned int tx_head = priv->tx_head;
> >> +       struct tx_desc *desc = &priv->tx_desc[tx_head];
> >> +       dma_addr_t phys;
> >> +
> >> +       hip04_tx_reclaim(ndev, false);
> >> +       mod_timer(&priv->txtimer, jiffies + RECLAIM_PERIOD);
> >> +
> >> +       if (priv->tx_count >= TX_DESC_NUM) {
> >> +               netif_stop_queue(ndev);
> >> +               return NETDEV_TX_BUSY;
> >> +       }
> >
> > This is where you have two problems:
> >
> > - if the descriptor ring is full, you wait for RECLAIM_PERIOD,
> >    which is far too long at 500ms, because during that time you
> >    are not able to add further data to the stopped queue.
> 
> Understand
> The idea here is not using the timer as much as possible.
> As experiment shows, only xmit reclaim buffers, the best throughput can 
> be achieved.

I'm only talking about the case where that doesn't work: once you stop
the queue, the xmit function won't get called again until the timer
causes the reclaim be done and restart the queue.

> > - As David Laight pointed out earlier, you must also ensure that
> >    you don't have too much /data/ pending in the descriptor ring
> >    when you stop the queue. For a 10mbit connection, you have already
> >    tested (as we discussed on IRC) that 64 descriptors with 1500 byte
> >    frames gives you a 68ms round-trip ping time, which is too much.
> 
> When iperf & ping running together and only ping, it is 0.7 ms.
> 
> >    Conversely, on 1gbit, having only 64 descriptors actually seems
> >    a little low, and you may be able to get better throughput if
> >    you extend the ring to e.g. 512 descriptors.
> 
> OK, Will check throughput of upgrade xmit descriptors.
> But is it said not using too much descripors for xmit since no xmit 
> interrupt?

The important part is to limit the time that data spends in the queue,
which is a function of the interface tx speed and the number of bytes
in the queue.

> >> +       phys = dma_map_single(&ndev->dev, skb->data, skb->len, DMA_TO_DEVICE);
> >> +       if (dma_mapping_error(&ndev->dev, phys)) {
> >> +               dev_kfree_skb(skb);
> >> +               return NETDEV_TX_OK;
> >> +       }
> >> +
> >> +       priv->tx_skb[tx_head] = skb;
> >> +       priv->tx_phys[tx_head] = phys;
> >> +       desc->send_addr = cpu_to_be32(phys);
> >> +       desc->send_size = cpu_to_be16(skb->len);
> >> +       desc->cfg = cpu_to_be32(DESC_DEF_CFG);
> >> +       phys = priv->tx_desc_dma + tx_head * sizeof(struct tx_desc);
> >> +       desc->wb_addr = cpu_to_be32(phys);
> >
> > One detail: since you don't have cache-coherent DMA, "desc" will
> > reside in uncached memory, so you try to minimize the number of accesses.
> > It's probably faster if you build the descriptor on the stack and
> > then atomically copy it over, rather than assigning each member at
> > a time.
> 
> I am sorry, not quite understand, could you clarify more?
> The phys and size etc of skb->data is changing, so need to assign.
> If member contents keep constant, it can be set when initializing.

I meant you should use 64-bit accesses here instead of multiple 32 and
16 bit accesses, but as David noted, it's actually not that much of
a deal for the writes as it is for the reads from uncached memory.

The important part is to avoid the line where you do 'if (desc->send_addr
!= 0)' as much as possible.

	Arnd

  reply	other threads:[~2014-04-02 15:25 UTC|newest]

Thread overview: 130+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-01 13:27 [PATCH v5 0/3] add hisilicon hip04 ethernet driver Zhangfei Gao
2014-04-01 13:27 ` Zhangfei Gao
2014-04-01 13:27 ` [PATCH 1/3] Documentation: add Device tree bindings for Hisilicon hip04 ethernet Zhangfei Gao
2014-04-01 13:27   ` Zhangfei Gao
     [not found] ` <1396358832-15828-1-git-send-email-zhangfei.gao-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2014-04-01 13:27   ` [PATCH 2/3] net: hisilicon: new hip04 MDIO driver Zhangfei Gao
2014-04-01 13:27     ` Zhangfei Gao
2014-04-04 15:42     ` Zhangfei Gao
2014-04-04 15:42       ` Zhangfei Gao
2014-04-04 17:48       ` David Miller
2014-04-04 17:48         ` David Miller
2014-04-01 13:27 ` [PATCH 3/3] net: hisilicon: new hip04 ethernet driver Zhangfei Gao
2014-04-01 13:27   ` Zhangfei Gao
     [not found]   ` <1396358832-15828-4-git-send-email-zhangfei.gao-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2014-04-02  9:21     ` Arnd Bergmann
2014-04-02  9:21       ` Arnd Bergmann
2014-04-02  9:51       ` zhangfei
2014-04-02  9:51         ` zhangfei
2014-04-02 15:24         ` Arnd Bergmann [this message]
2014-04-02 15:24           ` Arnd Bergmann
2014-04-02 10:04       ` David Laight
2014-04-02 10:04         ` David Laight
2014-04-02 15:49         ` Arnd Bergmann
2014-04-02 15:49           ` Arnd Bergmann
2014-04-03  6:24           ` Zhangfei Gao
2014-04-03  6:24             ` Zhangfei Gao
     [not found]             ` <CAMj5BkgfwE1hHpVeqH9WRitwCB30x3c4w0qw7sXT3PiOV-QcPQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-04-03  8:35               ` Arnd Bergmann
2014-04-03  8:35                 ` Arnd Bergmann
2014-04-03 15:22         ` David Miller
2014-04-03 15:22           ` David Miller
2014-04-03 15:38         ` zhangfei
2014-04-03 15:38           ` zhangfei
2014-04-03 15:27       ` Russell King - ARM Linux
2014-04-03 15:27         ` Russell King - ARM Linux
2014-04-03 15:42         ` David Laight
2014-04-03 15:42           ` David Laight
2014-04-03 15:50           ` Russell King - ARM Linux
2014-04-03 15:50             ` Russell King - ARM Linux
2014-04-03 17:57         ` Arnd Bergmann
2014-04-03 17:57           ` Arnd Bergmann
2014-04-04  6:52         ` Zhangfei Gao
2014-04-04  6:52           ` Zhangfei Gao
  -- strict thread matches above, loose matches on Subject: below --
2014-04-05  4:35 [PATCH v7 0/3] add hisilicon " Zhangfei Gao
2014-04-05  4:35 ` [PATCH 3/3] net: hisilicon: new " Zhangfei Gao
2014-04-05  4:35   ` Zhangfei Gao
2014-04-07 18:53   ` David Miller
2014-04-07 18:53     ` David Miller
2014-04-08  8:07     ` zhangfei
2014-04-08  8:07       ` zhangfei
2014-04-08  8:30       ` David Laight
2014-04-08  8:30         ` David Laight
     [not found]         ` <063D6719AE5E284EB5DD2968C1650D6D0F6F1434-VkEWCZq2GCInGFn1LkZF6NBPR1lH4CV8@public.gmane.org>
2014-04-08  9:42           ` Arnd Bergmann
2014-04-08  9:42             ` Arnd Bergmann
2014-04-08 14:47           ` zhangfei
2014-04-08 14:47             ` zhangfei
2014-04-18 13:17     ` zhangfei
2014-04-18 13:17       ` zhangfei
2014-04-07 18:56   ` David Miller
2014-04-07 18:56     ` David Miller
2014-04-04 15:16 [PATCH v6 0/3] add hisilicon " Zhangfei Gao
     [not found] ` <1396624597-390-1-git-send-email-zhangfei.gao-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2014-04-04 15:16   ` [PATCH 3/3] net: hisilicon: new " Zhangfei Gao
2014-04-04 15:16     ` Zhangfei Gao
2014-03-28 15:35 [PATCH v4 0/3] add hisilicon " Zhangfei Gao
2014-03-28 15:36 ` [PATCH 3/3] net: hisilicon: new " Zhangfei Gao
2014-03-28 15:36   ` Zhangfei Gao
2014-03-24 14:14 [PATCH v3 0/3] add hisilicon " Zhangfei Gao
2014-03-24 14:14 ` [PATCH 3/3] net: hisilicon: new " Zhangfei Gao
2014-03-24 14:14   ` Zhangfei Gao
     [not found]   ` <1395670496-17381-4-git-send-email-zhangfei.gao-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2014-03-24 15:18     ` Arnd Bergmann
2014-03-24 15:18       ` Arnd Bergmann
2014-03-25  4:06       ` Zhangfei Gao
2014-03-25  4:06         ` Zhangfei Gao
2014-03-25  8:12         ` Arnd Bergmann
2014-03-25  8:12           ` Arnd Bergmann
2014-03-25 17:00           ` Florian Fainelli
2014-03-25 17:00             ` Florian Fainelli
2014-03-25 17:05             ` Arnd Bergmann
2014-03-25 17:05               ` Arnd Bergmann
2014-03-25 17:16               ` Florian Fainelli
2014-03-25 17:16                 ` Florian Fainelli
2014-03-25 17:57                 ` Arnd Bergmann
2014-03-25 17:57                   ` Arnd Bergmann
2014-03-26  9:55                   ` David Laight
2014-03-26  9:55                     ` David Laight
2014-03-25 17:17               ` David Laight
2014-03-25 17:17                 ` David Laight
2014-03-25 17:21               ` Eric Dumazet
2014-03-25 17:21                 ` Eric Dumazet
2014-03-25 17:54                 ` Arnd Bergmann
2014-03-25 17:54                   ` Arnd Bergmann
2014-03-27 12:53                   ` zhangfei
2014-03-27 12:53                     ` zhangfei
2014-03-24 16:32   ` Florian Fainelli
2014-03-24 16:32     ` Florian Fainelli
2014-03-24 17:23     ` Arnd Bergmann
2014-03-24 17:23       ` Arnd Bergmann
2014-03-24 17:35       ` Florian Fainelli
2014-03-24 17:35         ` Florian Fainelli
2014-03-27  6:27     ` Zhangfei Gao
2014-03-27  6:27       ` Zhangfei Gao
2014-03-21 15:09 [PATCH v2 0/3] add hisilicon " Zhangfei Gao
2014-03-21 15:09 ` [PATCH 3/3] net: hisilicon: new " Zhangfei Gao
2014-03-21 15:09   ` Zhangfei Gao
2014-03-21 15:27   ` Arnd Bergmann
2014-03-21 15:27     ` Arnd Bergmann
2014-03-22  1:18     ` zhangfei
2014-03-22  1:18       ` zhangfei
2014-03-22  8:08       ` Arnd Bergmann
2014-03-22  8:08         ` Arnd Bergmann
2014-03-18  8:40 [PATCH 0/3] add hisilicon " Zhangfei Gao
     [not found] ` <1395132017-15928-1-git-send-email-zhangfei.gao-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2014-03-18  8:40   ` [PATCH 3/3] net: hisilicon: new " Zhangfei Gao
2014-03-18  8:40     ` Zhangfei Gao
     [not found]     ` <1395132017-15928-4-git-send-email-zhangfei.gao-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2014-03-18 10:46       ` Russell King - ARM Linux
2014-03-18 10:46         ` Russell King - ARM Linux
2014-03-20  9:51         ` Zhangfei Gao
2014-03-20  9:51           ` Zhangfei Gao
2014-03-24 14:17           ` Rob Herring
2014-03-24 14:17             ` Rob Herring
2014-03-26 14:22             ` Zhangfei Gao
2014-03-26 14:22               ` Zhangfei Gao
2014-03-18 11:25     ` Arnd Bergmann
2014-03-18 11:25       ` Arnd Bergmann
2014-03-20 14:00       ` Zhangfei Gao
2014-03-20 14:00         ` Zhangfei Gao
2014-03-20 14:31         ` Arnd Bergmann
2014-03-20 14:31           ` Arnd Bergmann
     [not found]           ` <201403201531.20416.arnd-r2nGTMty4D4@public.gmane.org>
2014-03-21  5:19             ` Zhangfei Gao
2014-03-21  5:19               ` Zhangfei Gao
2014-03-21  7:37               ` Arnd Bergmann
2014-03-21  7:37                 ` Arnd Bergmann
2014-03-21  7:56                 ` Zhangfei Gao
2014-03-21  7:56                   ` Zhangfei Gao
2014-03-24  8:17           ` Zhangfei Gao
2014-03-24  8:17             ` Zhangfei Gao
2014-03-24 10:02             ` Arnd Bergmann
2014-03-24 10:02               ` Arnd Bergmann
2014-03-24 13:23               ` Zhangfei Gao
2014-03-24 13:23                 ` Zhangfei Gao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4242558.6NaQec4f7j@wuerfel \
    --to=arnd@arndb.de \
    --cc=David.Laight@aculab.com \
    --cc=davem@davemloft.net \
    --cc=devicetree@vger.kernel.org \
    --cc=eric.dumazet@gmail.com \
    --cc=f.fainelli@gmail.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux@arm.linux.org.uk \
    --cc=mark.rutland@arm.com \
    --cc=netdev@vger.kernel.org \
    --cc=sergei.shtylyov@cogentembedded.com \
    --cc=zhangfei.gao@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.