All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
To: Russell King - ARM Linux <linux@armlinux.org.uk>
Cc: Robin Murphy <robin.murphy@arm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	devicetree@vger.kernel.org, Yehuda Yitschak <yehuday@marvell.com>,
	Jason Cooper <jason@lakedaemon.net>,
	Pawel Moll <pawel.moll@arm.com>,
	Ian Campbell <ijc+devicetree@hellion.org.uk>,
	netdev@vger.kernel.org, Hanna Hawa <hannah@marvell.com>,
	Nadav Haklai <nadavh@marvell.com>,
	Rob Herring <robh+dt@kernel.org>, Andrew Lunn <andrew@lunn.ch>,
	Kumar Gala <galak@codeaurora.org>,
	Gregory Clement <gregory.clement@free-electrons.com>,
	Stefan Chulski <stefanc@marvell.com>,
	Marcin Wojtas <mw@semihalf.com>,
	"David S. Miller" <davem@davemloft.net>,
	linux-arm-kernel@lists.infradead.org,
	Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Subject: Re: [PATCHv2 net-next 05/16] net: mvpp2: introduce PPv2.2 HW descriptors and adapt accessors
Date: Sat, 4 Feb 2017 14:59:38 +0100	[thread overview]
Message-ID: <20170204145938.6e6380d9@free-electrons.com> (raw)
In-Reply-To: <20170203155449.GC27312@n2100.armlinux.org.uk>

Hello,

On Fri, 3 Feb 2017 15:54:49 +0000, Russell King - ARM Linux wrote:

> Now for a bit of a whinge.  Reading through this code is rather annoying
> because of what's called a "physical" address which is actually a DMA
> address as far as the kernel is concerned - this makes it much harder
> when thinking about this issue because it causes all sorts of confusion.
> Please can the next revision of the patches start calling things by their
> right name - a dma_addr_t is a DMA address, not a physical address, even
> though _numerically_ it may be the same thing.  From the point of view
> of the kernel, you must not do phys_to_virt() on a dma_addr_t address.
> Thanks.

I do know that phys_addr_t is different from dma_addr_t. The case where
you have an IOMMU makes this very obvious. The fact that the driver
isn't completely correct in that respect indeed needs to be fixed.

> Taking a step backwards...
> 
> How do DMA addresses and this cookie get into the receive ring - from what
> I can see, the driver doesn't write these into the receive ring, it's the
> hardware that writes it, and the only route I can see that they get there
> is via writes performed in mvpp2_bm_pool_put().

Correct. Here is a quick summary of my understand of the HW operation.
The HW has a "buffer manager", i.e it can internally manage a list of
buffers. At initialization time, we provide to the HW a list of buffers
by pushing the DMA address and virtual address of each buffer into a
register. Thanks to that the HW then knows about all the buffers we
have given. Then, upon RX of a packet, the HW picks one of those
buffers, and then fills a RX descriptor with the DMA address of the
packet, and the VA in the "cookie" field. The "cookie" field can I
believe be used for whatever we want, it's just "free" space in the
descriptor.

> Now, from what I can see, the "buf_virt_addr" comes from:
> 
> +static void *mvpp2_frag_alloc(const struct mvpp2_bm_pool *pool)
> +{
> +       if (likely(pool->frag_size <= PAGE_SIZE))
> +               return netdev_alloc_frag(pool->frag_size);
> +       else
> +               return kmalloc(pool->frag_size, GFP_ATOMIC);
> +}
> 
> via mvpp2_buf_alloc().
> 
> Both kmalloc() and netdev_alloc_frag() guarantee that the virtual
> address will be in lowmem.

*That* is the thing I didn't realize until now. If the buffers are
always in lowmem, then it's much easier because we can indeed use
virt_to_phys()/phys_to_virt() on them. But it's indeed obvious they are
in lowmem, since we get a void* pointer for them.

> Given that, I would suggest changing mvpp2_bm_pool_put() as follows -
> and this is where my point above about separating the notion of "dma
> address" and "physical address" becomes very important:
> 
>  static inline void mvpp2_bm_pool_put(struct mvpp2_port *port, int pool,
> -                                    dma_addr_t buf_phys_addr,
> -                                    unsigned long buf_virt_addr)
> +                                    dma_addr_t dma, phys_addr_t phys)
>  {
> 
> and updating it to write "phys" as the existing buf_virt_addr.
> 
> In mvpp2_bm_bufs_add():
> 
>                 buf = mvpp2_buf_alloc(port, bm_pool, &phys_addr, GFP_KERNEL);
>                 if (!buf)
>                         break;
>  
>                 mvpp2_bm_pool_put(port, bm_pool->id, phys_addr,
> -                                 (unsigned long)buf);
> +                                 virt_to_phys(buf));
> 
> which I think means that mvpp2_rxdesc_virt_addr_get() can just become:
> 
>  	phys_addr_t cookie;
>  
>         /* PPv2.1 can only be used on 32 bits architectures, and there
>          * are 32 bits in buf_cookie which are enough to store the
>          * full virtual address, so things are easy.
>          */
>         if (port->priv->hw_version == MVPP21)
>                 cookie = rx_desc->pp21.buf_cookie;
>         else
>  		cookie = rx_desc->pp22.buf_cookie_misc & FORTY_BIT_MASK;
> 
>  	return phys_to_virt(cookie);

This makes complete sense. We use the cookie to store the phys_addr_t
rather than the virtual address. I might be missing something, but it
seems like a very good solution. Thanks for the suggestion, I'll try
this!

> I'd suggest against using DMA_BIT_MASK(40) there - because it's not a
> DMA address, even though it happens to resolve to the same number.

Sure, will update this as well.

> Again, I may have missed how the addresses end up getting into
> buf_cookie_misc, so what I suggest above may not be possible.

No, I think you got it right. At least it matches my own understanding
of the HW.

> I'd also suggest that there is some test in mvpp2_bm_bufs_add() to
> verify that the physical addresses and DMA addresses do fit within
> the available number of bits - if they don't we could end up scribbling
> over memory that we shouldn't be, and it looks like we have a failure
> path there to gracefully handle that situation - gracefully compared
> to a nasty BUG_ON().

For the DMA addresses, I have some additional changes on top of my v2
to use dma_set_mask() to ensure that DMA addresses fit in 40 bits. But
that's only for DMA addresses indeed.

Thanks again for your suggestion!

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

WARNING: multiple messages have this Message-ID (diff)
From: thomas.petazzoni@free-electrons.com (Thomas Petazzoni)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCHv2 net-next 05/16] net: mvpp2: introduce PPv2.2 HW descriptors and adapt accessors
Date: Sat, 4 Feb 2017 14:59:38 +0100	[thread overview]
Message-ID: <20170204145938.6e6380d9@free-electrons.com> (raw)
In-Reply-To: <20170203155449.GC27312@n2100.armlinux.org.uk>

Hello,

On Fri, 3 Feb 2017 15:54:49 +0000, Russell King - ARM Linux wrote:

> Now for a bit of a whinge.  Reading through this code is rather annoying
> because of what's called a "physical" address which is actually a DMA
> address as far as the kernel is concerned - this makes it much harder
> when thinking about this issue because it causes all sorts of confusion.
> Please can the next revision of the patches start calling things by their
> right name - a dma_addr_t is a DMA address, not a physical address, even
> though _numerically_ it may be the same thing.  From the point of view
> of the kernel, you must not do phys_to_virt() on a dma_addr_t address.
> Thanks.

I do know that phys_addr_t is different from dma_addr_t. The case where
you have an IOMMU makes this very obvious. The fact that the driver
isn't completely correct in that respect indeed needs to be fixed.

> Taking a step backwards...
> 
> How do DMA addresses and this cookie get into the receive ring - from what
> I can see, the driver doesn't write these into the receive ring, it's the
> hardware that writes it, and the only route I can see that they get there
> is via writes performed in mvpp2_bm_pool_put().

Correct. Here is a quick summary of my understand of the HW operation.
The HW has a "buffer manager", i.e it can internally manage a list of
buffers. At initialization time, we provide to the HW a list of buffers
by pushing the DMA address and virtual address of each buffer into a
register. Thanks to that the HW then knows about all the buffers we
have given. Then, upon RX of a packet, the HW picks one of those
buffers, and then fills a RX descriptor with the DMA address of the
packet, and the VA in the "cookie" field. The "cookie" field can I
believe be used for whatever we want, it's just "free" space in the
descriptor.

> Now, from what I can see, the "buf_virt_addr" comes from:
> 
> +static void *mvpp2_frag_alloc(const struct mvpp2_bm_pool *pool)
> +{
> +       if (likely(pool->frag_size <= PAGE_SIZE))
> +               return netdev_alloc_frag(pool->frag_size);
> +       else
> +               return kmalloc(pool->frag_size, GFP_ATOMIC);
> +}
> 
> via mvpp2_buf_alloc().
> 
> Both kmalloc() and netdev_alloc_frag() guarantee that the virtual
> address will be in lowmem.

*That* is the thing I didn't realize until now. If the buffers are
always in lowmem, then it's much easier because we can indeed use
virt_to_phys()/phys_to_virt() on them. But it's indeed obvious they are
in lowmem, since we get a void* pointer for them.

> Given that, I would suggest changing mvpp2_bm_pool_put() as follows -
> and this is where my point above about separating the notion of "dma
> address" and "physical address" becomes very important:
> 
>  static inline void mvpp2_bm_pool_put(struct mvpp2_port *port, int pool,
> -                                    dma_addr_t buf_phys_addr,
> -                                    unsigned long buf_virt_addr)
> +                                    dma_addr_t dma, phys_addr_t phys)
>  {
> 
> and updating it to write "phys" as the existing buf_virt_addr.
> 
> In mvpp2_bm_bufs_add():
> 
>                 buf = mvpp2_buf_alloc(port, bm_pool, &phys_addr, GFP_KERNEL);
>                 if (!buf)
>                         break;
>  
>                 mvpp2_bm_pool_put(port, bm_pool->id, phys_addr,
> -                                 (unsigned long)buf);
> +                                 virt_to_phys(buf));
> 
> which I think means that mvpp2_rxdesc_virt_addr_get() can just become:
> 
>  	phys_addr_t cookie;
>  
>         /* PPv2.1 can only be used on 32 bits architectures, and there
>          * are 32 bits in buf_cookie which are enough to store the
>          * full virtual address, so things are easy.
>          */
>         if (port->priv->hw_version == MVPP21)
>                 cookie = rx_desc->pp21.buf_cookie;
>         else
>  		cookie = rx_desc->pp22.buf_cookie_misc & FORTY_BIT_MASK;
> 
>  	return phys_to_virt(cookie);

This makes complete sense. We use the cookie to store the phys_addr_t
rather than the virtual address. I might be missing something, but it
seems like a very good solution. Thanks for the suggestion, I'll try
this!

> I'd suggest against using DMA_BIT_MASK(40) there - because it's not a
> DMA address, even though it happens to resolve to the same number.

Sure, will update this as well.

> Again, I may have missed how the addresses end up getting into
> buf_cookie_misc, so what I suggest above may not be possible.

No, I think you got it right. At least it matches my own understanding
of the HW.

> I'd also suggest that there is some test in mvpp2_bm_bufs_add() to
> verify that the physical addresses and DMA addresses do fit within
> the available number of bits - if they don't we could end up scribbling
> over memory that we shouldn't be, and it looks like we have a failure
> path there to gracefully handle that situation - gracefully compared
> to a nasty BUG_ON().

For the DMA addresses, I have some additional changes on top of my v2
to use dma_set_mask() to ensure that DMA addresses fit in 40 bits. But
that's only for DMA addresses indeed.

Thanks again for your suggestion!

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

  reply	other threads:[~2017-02-04 13:59 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-28 16:46 [PATCHv2 net-next 00/16] net: mvpp2: add basic support for PPv2.2 Thomas Petazzoni
2016-12-28 16:46 ` Thomas Petazzoni
     [not found] ` <1482943592-12556-1-git-send-email-thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
2016-12-28 16:46   ` [PATCHv2 net-next 01/16] dt-bindings: net: update Marvell PPv2 binding for PPv2.2 support Thomas Petazzoni
2016-12-28 16:46     ` Thomas Petazzoni
     [not found]     ` <1482943592-12556-2-git-send-email-thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
2017-01-03 20:18       ` Rob Herring
2017-01-03 20:18         ` Rob Herring
2017-02-02 16:56         ` Thomas Petazzoni
2017-02-02 16:56           ` Thomas Petazzoni
2017-02-03 16:48           ` Russell King - ARM Linux
2017-02-03 16:48             ` Russell King - ARM Linux
2017-02-14 14:25             ` Thomas Petazzoni
2017-02-14 14:25               ` Thomas Petazzoni
     [not found]               ` <20170214152503.602878cb-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
2017-02-21 10:12                 ` Thomas Petazzoni
2017-02-21 10:12                   ` Thomas Petazzoni
2017-01-07  9:32       ` Russell King - ARM Linux
2017-01-07  9:32         ` Russell King - ARM Linux
2017-02-02 16:44         ` Thomas Petazzoni
2017-02-02 16:44           ` Thomas Petazzoni
2016-12-28 16:46   ` [PATCHv2 net-next 02/16] net: mvpp2: add and use accessors for TX/RX descriptors Thomas Petazzoni
2016-12-28 16:46     ` Thomas Petazzoni
2016-12-28 16:46   ` [PATCHv2 net-next 03/16] net: mvpp2: add hw_version field in "struct mvpp2" Thomas Petazzoni
2016-12-28 16:46     ` Thomas Petazzoni
2016-12-28 16:46   ` [PATCHv2 net-next 07/16] net: mvpp2: adapt the mvpp2_rxq_*_pool_set functions to PPv2.2 Thomas Petazzoni
2016-12-28 16:46     ` Thomas Petazzoni
2016-12-28 16:46   ` [PATCHv2 net-next 14/16] net: mvpp2: adapt rxq distribution " Thomas Petazzoni
2016-12-28 16:46     ` Thomas Petazzoni
2016-12-28 16:46   ` [PATCHv2 net-next 16/16] net: mvpp2: finally add the PPv2.2 compatible string Thomas Petazzoni
2016-12-28 16:46     ` Thomas Petazzoni
2016-12-28 17:06   ` [PATCHv2 net-next 00/16] net: mvpp2: add basic support for PPv2.2 David Miller
2016-12-28 17:06     ` David Miller
     [not found]     ` <20161228.120644.1166014191192724301.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2016-12-28 21:08       ` Thomas Petazzoni
2016-12-28 21:08         ` Thomas Petazzoni
2016-12-28 16:46 ` [PATCHv2 net-next 04/16] net: mvpp2: introduce an intermediate union for the TX/RX descriptors Thomas Petazzoni
2016-12-28 16:46   ` Thomas Petazzoni
2016-12-28 16:46 ` [PATCHv2 net-next 05/16] net: mvpp2: introduce PPv2.2 HW descriptors and adapt accessors Thomas Petazzoni
2016-12-28 16:46   ` Thomas Petazzoni
2017-01-06 14:29   ` Russell King - ARM Linux
2017-01-06 14:29     ` Russell King - ARM Linux
2017-01-06 14:44     ` Robin Murphy
2017-01-06 14:44       ` Robin Murphy
     [not found]       ` <113811b6-79a4-9c66-d302-add9fb0c5b1a-5wv7dgnIgG8@public.gmane.org>
2017-02-03 13:24         ` Thomas Petazzoni
2017-02-03 13:24           ` Thomas Petazzoni
2017-02-03 14:05           ` Robin Murphy
2017-02-03 14:05             ` Robin Murphy
     [not found]             ` <231514f4-2e35-8bde-4469-aada833635aa-5wv7dgnIgG8@public.gmane.org>
2017-02-03 15:02               ` Thomas Petazzoni
2017-02-03 15:02                 ` Thomas Petazzoni
     [not found]                 ` <20170203160227.08b40c58-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
2017-02-03 16:31                   ` Robin Murphy
2017-02-03 16:31                     ` Robin Murphy
2017-02-03 15:54             ` Russell King - ARM Linux
2017-02-03 15:54               ` Russell King - ARM Linux
2017-02-04 13:59               ` Thomas Petazzoni [this message]
2017-02-04 13:59                 ` Thomas Petazzoni
2017-02-06 12:43                 ` David Laight
2017-02-06 12:43                   ` David Laight
2017-02-06 12:43                   ` David Laight
2016-12-28 16:46 ` [PATCHv2 net-next 06/16] net: mvpp2: adjust the allocation/free of BM pools for PPv2.2 Thomas Petazzoni
2016-12-28 16:46   ` Thomas Petazzoni
2017-01-06 14:32   ` Russell King - ARM Linux
2017-01-06 14:32     ` Russell King - ARM Linux
2016-12-28 16:46 ` [PATCHv2 net-next 08/16] net: mvpp2: adapt mvpp2_defaults_set() to PPv2.2 Thomas Petazzoni
2016-12-28 16:46   ` Thomas Petazzoni
2016-12-28 16:46 ` [PATCHv2 net-next 09/16] net: mvpp2: adjust mvpp2_{rxq,txq}_init for PPv2.2 Thomas Petazzoni
2016-12-28 16:46   ` [PATCHv2 net-next 09/16] net: mvpp2: adjust mvpp2_{rxq, txq}_init " Thomas Petazzoni
2016-12-28 16:46 ` [PATCHv2 net-next 10/16] net: mvpp2: handle register mapping and access " Thomas Petazzoni
2016-12-28 16:46   ` Thomas Petazzoni
2017-01-06 14:46   ` Russell King - ARM Linux
2017-01-06 14:46     ` Russell King - ARM Linux
     [not found]     ` <20170106144648.GE14217-l+eeeJia6m9URfEZ8mYm6t73F7V6hmMc@public.gmane.org>
2017-03-02  8:45       ` Thomas Petazzoni
2017-03-02  8:45         ` Thomas Petazzoni
2016-12-28 16:46 ` [PATCHv2 net-next 11/16] net: mvpp2: handle misc PPv2.1/PPv2.2 differences Thomas Petazzoni
2016-12-28 16:46   ` Thomas Petazzoni
2017-01-07  9:38   ` Russell King - ARM Linux
2017-01-07  9:38     ` Russell King - ARM Linux
     [not found]     ` <20170107093834.GJ14217-l+eeeJia6m9URfEZ8mYm6t73F7V6hmMc@public.gmane.org>
2017-01-07 20:10       ` Russell King - ARM Linux
2017-01-07 20:10         ` Russell King - ARM Linux
2017-02-14 14:53     ` Thomas Petazzoni
2017-02-14 14:53       ` Thomas Petazzoni
2017-01-07 11:03   ` Russell King - ARM Linux
2017-01-07 11:03     ` Russell King - ARM Linux
2017-01-07 12:12     ` Marcin Wojtas
2017-01-07 12:12       ` Marcin Wojtas
     [not found]       ` <CAPv3WKeQ=fj2cKPyJ2NqCaAv55cOyWodujKwj3-v5iCrDYNcmA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-01-07 13:50         ` Russell King - ARM Linux
2017-01-07 13:50           ` Russell King - ARM Linux
2017-01-07 13:50           ` Russell King - ARM Linux
2016-12-28 16:46 ` [PATCHv2 net-next 12/16] net: mvpp2: add AXI bridge initialization for PPv2.2 Thomas Petazzoni
2016-12-28 16:46   ` Thomas Petazzoni
2016-12-28 16:46 ` [PATCHv2 net-next 13/16] net: mvpp2: rework RXQ interrupt group " Thomas Petazzoni
2016-12-28 16:46   ` Thomas Petazzoni
2016-12-28 16:46 ` [PATCHv2 net-next 15/16] net: mvpp2: add support for an additional clock needed " Thomas Petazzoni
2016-12-28 16:46   ` Thomas Petazzoni
     [not found]   ` <1482943592-12556-16-git-send-email-thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
2017-01-07  9:29     ` Russell King - ARM Linux
2017-01-07  9:29       ` Russell King - ARM Linux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170204145938.6e6380d9@free-electrons.com \
    --to=thomas.petazzoni@free-electrons.com \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=devicetree@vger.kernel.org \
    --cc=galak@codeaurora.org \
    --cc=gregory.clement@free-electrons.com \
    --cc=hannah@marvell.com \
    --cc=ijc+devicetree@hellion.org.uk \
    --cc=jason@lakedaemon.net \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux@armlinux.org.uk \
    --cc=mark.rutland@arm.com \
    --cc=mw@semihalf.com \
    --cc=nadavh@marvell.com \
    --cc=netdev@vger.kernel.org \
    --cc=pawel.moll@arm.com \
    --cc=robh+dt@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=sebastian.hesselbarth@gmail.com \
    --cc=stefanc@marvell.com \
    --cc=yehuday@marvell.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.