All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Thomas Hellstrom <thomas@shipmail.org>
Cc: dri-devel@lists.freedesktop.org, airlied@linux.ie,
	linux-kernel@vger.kernel.org, konrad@darnok.org
Subject: Re: [RFC PATCH v2] Utilize the PCI API in the TTM framework.
Date: Mon, 10 Jan 2011 10:21:35 -0500	[thread overview]
Message-ID: <20110110152135.GA9732@dumpdata.com> (raw)
In-Reply-To: <4D2B16F3.1070105@shipmail.org>

On Mon, Jan 10, 2011 at 03:25:55PM +0100, Thomas Hellstrom wrote:
> Konrad,
> 
> Before looking further into the patch series, I need to make sure
> I've completely understood the problem and why you've chosen this
> solution: Please see inline.

Of course.

.. snip ..
> >The problem above can be easily reproduced on bare-metal if you pass in
> >"swiotlb=force iommu=soft".
> >
> 
> At a first glance, this would seem to be a driver error since the
> drivers are not calling pci_page_sync(), however I understand that
> the TTM infrastructure and desire to avoid bounce buffers add more
> implications to this...

<nods>
> 
> >There are two ways of fixing this:
> >
> >  1). Use the 'dma_alloc_coherent' (or pci_alloc_consistent if there is
> >      struct pcidev present), instead of alloc_page for GFP_DMA32. The
> >      'dma_alloc_coherent' guarantees that the allocated page fits
> >      within the device dma_mask (or uses the default DMA32 if no device
> >      is passed in). This also guarantees that any subsequent call
> >      to the PCI API for this page will return the same DMA (bus) address
> >      as the first call (so pci_alloc_consistent, and then pci_map_page
> >      will give the same DMA bus address).
> 
> 
> I guess dma_alloc_coherent() will allocate *real* DMA32 pages? that
> brings up a couple of questions:
> 1) Is it possible to change caching policy on pages allocated using
> dma_alloc_coherent?

Yes. They are the same "form-factor" as any normal page, except
that the IOMMU makes extra efforts to set this page up.

> 2) What about accounting? In a *non-Xen* environment, will the
> number of coherent pages be less than the number of DMA32 pages, or
> will dma_alloc_coherent just translate into a alloc_page(GFP_DMA32)?

The code in the IOMMUs end up calling __get_free_pages, which ends up
in alloc_pages. So the call doe ends up in alloc_page(flags).


native SWIOTLB (so no IOMMU): GFP_DMA32
GART (AMD's old IOMMU): GFP_DMA32:

For the hardware IOMMUs:

AMD VI: if it is in Passthrough mode, it calls it with GFP_DMA32.
   If it is in DMA translation mode (normal mode) it allocates a page
   with GFP_ZERO | ~(__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32) and immediately
   translates the bus address.

The flags change a bit:
VT-d: if there is no identity mapping, nor the PCI device is one of the special ones
   (GFX, Azalia), then it will pass it with GFP_DMA32.
   If it is in identity mapping state, and the device is a GFX or Azalia sound
   card, then it will ~(__GFP_DMA | GFP_DMA32) and immediately translate
   the buss address.

However, the interesting thing is that I've passed in the 'NULL' as
the struct device (not intentionally - did not want to add more changes
to the API) so all of the IOMMUs end up doing GFP_DMA32.

But it does mess up the accounting with the AMD-VI and VT-D as they strip
of the __GFP_DMA32 flag off. That is a big problem, I presume?

> 3) Same as above, but in a Xen environment, what will stop multiple
> guests to exhaust the coherent pages? It seems that the TTM
> accounting mechanisms will no longer be valid unless the number of
> available coherent pages are split across the guests?

Say I pass in four ATI Radeon cards (wherein each is a 32-bit card) to
four guests. Lets also assume that we are doing heavy operations in all
of the guests.  Since there are no communication between each TTM
accounting in each guest you could end up eating all of the 4GB physical
memory that is available to each guest. It could end up that the first
guess gets a lion share of the 4GB memory, while the other ones are
less so.

And if one was to do that on baremetal, with four ATI Radeon cards, the
TTM accounting mechanism would realize it is nearing the watermark
and do.. something, right? What would it do actually?

I think the error path would be the same in both cases?

> 
> >  2). Use the pci_sync_range_* after sending a page to the graphics
> >      engine. If the bounce buffer is used then we end up copying the
> >      pages.
> 
> Is the reason for choosing 1) instead of 2) purely a performance concern?

Yes, and also not understanding where I should insert the pci_sync_range
calls in the drivers.

> 
> 
> Finally, I wanted to ask why we need to pass / store the dma address
> of the TTM pages? Isn't it possible to just call into the DMA / PCI
> api to obtain it, and the coherent allocation will make sure it
> doesn't change?

It won't change, but you need the dma address during de-allocation:
dma_free_coherent..

  reply	other threads:[~2011-01-10 15:23 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-07 17:11 [RFC PATCH v2] Utilize the PCI API in the TTM framework Konrad Rzeszutek Wilk
2011-01-07 17:11 ` [PATCH 1/5] ttm: Introduce a placeholder for DMA (bus) addresses Konrad Rzeszutek Wilk
2011-01-27  9:13   ` Thomas Hellstrom
2011-01-07 17:11 ` [PATCH 2/5] tm: Utilize the dma_addr_t array for pages that are to in DMA32 pool Konrad Rzeszutek Wilk
2011-01-27  9:17   ` Thomas Hellstrom
2011-01-07 17:11 ` [PATCH 3/5] ttm: Expand (*populate) to support an array of DMA addresses Konrad Rzeszutek Wilk
2011-01-27  9:19   ` Thomas Hellstrom
2011-01-27 21:10     ` Konrad Rzeszutek Wilk
2011-01-07 17:11 ` [PATCH 4/5] radeon/ttm/PCIe: Use dma_addr if TTM has set it Konrad Rzeszutek Wilk
2011-01-27 21:20   ` Konrad Rzeszutek Wilk
2011-01-28 14:42     ` Jerome Glisse
2011-01-28 14:42       ` Jerome Glisse
2011-01-28 15:03       ` Konrad Rzeszutek Wilk
2011-01-28 15:03         ` Konrad Rzeszutek Wilk
2011-02-16 15:54       ` Konrad Rzeszutek Wilk
2011-02-16 15:54         ` Konrad Rzeszutek Wilk
2011-02-16 18:51         ` Jerome Glisse
2011-01-07 17:11 ` [PATCH 5/5] nouveau/ttm/PCIe: " Konrad Rzeszutek Wilk
2011-01-27 21:22   ` Konrad Rzeszutek Wilk
2011-01-07 22:21 ` [RFC PATCH v2] Utilize the PCI API in the TTM framework Ian Campbell
2011-01-08 10:41 ` Thomas Hellstrom
2011-01-10 14:25 ` Thomas Hellstrom
2011-01-10 15:21   ` Konrad Rzeszutek Wilk [this message]
2011-01-10 15:58     ` Thomas Hellstrom
2011-01-10 15:58       ` Thomas Hellstrom
2011-01-10 16:45       ` Konrad Rzeszutek Wilk
2011-01-10 20:50         ` Thomas Hellstrom
2011-01-11 15:55           ` Konrad Rzeszutek Wilk
2011-01-11 15:55             ` Konrad Rzeszutek Wilk
2011-01-11 16:21             ` Alex Deucher
2011-01-11 16:21               ` Alex Deucher
2011-01-11 16:59               ` Konrad Rzeszutek Wilk
2011-01-11 16:59                 ` Konrad Rzeszutek Wilk
2011-01-11 18:12                 ` Alex Deucher
2011-01-11 18:28                   ` Konrad Rzeszutek Wilk
2011-01-11 19:28                     ` Alex Deucher
2011-01-12  9:12             ` Thomas Hellstrom
2011-01-12 15:19               ` Konrad Rzeszutek Wilk
2011-01-12 15:19                 ` Konrad Rzeszutek Wilk
2011-01-24 14:49                 ` Konrad Rzeszutek Wilk
2011-01-24 14:49                   ` Konrad Rzeszutek Wilk
2011-01-27  9:28 ` Thomas Hellstrom
2011-01-27 21:13   ` Konrad Rzeszutek Wilk
2011-03-21 13:11 ` Michel Dänzer
2011-03-21 23:18   ` Konrad Rzeszutek Wilk
2011-03-21 23:18     ` Konrad Rzeszutek Wilk
2011-03-22 13:13     ` Michel Dänzer
2011-03-22 13:13       ` Michel Dänzer
2011-03-22 14:54       ` Konrad Rzeszutek Wilk
2011-03-22 15:10         ` Michel Dänzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110110152135.GA9732@dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=airlied@linux.ie \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=konrad@darnok.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=thomas@shipmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.