linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Hellstrom <thomas@shipmail.org>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: dri-devel@lists.freedesktop.org, airlied@linux.ie,
	linux-kernel@vger.kernel.org, konrad@darnok.org
Subject: Re: [RFC PATCH v2] Utilize the PCI API in the TTM framework.
Date: Mon, 10 Jan 2011 21:50:03 +0100	[thread overview]
Message-ID: <4D2B70FB.3000504@shipmail.org> (raw)
In-Reply-To: <20110110164519.GA27066@dumpdata.com>

On 01/10/2011 05:45 PM, Konrad Rzeszutek Wilk wrote:
> . snip ..
>    
>>>> 2) What about accounting? In a *non-Xen* environment, will the
>>>> number of coherent pages be less than the number of DMA32 pages, or
>>>> will dma_alloc_coherent just translate into a alloc_page(GFP_DMA32)?
>>>>          
>>> The code in the IOMMUs end up calling __get_free_pages, which ends up
>>> in alloc_pages. So the call doe ends up in alloc_page(flags).
>>>
>>>
>>> native SWIOTLB (so no IOMMU): GFP_DMA32
>>> GART (AMD's old IOMMU): GFP_DMA32:
>>>
>>> For the hardware IOMMUs:
>>>
>>> AMD VI: if it is in Passthrough mode, it calls it with GFP_DMA32.
>>>     If it is in DMA translation mode (normal mode) it allocates a page
>>>     with GFP_ZERO | ~(__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32) and immediately
>>>     translates the bus address.
>>>
>>> The flags change a bit:
>>> VT-d: if there is no identity mapping, nor the PCI device is one of the special ones
>>>     (GFX, Azalia), then it will pass it with GFP_DMA32.
>>>     If it is in identity mapping state, and the device is a GFX or Azalia sound
>>>     card, then it will ~(__GFP_DMA | GFP_DMA32) and immediately translate
>>>     the buss address.
>>>
>>> However, the interesting thing is that I've passed in the 'NULL' as
>>> the struct device (not intentionally - did not want to add more changes
>>> to the API) so all of the IOMMUs end up doing GFP_DMA32.
>>>
>>> But it does mess up the accounting with the AMD-VI and VT-D as they strip
>>> of the __GFP_DMA32 flag off. That is a big problem, I presume?
>>>        
>> Actually, I don't think it's a big problem. TTM allows a small
>> discrepancy between allocated pages and accounted pages to be able
>> to account on actual allocation result. IIRC, This means that a
>> DMA32 page will always be accounted as such, or at least we can make
>> it behave that way. As long as the device can always handle the
>> page, we should be fine.
>>      
> Excellent.
>    
>>      
>>>> 3) Same as above, but in a Xen environment, what will stop multiple
>>>> guests to exhaust the coherent pages? It seems that the TTM
>>>> accounting mechanisms will no longer be valid unless the number of
>>>> available coherent pages are split across the guests?
>>>>          
>>> Say I pass in four ATI Radeon cards (wherein each is a 32-bit card) to
>>> four guests. Lets also assume that we are doing heavy operations in all
>>> of the guests.  Since there are no communication between each TTM
>>> accounting in each guest you could end up eating all of the 4GB physical
>>> memory that is available to each guest. It could end up that the first
>>> guess gets a lion share of the 4GB memory, while the other ones are
>>> less so.
>>>
>>> And if one was to do that on baremetal, with four ATI Radeon cards, the
>>> TTM accounting mechanism would realize it is nearing the watermark
>>> and do.. something, right? What would it do actually?
>>>
>>> I think the error path would be the same in both cases?
>>>        
>> Not really. The really dangerous situation is if TTM is allowed to
>> exhaust all GFP_KERNEL memory. Then any application or kernel task
>>      
> Ok, since GFP_KERNEL does not contain the GFP_DMA32 flag then
> this should be OK?
>    

No, Unless I miss something, on a machine with 4GB or less, GFP_DMA32 
and GFP_KERNEL are allocated from the same pool of pages?

>
>> What *might* be possible, however, is that the GFP_KERNEL memory on
>> the host gets exhausted due to extensive TTM allocations in the
>> guest, but I guess that's a problem for XEN to resolve, not TTM.
>>      
> Hmm. I think I am missing something here. The GFP_KERNEL is any memory
> and the GFP_DMA32 is memory from the ZONE_DMA32. When we do start
> using the PCI-API, what happens underneath (so under Linux) is that
> "real PFNs" (Machine Frame Numbers) which are above the 0x100000 mark
> get swizzled in for the guest's PFNs (this is for the PCI devices
> that have the dma_mask set to 32bit). However, that is a Xen MMU
> accounting issue.
>    


So I was under the impression that when you allocate coherent memory in 
the guest, the physical page comes from DMA32 memory in the host. On a 
4GB machine or less, that would be the same as kernel memory. Now, if 4 
guests think they can allocate 2GB of coherent memory each, you might 
run out of kernel memory on the host?


Another thing that I was thinking of is what happens if you have a huge 
gart and allocate a lot of coherent memory. Could that potentially 
exhaust IOMMU resources?

>> /Thomas
>>
>> *) I think gem's flink still is vulnerable to this, though, so it
>>      
> Is there a good test-case for this?
>    


Not put in code. What you can do (for example in an openGL app) is to 
write some code that tries to flink with a guessed bo name until it 
succeeds. Then repeatedly from within the app, try to flink the same 
name until something crashes. I don't think the linux OOM killer can 
handle that situation. Should be fairly easy to put together.

/Thomas


  reply	other threads:[~2011-01-10 20:50 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-07 17:11 [RFC PATCH v2] Utilize the PCI API in the TTM framework Konrad Rzeszutek Wilk
2011-01-07 17:11 ` [PATCH 1/5] ttm: Introduce a placeholder for DMA (bus) addresses Konrad Rzeszutek Wilk
2011-01-07 17:11 ` [PATCH 2/5] tm: Utilize the dma_addr_t array for pages that are to in DMA32 pool Konrad Rzeszutek Wilk
2011-01-07 17:11 ` [PATCH 3/5] ttm: Expand (*populate) to support an array of DMA addresses Konrad Rzeszutek Wilk
2011-01-07 17:11 ` [PATCH 4/5] radeon/ttm/PCIe: Use dma_addr if TTM has set it Konrad Rzeszutek Wilk
2011-01-27 21:20   ` Konrad Rzeszutek Wilk
2011-01-28 14:42     ` Jerome Glisse
2011-01-28 15:03       ` Konrad Rzeszutek Wilk
2011-02-16 15:54       ` Konrad Rzeszutek Wilk
2011-02-16 18:51         ` Jerome Glisse
2011-01-07 17:11 ` [PATCH 5/5] nouveau/ttm/PCIe: " Konrad Rzeszutek Wilk
2011-01-27 21:22   ` Konrad Rzeszutek Wilk
2011-01-07 22:21 ` [RFC PATCH v2] Utilize the PCI API in the TTM framework Ian Campbell
2011-01-08 10:41 ` Thomas Hellstrom
2011-01-10 14:25 ` Thomas Hellstrom
2011-01-10 15:21   ` Konrad Rzeszutek Wilk
2011-01-10 15:58     ` Thomas Hellstrom
2011-01-10 16:45       ` Konrad Rzeszutek Wilk
2011-01-10 20:50         ` Thomas Hellstrom [this message]
2011-01-11 15:55           ` Konrad Rzeszutek Wilk
2011-01-11 16:21             ` Alex Deucher
2011-01-11 16:59               ` Konrad Rzeszutek Wilk
2011-01-11 18:12                 ` Alex Deucher
2011-01-11 18:28                   ` Konrad Rzeszutek Wilk
2011-01-11 19:28                     ` Alex Deucher
2011-01-12  9:12             ` Thomas Hellstrom
2011-01-12 15:19               ` Konrad Rzeszutek Wilk
2011-01-24 14:49                 ` Konrad Rzeszutek Wilk
2011-03-21 13:11 ` Michel Dänzer
2011-03-21 23:18   ` Konrad Rzeszutek Wilk
2011-03-22 13:13     ` Michel Dänzer
2011-03-22 14:54       ` Konrad Rzeszutek Wilk
2011-03-22 15:10         ` Michel Dänzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D2B70FB.3000504@shipmail.org \
    --to=thomas@shipmail.org \
    --cc=airlied@linux.ie \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=konrad.wilk@oracle.com \
    --cc=konrad@darnok.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).