All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Jerome Glisse <j.glisse@gmail.com>,
	Thomas Hellstrom <thellstrom@vmware.com>,
	Dave Airlie <airlied@redhat.com>,
	kamal@canonical.com, ben@decadent.org.uk,
	LKML <linux-kernel@vger.kernel.org>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	m.szyprowski@samsung.com
Subject: Re: CONFIG_DMA_CMA causes ttm performance problems/hangs.
Date: Tue, 12 Aug 2014 16:47:43 -0400	[thread overview]
Message-ID: <20140812204743.GA15496@laptop.dumpdata.com> (raw)
In-Reply-To: <53EA0497.6060307@gmail.com>

On Tue, Aug 12, 2014 at 02:12:07PM +0200, Mario Kleiner wrote:
> On 08/11/2014 05:17 PM, Jerome Glisse wrote:
> >On Mon, Aug 11, 2014 at 12:11:21PM +0200, Thomas Hellstrom wrote:
> >>On 08/10/2014 08:02 PM, Mario Kleiner wrote:
> >>>On 08/10/2014 01:03 PM, Thomas Hellstrom wrote:
> >>>>On 08/10/2014 05:11 AM, Mario Kleiner wrote:
> >>>>>Resent this time without HTML formatting which lkml doesn't like.
> >>>>>Sorry.
> >>>>>
> >>>>>On 08/09/2014 03:58 PM, Thomas Hellstrom wrote:
> >>>>>>On 08/09/2014 03:33 PM, Konrad Rzeszutek Wilk wrote:
> >>>>>>>On August 9, 2014 1:39:39 AM EDT, Thomas
> >>>>>>>Hellstrom<thellstrom@vmware.com>  wrote:
> >>>>>>>>Hi.
> >>>>>>>>
> >>>>>>>Hey Thomas!
> >>>>>>>
> >>>>>>>>IIRC I don't think the TTM DMA pool allocates coherent pages more
> >>>>>>>>than
> >>>>>>>>one page at a time, and _if that's true_ it's pretty unnecessary for
> >>>>>>>>the
> >>>>>>>>dma subsystem to route those allocations to CMA. Maybe Konrad could
> >>>>>>>>shed
> >>>>>>>>some light over this?
> >>>>>>>It should allocate in batches and keep them in the TTM DMA pool for
> >>>>>>>some time to be reused.
> >>>>>>>
> >>>>>>>The pages that it gets are in 4kb granularity though.
> >>>>>>Then I feel inclined to say this is a DMA subsystem bug. Single page
> >>>>>>allocations shouldn't get routed to CMA.
> >>>>>>
> >>>>>>/Thomas
> >>>>>Yes, seems you're both right. I read through the code a bit more and
> >>>>>indeed the TTM DMA pool allocates only one page during each
> >>>>>dma_alloc_coherent() call, so it doesn't need CMA memory. The current
> >>>>>allocators don't check for single page CMA allocations and therefore
> >>>>>try to get it from the CMA area anyway, instead of skipping to the
> >>>>>much cheaper fallback.
> >>>>>
> >>>>>So the callers of dma_alloc_from_contiguous() could need that little
> >>>>>optimization of skipping it if only one page is requested. For
> >>>>>
> >>>>>dma_generic_alloc_coherent
> >>>>><https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/ident?i%3Ddma_generic_alloc_coherent&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=d1852625e2ab2ff07eb34a7f33fc1f55f7f13959912d5a6ce9316d23070ce939>
> >>>>>
> >>>>>andintel_alloc_coherent
> >>>>><https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/ident?i%3Dintel_alloc_coherent&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=82d587e9b6aeced5cf9a7caefa91bf47fba809f3522b7379d22e45a2d5d35ebd>
> >>>>>this
> >>>>>seems easy to do. Looking at the arm arch variants, e.g.,
> >>>>>
> >>>>>https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/source/arch/arm/mm/dma-mapping.c%23L1194&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=4c178257eab9b5d7ca650dedba76cf27abeb49ddc7aebb9433f52b6c8bb3bbac
> >>>>>
> >>>>>
> >>>>>and
> >>>>>
> >>>>>https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/source/arch/arm64/mm/dma-mapping.c%23L44&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=5f62f4cbe8cee1f1dd4cbba656354efe6867bcdc664cf90e9719e2f42a85de08
> >>>>>
> >>>>>
> >>>>>i'm not sure if it is that easily done, as there aren't any fallbacks
> >>>>>for such a case and the code looks to me as if that's at least
> >>>>>somewhat intentional.
> >>>>>
> >>>>>As far as TTM goes, one quick one-line fix to prevent it from using
> >>>>>the CMA at least on SWIOTLB, NOMMU and Intel IOMMU (when using the
> >>>>>above methods) would be to clear the __GFP_WAIT
> >>>>><https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/ident?i%3D__GFP_WAIT&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=d56d076770d3416264be6c9ea2829ac0d6951203696fa3ad04144f13307577bc>
> >>>>>flag from the
> >>>>>passed gfp_t flags. That would trigger the well working fallback.
> >>>>>So, is
> >>>>>
> >>>>>__GFP_WAIT
> >>>>><https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/ident?i%3D__GFP_WAIT&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=d56d076770d3416264be6c9ea2829ac0d6951203696fa3ad04144f13307577bc>
> >>>>>needed
> >>>>>for those single page allocations that go through__ttm_dma_alloc_page
> >>>>><https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/ident?i%3D__ttm_dma_alloc_page&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=7898522bba274e4dcc332735fbcf0c96e48918f60c2ee8e9a3e9c73ab3487bd0>?
> >>>>>
> >>>>>
> >>>>>It would be nice to have such a simple, non-intrusive one-line patch
> >>>>>that we still could get into 3.17 and then backported to older stable
> >>>>>kernels to avoid the same desktop hangs there if CMA is enabled. It
> >>>>>would be also nice for actual users of CMA to not use up lots of CMA
> >>>>>space for gpu's which don't need it. I think DMA_CMA was introduced
> >>>>>around 3.12.
> >>>>>
> >>>>I don't think that's a good idea. Omitting __GFP_WAIT would cause
> >>>>unnecessary memory allocation errors on systems under stress.
> >>>>I think this should be filed as a DMA subsystem kernel bug / regression
> >>>>and an appropriate solution should be worked out together with the DMA
> >>>>subsystem maintainers and then backported.
> >>>Ok, so it is needed. I'll file a bug report.
> >>>
> >>>>>The other problem is that probably TTM does not reuse pages from the
> >>>>>DMA pool. If i trace the __ttm_dma_alloc_page
> >>>>><https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/ident?i%3D__ttm_dma_alloc_page&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=7898522bba274e4dcc332735fbcf0c96e48918f60c2ee8e9a3e9c73ab3487bd0>
> >>>>>and
> >>>>>__ttm_dma_free_page
> >>>>><https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/ident?i%3D__ttm_dma_alloc_page&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=7898522bba274e4dcc332735fbcf0c96e48918f60c2ee8e9a3e9c73ab3487bd0>
> >>>>>calls for
> >>>>>those single page allocs/frees, then over a 20 second interval of
> >>>>>tracing and switching tabs in firefox, scrolling things around etc. i
> >>>>>find about as many alloc's as i find free's, e.g., 1607 allocs vs.
> >>>>>1648 frees.
> >>>>This is because historically the pools have been designed to keep only
> >>>>pages with nonstandard caching attributes since changing page caching
> >>>>attributes have been very slow but the kernel page allocators have been
> >>>>reasonably fast.
> >>>>
> >>>>/Thomas
> >>>Ok. A bit more ftraceing showed my hang problem case goes through the
> >>>"if (is_cached)" paths, so the pool doesn't recycle anything and i see
> >>>it bouncing up and down by 4 pages all the time.
> >>>
> >>>But for the non-cached case, which i don't hit with my problem, could
> >>>one of you look at line 954...
> >>>
> >>>https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/source/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c%23L954&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=e15c51805d429ee6d8960d6b88035e9811a1cdbfbf13168eec2fbb2214b99c60
> >>>
> >>>
> >>>... and tell me why that unconditional npages = count; assignment
> >>>makes sense? It seems to essentially disable all recycling for the dma
> >>>pool whenever the pool isn't filled up to/beyond its maximum with free
> >>>pages? When the pool is filled up, lots of stuff is recycled, but when
> >>>it is already somewhat below capacity, it gets "punished" by not
> >>>getting refilled? I'd just like to understand the logic behind that line.
> >>>
> >>>thanks,
> >>>-mario
> >>I'll happily forward that question to Konrad who wrote the code (or it
> >>may even stem from the ordinary page pool code which IIRC has Dave
> >>Airlie / Jerome Glisse as authors)
> >This is effectively bogus code, i now wonder how it came to stay alive.
> >Attached patch will fix that.
> 
> Yes, that makes sense to me. Fwiw,
> 
> Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>

What about testing? Did it make the issue less of a problem or did it
disappear completely?

Thank you.
> 
> -mario
> 

WARNING: multiple messages have this Message-ID (diff)
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>,
	kamal@canonical.com, LKML <linux-kernel@vger.kernel.org>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	Dave Airlie <airlied@redhat.com>,
	ben@decadent.org.uk, m.szyprowski@samsung.com
Subject: Re: CONFIG_DMA_CMA causes ttm performance problems/hangs.
Date: Tue, 12 Aug 2014 16:47:43 -0400	[thread overview]
Message-ID: <20140812204743.GA15496@laptop.dumpdata.com> (raw)
In-Reply-To: <53EA0497.6060307@gmail.com>

On Tue, Aug 12, 2014 at 02:12:07PM +0200, Mario Kleiner wrote:
> On 08/11/2014 05:17 PM, Jerome Glisse wrote:
> >On Mon, Aug 11, 2014 at 12:11:21PM +0200, Thomas Hellstrom wrote:
> >>On 08/10/2014 08:02 PM, Mario Kleiner wrote:
> >>>On 08/10/2014 01:03 PM, Thomas Hellstrom wrote:
> >>>>On 08/10/2014 05:11 AM, Mario Kleiner wrote:
> >>>>>Resent this time without HTML formatting which lkml doesn't like.
> >>>>>Sorry.
> >>>>>
> >>>>>On 08/09/2014 03:58 PM, Thomas Hellstrom wrote:
> >>>>>>On 08/09/2014 03:33 PM, Konrad Rzeszutek Wilk wrote:
> >>>>>>>On August 9, 2014 1:39:39 AM EDT, Thomas
> >>>>>>>Hellstrom<thellstrom@vmware.com>  wrote:
> >>>>>>>>Hi.
> >>>>>>>>
> >>>>>>>Hey Thomas!
> >>>>>>>
> >>>>>>>>IIRC I don't think the TTM DMA pool allocates coherent pages more
> >>>>>>>>than
> >>>>>>>>one page at a time, and _if that's true_ it's pretty unnecessary for
> >>>>>>>>the
> >>>>>>>>dma subsystem to route those allocations to CMA. Maybe Konrad could
> >>>>>>>>shed
> >>>>>>>>some light over this?
> >>>>>>>It should allocate in batches and keep them in the TTM DMA pool for
> >>>>>>>some time to be reused.
> >>>>>>>
> >>>>>>>The pages that it gets are in 4kb granularity though.
> >>>>>>Then I feel inclined to say this is a DMA subsystem bug. Single page
> >>>>>>allocations shouldn't get routed to CMA.
> >>>>>>
> >>>>>>/Thomas
> >>>>>Yes, seems you're both right. I read through the code a bit more and
> >>>>>indeed the TTM DMA pool allocates only one page during each
> >>>>>dma_alloc_coherent() call, so it doesn't need CMA memory. The current
> >>>>>allocators don't check for single page CMA allocations and therefore
> >>>>>try to get it from the CMA area anyway, instead of skipping to the
> >>>>>much cheaper fallback.
> >>>>>
> >>>>>So the callers of dma_alloc_from_contiguous() could need that little
> >>>>>optimization of skipping it if only one page is requested. For
> >>>>>
> >>>>>dma_generic_alloc_coherent
> >>>>><https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/ident?i%3Ddma_generic_alloc_coherent&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=d1852625e2ab2ff07eb34a7f33fc1f55f7f13959912d5a6ce9316d23070ce939>
> >>>>>
> >>>>>andintel_alloc_coherent
> >>>>><https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/ident?i%3Dintel_alloc_coherent&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=82d587e9b6aeced5cf9a7caefa91bf47fba809f3522b7379d22e45a2d5d35ebd>
> >>>>>this
> >>>>>seems easy to do. Looking at the arm arch variants, e.g.,
> >>>>>
> >>>>>https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/source/arch/arm/mm/dma-mapping.c%23L1194&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=4c178257eab9b5d7ca650dedba76cf27abeb49ddc7aebb9433f52b6c8bb3bbac
> >>>>>
> >>>>>
> >>>>>and
> >>>>>
> >>>>>https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/source/arch/arm64/mm/dma-mapping.c%23L44&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=5f62f4cbe8cee1f1dd4cbba656354efe6867bcdc664cf90e9719e2f42a85de08
> >>>>>
> >>>>>
> >>>>>i'm not sure if it is that easily done, as there aren't any fallbacks
> >>>>>for such a case and the code looks to me as if that's at least
> >>>>>somewhat intentional.
> >>>>>
> >>>>>As far as TTM goes, one quick one-line fix to prevent it from using
> >>>>>the CMA at least on SWIOTLB, NOMMU and Intel IOMMU (when using the
> >>>>>above methods) would be to clear the __GFP_WAIT
> >>>>><https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/ident?i%3D__GFP_WAIT&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=d56d076770d3416264be6c9ea2829ac0d6951203696fa3ad04144f13307577bc>
> >>>>>flag from the
> >>>>>passed gfp_t flags. That would trigger the well working fallback.
> >>>>>So, is
> >>>>>
> >>>>>__GFP_WAIT
> >>>>><https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/ident?i%3D__GFP_WAIT&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=d56d076770d3416264be6c9ea2829ac0d6951203696fa3ad04144f13307577bc>
> >>>>>needed
> >>>>>for those single page allocations that go through__ttm_dma_alloc_page
> >>>>><https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/ident?i%3D__ttm_dma_alloc_page&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=7898522bba274e4dcc332735fbcf0c96e48918f60c2ee8e9a3e9c73ab3487bd0>?
> >>>>>
> >>>>>
> >>>>>It would be nice to have such a simple, non-intrusive one-line patch
> >>>>>that we still could get into 3.17 and then backported to older stable
> >>>>>kernels to avoid the same desktop hangs there if CMA is enabled. It
> >>>>>would be also nice for actual users of CMA to not use up lots of CMA
> >>>>>space for gpu's which don't need it. I think DMA_CMA was introduced
> >>>>>around 3.12.
> >>>>>
> >>>>I don't think that's a good idea. Omitting __GFP_WAIT would cause
> >>>>unnecessary memory allocation errors on systems under stress.
> >>>>I think this should be filed as a DMA subsystem kernel bug / regression
> >>>>and an appropriate solution should be worked out together with the DMA
> >>>>subsystem maintainers and then backported.
> >>>Ok, so it is needed. I'll file a bug report.
> >>>
> >>>>>The other problem is that probably TTM does not reuse pages from the
> >>>>>DMA pool. If i trace the __ttm_dma_alloc_page
> >>>>><https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/ident?i%3D__ttm_dma_alloc_page&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=7898522bba274e4dcc332735fbcf0c96e48918f60c2ee8e9a3e9c73ab3487bd0>
> >>>>>and
> >>>>>__ttm_dma_free_page
> >>>>><https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/ident?i%3D__ttm_dma_alloc_page&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=7898522bba274e4dcc332735fbcf0c96e48918f60c2ee8e9a3e9c73ab3487bd0>
> >>>>>calls for
> >>>>>those single page allocs/frees, then over a 20 second interval of
> >>>>>tracing and switching tabs in firefox, scrolling things around etc. i
> >>>>>find about as many alloc's as i find free's, e.g., 1607 allocs vs.
> >>>>>1648 frees.
> >>>>This is because historically the pools have been designed to keep only
> >>>>pages with nonstandard caching attributes since changing page caching
> >>>>attributes have been very slow but the kernel page allocators have been
> >>>>reasonably fast.
> >>>>
> >>>>/Thomas
> >>>Ok. A bit more ftraceing showed my hang problem case goes through the
> >>>"if (is_cached)" paths, so the pool doesn't recycle anything and i see
> >>>it bouncing up and down by 4 pages all the time.
> >>>
> >>>But for the non-cached case, which i don't hit with my problem, could
> >>>one of you look at line 954...
> >>>
> >>>https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/source/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c%23L954&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A&m=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A&s=e15c51805d429ee6d8960d6b88035e9811a1cdbfbf13168eec2fbb2214b99c60
> >>>
> >>>
> >>>... and tell me why that unconditional npages = count; assignment
> >>>makes sense? It seems to essentially disable all recycling for the dma
> >>>pool whenever the pool isn't filled up to/beyond its maximum with free
> >>>pages? When the pool is filled up, lots of stuff is recycled, but when
> >>>it is already somewhat below capacity, it gets "punished" by not
> >>>getting refilled? I'd just like to understand the logic behind that line.
> >>>
> >>>thanks,
> >>>-mario
> >>I'll happily forward that question to Konrad who wrote the code (or it
> >>may even stem from the ordinary page pool code which IIRC has Dave
> >>Airlie / Jerome Glisse as authors)
> >This is effectively bogus code, i now wonder how it came to stay alive.
> >Attached patch will fix that.
> 
> Yes, that makes sense to me. Fwiw,
> 
> Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>

What about testing? Did it make the issue less of a problem or did it
disappear completely?

Thank you.
> 
> -mario
> 

  reply	other threads:[~2014-08-12 20:48 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-08 17:42 CONFIG_DMA_CMA causes ttm performance problems/hangs Mario Kleiner
2014-08-08 17:42 ` Mario Kleiner
2014-08-09  5:39 ` Thomas Hellstrom
2014-08-09  5:39   ` Thomas Hellstrom
2014-08-09 13:33   ` Konrad Rzeszutek Wilk
2014-08-09 13:33     ` Konrad Rzeszutek Wilk
2014-08-09 13:58     ` Thomas Hellstrom
2014-08-09 13:58       ` Thomas Hellstrom
2014-08-10  3:06       ` Mario Kleiner
2014-08-10  3:11       ` Mario Kleiner
2014-08-10  3:11         ` Mario Kleiner
2014-08-10 11:03         ` Thomas Hellstrom
2014-08-10 11:03           ` Thomas Hellstrom
2014-08-10 18:02           ` Mario Kleiner
2014-08-10 18:02             ` Mario Kleiner
2014-08-11 10:11             ` Thomas Hellstrom
2014-08-11 10:11               ` Thomas Hellstrom
2014-08-11 15:17               ` Jerome Glisse
2014-08-11 15:17                 ` Jerome Glisse
2014-08-12 12:12                 ` Mario Kleiner
2014-08-12 12:12                   ` Mario Kleiner
2014-08-12 20:47                   ` Konrad Rzeszutek Wilk [this message]
2014-08-12 20:47                     ` Konrad Rzeszutek Wilk
2014-08-13  1:50                 ` Michel Dänzer
2014-08-13  2:04                   ` Mario Kleiner
2014-08-13  2:17                     ` Jerome Glisse
2014-08-13  2:17                       ` Jerome Glisse
2014-08-13  8:42                       ` Lucas Stach
2014-08-13  8:42                         ` Lucas Stach
2014-08-13  2:04                   ` Jerome Glisse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140812204743.GA15496@laptop.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=airlied@redhat.com \
    --cc=ben@decadent.org.uk \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=j.glisse@gmail.com \
    --cc=kamal@canonical.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=mario.kleiner.de@gmail.com \
    --cc=thellstrom@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.