All of lore.kernel.org
 help / color / mirror / Atom feed
* Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?
@ 2016-10-10 19:43 Shawn Starr
  2016-10-10 20:55 ` Shawn Starr
  0 siblings, 1 reply; 16+ messages in thread
From: Shawn Starr @ 2016-10-10 19:43 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 330 bytes --]

Hello AMD Folks, 

Bad stalls in Heaven/Valley when transitioning to a new 
scene, Valley scene rendering 9fps normally 30fps (flying 
around mountain scene).

Going to downgrade kernel to 4.8 stock w/o drm-next-4.9-wip 
to see if its kernel regression, GPU is revving up/down maybe 
something messed up with CLKS?

Thanks,
Shawn

[-- Attachment #1.2: Type: text/html, Size: 1818 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?
  2016-10-10 19:43 Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK? Shawn Starr
@ 2016-10-10 20:55 ` Shawn Starr
  2016-10-10 23:36   ` Shawn Starr
  0 siblings, 1 reply; 16+ messages in thread
From: Shawn Starr @ 2016-10-10 20:55 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Hello,

It turns out its not kernel, bisecting mesa/LLVM now to see where this issue 
is happening from.

Thanks,
Shawn


On Monday, October 10, 2016 3:43:57 PM EDT Shawn Starr wrote:

Hello AMD Folks, 

Bad stalls in Heaven/Valley when transitioning to a new scene, Valley scene 
rendering 9fps normally 30fps (flying around mountain scene).

Going to downgrade kernel to 4.8 stock w/o drm-next-4.9-wip to see if its 
kernel regression, GPU is revving up/down maybe something messed up with CLKS?

Thanks,
Shawn



_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?
  2016-10-10 20:55 ` Shawn Starr
@ 2016-10-10 23:36   ` Shawn Starr
  2016-10-13 18:28     ` mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?] Shawn Starr
  0 siblings, 1 reply; 16+ messages in thread
From: Shawn Starr @ 2016-10-10 23:36 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Monday, October 10, 2016 4:55:24 PM EDT Shawn Starr wrote:
> Hello,
> 
> It turns out its not kernel, bisecting mesa/LLVM now to see where this issue
> is happening from.

Correction, it is kernel, 4.8-rc8 is good from commit 
c2cbc38b9715bd8318062e600668fc30e5a3fbfa

Bisecting this now.

Thanks,
Shawn

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]
  2016-10-10 23:36   ` Shawn Starr
@ 2016-10-13 18:28     ` Shawn Starr
  2016-10-14  1:33       ` Michel Dänzer
  0 siblings, 1 reply; 16+ messages in thread
From: Shawn Starr @ 2016-10-13 18:28 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Hello AMD folks,

I have discovered a problem in Linus master that affects AMDGPU, nobody would 
notice this in drm-next-4.9-wip since its not in this repo.


git bisect start
# good: [c8d2bc9bc39ebea8437fd974fdbc21847bb897a3] Linux 4.8
git bisect good c8d2bc9bc39ebea8437fd974fdbc21847bb897a3
# bad: [f29135b54bcbfe1fea97d94e2ae860bade1d5a31] Merge branch 'for-linus-4.9' 
of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
git bisect bad f29135b54bcbfe1fea97d94e2ae860bade1d5a31
# good: [5691f0e9a3e7855832d5fd094801bf600347c2d0] Merge tag 'sound-4.9-rc1' 
of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect good 5691f0e9a3e7855832d5fd094801bf600347c2d0
# good: [e89ac165a5ebd0a95650ed48d40b8b4e3a8991dc] staging: rts5208: fix 
comment blocks style in rtsx_chip.h
git bisect good e89ac165a5ebd0a95650ed48d40b8b4e3a8991dc
# good: [07021b43597f506cc525d139ed1a94e79cf184f2] Merge tag 'powerpc-4.9-1' 
of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
git bisect good 07021b43597f506cc525d139ed1a94e79cf184f2
# good: [c913fc4146ba7c280e074558d0a461e5c6f07c8a] Merge tag 'armsoc-late' of 
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
git bisect good c913fc4146ba7c280e074558d0a461e5c6f07c8a
# bad: [abb5a14fa20fdd400995926134b7be9eb8ce6048] Merge branch 'work.misc' of 
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
git bisect bad abb5a14fa20fdd400995926134b7be9eb8ce6048
# bad: [b9044ac8292fc94bee33f6f08acaed3ac55f0c75] Merge tag 'for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
git bisect bad b9044ac8292fc94bee33f6f08acaed3ac55f0c75
# bad: [68ba0326b4e14988f9e0c24a6e12a85cf2acd1ca] proc: much faster /proc/
vmstat
git bisect bad 68ba0326b4e14988f9e0c24a6e12a85cf2acd1ca
# good: [1d8bf926f8739bd35d054097907fef35d881e403] mm/bootmem.c: replace 
kzalloc() by kzalloc_node()
git bisect good 1d8bf926f8739bd35d054097907fef35d881e403
# bad: [cc30c5d6461a2813406f7f84d581643781922a82] mm/page_io.c: replace some 
BUG_ON()s with VM_BUG_ON_PAGE()
git bisect bad cc30c5d6461a2813406f7f84d581643781922a82
# good: [6fcb52a56ff60d240f06296b12827e7f20d45f63] thp: reduce usage of huge 
zero page's atomic counter
git bisect good 6fcb52a56ff60d240f06296b12827e7f20d45f63
# bad: [d943649831aba0fcdda37a0e9e25b332a634cf5e] mm, compaction: more 
reliably increase direct compaction priority
git bisect bad d943649831aba0fcdda37a0e9e25b332a634cf5e
# bad: [87744ab3832b83ba71b931f86f9cfdb000d07da5] mm: fix cache mode tracking 
in vm_insert_mixed()
git bisect bad 87744ab3832b83ba71b931f86f9cfdb000d07da5
# good: [d66ba15bde22703b3c0cec6782519cb0765a6777] memory-hotplug: fix 
store_mem_state() return value
git bisect good d66ba15bde22703b3c0cec6782519cb0765a6777
# first bad commit: [87744ab3832b83ba71b931f86f9cfdb000d07da5] mm: fix cache 
mode tracking in vm_insert_mixed()

87744ab3832b83ba71b931f86f9cfdb000d07da5 is the first bad commit
commit 87744ab3832b83ba71b931f86f9cfdb000d07da5
Author: Dan Williams <dan.j.williams@intel.com>
Date:   Fri Oct 7 17:00:18 2016 -0700

    mm: fix cache mode tracking in vm_insert_mixed()
    
    vm_insert_mixed() unlike vm_insert_pfn_prot() and vmf_insert_pfn_pmd(),
    fails to check the pgprot_t it uses for the mapping against the one
    recorded in the memtype tracking tree.  Add the missing call to
    track_pfn_insert() to preclude cases where incompatible aliased mappings
    are established for a given physical address range.
    
    Link: http://lkml.kernel.org/r/
147328717909.35069.14256589123570653697.stgit@dwillia2-
desk3.amr.corp.intel.com
    Signed-off-by: Dan Williams <dan.j.williams@intel.com>
    Cc: David Airlie <airlied@linux.ie>
    Cc: Matthew Wilcox <mawilcox@microsoft.com>
    Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

:040000 040000 7517c0019fe49c1830b5a1d81f1dc099c5aab98a 
fd497a604a2af5995db2b8ed1e9c640bede6adf3 M      mm


Removal of this patch stops graphics stalls.

A friend of mine mentions,

"looks like a graphics thingy you depend on is requesting a mapping with a 
not-allowed cache mode, and now you are (rightfully) getting errors?"

Thanks,
Shawn



On Monday, October 10, 2016 7:36:28 PM EDT Shawn Starr wrote:
> On Monday, October 10, 2016 4:55:24 PM EDT Shawn Starr wrote:
> > Hello,
> > 
> > It turns out its not kernel, bisecting mesa/LLVM now to see where this
> > issue is happening from.
> 
> Correction, it is kernel, 4.8-rc8 is good from commit
> c2cbc38b9715bd8318062e600668fc30e5a3fbfa
> 
> Bisecting this now.
> 
> Thanks,
> Shawn


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]
  2016-10-13 18:28     ` mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?] Shawn Starr
@ 2016-10-14  1:33       ` Michel Dänzer
       [not found]         ` <10a1e298-df32-52a5-7694-b205794ca009-otUistvHUpPR7s880joybQ@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Michel Dänzer @ 2016-10-14  1:33 UTC (permalink / raw)
  To: Shawn Starr, Dan Williams
  Cc: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[ Adding Dan Williams and dri-devel ]

On 14/10/16 03:28 AM, Shawn Starr wrote:
> Hello AMD folks,
> 
> I have discovered a problem in Linus master that affects AMDGPU, nobody would 
> notice this in drm-next-4.9-wip since its not in this repo.

[...]

> 87744ab3832b83ba71b931f86f9cfdb000d07da5 is the first bad commit
> commit 87744ab3832b83ba71b931f86f9cfdb000d07da5
> Author: Dan Williams <dan.j.williams@intel.com>
> Date:   Fri Oct 7 17:00:18 2016 -0700
> 
>     mm: fix cache mode tracking in vm_insert_mixed()
>     
>     vm_insert_mixed() unlike vm_insert_pfn_prot() and vmf_insert_pfn_pmd(),
>     fails to check the pgprot_t it uses for the mapping against the one
>     recorded in the memtype tracking tree.  Add the missing call to
>     track_pfn_insert() to preclude cases where incompatible aliased mappings
>     are established for a given physical address range.
>     
>     Link: http://lkml.kernel.org/r/
> 147328717909.35069.14256589123570653697.stgit@dwillia2-
> desk3.amr.corp.intel.com
>     Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>     Cc: David Airlie <airlied@linux.ie>
>     Cc: Matthew Wilcox <mawilcox@microsoft.com>
>     Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> 
> :040000 040000 7517c0019fe49c1830b5a1d81f1dc099c5aab98a 
> fd497a604a2af5995db2b8ed1e9c640bede6adf3 M      mm
> 
> 
> Removal of this patch stops graphics stalls.

Thanks for bisecting this Shawn.


> A friend of mine mentions,
> 
> "looks like a graphics thingy you depend on is requesting a mapping with a 
> not-allowed cache mode, and now you are (rightfully) getting errors?"

It would be nice to get some more specific pointers what amdgpu (or
maybe ttm, since that calls vm_insert_mixed in ttm_bo_vm_fault) might be
doing wrong.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]
       [not found]         ` <10a1e298-df32-52a5-7694-b205794ca009-otUistvHUpPR7s880joybQ@public.gmane.org>
@ 2016-10-16 18:41           ` Marek Olšák
  2016-10-16 20:53             ` Dave Airlie
  0 siblings, 1 reply; 16+ messages in thread
From: Marek Olšák @ 2016-10-16 18:41 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: Dan Williams, Shawn Starr,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Fri, Oct 14, 2016 at 3:33 AM, Michel Dänzer <michel@daenzer.net> wrote:
>
> [ Adding Dan Williams and dri-devel ]
>
> On 14/10/16 03:28 AM, Shawn Starr wrote:
>> Hello AMD folks,
>>
>> I have discovered a problem in Linus master that affects AMDGPU, nobody would
>> notice this in drm-next-4.9-wip since its not in this repo.
>
> [...]
>
>> 87744ab3832b83ba71b931f86f9cfdb000d07da5 is the first bad commit
>> commit 87744ab3832b83ba71b931f86f9cfdb000d07da5
>> Author: Dan Williams <dan.j.williams@intel.com>
>> Date:   Fri Oct 7 17:00:18 2016 -0700
>>
>>     mm: fix cache mode tracking in vm_insert_mixed()
>>
>>     vm_insert_mixed() unlike vm_insert_pfn_prot() and vmf_insert_pfn_pmd(),
>>     fails to check the pgprot_t it uses for the mapping against the one
>>     recorded in the memtype tracking tree.  Add the missing call to
>>     track_pfn_insert() to preclude cases where incompatible aliased mappings
>>     are established for a given physical address range.
>>
>>     Link: http://lkml.kernel.org/r/
>> 147328717909.35069.14256589123570653697.stgit@dwillia2-
>> desk3.amr.corp.intel.com
>>     Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>>     Cc: David Airlie <airlied@linux.ie>
>>     Cc: Matthew Wilcox <mawilcox@microsoft.com>
>>     Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
>>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
>>
>> :040000 040000 7517c0019fe49c1830b5a1d81f1dc099c5aab98a
>> fd497a604a2af5995db2b8ed1e9c640bede6adf3 M      mm
>>
>>
>> Removal of this patch stops graphics stalls.
>
> Thanks for bisecting this Shawn.
>
>
>> A friend of mine mentions,
>>
>> "looks like a graphics thingy you depend on is requesting a mapping with a
>> not-allowed cache mode, and now you are (rightfully) getting errors?"
>
> It would be nice to get some more specific pointers what amdgpu (or
> maybe ttm, since that calls vm_insert_mixed in ttm_bo_vm_fault) might be
> doing wrong.

BTW, people have reported that rendering stalls every time TTM tries
to move a buffer, even if the move is only a few MB.

See FPS and num_bytes_moved here:
https://i.imgur.com/kNj2vqF.png

There are 5 big stalls. 4 of them are due to the mm commit.

Marek
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]
  2016-10-16 18:41           ` Marek Olšák
@ 2016-10-16 20:53             ` Dave Airlie
  2016-10-17 21:25               ` Dan Williams
  0 siblings, 1 reply; 16+ messages in thread
From: Dave Airlie @ 2016-10-16 20:53 UTC (permalink / raw)
  To: Marek Olšák
  Cc: dri-devel, Michel Dänzer, Shawn Starr, Dan Williams, amd-gfx

On 17 October 2016 at 04:41, Marek Olšák <maraeo@gmail.com> wrote:
> On Fri, Oct 14, 2016 at 3:33 AM, Michel Dänzer <michel@daenzer.net> wrote:
>>
>> [ Adding Dan Williams and dri-devel ]
>>
>> On 14/10/16 03:28 AM, Shawn Starr wrote:
>>> Hello AMD folks,
>>>
>>> I have discovered a problem in Linus master that affects AMDGPU, nobody would
>>> notice this in drm-next-4.9-wip since its not in this repo.
>>
>> [...]
>>
>>> 87744ab3832b83ba71b931f86f9cfdb000d07da5 is the first bad commit
>>> commit 87744ab3832b83ba71b931f86f9cfdb000d07da5
>>> Author: Dan Williams <dan.j.williams@intel.com>
>>> Date:   Fri Oct 7 17:00:18 2016 -0700
>>>
>>>     mm: fix cache mode tracking in vm_insert_mixed()
>>>
>>>     vm_insert_mixed() unlike vm_insert_pfn_prot() and vmf_insert_pfn_pmd(),
>>>     fails to check the pgprot_t it uses for the mapping against the one
>>>     recorded in the memtype tracking tree.  Add the missing call to
>>>     track_pfn_insert() to preclude cases where incompatible aliased mappings
>>>     are established for a given physical address range.
>>>
>>>     Link: http://lkml.kernel.org/r/
>>> 147328717909.35069.14256589123570653697.stgit@dwillia2-
>>> desk3.amr.corp.intel.com
>>>     Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>>>     Cc: David Airlie <airlied@linux.ie>
>>>     Cc: Matthew Wilcox <mawilcox@microsoft.com>
>>>     Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
>>>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>>>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
>>>
>>> :040000 040000 7517c0019fe49c1830b5a1d81f1dc099c5aab98a
>>> fd497a604a2af5995db2b8ed1e9c640bede6adf3 M      mm
>>>
>>>
>>> Removal of this patch stops graphics stalls.
>>
>> Thanks for bisecting this Shawn.
>>
>>
>>> A friend of mine mentions,
>>>
>>> "looks like a graphics thingy you depend on is requesting a mapping with a
>>> not-allowed cache mode, and now you are (rightfully) getting errors?"
>>
>> It would be nice to get some more specific pointers what amdgpu (or
>> maybe ttm, since that calls vm_insert_mixed in ttm_bo_vm_fault) might be
>> doing wrong.

       /*
         * We'd like to use VM_PFNMAP on shared mappings, where
         * (vma->vm_flags & VM_SHARED) != 0, for performance reasons,
         * but for some reason VM_PFNMAP + x86 PAT + write-combine is very
         * bad for performance. Until that has been sorted out, use
         * VM_MIXEDMAP on all mappings. See freedesktop.org bug #75719
         */
        vma->vm_flags |= VM_MIXEDMAP;

We have that comment in the ttm code, which to me implies that mixed is
doing the right thing now, but that is slow, as the interface we
should be using.

Dave.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]
  2016-10-16 20:53             ` Dave Airlie
@ 2016-10-17 21:25               ` Dan Williams
  2016-10-17 22:01                 ` Dave Airlie
  0 siblings, 1 reply; 16+ messages in thread
From: Dan Williams @ 2016-10-17 21:25 UTC (permalink / raw)
  To: Dave Airlie; +Cc: dri-devel, Michel Dänzer, Shawn Starr, amd-gfx

On Sun, Oct 16, 2016 at 1:53 PM, Dave Airlie <airlied@gmail.com> wrote:
> On 17 October 2016 at 04:41, Marek Olšák <maraeo@gmail.com> wrote:
>> On Fri, Oct 14, 2016 at 3:33 AM, Michel Dänzer <michel@daenzer.net> wrote:
>>>
>>> [ Adding Dan Williams and dri-devel ]
>>>
>>> On 14/10/16 03:28 AM, Shawn Starr wrote:
>>>> Hello AMD folks,
>>>>
>>>> I have discovered a problem in Linus master that affects AMDGPU, nobody would
>>>> notice this in drm-next-4.9-wip since its not in this repo.
>>>
>>> [...]
>>>
>>>> 87744ab3832b83ba71b931f86f9cfdb000d07da5 is the first bad commit
>>>> commit 87744ab3832b83ba71b931f86f9cfdb000d07da5
>>>> Author: Dan Williams <dan.j.williams@intel.com>
>>>> Date:   Fri Oct 7 17:00:18 2016 -0700
>>>>
>>>>     mm: fix cache mode tracking in vm_insert_mixed()
>>>>
>>>>     vm_insert_mixed() unlike vm_insert_pfn_prot() and vmf_insert_pfn_pmd(),
>>>>     fails to check the pgprot_t it uses for the mapping against the one
>>>>     recorded in the memtype tracking tree.  Add the missing call to
>>>>     track_pfn_insert() to preclude cases where incompatible aliased mappings
>>>>     are established for a given physical address range.
>>>>
>>>>     Link: http://lkml.kernel.org/r/
>>>> 147328717909.35069.14256589123570653697.stgit@dwillia2-
>>>> desk3.amr.corp.intel.com
>>>>     Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>>>>     Cc: David Airlie <airlied@linux.ie>
>>>>     Cc: Matthew Wilcox <mawilcox@microsoft.com>
>>>>     Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
>>>>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>>>>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
>>>>
>>>> :040000 040000 7517c0019fe49c1830b5a1d81f1dc099c5aab98a
>>>> fd497a604a2af5995db2b8ed1e9c640bede6adf3 M      mm
>>>>
>>>>
>>>> Removal of this patch stops graphics stalls.
>>>
>>> Thanks for bisecting this Shawn.
>>>
>>>
>>>> A friend of mine mentions,
>>>>
>>>> "looks like a graphics thingy you depend on is requesting a mapping with a
>>>> not-allowed cache mode, and now you are (rightfully) getting errors?"
>>>
>>> It would be nice to get some more specific pointers what amdgpu (or
>>> maybe ttm, since that calls vm_insert_mixed in ttm_bo_vm_fault) might be
>>> doing wrong.
>
>        /*
>          * We'd like to use VM_PFNMAP on shared mappings, where
>          * (vma->vm_flags & VM_SHARED) != 0, for performance reasons,
>          * but for some reason VM_PFNMAP + x86 PAT + write-combine is very
>          * bad for performance. Until that has been sorted out, use
>          * VM_MIXEDMAP on all mappings. See freedesktop.org bug #75719
>          */
>         vma->vm_flags |= VM_MIXEDMAP;
>
> We have that comment in the ttm code, which to me implies that mixed is
> doing the right thing now, but that is slow, as the interface we
> should be using.
>

Aren't there only 2 possibilities for this regression?

1/ a memtype entry was never made so track_pfn_insert() returns an
uncached mapping

2/ a conflicting memtype entry exists and undefined behavior due to
mixed mapping types is avoided with the change.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]
  2016-10-17 21:25               ` Dan Williams
@ 2016-10-17 22:01                 ` Dave Airlie
  2016-10-18  3:48                   ` Dave Airlie
  2016-10-18  7:39                   ` Daniel Vetter
  0 siblings, 2 replies; 16+ messages in thread
From: Dave Airlie @ 2016-10-17 22:01 UTC (permalink / raw)
  To: Dan Williams; +Cc: dri-devel, Michel Dänzer, Shawn Starr, amd-gfx

On 18 October 2016 at 07:25, Dan Williams <dan.j.williams@intel.com> wrote:
> On Sun, Oct 16, 2016 at 1:53 PM, Dave Airlie <airlied@gmail.com> wrote:
>> On 17 October 2016 at 04:41, Marek Olšák <maraeo@gmail.com> wrote:
>>> On Fri, Oct 14, 2016 at 3:33 AM, Michel Dänzer <michel@daenzer.net> wrote:
>>>>
>>>> [ Adding Dan Williams and dri-devel ]
>>>>
>>>> On 14/10/16 03:28 AM, Shawn Starr wrote:
>>>>> Hello AMD folks,
>>>>>
>>>>> I have discovered a problem in Linus master that affects AMDGPU, nobody would
>>>>> notice this in drm-next-4.9-wip since its not in this repo.
>>>>
>>>> [...]
>>>>
>>>>> 87744ab3832b83ba71b931f86f9cfdb000d07da5 is the first bad commit
>>>>> commit 87744ab3832b83ba71b931f86f9cfdb000d07da5
>>>>> Author: Dan Williams <dan.j.williams@intel.com>
>>>>> Date:   Fri Oct 7 17:00:18 2016 -0700
>>>>>
>>>>>     mm: fix cache mode tracking in vm_insert_mixed()
>>>>>
>>>>>     vm_insert_mixed() unlike vm_insert_pfn_prot() and vmf_insert_pfn_pmd(),
>>>>>     fails to check the pgprot_t it uses for the mapping against the one
>>>>>     recorded in the memtype tracking tree.  Add the missing call to
>>>>>     track_pfn_insert() to preclude cases where incompatible aliased mappings
>>>>>     are established for a given physical address range.
>>>>>
>>>>>     Link: http://lkml.kernel.org/r/
>>>>> 147328717909.35069.14256589123570653697.stgit@dwillia2-
>>>>> desk3.amr.corp.intel.com
>>>>>     Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>>>>>     Cc: David Airlie <airlied@linux.ie>
>>>>>     Cc: Matthew Wilcox <mawilcox@microsoft.com>
>>>>>     Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
>>>>>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>>>>>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
>>>>>
>>>>> :040000 040000 7517c0019fe49c1830b5a1d81f1dc099c5aab98a
>>>>> fd497a604a2af5995db2b8ed1e9c640bede6adf3 M      mm
>>>>>
>>>>>
>>>>> Removal of this patch stops graphics stalls.
>>>>
>>>> Thanks for bisecting this Shawn.
>>>>
>>>>
>>>>> A friend of mine mentions,
>>>>>
>>>>> "looks like a graphics thingy you depend on is requesting a mapping with a
>>>>> not-allowed cache mode, and now you are (rightfully) getting errors?"
>>>>
>>>> It would be nice to get some more specific pointers what amdgpu (or
>>>> maybe ttm, since that calls vm_insert_mixed in ttm_bo_vm_fault) might be
>>>> doing wrong.
>>
>>        /*
>>          * We'd like to use VM_PFNMAP on shared mappings, where
>>          * (vma->vm_flags & VM_SHARED) != 0, for performance reasons,
>>          * but for some reason VM_PFNMAP + x86 PAT + write-combine is very
>>          * bad for performance. Until that has been sorted out, use
>>          * VM_MIXEDMAP on all mappings. See freedesktop.org bug #75719
>>          */
>>         vma->vm_flags |= VM_MIXEDMAP;
>>
>> We have that comment in the ttm code, which to me implies that mixed is
>> doing the right thing now, but that is slow, as the interface we
>> should be using.
>>
>
> Aren't there only 2 possibilities for this regression?
>
> 1/ a memtype entry was never made so track_pfn_insert() returns an
> uncached mapping
>
> 2/ a conflicting memtype entry exists and undefined behavior due to
> mixed mapping types is avoided with the change.

3/ The CPU usage through this path goes up, and slows things down,
though I suspect you it's more an uncached mapping showing up
when we don't expect it.

Dave.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]
  2016-10-17 22:01                 ` Dave Airlie
@ 2016-10-18  3:48                   ` Dave Airlie
  2016-10-18 13:53                     ` Dan Williams
  2016-10-18  7:39                   ` Daniel Vetter
  1 sibling, 1 reply; 16+ messages in thread
From: Dave Airlie @ 2016-10-18  3:48 UTC (permalink / raw)
  To: Dan Williams; +Cc: dri-devel, Michel Dänzer, Shawn Starr, amd-gfx

On 18 October 2016 at 08:01, Dave Airlie <airlied@gmail.com> wrote:
> On 18 October 2016 at 07:25, Dan Williams <dan.j.williams@intel.com> wrote:
>> On Sun, Oct 16, 2016 at 1:53 PM, Dave Airlie <airlied@gmail.com> wrote:
>>> On 17 October 2016 at 04:41, Marek Olšák <maraeo@gmail.com> wrote:
>>>> On Fri, Oct 14, 2016 at 3:33 AM, Michel Dänzer <michel@daenzer.net> wrote:
>>>>>
>>>>> [ Adding Dan Williams and dri-devel ]
>>>>>
>>>>> On 14/10/16 03:28 AM, Shawn Starr wrote:
>>>>>> Hello AMD folks,
>>>>>>
>>>>>> I have discovered a problem in Linus master that affects AMDGPU, nobody would
>>>>>> notice this in drm-next-4.9-wip since its not in this repo.
>>>>>
>>>>> [...]
>>>>>
>>>>>> 87744ab3832b83ba71b931f86f9cfdb000d07da5 is the first bad commit
>>>>>> commit 87744ab3832b83ba71b931f86f9cfdb000d07da5
>>>>>> Author: Dan Williams <dan.j.williams@intel.com>
>>>>>> Date:   Fri Oct 7 17:00:18 2016 -0700
>>>>>>
>>>>>>     mm: fix cache mode tracking in vm_insert_mixed()
>>>>>>
>>>>>>     vm_insert_mixed() unlike vm_insert_pfn_prot() and vmf_insert_pfn_pmd(),
>>>>>>     fails to check the pgprot_t it uses for the mapping against the one
>>>>>>     recorded in the memtype tracking tree.  Add the missing call to
>>>>>>     track_pfn_insert() to preclude cases where incompatible aliased mappings
>>>>>>     are established for a given physical address range.
>>>>>>
>>>>>>     Link: http://lkml.kernel.org/r/
>>>>>> 147328717909.35069.14256589123570653697.stgit@dwillia2-
>>>>>> desk3.amr.corp.intel.com
>>>>>>     Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>>>>>>     Cc: David Airlie <airlied@linux.ie>
>>>>>>     Cc: Matthew Wilcox <mawilcox@microsoft.com>
>>>>>>     Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
>>>>>>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>>>>>>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
>>>>>>
>>>>>> :040000 040000 7517c0019fe49c1830b5a1d81f1dc099c5aab98a
>>>>>> fd497a604a2af5995db2b8ed1e9c640bede6adf3 M      mm
>>>>>>
>>>>>>
>>>>>> Removal of this patch stops graphics stalls.
>>>>>
>>>>> Thanks for bisecting this Shawn.
>>>>>
>>>>>
>>>>>> A friend of mine mentions,
>>>>>>
>>>>>> "looks like a graphics thingy you depend on is requesting a mapping with a
>>>>>> not-allowed cache mode, and now you are (rightfully) getting errors?"
>>>>>
>>>>> It would be nice to get some more specific pointers what amdgpu (or
>>>>> maybe ttm, since that calls vm_insert_mixed in ttm_bo_vm_fault) might be
>>>>> doing wrong.
>>>
>>>        /*
>>>          * We'd like to use VM_PFNMAP on shared mappings, where
>>>          * (vma->vm_flags & VM_SHARED) != 0, for performance reasons,
>>>          * but for some reason VM_PFNMAP + x86 PAT + write-combine is very
>>>          * bad for performance. Until that has been sorted out, use
>>>          * VM_MIXEDMAP on all mappings. See freedesktop.org bug #75719
>>>          */
>>>         vma->vm_flags |= VM_MIXEDMAP;
>>>
>>> We have that comment in the ttm code, which to me implies that mixed is
>>> doing the right thing now, but that is slow, as the interface we
>>> should be using.
>>>
>>
>> Aren't there only 2 possibilities for this regression?
>>
>> 1/ a memtype entry was never made so track_pfn_insert() returns an
>> uncached mapping
>>
>> 2/ a conflicting memtype entry exists and undefined behavior due to
>> mixed mapping types is avoided with the change.
>
> 3/ The CPU usage through this path goes up, and slows things down,
> though I suspect you it's more an uncached mapping showing up
> when we don't expect it.

It's looking line number 1, there is no mapping, now we get uncached
where we used to get write through.

difference in page prot 7f7bbc0e0000, pfn 20000000000e71e4,
8000000000000037, 800000000000002f

0x2f is the vma pg prot which has PWT set in it, 0x37 is the returned
pgprot which lacks that bit.

not sure where to go from here, suggestions?
Dave.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]
  2016-10-17 22:01                 ` Dave Airlie
  2016-10-18  3:48                   ` Dave Airlie
@ 2016-10-18  7:39                   ` Daniel Vetter
  1 sibling, 0 replies; 16+ messages in thread
From: Daniel Vetter @ 2016-10-18  7:39 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Dan Williams, amd-gfx, Shawn Starr, Michel Dänzer, dri-devel

On Tue, Oct 18, 2016 at 08:01:01AM +1000, Dave Airlie wrote:
> On 18 October 2016 at 07:25, Dan Williams <dan.j.williams@intel.com> wrote:
> > On Sun, Oct 16, 2016 at 1:53 PM, Dave Airlie <airlied@gmail.com> wrote:
> >> On 17 October 2016 at 04:41, Marek Olšák <maraeo@gmail.com> wrote:
> >>> On Fri, Oct 14, 2016 at 3:33 AM, Michel Dänzer <michel@daenzer.net> wrote:
> >>>>
> >>>> [ Adding Dan Williams and dri-devel ]
> >>>>
> >>>> On 14/10/16 03:28 AM, Shawn Starr wrote:
> >>>>> Hello AMD folks,
> >>>>>
> >>>>> I have discovered a problem in Linus master that affects AMDGPU, nobody would
> >>>>> notice this in drm-next-4.9-wip since its not in this repo.
> >>>>
> >>>> [...]
> >>>>
> >>>>> 87744ab3832b83ba71b931f86f9cfdb000d07da5 is the first bad commit
> >>>>> commit 87744ab3832b83ba71b931f86f9cfdb000d07da5
> >>>>> Author: Dan Williams <dan.j.williams@intel.com>
> >>>>> Date:   Fri Oct 7 17:00:18 2016 -0700
> >>>>>
> >>>>>     mm: fix cache mode tracking in vm_insert_mixed()
> >>>>>
> >>>>>     vm_insert_mixed() unlike vm_insert_pfn_prot() and vmf_insert_pfn_pmd(),
> >>>>>     fails to check the pgprot_t it uses for the mapping against the one
> >>>>>     recorded in the memtype tracking tree.  Add the missing call to
> >>>>>     track_pfn_insert() to preclude cases where incompatible aliased mappings
> >>>>>     are established for a given physical address range.
> >>>>>
> >>>>>     Link: http://lkml.kernel.org/r/
> >>>>> 147328717909.35069.14256589123570653697.stgit@dwillia2-
> >>>>> desk3.amr.corp.intel.com
> >>>>>     Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> >>>>>     Cc: David Airlie <airlied@linux.ie>
> >>>>>     Cc: Matthew Wilcox <mawilcox@microsoft.com>
> >>>>>     Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
> >>>>>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> >>>>>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> >>>>>
> >>>>> :040000 040000 7517c0019fe49c1830b5a1d81f1dc099c5aab98a
> >>>>> fd497a604a2af5995db2b8ed1e9c640bede6adf3 M      mm
> >>>>>
> >>>>>
> >>>>> Removal of this patch stops graphics stalls.
> >>>>
> >>>> Thanks for bisecting this Shawn.
> >>>>
> >>>>
> >>>>> A friend of mine mentions,
> >>>>>
> >>>>> "looks like a graphics thingy you depend on is requesting a mapping with a
> >>>>> not-allowed cache mode, and now you are (rightfully) getting errors?"
> >>>>
> >>>> It would be nice to get some more specific pointers what amdgpu (or
> >>>> maybe ttm, since that calls vm_insert_mixed in ttm_bo_vm_fault) might be
> >>>> doing wrong.
> >>
> >>        /*
> >>          * We'd like to use VM_PFNMAP on shared mappings, where
> >>          * (vma->vm_flags & VM_SHARED) != 0, for performance reasons,
> >>          * but for some reason VM_PFNMAP + x86 PAT + write-combine is very
> >>          * bad for performance. Until that has been sorted out, use
> >>          * VM_MIXEDMAP on all mappings. See freedesktop.org bug #75719
> >>          */
> >>         vma->vm_flags |= VM_MIXEDMAP;
> >>
> >> We have that comment in the ttm code, which to me implies that mixed is
> >> doing the right thing now, but that is slow, as the interface we
> >> should be using.
> >>
> >
> > Aren't there only 2 possibilities for this regression?
> >
> > 1/ a memtype entry was never made so track_pfn_insert() returns an
> > uncached mapping
> >
> > 2/ a conflicting memtype entry exists and undefined behavior due to
> > mixed mapping types is avoided with the change.
> 
> 3/ The CPU usage through this path goes up, and slows things down,
> though I suspect you it's more an uncached mapping showing up
> when we don't expect it.

Sounds reasonable, at least we (=i915 folks) known pte caching type
tracking is ridiculously expensive. In 4.9 we have our own pte walker and
upfront (at driver load) caching type checking to avoid all that. It's in
i915_mm.c, but probably should be moved into core kernel code (next to the
io_mapping stuff, which we reused as the tracking structure).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]
  2016-10-18  3:48                   ` Dave Airlie
@ 2016-10-18 13:53                     ` Dan Williams
  2016-10-19  6:42                       ` Dave Airlie
  0 siblings, 1 reply; 16+ messages in thread
From: Dan Williams @ 2016-10-18 13:53 UTC (permalink / raw)
  To: Dave Airlie; +Cc: dri-devel, Michel Dänzer, Shawn Starr, amd-gfx

On Mon, Oct 17, 2016 at 8:48 PM, Dave Airlie <airlied@gmail.com> wrote:
[..]
>>> Aren't there only 2 possibilities for this regression?
>>>
>>> 1/ a memtype entry was never made so track_pfn_insert() returns an
>>> uncached mapping
>>>
>>> 2/ a conflicting memtype entry exists and undefined behavior due to
>>> mixed mapping types is avoided with the change.
>>
>> 3/ The CPU usage through this path goes up, and slows things down,
>> though I suspect you it's more an uncached mapping showing up
>> when we don't expect it.
>
> It's looking line number 1, there is no mapping, now we get uncached
> where we used to get write through.
>
> difference in page prot 7f7bbc0e0000, pfn 20000000000e71e4,
> 8000000000000037, 800000000000002f
>
> 0x2f is the vma pg prot which has PWT set in it, 0x37 is the returned
> pgprot which lacks that bit.
>
> not sure where to go from here, suggestions?

If the driver established an ioremap_wt() across the range, or just
called reserve_memtype() directly that should restore WT mappings.

Although Daniel's suggestion to use the i915 mapping helpers sounds
like it avoids problem 3/ as well.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]
  2016-10-18 13:53                     ` Dan Williams
@ 2016-10-19  6:42                       ` Dave Airlie
  2016-10-19 10:33                         ` Marek Olšák
  0 siblings, 1 reply; 16+ messages in thread
From: Dave Airlie @ 2016-10-19  6:42 UTC (permalink / raw)
  To: Dan Williams; +Cc: dri-devel, Michel Dänzer, Shawn Starr, amd-gfx

On 18 October 2016 at 23:53, Dan Williams <dan.j.williams@intel.com> wrote:
> On Mon, Oct 17, 2016 at 8:48 PM, Dave Airlie <airlied@gmail.com> wrote:
> [..]
>>>> Aren't there only 2 possibilities for this regression?
>>>>
>>>> 1/ a memtype entry was never made so track_pfn_insert() returns an
>>>> uncached mapping
>>>>
>>>> 2/ a conflicting memtype entry exists and undefined behavior due to
>>>> mixed mapping types is avoided with the change.
>>>
>>> 3/ The CPU usage through this path goes up, and slows things down,
>>> though I suspect you it's more an uncached mapping showing up
>>> when we don't expect it.
>>
>> It's looking line number 1, there is no mapping, now we get uncached
>> where we used to get write through.
>>
>> difference in page prot 7f7bbc0e0000, pfn 20000000000e71e4,
>> 8000000000000037, 800000000000002f
>>
>> 0x2f is the vma pg prot which has PWT set in it, 0x37 is the returned
>> pgprot which lacks that bit.
>>
>> not sure where to go from here, suggestions?
>
> If the driver established an ioremap_wt() across the range, or just
> called reserve_memtype() directly that should restore WT mappings.
>
> Although Daniel's suggestion to use the i915 mapping helpers sounds
> like it avoids problem 3/ as well.

Well we shouldn't be doing that many VRAM mappings on the CPU so
I doubt we'll hit the overheads here that often.

Ideally we'd always use DMA to move stuff in/out of VRAM, but there
are some places where we still do WC VRAM writes for uploads.

So I've sent the patches, any major opinions on them, we can't just
ioremap_wc the whole BAR, as on 32-bit that just messes things up
and it's unnecessary anyways.

Dave.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]
  2016-10-19  6:42                       ` Dave Airlie
@ 2016-10-19 10:33                         ` Marek Olšák
  2016-10-20  1:11                           ` Michel Dänzer
  0 siblings, 1 reply; 16+ messages in thread
From: Marek Olšák @ 2016-10-19 10:33 UTC (permalink / raw)
  To: Dave Airlie
  Cc: dri-devel, Dan Williams, Shawn Starr, Michel Dänzer, amd-gfx

On Wed, Oct 19, 2016 at 8:42 AM, Dave Airlie <airlied@gmail.com> wrote:
> On 18 October 2016 at 23:53, Dan Williams <dan.j.williams@intel.com> wrote:
>> On Mon, Oct 17, 2016 at 8:48 PM, Dave Airlie <airlied@gmail.com> wrote:
>> [..]
>>>>> Aren't there only 2 possibilities for this regression?
>>>>>
>>>>> 1/ a memtype entry was never made so track_pfn_insert() returns an
>>>>> uncached mapping
>>>>>
>>>>> 2/ a conflicting memtype entry exists and undefined behavior due to
>>>>> mixed mapping types is avoided with the change.
>>>>
>>>> 3/ The CPU usage through this path goes up, and slows things down,
>>>> though I suspect you it's more an uncached mapping showing up
>>>> when we don't expect it.
>>>
>>> It's looking line number 1, there is no mapping, now we get uncached
>>> where we used to get write through.
>>>
>>> difference in page prot 7f7bbc0e0000, pfn 20000000000e71e4,
>>> 8000000000000037, 800000000000002f
>>>
>>> 0x2f is the vma pg prot which has PWT set in it, 0x37 is the returned
>>> pgprot which lacks that bit.
>>>
>>> not sure where to go from here, suggestions?
>>
>> If the driver established an ioremap_wt() across the range, or just
>> called reserve_memtype() directly that should restore WT mappings.
>>
>> Although Daniel's suggestion to use the i915 mapping helpers sounds
>> like it avoids problem 3/ as well.
>
> Well we shouldn't be doing that many VRAM mappings on the CPU so
> I doubt we'll hit the overheads here that often.
>
> Ideally we'd always use DMA to move stuff in/out of VRAM, but there
> are some places where we still do WC VRAM writes for uploads.

WC VRAM for uploads is better than WC GART IMO.

Marek
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]
  2016-10-19 10:33                         ` Marek Olšák
@ 2016-10-20  1:11                           ` Michel Dänzer
       [not found]                             ` <2ebd438d-21e6-cee8-3062-0ef84ab6c347-otUistvHUpPR7s880joybQ@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Michel Dänzer @ 2016-10-20  1:11 UTC (permalink / raw)
  To: Marek Olšák, Dave Airlie; +Cc: Dan Williams, amd-gfx, dri-devel

On 19/10/16 07:33 PM, Marek Olšák wrote:
> On Wed, Oct 19, 2016 at 8:42 AM, Dave Airlie <airlied@gmail.com> wrote:
>> On 18 October 2016 at 23:53, Dan Williams <dan.j.williams@intel.com> wrote:
>>> On Mon, Oct 17, 2016 at 8:48 PM, Dave Airlie <airlied@gmail.com> wrote:
>>> [..]
>>>>>> Aren't there only 2 possibilities for this regression?
>>>>>>
>>>>>> 1/ a memtype entry was never made so track_pfn_insert() returns an
>>>>>> uncached mapping
>>>>>>
>>>>>> 2/ a conflicting memtype entry exists and undefined behavior due to
>>>>>> mixed mapping types is avoided with the change.
>>>>>
>>>>> 3/ The CPU usage through this path goes up, and slows things down,
>>>>> though I suspect you it's more an uncached mapping showing up
>>>>> when we don't expect it.
>>>>
>>>> It's looking line number 1, there is no mapping, now we get uncached
>>>> where we used to get write through.
>>>>
>>>> difference in page prot 7f7bbc0e0000, pfn 20000000000e71e4,
>>>> 8000000000000037, 800000000000002f
>>>>
>>>> 0x2f is the vma pg prot which has PWT set in it, 0x37 is the returned
>>>> pgprot which lacks that bit.
>>>>
>>>> not sure where to go from here, suggestions?
>>>
>>> If the driver established an ioremap_wt() across the range, or just
>>> called reserve_memtype() directly that should restore WT mappings.
>>>
>>> Although Daniel's suggestion to use the i915 mapping helpers sounds
>>> like it avoids problem 3/ as well.
>>
>> Well we shouldn't be doing that many VRAM mappings on the CPU so
>> I doubt we'll hit the overheads here that often.
>>
>> Ideally we'd always use DMA to move stuff in/out of VRAM, but there
>> are some places where we still do WC VRAM writes for uploads.
> 
> WC VRAM for uploads is better than WC GART IMO.

It's not a simple choice I'm afraid. While writing directly to WC VRAM
can be faster than writing to WC GART and then DMA'ing to VRAM, doing so
increases pressure on the first 256MB of VRAM. That's why I disabled
direct VRAM writes for streaming uploads again in
https://cgit.freedesktop.org/mesa/mesa/commit/?id=7b4276d7acf2e0f77044cb50caa6ad936fa78786
. It's possible that something has changed since then though, feel free
to play with enabling it again.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]
       [not found]                             ` <2ebd438d-21e6-cee8-3062-0ef84ab6c347-otUistvHUpPR7s880joybQ@public.gmane.org>
@ 2016-10-20  9:06                               ` Marek Olšák
  0 siblings, 0 replies; 16+ messages in thread
From: Marek Olšák @ 2016-10-20  9:06 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: Dan Williams, Dave Airlie,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Thu, Oct 20, 2016 at 3:11 AM, Michel Dänzer <michel@daenzer.net> wrote:
> On 19/10/16 07:33 PM, Marek Olšák wrote:
>> On Wed, Oct 19, 2016 at 8:42 AM, Dave Airlie <airlied@gmail.com> wrote:
>>> On 18 October 2016 at 23:53, Dan Williams <dan.j.williams@intel.com> wrote:
>>>> On Mon, Oct 17, 2016 at 8:48 PM, Dave Airlie <airlied@gmail.com> wrote:
>>>> [..]
>>>>>>> Aren't there only 2 possibilities for this regression?
>>>>>>>
>>>>>>> 1/ a memtype entry was never made so track_pfn_insert() returns an
>>>>>>> uncached mapping
>>>>>>>
>>>>>>> 2/ a conflicting memtype entry exists and undefined behavior due to
>>>>>>> mixed mapping types is avoided with the change.
>>>>>>
>>>>>> 3/ The CPU usage through this path goes up, and slows things down,
>>>>>> though I suspect you it's more an uncached mapping showing up
>>>>>> when we don't expect it.
>>>>>
>>>>> It's looking line number 1, there is no mapping, now we get uncached
>>>>> where we used to get write through.
>>>>>
>>>>> difference in page prot 7f7bbc0e0000, pfn 20000000000e71e4,
>>>>> 8000000000000037, 800000000000002f
>>>>>
>>>>> 0x2f is the vma pg prot which has PWT set in it, 0x37 is the returned
>>>>> pgprot which lacks that bit.
>>>>>
>>>>> not sure where to go from here, suggestions?
>>>>
>>>> If the driver established an ioremap_wt() across the range, or just
>>>> called reserve_memtype() directly that should restore WT mappings.
>>>>
>>>> Although Daniel's suggestion to use the i915 mapping helpers sounds
>>>> like it avoids problem 3/ as well.
>>>
>>> Well we shouldn't be doing that many VRAM mappings on the CPU so
>>> I doubt we'll hit the overheads here that often.
>>>
>>> Ideally we'd always use DMA to move stuff in/out of VRAM, but there
>>> are some places where we still do WC VRAM writes for uploads.
>>
>> WC VRAM for uploads is better than WC GART IMO.
>
> It's not a simple choice I'm afraid. While writing directly to WC VRAM
> can be faster than writing to WC GART and then DMA'ing to VRAM, doing so
> increases pressure on the first 256MB of VRAM. That's why I disabled
> direct VRAM writes for streaming uploads again in
> https://cgit.freedesktop.org/mesa/mesa/commit/?id=7b4276d7acf2e0f77044cb50caa6ad936fa78786
> . It's possible that something has changed since then though, feel free
> to play with enabling it again.

amdgpu should handle any memory pressure gracefully. radeon is not so
robust though.

Marek
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-10-20  9:06 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-10 19:43 Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK? Shawn Starr
2016-10-10 20:55 ` Shawn Starr
2016-10-10 23:36   ` Shawn Starr
2016-10-13 18:28     ` mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?] Shawn Starr
2016-10-14  1:33       ` Michel Dänzer
     [not found]         ` <10a1e298-df32-52a5-7694-b205794ca009-otUistvHUpPR7s880joybQ@public.gmane.org>
2016-10-16 18:41           ` Marek Olšák
2016-10-16 20:53             ` Dave Airlie
2016-10-17 21:25               ` Dan Williams
2016-10-17 22:01                 ` Dave Airlie
2016-10-18  3:48                   ` Dave Airlie
2016-10-18 13:53                     ` Dan Williams
2016-10-19  6:42                       ` Dave Airlie
2016-10-19 10:33                         ` Marek Olšák
2016-10-20  1:11                           ` Michel Dänzer
     [not found]                             ` <2ebd438d-21e6-cee8-3062-0ef84ab6c347-otUistvHUpPR7s880joybQ@public.gmane.org>
2016-10-20  9:06                               ` Marek Olšák
2016-10-18  7:39                   ` Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.