dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
@ 2017-12-18 18:06 Mike Galbraith
       [not found] ` <1513620418.7113.51.camel-Mmb7MZpHnFY@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Galbraith @ 2017-12-18 18:06 UTC (permalink / raw)
  To: LKML; +Cc: nouveau, Ben Skeggs, dri-devel

Greetings,

Kernel bound workloads seem to trigger the below for whatever reason.
 I only see this when beating up NFS.  There was a kworker wakeup
latency issue, but with a bandaid applied to fix that up, I can still
trigger this.

[ 1313.811031] nouveau 0000:01:00.0: swiotlb buffer is full (sz: 2097152 bytes)
[ 1313.811035] swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
[ 1313.811038] CPU: 6 PID: 3026 Comm: Xorg Tainted: G            E    4.15.0.g1291a0d5-master #355
[ 1313.811040] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
[ 1313.811041] Call Trace:
[ 1313.811049]  dump_stack+0x7c/0xb6
[ 1313.811053]  swiotlb_alloc_coherent+0x13f/0x150
[ 1313.811060]  ttm_dma_pool_alloc_new_pages+0x106/0x3c0 [ttm]
[ 1313.811066]  ttm_dma_pool_get_pages+0x10a/0x1e0 [ttm]
[ 1313.811070]  ttm_dma_populate+0x21f/0x2f0 [ttm]
[ 1313.811075]  ttm_tt_bind+0x2f/0x60 [ttm]
[ 1313.811079]  ttm_bo_handle_move_mem+0x51f/0x580 [ttm]
[ 1313.811084]  ? ttm_bo_handle_move_mem+0x5/0x580 [ttm]
[ 1313.811088]  ttm_bo_validate+0x10c/0x120 [ttm]
[ 1313.811092]  ? ttm_bo_validate+0x5/0x120 [ttm]
[ 1313.811106]  ? drm_mode_setcrtc+0x20e/0x540 [drm]
[ 1313.811109]  ttm_bo_init_reserved+0x290/0x490 [ttm]
[ 1313.811114]  ttm_bo_init+0x52/0xb0 [ttm]
[ 1313.811141]  ? nv10_bo_put_tile_region+0x60/0x60 [nouveau]
[ 1313.811163]  nouveau_bo_new+0x465/0x5e0 [nouveau]
[ 1313.811184]  ? nv10_bo_put_tile_region+0x60/0x60 [nouveau]
[ 1313.811203]  nouveau_gem_new+0x66/0x110 [nouveau]
[ 1313.811223]  ? nouveau_gem_new+0x110/0x110 [nouveau]
[ 1313.811241]  nouveau_gem_ioctl_new+0x48/0xc0 [nouveau]
[ 1313.811249]  drm_ioctl_kernel+0x64/0xb0 [drm]
[ 1313.811257]  drm_ioctl+0x2a4/0x360 [drm]
[ 1313.811276]  ? nouveau_gem_new+0x110/0x110 [nouveau]
[ 1313.811285]  ? drm_ioctl+0x5/0x360 [drm]
[ 1313.811304]  nouveau_drm_ioctl+0x50/0xb0 [nouveau]
[ 1313.811308]  do_vfs_ioctl+0x90/0x690
[ 1313.811311]  ? do_vfs_ioctl+0x5/0x690
[ 1313.811313]  SyS_ioctl+0x3b/0x70
[ 1313.811316]  entry_SYSCALL_64_fastpath+0x1f/0x91
[ 1313.811320] RIP: 0033:0x7f3234746227
[ 1313.811321] RSP: 002b:00007ffc3ace0408 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
[ 1313.811324] RAX: ffffffffffffffda RBX: 00000000025515d0 RCX: 00007f3234746227
[ 1313.811325] RDX: 00007ffc3ace0460 RSI: 00000000c0306480 RDI: 000000000000000b
[ 1313.811326] RBP: 0000000000824120 R08: 0000000002548f80 R09: 00000000025490d0
[ 1313.811328] R10: 0000000000000000 R11: 0000000000003246 R12: 000000000000093d
[ 1313.811329] R13: 0000000002aff74c R14: 0000000000824150 R15: 0000000000000000
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
       [not found] ` <1513620418.7113.51.camel-Mmb7MZpHnFY@public.gmane.org>
@ 2017-12-18 19:01   ` Tobias Klausmann
       [not found]     ` <a083d576-215f-eb76-278b-741fc65fb138-AqjdNwhu20eELgA04lAiVw@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Tobias Klausmann @ 2017-12-18 19:01 UTC (permalink / raw)
  To: Mike Galbraith, LKML; +Cc: nouveau, Ben Skeggs, dri-devel


On 12/18/17 7:06 PM, Mike Galbraith wrote:
> Greetings,
>
> Kernel bound workloads seem to trigger the below for whatever reason.
>   I only see this when beating up NFS.  There was a kworker wakeup
> latency issue, but with a bandaid applied to fix that up, I can still
> trigger this.


Hi,

i have seen this one as well with my system, but i could not find an 
easy way to trigger it for bisecting purpose. If you can trigger it 
conveniently, a bisect would be nice!

Greetings,

Tobias


>
> [ 1313.811031] nouveau 0000:01:00.0: swiotlb buffer is full (sz: 2097152 bytes)
> [ 1313.811035] swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
> [ 1313.811038] CPU: 6 PID: 3026 Comm: Xorg Tainted: G            E    4.15.0.g1291a0d5-master #355
> [ 1313.811040] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
> [ 1313.811041] Call Trace:
> [ 1313.811049]  dump_stack+0x7c/0xb6
> [ 1313.811053]  swiotlb_alloc_coherent+0x13f/0x150
> [ 1313.811060]  ttm_dma_pool_alloc_new_pages+0x106/0x3c0 [ttm]
> [ 1313.811066]  ttm_dma_pool_get_pages+0x10a/0x1e0 [ttm]
> [ 1313.811070]  ttm_dma_populate+0x21f/0x2f0 [ttm]
> [ 1313.811075]  ttm_tt_bind+0x2f/0x60 [ttm]
> [ 1313.811079]  ttm_bo_handle_move_mem+0x51f/0x580 [ttm]
> [ 1313.811084]  ? ttm_bo_handle_move_mem+0x5/0x580 [ttm]
> [ 1313.811088]  ttm_bo_validate+0x10c/0x120 [ttm]
> [ 1313.811092]  ? ttm_bo_validate+0x5/0x120 [ttm]
> [ 1313.811106]  ? drm_mode_setcrtc+0x20e/0x540 [drm]
> [ 1313.811109]  ttm_bo_init_reserved+0x290/0x490 [ttm]
> [ 1313.811114]  ttm_bo_init+0x52/0xb0 [ttm]
> [ 1313.811141]  ? nv10_bo_put_tile_region+0x60/0x60 [nouveau]
> [ 1313.811163]  nouveau_bo_new+0x465/0x5e0 [nouveau]
> [ 1313.811184]  ? nv10_bo_put_tile_region+0x60/0x60 [nouveau]
> [ 1313.811203]  nouveau_gem_new+0x66/0x110 [nouveau]
> [ 1313.811223]  ? nouveau_gem_new+0x110/0x110 [nouveau]
> [ 1313.811241]  nouveau_gem_ioctl_new+0x48/0xc0 [nouveau]
> [ 1313.811249]  drm_ioctl_kernel+0x64/0xb0 [drm]
> [ 1313.811257]  drm_ioctl+0x2a4/0x360 [drm]
> [ 1313.811276]  ? nouveau_gem_new+0x110/0x110 [nouveau]
> [ 1313.811285]  ? drm_ioctl+0x5/0x360 [drm]
> [ 1313.811304]  nouveau_drm_ioctl+0x50/0xb0 [nouveau]
> [ 1313.811308]  do_vfs_ioctl+0x90/0x690
> [ 1313.811311]  ? do_vfs_ioctl+0x5/0x690
> [ 1313.811313]  SyS_ioctl+0x3b/0x70
> [ 1313.811316]  entry_SYSCALL_64_fastpath+0x1f/0x91
> [ 1313.811320] RIP: 0033:0x7f3234746227
> [ 1313.811321] RSP: 002b:00007ffc3ace0408 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
> [ 1313.811324] RAX: ffffffffffffffda RBX: 00000000025515d0 RCX: 00007f3234746227
> [ 1313.811325] RDX: 00007ffc3ace0460 RSI: 00000000c0306480 RDI: 000000000000000b
> [ 1313.811326] RBP: 0000000000824120 R08: 0000000002548f80 R09: 00000000025490d0
> [ 1313.811328] R10: 0000000000000000 R11: 0000000000003246 R12: 000000000000093d
> [ 1313.811329] R13: 0000000002aff74c R14: 0000000000824150 R15: 0000000000000000
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
       [not found]     ` <a083d576-215f-eb76-278b-741fc65fb138-AqjdNwhu20eELgA04lAiVw@public.gmane.org>
@ 2017-12-18 19:12       ` Mike Galbraith
  2017-12-19 10:37       ` Michel Dänzer
  1 sibling, 0 replies; 10+ messages in thread
From: Mike Galbraith @ 2017-12-18 19:12 UTC (permalink / raw)
  To: Tobias Klausmann, LKML; +Cc: nouveau, Ben Skeggs, dri-devel

On Mon, 2017-12-18 at 20:01 +0100, Tobias Klausmann wrote:
> On 12/18/17 7:06 PM, Mike Galbraith wrote:
> > Greetings,
> >
> > Kernel bound workloads seem to trigger the below for whatever reason.
> >   I only see this when beating up NFS.  There was a kworker wakeup
> > latency issue, but with a bandaid applied to fix that up, I can still
> > trigger this.
> 
> 
> Hi,
> 
> i have seen this one as well with my system, but i could not find an 
> easy way to trigger it for bisecting purpose. If you can trigger it 
> conveniently, a bisect would be nice!

Workload permitting.  To reproduce, mount your box NFS, cd to somewhere
the NFS mount, and just do bonnie -s <memory size>.  There, maybe
you'll beat me to it.  I hope so, I have multiple kernels doing the
annoying "baby birds in a nest" thing at me literally endlessly :)

	-Mike
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
       [not found]     ` <a083d576-215f-eb76-278b-741fc65fb138-AqjdNwhu20eELgA04lAiVw@public.gmane.org>
  2017-12-18 19:12       ` Mike Galbraith
@ 2017-12-19 10:37       ` Michel Dänzer
       [not found]         ` <f447948f-cfad-f4ea-4c41-54e42a733c16-otUistvHUpPR7s880joybQ@public.gmane.org>
  1 sibling, 1 reply; 10+ messages in thread
From: Michel Dänzer @ 2017-12-19 10:37 UTC (permalink / raw)
  To: Tobias Klausmann, Mike Galbraith
  Cc: nouveau, Christian König, LKML, dri-devel, Ben Skeggs

On 2017-12-18 08:01 PM, Tobias Klausmann wrote:
> On 12/18/17 7:06 PM, Mike Galbraith wrote:
>> Greetings,
>>
>> Kernel bound workloads seem to trigger the below for whatever reason.
>>   I only see this when beating up NFS.  There was a kworker wakeup
>> latency issue, but with a bandaid applied to fix that up, I can still
>> trigger this.
> 
> 
> Hi,
> 
> i have seen this one as well with my system, but i could not find an
> easy way to trigger it for bisecting purpose. If you can trigger it
> conveniently, a bisect would be nice!

I'm seeing this (with the amdgpu and radeon drivers) when restic takes a
backup, creating memory pressure. I happen to have just finished
bisecting, the result is:

648bc3574716400acc06f99915815f80d9563783 is the first bad commit
commit 648bc3574716400acc06f99915815f80d9563783
Author: Christian König <christian.koenig@amd.com>
Date:   Thu Jul 6 09:59:43 2017 +0200

    drm/ttm: add transparent huge page support for DMA allocations v2

    Try to allocate huge pages when it makes sense.

    v2: fix comment and use ifdef


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
       [not found]         ` <f447948f-cfad-f4ea-4c41-54e42a733c16-otUistvHUpPR7s880joybQ@public.gmane.org>
@ 2017-12-19 10:39           ` Michel Dänzer
  2017-12-19 13:45             ` Christian König
  0 siblings, 1 reply; 10+ messages in thread
From: Michel Dänzer @ 2017-12-19 10:39 UTC (permalink / raw)
  To: Tobias Klausmann, Mike Galbraith
  Cc: nouveau, Ben Skeggs, LKML, dri-devel, Christian König

On 2017-12-19 11:37 AM, Michel Dänzer wrote:
> On 2017-12-18 08:01 PM, Tobias Klausmann wrote:
>> On 12/18/17 7:06 PM, Mike Galbraith wrote:
>>> Greetings,
>>>
>>> Kernel bound workloads seem to trigger the below for whatever reason.
>>>   I only see this when beating up NFS.  There was a kworker wakeup
>>> latency issue, but with a bandaid applied to fix that up, I can still
>>> trigger this.
>>
>>
>> Hi,
>>
>> i have seen this one as well with my system, but i could not find an
>> easy way to trigger it for bisecting purpose. If you can trigger it
>> conveniently, a bisect would be nice!
> 
> I'm seeing this (with the amdgpu and radeon drivers) when restic takes a
> backup, creating memory pressure. I happen to have just finished
> bisecting, the result is:
> 
> 648bc3574716400acc06f99915815f80d9563783 is the first bad commit
> commit 648bc3574716400acc06f99915815f80d9563783
> Author: Christian König <christian.koenig@amd.com>
> Date:   Thu Jul 6 09:59:43 2017 +0200
> 
>     drm/ttm: add transparent huge page support for DMA allocations v2
> 
>     Try to allocate huge pages when it makes sense.
> 
>     v2: fix comment and use ifdef
> 
> 

BTW, I haven't noticed any bad effects other than the dmesg splats, so
maybe it's just noise about transient failures for which there is a
proper fallback in place.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
  2017-12-19 10:39           ` Michel Dänzer
@ 2017-12-19 13:45             ` Christian König
       [not found]               ` <e1b8dd62-4423-2b51-9634-e8938801b5d9-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Christian König @ 2017-12-19 13:45 UTC (permalink / raw)
  To: Michel Dänzer, Tobias Klausmann, Mike Galbraith
  Cc: nouveau, LKML, dri-devel, Ben Skeggs

Am 19.12.2017 um 11:39 schrieb Michel Dänzer:
> On 2017-12-19 11:37 AM, Michel Dänzer wrote:
>> On 2017-12-18 08:01 PM, Tobias Klausmann wrote:
>>> On 12/18/17 7:06 PM, Mike Galbraith wrote:
>>>> Greetings,
>>>>
>>>> Kernel bound workloads seem to trigger the below for whatever reason.
>>>>    I only see this when beating up NFS.  There was a kworker wakeup
>>>> latency issue, but with a bandaid applied to fix that up, I can still
>>>> trigger this.
>>>
>>> Hi,
>>>
>>> i have seen this one as well with my system, but i could not find an
>>> easy way to trigger it for bisecting purpose. If you can trigger it
>>> conveniently, a bisect would be nice!
>> I'm seeing this (with the amdgpu and radeon drivers) when restic takes a
>> backup, creating memory pressure. I happen to have just finished
>> bisecting, the result is:
>>
>> 648bc3574716400acc06f99915815f80d9563783 is the first bad commit
>> commit 648bc3574716400acc06f99915815f80d9563783
>> Author: Christian König <christian.koenig@amd.com>
>> Date:   Thu Jul 6 09:59:43 2017 +0200
>>
>>      drm/ttm: add transparent huge page support for DMA allocations v2
>>
>>      Try to allocate huge pages when it makes sense.
>>
>>      v2: fix comment and use ifdef
>>
>>
> BTW, I haven't noticed any bad effects other than the dmesg splats, so
> maybe it's just noise about transient failures for which there is a
> proper fallback in place.

Yeah, I think that is exactly what happens here.

We try to allocate a huge page, but fail and so fall back to using 
multiple 4k pages instead.

Going to send out a patch to suppress the warning.

Thanks for bisecting this,
Christian.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
       [not found]               ` <e1b8dd62-4423-2b51-9634-e8938801b5d9-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-12-31 18:27                 ` Ilia Mirkin
       [not found]                   ` <CAKb7Uvgt+SOwf6i2kAzEz65VTGzP6vYb2nBD+78FnXLtMZfOvg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Ilia Mirkin @ 2017-12-31 18:27 UTC (permalink / raw)
  To: Christian König
  Cc: nouveau, Michel Dänzer, Mike Galbraith, LKML, dri-devel, Ben Skeggs

On Tue, Dec 19, 2017 at 8:45 AM, Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
> Am 19.12.2017 um 11:39 schrieb Michel Dänzer:
>>
>> On 2017-12-19 11:37 AM, Michel Dänzer wrote:
>>>
>>> On 2017-12-18 08:01 PM, Tobias Klausmann wrote:
>>>>
>>>> On 12/18/17 7:06 PM, Mike Galbraith wrote:
>>>>>
>>>>> Greetings,
>>>>>
>>>>> Kernel bound workloads seem to trigger the below for whatever reason.
>>>>>    I only see this when beating up NFS.  There was a kworker wakeup
>>>>> latency issue, but with a bandaid applied to fix that up, I can still
>>>>> trigger this.
>>>>
>>>>
>>>> Hi,
>>>>
>>>> i have seen this one as well with my system, but i could not find an
>>>> easy way to trigger it for bisecting purpose. If you can trigger it
>>>> conveniently, a bisect would be nice!
>>>
>>> I'm seeing this (with the amdgpu and radeon drivers) when restic takes a
>>> backup, creating memory pressure. I happen to have just finished
>>> bisecting, the result is:
>>>
>>> 648bc3574716400acc06f99915815f80d9563783 is the first bad commit
>>> commit 648bc3574716400acc06f99915815f80d9563783
>>> Author: Christian König <christian.koenig@amd.com>
>>> Date:   Thu Jul 6 09:59:43 2017 +0200
>>>
>>>      drm/ttm: add transparent huge page support for DMA allocations v2
>>>
>>>      Try to allocate huge pages when it makes sense.
>>>
>>>      v2: fix comment and use ifdef
>>>
>>>
>> BTW, I haven't noticed any bad effects other than the dmesg splats, so
>> maybe it's just noise about transient failures for which there is a
>> proper fallback in place.
>
>
> Yeah, I think that is exactly what happens here.
>
> We try to allocate a huge page, but fail and so fall back to using multiple
> 4k pages instead.
>
> Going to send out a patch to suppress the warning.

Hi Christian,

Did you ever send out such a patch? I didn't see one on the list, but
perhaps I missed it. One definitely hasn't made it upstream yet. (I
just hit the issue myself with Linus's tree from last night.)

Thanks,

  -ilia
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
       [not found]                   ` <CAKb7Uvgt+SOwf6i2kAzEz65VTGzP6vYb2nBD+78FnXLtMZfOvg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-12-31 20:53                     ` Mike Galbraith
       [not found]                       ` <1514753618.20829.3.camel-Mmb7MZpHnFY@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Galbraith @ 2017-12-31 20:53 UTC (permalink / raw)
  To: Ilia Mirkin, Christian König
  Cc: nouveau, Michel Dänzer, LKML, dri-devel, Ben Skeggs

On Sun, 2017-12-31 at 13:27 -0500, Ilia Mirkin wrote:
> On Tue, Dec 19, 2017 at 8:45 AM, Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
> > Am 19.12.2017 um 11:39 schrieb Michel Dänzer:
> >>
> >> On 2017-12-19 11:37 AM, Michel Dänzer wrote:
> >>>
> >>> On 2017-12-18 08:01 PM, Tobias Klausmann wrote:
> >>>>
> >>>> On 12/18/17 7:06 PM, Mike Galbraith wrote:
> >>>>>
> >>>>> Greetings,
> >>>>>
> >>>>> Kernel bound workloads seem to trigger the below for whatever reason.
> >>>>>    I only see this when beating up NFS.  There was a kworker wakeup
> >>>>> latency issue, but with a bandaid applied to fix that up, I can still
> >>>>> trigger this.
> >>>>
> >>>>
> >>>> Hi,
> >>>>
> >>>> i have seen this one as well with my system, but i could not find an
> >>>> easy way to trigger it for bisecting purpose. If you can trigger it
> >>>> conveniently, a bisect would be nice!
> >>>
> >>> I'm seeing this (with the amdgpu and radeon drivers) when restic takes a
> >>> backup, creating memory pressure. I happen to have just finished
> >>> bisecting, the result is:
> >>>
> >>> 648bc3574716400acc06f99915815f80d9563783 is the first bad commit
> >>> commit 648bc3574716400acc06f99915815f80d9563783
> >>> Author: Christian König <christian.koenig@amd.com>
> >>> Date:   Thu Jul 6 09:59:43 2017 +0200
> >>>
> >>>      drm/ttm: add transparent huge page support for DMA allocations v2
> >>>
> >>>      Try to allocate huge pages when it makes sense.
> >>>
> >>>      v2: fix comment and use ifdef
> >>>
> >>>
> >> BTW, I haven't noticed any bad effects other than the dmesg splats, so
> >> maybe it's just noise about transient failures for which there is a
> >> proper fallback in place.
> >
> >
> > Yeah, I think that is exactly what happens here.
> >
> > We try to allocate a huge page, but fail and so fall back to using multiple
> > 4k pages instead.
> >
> > Going to send out a patch to suppress the warning.
> 
> Hi Christian,
> 
> Did you ever send out such a patch? I didn't see one on the list, but
> perhaps I missed it. One definitely hasn't made it upstream yet. (I
> just hit the issue myself with Linus's tree from last night.)

Actually, that wants a bit more methinks, because while the stack dump
goes away, you still get spammed, it just comes in smaller chunks.

	-Mike
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
       [not found]                       ` <1514753618.20829.3.camel-Mmb7MZpHnFY@public.gmane.org>
@ 2018-01-01 18:08                         ` Ilia Mirkin
  2018-01-02  9:43                           ` Christian König
  0 siblings, 1 reply; 10+ messages in thread
From: Ilia Mirkin @ 2018-01-01 18:08 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: nouveau, Michel Dänzer, LKML, dri-devel, Ben Skeggs,
	Christian König

On Sun, Dec 31, 2017 at 3:53 PM, Mike Galbraith <efault@gmx.de> wrote:
> On Sun, 2017-12-31 at 13:27 -0500, Ilia Mirkin wrote:
>> On Tue, Dec 19, 2017 at 8:45 AM, Christian König
>> <ckoenig.leichtzumerken@gmail.com> wrote:
>> > Am 19.12.2017 um 11:39 schrieb Michel Dänzer:
>> >>
>> >> On 2017-12-19 11:37 AM, Michel Dänzer wrote:
>> >>>
>> >>> On 2017-12-18 08:01 PM, Tobias Klausmann wrote:
>> >>>>
>> >>>> On 12/18/17 7:06 PM, Mike Galbraith wrote:
>> >>>>>
>> >>>>> Greetings,
>> >>>>>
>> >>>>> Kernel bound workloads seem to trigger the below for whatever reason.
>> >>>>>    I only see this when beating up NFS.  There was a kworker wakeup
>> >>>>> latency issue, but with a bandaid applied to fix that up, I can still
>> >>>>> trigger this.
>> >>>>
>> >>>>
>> >>>> Hi,
>> >>>>
>> >>>> i have seen this one as well with my system, but i could not find an
>> >>>> easy way to trigger it for bisecting purpose. If you can trigger it
>> >>>> conveniently, a bisect would be nice!
>> >>>
>> >>> I'm seeing this (with the amdgpu and radeon drivers) when restic takes a
>> >>> backup, creating memory pressure. I happen to have just finished
>> >>> bisecting, the result is:
>> >>>
>> >>> 648bc3574716400acc06f99915815f80d9563783 is the first bad commit
>> >>> commit 648bc3574716400acc06f99915815f80d9563783
>> >>> Author: Christian König <christian.koenig@amd.com>
>> >>> Date:   Thu Jul 6 09:59:43 2017 +0200
>> >>>
>> >>>      drm/ttm: add transparent huge page support for DMA allocations v2
>> >>>
>> >>>      Try to allocate huge pages when it makes sense.
>> >>>
>> >>>      v2: fix comment and use ifdef
>> >>>
>> >>>
>> >> BTW, I haven't noticed any bad effects other than the dmesg splats, so
>> >> maybe it's just noise about transient failures for which there is a
>> >> proper fallback in place.
>> >
>> >
>> > Yeah, I think that is exactly what happens here.
>> >
>> > We try to allocate a huge page, but fail and so fall back to using multiple
>> > 4k pages instead.
>> >
>> > Going to send out a patch to suppress the warning.
>>
>> Hi Christian,
>>
>> Did you ever send out such a patch? I didn't see one on the list, but
>> perhaps I missed it. One definitely hasn't made it upstream yet. (I
>> just hit the issue myself with Linus's tree from last night.)
>
> Actually, that wants a bit more methinks, because while the stack dump
> goes away, you still get spammed, it just comes in smaller chunks.

OK, well this has to either be fixed or reverted. Right now it's
complaining all the time for me after like a day of uptime.

  -ilia
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
  2018-01-01 18:08                         ` Ilia Mirkin
@ 2018-01-02  9:43                           ` Christian König
  0 siblings, 0 replies; 10+ messages in thread
From: Christian König @ 2018-01-02  9:43 UTC (permalink / raw)
  To: Ilia Mirkin, Mike Galbraith
  Cc: nouveau, Michel Dänzer, LKML, dri-devel, Ben Skeggs,
	Tobias Klausmann, Christian König

Am 01.01.2018 um 19:08 schrieb Ilia Mirkin:
> On Sun, Dec 31, 2017 at 3:53 PM, Mike Galbraith <efault@gmx.de> wrote:
>> On Sun, 2017-12-31 at 13:27 -0500, Ilia Mirkin wrote:
>>> On Tue, Dec 19, 2017 at 8:45 AM, Christian König
>>> <ckoenig.leichtzumerken@gmail.com> wrote:
>>>> Am 19.12.2017 um 11:39 schrieb Michel Dänzer:
>>>>> On 2017-12-19 11:37 AM, Michel Dänzer wrote:
>>>>>> On 2017-12-18 08:01 PM, Tobias Klausmann wrote:
>>>>>>> On 12/18/17 7:06 PM, Mike Galbraith wrote:
>>>>>>>> Greetings,
>>>>>>>>
>>>>>>>> Kernel bound workloads seem to trigger the below for whatever reason.
>>>>>>>>     I only see this when beating up NFS.  There was a kworker wakeup
>>>>>>>> latency issue, but with a bandaid applied to fix that up, I can still
>>>>>>>> trigger this.
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> i have seen this one as well with my system, but i could not find an
>>>>>>> easy way to trigger it for bisecting purpose. If you can trigger it
>>>>>>> conveniently, a bisect would be nice!
>>>>>> I'm seeing this (with the amdgpu and radeon drivers) when restic takes a
>>>>>> backup, creating memory pressure. I happen to have just finished
>>>>>> bisecting, the result is:
>>>>>>
>>>>>> 648bc3574716400acc06f99915815f80d9563783 is the first bad commit
>>>>>> commit 648bc3574716400acc06f99915815f80d9563783
>>>>>> Author: Christian König <christian.koenig@amd.com>
>>>>>> Date:   Thu Jul 6 09:59:43 2017 +0200
>>>>>>
>>>>>>       drm/ttm: add transparent huge page support for DMA allocations v2
>>>>>>
>>>>>>       Try to allocate huge pages when it makes sense.
>>>>>>
>>>>>>       v2: fix comment and use ifdef
>>>>>>
>>>>>>
>>>>> BTW, I haven't noticed any bad effects other than the dmesg splats, so
>>>>> maybe it's just noise about transient failures for which there is a
>>>>> proper fallback in place.
>>>>
>>>> Yeah, I think that is exactly what happens here.
>>>>
>>>> We try to allocate a huge page, but fail and so fall back to using multiple
>>>> 4k pages instead.
>>>>
>>>> Going to send out a patch to suppress the warning.
>>> Hi Christian,
>>>
>>> Did you ever send out such a patch? I didn't see one on the list, but
>>> perhaps I missed it. One definitely hasn't made it upstream yet. (I
>>> just hit the issue myself with Linus's tree from last night.)
>> Actually, that wants a bit more methinks, because while the stack dump
>> goes away, you still get spammed, it just comes in smaller chunks.
> OK, well this has to either be fixed or reverted. Right now it's
> complaining all the time for me after like a day of uptime.

I've already send out a patch to Konrad Rzeszutek Wilk and he wanted to 
queue that up.

But there is another warning I'm currently working on, just didn't had 
time to during the holidays.

Regards,
Christian.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-01-02  9:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-18 18:06 nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152 Mike Galbraith
     [not found] ` <1513620418.7113.51.camel-Mmb7MZpHnFY@public.gmane.org>
2017-12-18 19:01   ` Tobias Klausmann
     [not found]     ` <a083d576-215f-eb76-278b-741fc65fb138-AqjdNwhu20eELgA04lAiVw@public.gmane.org>
2017-12-18 19:12       ` Mike Galbraith
2017-12-19 10:37       ` Michel Dänzer
     [not found]         ` <f447948f-cfad-f4ea-4c41-54e42a733c16-otUistvHUpPR7s880joybQ@public.gmane.org>
2017-12-19 10:39           ` Michel Dänzer
2017-12-19 13:45             ` Christian König
     [not found]               ` <e1b8dd62-4423-2b51-9634-e8938801b5d9-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-12-31 18:27                 ` Ilia Mirkin
     [not found]                   ` <CAKb7Uvgt+SOwf6i2kAzEz65VTGzP6vYb2nBD+78FnXLtMZfOvg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-12-31 20:53                     ` Mike Galbraith
     [not found]                       ` <1514753618.20829.3.camel-Mmb7MZpHnFY@public.gmane.org>
2018-01-01 18:08                         ` Ilia Mirkin
2018-01-02  9:43                           ` Christian König

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).