All of lore.kernel.org
 help / color / mirror / Atom feed
* Trouble with TTM patches w/nouveau in linux-next
@ 2021-06-09 13:47 ` Mikko Perttunen
  0 siblings, 0 replies; 18+ messages in thread
From: Mikko Perttunen @ 2021-06-09 13:47 UTC (permalink / raw)
  To: christian.koenig; +Cc: matthew.auld, ray.huang, dri-devel, nouveau, linux-tegra

Hi,

I'm observing nouveau not initializing recently on linux-next on my 
Tegra186 Jetson TX2 board. Specifically it looks like BO allocation is 
failing when initializing the sync subsystem:

[   21.858149] nouveau 17000000.gpu: DRM: failed to initialise sync 
subsystem, -28

I have been bisecting and I have found two patches that affect this. 
Firstly, things first break on

d02117f8efaa drm/ttm: remove special handling for non GEM drivers

starting to return error code -12. Then, at

d79025c7f5e3 drm/ttm: always initialize the full ttm_resource v2

the error code changes to the above -28.

If I checkout one commit prior to d79025c7f5e3 and revert d02117f8efaa, 
things work again. There are a bunch of other TTM commits between this 
and HEAD, so reverting these on top of HEAD doesn't work. However, I 
checked that both yesterday's and today's nexts are also broken.

Thank you,
Mikko


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Nouveau] Trouble with TTM patches w/nouveau in linux-next
@ 2021-06-09 13:47 ` Mikko Perttunen
  0 siblings, 0 replies; 18+ messages in thread
From: Mikko Perttunen @ 2021-06-09 13:47 UTC (permalink / raw)
  To: christian.koenig; +Cc: linux-tegra, nouveau, ray.huang, matthew.auld, dri-devel

Hi,

I'm observing nouveau not initializing recently on linux-next on my 
Tegra186 Jetson TX2 board. Specifically it looks like BO allocation is 
failing when initializing the sync subsystem:

[   21.858149] nouveau 17000000.gpu: DRM: failed to initialise sync 
subsystem, -28

I have been bisecting and I have found two patches that affect this. 
Firstly, things first break on

d02117f8efaa drm/ttm: remove special handling for non GEM drivers

starting to return error code -12. Then, at

d79025c7f5e3 drm/ttm: always initialize the full ttm_resource v2

the error code changes to the above -28.

If I checkout one commit prior to d79025c7f5e3 and revert d02117f8efaa, 
things work again. There are a bunch of other TTM commits between this 
and HEAD, so reverting these on top of HEAD doesn't work. However, I 
checked that both yesterday's and today's nexts are also broken.

Thank you,
Mikko

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Trouble with TTM patches w/nouveau in linux-next
@ 2021-06-09 13:47 ` Mikko Perttunen
  0 siblings, 0 replies; 18+ messages in thread
From: Mikko Perttunen @ 2021-06-09 13:47 UTC (permalink / raw)
  To: christian.koenig; +Cc: linux-tegra, nouveau, ray.huang, matthew.auld, dri-devel

Hi,

I'm observing nouveau not initializing recently on linux-next on my 
Tegra186 Jetson TX2 board. Specifically it looks like BO allocation is 
failing when initializing the sync subsystem:

[   21.858149] nouveau 17000000.gpu: DRM: failed to initialise sync 
subsystem, -28

I have been bisecting and I have found two patches that affect this. 
Firstly, things first break on

d02117f8efaa drm/ttm: remove special handling for non GEM drivers

starting to return error code -12. Then, at

d79025c7f5e3 drm/ttm: always initialize the full ttm_resource v2

the error code changes to the above -28.

If I checkout one commit prior to d79025c7f5e3 and revert d02117f8efaa, 
things work again. There are a bunch of other TTM commits between this 
and HEAD, so reverting these on top of HEAD doesn't work. However, I 
checked that both yesterday's and today's nexts are also broken.

Thank you,
Mikko


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Trouble with TTM patches w/nouveau in linux-next
  2021-06-09 13:47 ` [Nouveau] " Mikko Perttunen
  (?)
@ 2021-06-09 14:17   ` Christian König
  -1 siblings, 0 replies; 18+ messages in thread
From: Christian König @ 2021-06-09 14:17 UTC (permalink / raw)
  To: Mikko Perttunen; +Cc: matthew.auld, ray.huang, dri-devel, nouveau, linux-tegra

Hi Mikko,

strange sounds like Nouveau was somehow also using the GEM workaround 
for VMWGFX as well.

But -12 means -ENOMEM which doesn't fits into the picture.

I will try with a G710, but if that doesn't yields anything I need some 
more input from you.

Thanks for the report,
Christian.


Am 09.06.21 um 15:47 schrieb Mikko Perttunen:
> Hi,
>
> I'm observing nouveau not initializing recently on linux-next on my 
> Tegra186 Jetson TX2 board. Specifically it looks like BO allocation is 
> failing when initializing the sync subsystem:
>
> [   21.858149] nouveau 17000000.gpu: DRM: failed to initialise sync 
> subsystem, -28
>
> I have been bisecting and I have found two patches that affect this. 
> Firstly, things first break on
>
> d02117f8efaa drm/ttm: remove special handling for non GEM drivers
>
> starting to return error code -12. Then, at
>
> d79025c7f5e3 drm/ttm: always initialize the full ttm_resource v2
>
> the error code changes to the above -28.
>
> If I checkout one commit prior to d79025c7f5e3 and revert 
> d02117f8efaa, things work again. There are a bunch of other TTM 
> commits between this and HEAD, so reverting these on top of HEAD 
> doesn't work. However, I checked that both yesterday's and today's 
> nexts are also broken.
>
> Thank you,
> Mikko
>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Nouveau] Trouble with TTM patches w/nouveau in linux-next
@ 2021-06-09 14:17   ` Christian König
  0 siblings, 0 replies; 18+ messages in thread
From: Christian König @ 2021-06-09 14:17 UTC (permalink / raw)
  To: Mikko Perttunen; +Cc: linux-tegra, nouveau, ray.huang, matthew.auld, dri-devel

Hi Mikko,

strange sounds like Nouveau was somehow also using the GEM workaround 
for VMWGFX as well.

But -12 means -ENOMEM which doesn't fits into the picture.

I will try with a G710, but if that doesn't yields anything I need some 
more input from you.

Thanks for the report,
Christian.


Am 09.06.21 um 15:47 schrieb Mikko Perttunen:
> Hi,
>
> I'm observing nouveau not initializing recently on linux-next on my 
> Tegra186 Jetson TX2 board. Specifically it looks like BO allocation is 
> failing when initializing the sync subsystem:
>
> [   21.858149] nouveau 17000000.gpu: DRM: failed to initialise sync 
> subsystem, -28
>
> I have been bisecting and I have found two patches that affect this. 
> Firstly, things first break on
>
> d02117f8efaa drm/ttm: remove special handling for non GEM drivers
>
> starting to return error code -12. Then, at
>
> d79025c7f5e3 drm/ttm: always initialize the full ttm_resource v2
>
> the error code changes to the above -28.
>
> If I checkout one commit prior to d79025c7f5e3 and revert 
> d02117f8efaa, things work again. There are a bunch of other TTM 
> commits between this and HEAD, so reverting these on top of HEAD 
> doesn't work. However, I checked that both yesterday's and today's 
> nexts are also broken.
>
> Thank you,
> Mikko
>

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Trouble with TTM patches w/nouveau in linux-next
@ 2021-06-09 14:17   ` Christian König
  0 siblings, 0 replies; 18+ messages in thread
From: Christian König @ 2021-06-09 14:17 UTC (permalink / raw)
  To: Mikko Perttunen; +Cc: linux-tegra, nouveau, ray.huang, matthew.auld, dri-devel

Hi Mikko,

strange sounds like Nouveau was somehow also using the GEM workaround 
for VMWGFX as well.

But -12 means -ENOMEM which doesn't fits into the picture.

I will try with a G710, but if that doesn't yields anything I need some 
more input from you.

Thanks for the report,
Christian.


Am 09.06.21 um 15:47 schrieb Mikko Perttunen:
> Hi,
>
> I'm observing nouveau not initializing recently on linux-next on my 
> Tegra186 Jetson TX2 board. Specifically it looks like BO allocation is 
> failing when initializing the sync subsystem:
>
> [   21.858149] nouveau 17000000.gpu: DRM: failed to initialise sync 
> subsystem, -28
>
> I have been bisecting and I have found two patches that affect this. 
> Firstly, things first break on
>
> d02117f8efaa drm/ttm: remove special handling for non GEM drivers
>
> starting to return error code -12. Then, at
>
> d79025c7f5e3 drm/ttm: always initialize the full ttm_resource v2
>
> the error code changes to the above -28.
>
> If I checkout one commit prior to d79025c7f5e3 and revert 
> d02117f8efaa, things work again. There are a bunch of other TTM 
> commits between this and HEAD, so reverting these on top of HEAD 
> doesn't work. However, I checked that both yesterday's and today's 
> nexts are also broken.
>
> Thank you,
> Mikko
>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Nouveau] Trouble with TTM patches w/nouveau in linux-next
  2021-06-09 14:17   ` [Nouveau] " Christian König
  (?)
@ 2021-06-09 14:52     ` Ilia Mirkin
  -1 siblings, 0 replies; 18+ messages in thread
From: Ilia Mirkin @ 2021-06-09 14:52 UTC (permalink / raw)
  To: Christian König
  Cc: Mikko Perttunen, linux-tegra, nouveau, ray.huang, matthew.auld,
	dri-devel

Christian - potentially relevant is that Tegra doesn't have VRAM at
all -- all GTT (or GART or whatever it's called nowadays). No
fake/stolen VRAM.

Cheers,

  -ilia

On Wed, Jun 9, 2021 at 10:18 AM Christian König
<christian.koenig@amd.com> wrote:
>
> Hi Mikko,
>
> strange sounds like Nouveau was somehow also using the GEM workaround
> for VMWGFX as well.
>
> But -12 means -ENOMEM which doesn't fits into the picture.
>
> I will try with a G710, but if that doesn't yields anything I need some
> more input from you.
>
> Thanks for the report,
> Christian.
>
>
> Am 09.06.21 um 15:47 schrieb Mikko Perttunen:
> > Hi,
> >
> > I'm observing nouveau not initializing recently on linux-next on my
> > Tegra186 Jetson TX2 board. Specifically it looks like BO allocation is
> > failing when initializing the sync subsystem:
> >
> > [   21.858149] nouveau 17000000.gpu: DRM: failed to initialise sync
> > subsystem, -28
> >
> > I have been bisecting and I have found two patches that affect this.
> > Firstly, things first break on
> >
> > d02117f8efaa drm/ttm: remove special handling for non GEM drivers
> >
> > starting to return error code -12. Then, at
> >
> > d79025c7f5e3 drm/ttm: always initialize the full ttm_resource v2
> >
> > the error code changes to the above -28.
> >
> > If I checkout one commit prior to d79025c7f5e3 and revert
> > d02117f8efaa, things work again. There are a bunch of other TTM
> > commits between this and HEAD, so reverting these on top of HEAD
> > doesn't work. However, I checked that both yesterday's and today's
> > nexts are also broken.
> >
> > Thank you,
> > Mikko
> >
>
> _______________________________________________
> Nouveau mailing list
> Nouveau@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Nouveau] Trouble with TTM patches w/nouveau in linux-next
@ 2021-06-09 14:52     ` Ilia Mirkin
  0 siblings, 0 replies; 18+ messages in thread
From: Ilia Mirkin @ 2021-06-09 14:52 UTC (permalink / raw)
  To: Christian König
  Cc: nouveau, dri-devel, ray.huang, matthew.auld, linux-tegra

Christian - potentially relevant is that Tegra doesn't have VRAM at
all -- all GTT (or GART or whatever it's called nowadays). No
fake/stolen VRAM.

Cheers,

  -ilia

On Wed, Jun 9, 2021 at 10:18 AM Christian König
<christian.koenig@amd.com> wrote:
>
> Hi Mikko,
>
> strange sounds like Nouveau was somehow also using the GEM workaround
> for VMWGFX as well.
>
> But -12 means -ENOMEM which doesn't fits into the picture.
>
> I will try with a G710, but if that doesn't yields anything I need some
> more input from you.
>
> Thanks for the report,
> Christian.
>
>
> Am 09.06.21 um 15:47 schrieb Mikko Perttunen:
> > Hi,
> >
> > I'm observing nouveau not initializing recently on linux-next on my
> > Tegra186 Jetson TX2 board. Specifically it looks like BO allocation is
> > failing when initializing the sync subsystem:
> >
> > [   21.858149] nouveau 17000000.gpu: DRM: failed to initialise sync
> > subsystem, -28
> >
> > I have been bisecting and I have found two patches that affect this.
> > Firstly, things first break on
> >
> > d02117f8efaa drm/ttm: remove special handling for non GEM drivers
> >
> > starting to return error code -12. Then, at
> >
> > d79025c7f5e3 drm/ttm: always initialize the full ttm_resource v2
> >
> > the error code changes to the above -28.
> >
> > If I checkout one commit prior to d79025c7f5e3 and revert
> > d02117f8efaa, things work again. There are a bunch of other TTM
> > commits between this and HEAD, so reverting these on top of HEAD
> > doesn't work. However, I checked that both yesterday's and today's
> > nexts are also broken.
> >
> > Thank you,
> > Mikko
> >
>
> _______________________________________________
> Nouveau mailing list
> Nouveau@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Nouveau] Trouble with TTM patches w/nouveau in linux-next
@ 2021-06-09 14:52     ` Ilia Mirkin
  0 siblings, 0 replies; 18+ messages in thread
From: Ilia Mirkin @ 2021-06-09 14:52 UTC (permalink / raw)
  To: Christian König
  Cc: Mikko Perttunen, nouveau, dri-devel, ray.huang, matthew.auld,
	linux-tegra

Christian - potentially relevant is that Tegra doesn't have VRAM at
all -- all GTT (or GART or whatever it's called nowadays). No
fake/stolen VRAM.

Cheers,

  -ilia

On Wed, Jun 9, 2021 at 10:18 AM Christian König
<christian.koenig@amd.com> wrote:
>
> Hi Mikko,
>
> strange sounds like Nouveau was somehow also using the GEM workaround
> for VMWGFX as well.
>
> But -12 means -ENOMEM which doesn't fits into the picture.
>
> I will try with a G710, but if that doesn't yields anything I need some
> more input from you.
>
> Thanks for the report,
> Christian.
>
>
> Am 09.06.21 um 15:47 schrieb Mikko Perttunen:
> > Hi,
> >
> > I'm observing nouveau not initializing recently on linux-next on my
> > Tegra186 Jetson TX2 board. Specifically it looks like BO allocation is
> > failing when initializing the sync subsystem:
> >
> > [   21.858149] nouveau 17000000.gpu: DRM: failed to initialise sync
> > subsystem, -28
> >
> > I have been bisecting and I have found two patches that affect this.
> > Firstly, things first break on
> >
> > d02117f8efaa drm/ttm: remove special handling for non GEM drivers
> >
> > starting to return error code -12. Then, at
> >
> > d79025c7f5e3 drm/ttm: always initialize the full ttm_resource v2
> >
> > the error code changes to the above -28.
> >
> > If I checkout one commit prior to d79025c7f5e3 and revert
> > d02117f8efaa, things work again. There are a bunch of other TTM
> > commits between this and HEAD, so reverting these on top of HEAD
> > doesn't work. However, I checked that both yesterday's and today's
> > nexts are also broken.
> >
> > Thank you,
> > Mikko
> >
>
> _______________________________________________
> Nouveau mailing list
> Nouveau@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Nouveau] Trouble with TTM patches w/nouveau in linux-next
  2021-06-09 14:52     ` Ilia Mirkin
  (?)
@ 2021-06-09 14:58       ` Christian König
  -1 siblings, 0 replies; 18+ messages in thread
From: Christian König @ 2021-06-09 14:58 UTC (permalink / raw)
  To: Ilia Mirkin
  Cc: Mikko Perttunen, linux-tegra, nouveau, ray.huang, matthew.auld,
	dri-devel

Good point, but I think that is unrelated.

My suspicion is rather that nouveau is not initializing the underlying 
GEM object for internal allocations.

So what happens is the same as on VMWGFX that TTM doesn't know anything 
about the size to of the BO resulting in a kmalloc() with a random value 
and eventually -ENOMEM.

Good news is that I can reproduce it, so going to look into that later 
today.

Regards,
Christian.

Am 09.06.21 um 16:52 schrieb Ilia Mirkin:
> Christian - potentially relevant is that Tegra doesn't have VRAM at
> all -- all GTT (or GART or whatever it's called nowadays). No
> fake/stolen VRAM.
>
> Cheers,
>
>    -ilia
>
> On Wed, Jun 9, 2021 at 10:18 AM Christian König
> <christian.koenig@amd.com> wrote:
>> Hi Mikko,
>>
>> strange sounds like Nouveau was somehow also using the GEM workaround
>> for VMWGFX as well.
>>
>> But -12 means -ENOMEM which doesn't fits into the picture.
>>
>> I will try with a G710, but if that doesn't yields anything I need some
>> more input from you.
>>
>> Thanks for the report,
>> Christian.
>>
>>
>> Am 09.06.21 um 15:47 schrieb Mikko Perttunen:
>>> Hi,
>>>
>>> I'm observing nouveau not initializing recently on linux-next on my
>>> Tegra186 Jetson TX2 board. Specifically it looks like BO allocation is
>>> failing when initializing the sync subsystem:
>>>
>>> [   21.858149] nouveau 17000000.gpu: DRM: failed to initialise sync
>>> subsystem, -28
>>>
>>> I have been bisecting and I have found two patches that affect this.
>>> Firstly, things first break on
>>>
>>> d02117f8efaa drm/ttm: remove special handling for non GEM drivers
>>>
>>> starting to return error code -12. Then, at
>>>
>>> d79025c7f5e3 drm/ttm: always initialize the full ttm_resource v2
>>>
>>> the error code changes to the above -28.
>>>
>>> If I checkout one commit prior to d79025c7f5e3 and revert
>>> d02117f8efaa, things work again. There are a bunch of other TTM
>>> commits between this and HEAD, so reverting these on top of HEAD
>>> doesn't work. However, I checked that both yesterday's and today's
>>> nexts are also broken.
>>>
>>> Thank you,
>>> Mikko
>>>
>> _______________________________________________
>> Nouveau mailing list
>> Nouveau@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fnouveau&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Caaf09cbea0b04d8dc01208d92b5637ba%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637588472445308290%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=ePoWVtHPXeK5RThkRuQSykKrfWCgPOzG5CLTzfw9%2Fuw%3D&amp;reserved=0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Nouveau] Trouble with TTM patches w/nouveau in linux-next
@ 2021-06-09 14:58       ` Christian König
  0 siblings, 0 replies; 18+ messages in thread
From: Christian König @ 2021-06-09 14:58 UTC (permalink / raw)
  To: Ilia Mirkin; +Cc: nouveau, dri-devel, ray.huang, matthew.auld, linux-tegra

Good point, but I think that is unrelated.

My suspicion is rather that nouveau is not initializing the underlying 
GEM object for internal allocations.

So what happens is the same as on VMWGFX that TTM doesn't know anything 
about the size to of the BO resulting in a kmalloc() with a random value 
and eventually -ENOMEM.

Good news is that I can reproduce it, so going to look into that later 
today.

Regards,
Christian.

Am 09.06.21 um 16:52 schrieb Ilia Mirkin:
> Christian - potentially relevant is that Tegra doesn't have VRAM at
> all -- all GTT (or GART or whatever it's called nowadays). No
> fake/stolen VRAM.
>
> Cheers,
>
>    -ilia
>
> On Wed, Jun 9, 2021 at 10:18 AM Christian König
> <christian.koenig@amd.com> wrote:
>> Hi Mikko,
>>
>> strange sounds like Nouveau was somehow also using the GEM workaround
>> for VMWGFX as well.
>>
>> But -12 means -ENOMEM which doesn't fits into the picture.
>>
>> I will try with a G710, but if that doesn't yields anything I need some
>> more input from you.
>>
>> Thanks for the report,
>> Christian.
>>
>>
>> Am 09.06.21 um 15:47 schrieb Mikko Perttunen:
>>> Hi,
>>>
>>> I'm observing nouveau not initializing recently on linux-next on my
>>> Tegra186 Jetson TX2 board. Specifically it looks like BO allocation is
>>> failing when initializing the sync subsystem:
>>>
>>> [   21.858149] nouveau 17000000.gpu: DRM: failed to initialise sync
>>> subsystem, -28
>>>
>>> I have been bisecting and I have found two patches that affect this.
>>> Firstly, things first break on
>>>
>>> d02117f8efaa drm/ttm: remove special handling for non GEM drivers
>>>
>>> starting to return error code -12. Then, at
>>>
>>> d79025c7f5e3 drm/ttm: always initialize the full ttm_resource v2
>>>
>>> the error code changes to the above -28.
>>>
>>> If I checkout one commit prior to d79025c7f5e3 and revert
>>> d02117f8efaa, things work again. There are a bunch of other TTM
>>> commits between this and HEAD, so reverting these on top of HEAD
>>> doesn't work. However, I checked that both yesterday's and today's
>>> nexts are also broken.
>>>
>>> Thank you,
>>> Mikko
>>>
>> _______________________________________________
>> Nouveau mailing list
>> Nouveau@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fnouveau&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Caaf09cbea0b04d8dc01208d92b5637ba%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637588472445308290%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=ePoWVtHPXeK5RThkRuQSykKrfWCgPOzG5CLTzfw9%2Fuw%3D&amp;reserved=0

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Nouveau] Trouble with TTM patches w/nouveau in linux-next
@ 2021-06-09 14:58       ` Christian König
  0 siblings, 0 replies; 18+ messages in thread
From: Christian König @ 2021-06-09 14:58 UTC (permalink / raw)
  To: Ilia Mirkin
  Cc: Mikko Perttunen, nouveau, dri-devel, ray.huang, matthew.auld,
	linux-tegra

Good point, but I think that is unrelated.

My suspicion is rather that nouveau is not initializing the underlying 
GEM object for internal allocations.

So what happens is the same as on VMWGFX that TTM doesn't know anything 
about the size to of the BO resulting in a kmalloc() with a random value 
and eventually -ENOMEM.

Good news is that I can reproduce it, so going to look into that later 
today.

Regards,
Christian.

Am 09.06.21 um 16:52 schrieb Ilia Mirkin:
> Christian - potentially relevant is that Tegra doesn't have VRAM at
> all -- all GTT (or GART or whatever it's called nowadays). No
> fake/stolen VRAM.
>
> Cheers,
>
>    -ilia
>
> On Wed, Jun 9, 2021 at 10:18 AM Christian König
> <christian.koenig@amd.com> wrote:
>> Hi Mikko,
>>
>> strange sounds like Nouveau was somehow also using the GEM workaround
>> for VMWGFX as well.
>>
>> But -12 means -ENOMEM which doesn't fits into the picture.
>>
>> I will try with a G710, but if that doesn't yields anything I need some
>> more input from you.
>>
>> Thanks for the report,
>> Christian.
>>
>>
>> Am 09.06.21 um 15:47 schrieb Mikko Perttunen:
>>> Hi,
>>>
>>> I'm observing nouveau not initializing recently on linux-next on my
>>> Tegra186 Jetson TX2 board. Specifically it looks like BO allocation is
>>> failing when initializing the sync subsystem:
>>>
>>> [   21.858149] nouveau 17000000.gpu: DRM: failed to initialise sync
>>> subsystem, -28
>>>
>>> I have been bisecting and I have found two patches that affect this.
>>> Firstly, things first break on
>>>
>>> d02117f8efaa drm/ttm: remove special handling for non GEM drivers
>>>
>>> starting to return error code -12. Then, at
>>>
>>> d79025c7f5e3 drm/ttm: always initialize the full ttm_resource v2
>>>
>>> the error code changes to the above -28.
>>>
>>> If I checkout one commit prior to d79025c7f5e3 and revert
>>> d02117f8efaa, things work again. There are a bunch of other TTM
>>> commits between this and HEAD, so reverting these on top of HEAD
>>> doesn't work. However, I checked that both yesterday's and today's
>>> nexts are also broken.
>>>
>>> Thank you,
>>> Mikko
>>>
>> _______________________________________________
>> Nouveau mailing list
>> Nouveau@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fnouveau&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Caaf09cbea0b04d8dc01208d92b5637ba%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637588472445308290%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=ePoWVtHPXeK5RThkRuQSykKrfWCgPOzG5CLTzfw9%2Fuw%3D&amp;reserved=0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Nouveau] Trouble with TTM patches w/nouveau in linux-next
  2021-06-09 14:58       ` Christian König
  (?)
@ 2021-06-09 15:13         ` Ilia Mirkin
  -1 siblings, 0 replies; 18+ messages in thread
From: Ilia Mirkin @ 2021-06-09 15:13 UTC (permalink / raw)
  To: Christian König
  Cc: Mikko Perttunen, linux-tegra, nouveau, ray.huang, matthew.auld,
	dri-devel

GEM init happens here:

https://cgit.freedesktop.org/drm/drm/tree/drivers/gpu/drm/nouveau/nouveau_gem.c#n221

Note the bo alloc / gem init / bo init dance.

I don't think there is a GEM object for internal allocations at all --
we just allocate bo's directly and that's it. Perhaps you meant
something else? I thought GEM was meant for externally-available
objects.

Cheers,

  -ilia

On Wed, Jun 9, 2021 at 10:58 AM Christian König
<christian.koenig@amd.com> wrote:
>
> Good point, but I think that is unrelated.
>
> My suspicion is rather that nouveau is not initializing the underlying
> GEM object for internal allocations.
>
> So what happens is the same as on VMWGFX that TTM doesn't know anything
> about the size to of the BO resulting in a kmalloc() with a random value
> and eventually -ENOMEM.
>
> Good news is that I can reproduce it, so going to look into that later
> today.
>
> Regards,
> Christian.
>
> Am 09.06.21 um 16:52 schrieb Ilia Mirkin:
> > Christian - potentially relevant is that Tegra doesn't have VRAM at
> > all -- all GTT (or GART or whatever it's called nowadays). No
> > fake/stolen VRAM.
> >
> > Cheers,
> >
> >    -ilia
> >
> > On Wed, Jun 9, 2021 at 10:18 AM Christian König
> > <christian.koenig@amd.com> wrote:
> >> Hi Mikko,
> >>
> >> strange sounds like Nouveau was somehow also using the GEM workaround
> >> for VMWGFX as well.
> >>
> >> But -12 means -ENOMEM which doesn't fits into the picture.
> >>
> >> I will try with a G710, but if that doesn't yields anything I need some
> >> more input from you.
> >>
> >> Thanks for the report,
> >> Christian.
> >>
> >>
> >> Am 09.06.21 um 15:47 schrieb Mikko Perttunen:
> >>> Hi,
> >>>
> >>> I'm observing nouveau not initializing recently on linux-next on my
> >>> Tegra186 Jetson TX2 board. Specifically it looks like BO allocation is
> >>> failing when initializing the sync subsystem:
> >>>
> >>> [   21.858149] nouveau 17000000.gpu: DRM: failed to initialise sync
> >>> subsystem, -28
> >>>
> >>> I have been bisecting and I have found two patches that affect this.
> >>> Firstly, things first break on
> >>>
> >>> d02117f8efaa drm/ttm: remove special handling for non GEM drivers
> >>>
> >>> starting to return error code -12. Then, at
> >>>
> >>> d79025c7f5e3 drm/ttm: always initialize the full ttm_resource v2
> >>>
> >>> the error code changes to the above -28.
> >>>
> >>> If I checkout one commit prior to d79025c7f5e3 and revert
> >>> d02117f8efaa, things work again. There are a bunch of other TTM
> >>> commits between this and HEAD, so reverting these on top of HEAD
> >>> doesn't work. However, I checked that both yesterday's and today's
> >>> nexts are also broken.
> >>>
> >>> Thank you,
> >>> Mikko
> >>>
> >> _______________________________________________
> >> Nouveau mailing list
> >> Nouveau@lists.freedesktop.org
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fnouveau&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Caaf09cbea0b04d8dc01208d92b5637ba%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637588472445308290%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=ePoWVtHPXeK5RThkRuQSykKrfWCgPOzG5CLTzfw9%2Fuw%3D&amp;reserved=0
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Nouveau] Trouble with TTM patches w/nouveau in linux-next
@ 2021-06-09 15:13         ` Ilia Mirkin
  0 siblings, 0 replies; 18+ messages in thread
From: Ilia Mirkin @ 2021-06-09 15:13 UTC (permalink / raw)
  To: Christian König
  Cc: nouveau, dri-devel, ray.huang, matthew.auld, linux-tegra

GEM init happens here:

https://cgit.freedesktop.org/drm/drm/tree/drivers/gpu/drm/nouveau/nouveau_gem.c#n221

Note the bo alloc / gem init / bo init dance.

I don't think there is a GEM object for internal allocations at all --
we just allocate bo's directly and that's it. Perhaps you meant
something else? I thought GEM was meant for externally-available
objects.

Cheers,

  -ilia

On Wed, Jun 9, 2021 at 10:58 AM Christian König
<christian.koenig@amd.com> wrote:
>
> Good point, but I think that is unrelated.
>
> My suspicion is rather that nouveau is not initializing the underlying
> GEM object for internal allocations.
>
> So what happens is the same as on VMWGFX that TTM doesn't know anything
> about the size to of the BO resulting in a kmalloc() with a random value
> and eventually -ENOMEM.
>
> Good news is that I can reproduce it, so going to look into that later
> today.
>
> Regards,
> Christian.
>
> Am 09.06.21 um 16:52 schrieb Ilia Mirkin:
> > Christian - potentially relevant is that Tegra doesn't have VRAM at
> > all -- all GTT (or GART or whatever it's called nowadays). No
> > fake/stolen VRAM.
> >
> > Cheers,
> >
> >    -ilia
> >
> > On Wed, Jun 9, 2021 at 10:18 AM Christian König
> > <christian.koenig@amd.com> wrote:
> >> Hi Mikko,
> >>
> >> strange sounds like Nouveau was somehow also using the GEM workaround
> >> for VMWGFX as well.
> >>
> >> But -12 means -ENOMEM which doesn't fits into the picture.
> >>
> >> I will try with a G710, but if that doesn't yields anything I need some
> >> more input from you.
> >>
> >> Thanks for the report,
> >> Christian.
> >>
> >>
> >> Am 09.06.21 um 15:47 schrieb Mikko Perttunen:
> >>> Hi,
> >>>
> >>> I'm observing nouveau not initializing recently on linux-next on my
> >>> Tegra186 Jetson TX2 board. Specifically it looks like BO allocation is
> >>> failing when initializing the sync subsystem:
> >>>
> >>> [   21.858149] nouveau 17000000.gpu: DRM: failed to initialise sync
> >>> subsystem, -28
> >>>
> >>> I have been bisecting and I have found two patches that affect this.
> >>> Firstly, things first break on
> >>>
> >>> d02117f8efaa drm/ttm: remove special handling for non GEM drivers
> >>>
> >>> starting to return error code -12. Then, at
> >>>
> >>> d79025c7f5e3 drm/ttm: always initialize the full ttm_resource v2
> >>>
> >>> the error code changes to the above -28.
> >>>
> >>> If I checkout one commit prior to d79025c7f5e3 and revert
> >>> d02117f8efaa, things work again. There are a bunch of other TTM
> >>> commits between this and HEAD, so reverting these on top of HEAD
> >>> doesn't work. However, I checked that both yesterday's and today's
> >>> nexts are also broken.
> >>>
> >>> Thank you,
> >>> Mikko
> >>>
> >> _______________________________________________
> >> Nouveau mailing list
> >> Nouveau@lists.freedesktop.org
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fnouveau&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Caaf09cbea0b04d8dc01208d92b5637ba%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637588472445308290%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=ePoWVtHPXeK5RThkRuQSykKrfWCgPOzG5CLTzfw9%2Fuw%3D&amp;reserved=0
>
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Nouveau] Trouble with TTM patches w/nouveau in linux-next
@ 2021-06-09 15:13         ` Ilia Mirkin
  0 siblings, 0 replies; 18+ messages in thread
From: Ilia Mirkin @ 2021-06-09 15:13 UTC (permalink / raw)
  To: Christian König
  Cc: Mikko Perttunen, nouveau, dri-devel, ray.huang, matthew.auld,
	linux-tegra

GEM init happens here:

https://cgit.freedesktop.org/drm/drm/tree/drivers/gpu/drm/nouveau/nouveau_gem.c#n221

Note the bo alloc / gem init / bo init dance.

I don't think there is a GEM object for internal allocations at all --
we just allocate bo's directly and that's it. Perhaps you meant
something else? I thought GEM was meant for externally-available
objects.

Cheers,

  -ilia

On Wed, Jun 9, 2021 at 10:58 AM Christian König
<christian.koenig@amd.com> wrote:
>
> Good point, but I think that is unrelated.
>
> My suspicion is rather that nouveau is not initializing the underlying
> GEM object for internal allocations.
>
> So what happens is the same as on VMWGFX that TTM doesn't know anything
> about the size to of the BO resulting in a kmalloc() with a random value
> and eventually -ENOMEM.
>
> Good news is that I can reproduce it, so going to look into that later
> today.
>
> Regards,
> Christian.
>
> Am 09.06.21 um 16:52 schrieb Ilia Mirkin:
> > Christian - potentially relevant is that Tegra doesn't have VRAM at
> > all -- all GTT (or GART or whatever it's called nowadays). No
> > fake/stolen VRAM.
> >
> > Cheers,
> >
> >    -ilia
> >
> > On Wed, Jun 9, 2021 at 10:18 AM Christian König
> > <christian.koenig@amd.com> wrote:
> >> Hi Mikko,
> >>
> >> strange sounds like Nouveau was somehow also using the GEM workaround
> >> for VMWGFX as well.
> >>
> >> But -12 means -ENOMEM which doesn't fits into the picture.
> >>
> >> I will try with a G710, but if that doesn't yields anything I need some
> >> more input from you.
> >>
> >> Thanks for the report,
> >> Christian.
> >>
> >>
> >> Am 09.06.21 um 15:47 schrieb Mikko Perttunen:
> >>> Hi,
> >>>
> >>> I'm observing nouveau not initializing recently on linux-next on my
> >>> Tegra186 Jetson TX2 board. Specifically it looks like BO allocation is
> >>> failing when initializing the sync subsystem:
> >>>
> >>> [   21.858149] nouveau 17000000.gpu: DRM: failed to initialise sync
> >>> subsystem, -28
> >>>
> >>> I have been bisecting and I have found two patches that affect this.
> >>> Firstly, things first break on
> >>>
> >>> d02117f8efaa drm/ttm: remove special handling for non GEM drivers
> >>>
> >>> starting to return error code -12. Then, at
> >>>
> >>> d79025c7f5e3 drm/ttm: always initialize the full ttm_resource v2
> >>>
> >>> the error code changes to the above -28.
> >>>
> >>> If I checkout one commit prior to d79025c7f5e3 and revert
> >>> d02117f8efaa, things work again. There are a bunch of other TTM
> >>> commits between this and HEAD, so reverting these on top of HEAD
> >>> doesn't work. However, I checked that both yesterday's and today's
> >>> nexts are also broken.
> >>>
> >>> Thank you,
> >>> Mikko
> >>>
> >> _______________________________________________
> >> Nouveau mailing list
> >> Nouveau@lists.freedesktop.org
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fnouveau&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Caaf09cbea0b04d8dc01208d92b5637ba%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637588472445308290%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=ePoWVtHPXeK5RThkRuQSykKrfWCgPOzG5CLTzfw9%2Fuw%3D&amp;reserved=0
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Nouveau] Trouble with TTM patches w/nouveau in linux-next
  2021-06-09 15:13         ` Ilia Mirkin
  (?)
@ 2021-06-09 15:21           ` Christian König
  -1 siblings, 0 replies; 18+ messages in thread
From: Christian König @ 2021-06-09 15:21 UTC (permalink / raw)
  To: Ilia Mirkin
  Cc: Mikko Perttunen, linux-tegra, nouveau, ray.huang, matthew.auld,
	dri-devel

Yeah, exactly that's the root cause of the problem.

GEM objects are the base class for TTM BOs for quite a while now.

So we at least need to initialize them or otherwise at least the size of 
the object is not available.

Going to send a fix in a few minutes.

Thanks,
Christian.

Am 09.06.21 um 17:13 schrieb Ilia Mirkin:
> GEM init happens here:
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm%2Ftree%2Fdrivers%2Fgpu%2Fdrm%2Fnouveau%2Fnouveau_gem.c%23n221&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C228788b1c8524fa8128b08d92b591b81%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637588483983919147%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=f%2BFFnPEAEoeuD8bKtnGtL5ZU0HBYpJAjqKqD29Xn9Kw%3D&amp;reserved=0
>
> Note the bo alloc / gem init / bo init dance.
>
> I don't think there is a GEM object for internal allocations at all --
> we just allocate bo's directly and that's it. Perhaps you meant
> something else? I thought GEM was meant for externally-available
> objects.
>
> Cheers,
>
>    -ilia
>
> On Wed, Jun 9, 2021 at 10:58 AM Christian König
> <christian.koenig@amd.com> wrote:
>> Good point, but I think that is unrelated.
>>
>> My suspicion is rather that nouveau is not initializing the underlying
>> GEM object for internal allocations.
>>
>> So what happens is the same as on VMWGFX that TTM doesn't know anything
>> about the size to of the BO resulting in a kmalloc() with a random value
>> and eventually -ENOMEM.
>>
>> Good news is that I can reproduce it, so going to look into that later
>> today.
>>
>> Regards,
>> Christian.
>>
>> Am 09.06.21 um 16:52 schrieb Ilia Mirkin:
>>> Christian - potentially relevant is that Tegra doesn't have VRAM at
>>> all -- all GTT (or GART or whatever it's called nowadays). No
>>> fake/stolen VRAM.
>>>
>>> Cheers,
>>>
>>>     -ilia
>>>
>>> On Wed, Jun 9, 2021 at 10:18 AM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>> Hi Mikko,
>>>>
>>>> strange sounds like Nouveau was somehow also using the GEM workaround
>>>> for VMWGFX as well.
>>>>
>>>> But -12 means -ENOMEM which doesn't fits into the picture.
>>>>
>>>> I will try with a G710, but if that doesn't yields anything I need some
>>>> more input from you.
>>>>
>>>> Thanks for the report,
>>>> Christian.
>>>>
>>>>
>>>> Am 09.06.21 um 15:47 schrieb Mikko Perttunen:
>>>>> Hi,
>>>>>
>>>>> I'm observing nouveau not initializing recently on linux-next on my
>>>>> Tegra186 Jetson TX2 board. Specifically it looks like BO allocation is
>>>>> failing when initializing the sync subsystem:
>>>>>
>>>>> [   21.858149] nouveau 17000000.gpu: DRM: failed to initialise sync
>>>>> subsystem, -28
>>>>>
>>>>> I have been bisecting and I have found two patches that affect this.
>>>>> Firstly, things first break on
>>>>>
>>>>> d02117f8efaa drm/ttm: remove special handling for non GEM drivers
>>>>>
>>>>> starting to return error code -12. Then, at
>>>>>
>>>>> d79025c7f5e3 drm/ttm: always initialize the full ttm_resource v2
>>>>>
>>>>> the error code changes to the above -28.
>>>>>
>>>>> If I checkout one commit prior to d79025c7f5e3 and revert
>>>>> d02117f8efaa, things work again. There are a bunch of other TTM
>>>>> commits between this and HEAD, so reverting these on top of HEAD
>>>>> doesn't work. However, I checked that both yesterday's and today's
>>>>> nexts are also broken.
>>>>>
>>>>> Thank you,
>>>>> Mikko
>>>>>
>>>> _______________________________________________
>>>> Nouveau mailing list
>>>> Nouveau@lists.freedesktop.org
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fnouveau&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C228788b1c8524fa8128b08d92b591b81%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637588483983919147%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=dn0%2FRAOKKddFQbhJcjd3v1L%2BHc4hGlpWIURPTG33g50%3D&amp;reserved=0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Nouveau] Trouble with TTM patches w/nouveau in linux-next
@ 2021-06-09 15:21           ` Christian König
  0 siblings, 0 replies; 18+ messages in thread
From: Christian König @ 2021-06-09 15:21 UTC (permalink / raw)
  To: Ilia Mirkin; +Cc: nouveau, dri-devel, ray.huang, matthew.auld, linux-tegra

Yeah, exactly that's the root cause of the problem.

GEM objects are the base class for TTM BOs for quite a while now.

So we at least need to initialize them or otherwise at least the size of 
the object is not available.

Going to send a fix in a few minutes.

Thanks,
Christian.

Am 09.06.21 um 17:13 schrieb Ilia Mirkin:
> GEM init happens here:
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm%2Ftree%2Fdrivers%2Fgpu%2Fdrm%2Fnouveau%2Fnouveau_gem.c%23n221&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C228788b1c8524fa8128b08d92b591b81%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637588483983919147%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=f%2BFFnPEAEoeuD8bKtnGtL5ZU0HBYpJAjqKqD29Xn9Kw%3D&amp;reserved=0
>
> Note the bo alloc / gem init / bo init dance.
>
> I don't think there is a GEM object for internal allocations at all --
> we just allocate bo's directly and that's it. Perhaps you meant
> something else? I thought GEM was meant for externally-available
> objects.
>
> Cheers,
>
>    -ilia
>
> On Wed, Jun 9, 2021 at 10:58 AM Christian König
> <christian.koenig@amd.com> wrote:
>> Good point, but I think that is unrelated.
>>
>> My suspicion is rather that nouveau is not initializing the underlying
>> GEM object for internal allocations.
>>
>> So what happens is the same as on VMWGFX that TTM doesn't know anything
>> about the size to of the BO resulting in a kmalloc() with a random value
>> and eventually -ENOMEM.
>>
>> Good news is that I can reproduce it, so going to look into that later
>> today.
>>
>> Regards,
>> Christian.
>>
>> Am 09.06.21 um 16:52 schrieb Ilia Mirkin:
>>> Christian - potentially relevant is that Tegra doesn't have VRAM at
>>> all -- all GTT (or GART or whatever it's called nowadays). No
>>> fake/stolen VRAM.
>>>
>>> Cheers,
>>>
>>>     -ilia
>>>
>>> On Wed, Jun 9, 2021 at 10:18 AM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>> Hi Mikko,
>>>>
>>>> strange sounds like Nouveau was somehow also using the GEM workaround
>>>> for VMWGFX as well.
>>>>
>>>> But -12 means -ENOMEM which doesn't fits into the picture.
>>>>
>>>> I will try with a G710, but if that doesn't yields anything I need some
>>>> more input from you.
>>>>
>>>> Thanks for the report,
>>>> Christian.
>>>>
>>>>
>>>> Am 09.06.21 um 15:47 schrieb Mikko Perttunen:
>>>>> Hi,
>>>>>
>>>>> I'm observing nouveau not initializing recently on linux-next on my
>>>>> Tegra186 Jetson TX2 board. Specifically it looks like BO allocation is
>>>>> failing when initializing the sync subsystem:
>>>>>
>>>>> [   21.858149] nouveau 17000000.gpu: DRM: failed to initialise sync
>>>>> subsystem, -28
>>>>>
>>>>> I have been bisecting and I have found two patches that affect this.
>>>>> Firstly, things first break on
>>>>>
>>>>> d02117f8efaa drm/ttm: remove special handling for non GEM drivers
>>>>>
>>>>> starting to return error code -12. Then, at
>>>>>
>>>>> d79025c7f5e3 drm/ttm: always initialize the full ttm_resource v2
>>>>>
>>>>> the error code changes to the above -28.
>>>>>
>>>>> If I checkout one commit prior to d79025c7f5e3 and revert
>>>>> d02117f8efaa, things work again. There are a bunch of other TTM
>>>>> commits between this and HEAD, so reverting these on top of HEAD
>>>>> doesn't work. However, I checked that both yesterday's and today's
>>>>> nexts are also broken.
>>>>>
>>>>> Thank you,
>>>>> Mikko
>>>>>
>>>> _______________________________________________
>>>> Nouveau mailing list
>>>> Nouveau@lists.freedesktop.org
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fnouveau&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C228788b1c8524fa8128b08d92b591b81%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637588483983919147%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=dn0%2FRAOKKddFQbhJcjd3v1L%2BHc4hGlpWIURPTG33g50%3D&amp;reserved=0

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Nouveau] Trouble with TTM patches w/nouveau in linux-next
@ 2021-06-09 15:21           ` Christian König
  0 siblings, 0 replies; 18+ messages in thread
From: Christian König @ 2021-06-09 15:21 UTC (permalink / raw)
  To: Ilia Mirkin
  Cc: Mikko Perttunen, nouveau, dri-devel, ray.huang, matthew.auld,
	linux-tegra

Yeah, exactly that's the root cause of the problem.

GEM objects are the base class for TTM BOs for quite a while now.

So we at least need to initialize them or otherwise at least the size of 
the object is not available.

Going to send a fix in a few minutes.

Thanks,
Christian.

Am 09.06.21 um 17:13 schrieb Ilia Mirkin:
> GEM init happens here:
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm%2Ftree%2Fdrivers%2Fgpu%2Fdrm%2Fnouveau%2Fnouveau_gem.c%23n221&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C228788b1c8524fa8128b08d92b591b81%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637588483983919147%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=f%2BFFnPEAEoeuD8bKtnGtL5ZU0HBYpJAjqKqD29Xn9Kw%3D&amp;reserved=0
>
> Note the bo alloc / gem init / bo init dance.
>
> I don't think there is a GEM object for internal allocations at all --
> we just allocate bo's directly and that's it. Perhaps you meant
> something else? I thought GEM was meant for externally-available
> objects.
>
> Cheers,
>
>    -ilia
>
> On Wed, Jun 9, 2021 at 10:58 AM Christian König
> <christian.koenig@amd.com> wrote:
>> Good point, but I think that is unrelated.
>>
>> My suspicion is rather that nouveau is not initializing the underlying
>> GEM object for internal allocations.
>>
>> So what happens is the same as on VMWGFX that TTM doesn't know anything
>> about the size to of the BO resulting in a kmalloc() with a random value
>> and eventually -ENOMEM.
>>
>> Good news is that I can reproduce it, so going to look into that later
>> today.
>>
>> Regards,
>> Christian.
>>
>> Am 09.06.21 um 16:52 schrieb Ilia Mirkin:
>>> Christian - potentially relevant is that Tegra doesn't have VRAM at
>>> all -- all GTT (or GART or whatever it's called nowadays). No
>>> fake/stolen VRAM.
>>>
>>> Cheers,
>>>
>>>     -ilia
>>>
>>> On Wed, Jun 9, 2021 at 10:18 AM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>> Hi Mikko,
>>>>
>>>> strange sounds like Nouveau was somehow also using the GEM workaround
>>>> for VMWGFX as well.
>>>>
>>>> But -12 means -ENOMEM which doesn't fits into the picture.
>>>>
>>>> I will try with a G710, but if that doesn't yields anything I need some
>>>> more input from you.
>>>>
>>>> Thanks for the report,
>>>> Christian.
>>>>
>>>>
>>>> Am 09.06.21 um 15:47 schrieb Mikko Perttunen:
>>>>> Hi,
>>>>>
>>>>> I'm observing nouveau not initializing recently on linux-next on my
>>>>> Tegra186 Jetson TX2 board. Specifically it looks like BO allocation is
>>>>> failing when initializing the sync subsystem:
>>>>>
>>>>> [   21.858149] nouveau 17000000.gpu: DRM: failed to initialise sync
>>>>> subsystem, -28
>>>>>
>>>>> I have been bisecting and I have found two patches that affect this.
>>>>> Firstly, things first break on
>>>>>
>>>>> d02117f8efaa drm/ttm: remove special handling for non GEM drivers
>>>>>
>>>>> starting to return error code -12. Then, at
>>>>>
>>>>> d79025c7f5e3 drm/ttm: always initialize the full ttm_resource v2
>>>>>
>>>>> the error code changes to the above -28.
>>>>>
>>>>> If I checkout one commit prior to d79025c7f5e3 and revert
>>>>> d02117f8efaa, things work again. There are a bunch of other TTM
>>>>> commits between this and HEAD, so reverting these on top of HEAD
>>>>> doesn't work. However, I checked that both yesterday's and today's
>>>>> nexts are also broken.
>>>>>
>>>>> Thank you,
>>>>> Mikko
>>>>>
>>>> _______________________________________________
>>>> Nouveau mailing list
>>>> Nouveau@lists.freedesktop.org
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fnouveau&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C228788b1c8524fa8128b08d92b591b81%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637588483983919147%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=dn0%2FRAOKKddFQbhJcjd3v1L%2BHc4hGlpWIURPTG33g50%3D&amp;reserved=0


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-06-09 15:22 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-09 13:47 Trouble with TTM patches w/nouveau in linux-next Mikko Perttunen
2021-06-09 13:47 ` Mikko Perttunen
2021-06-09 13:47 ` [Nouveau] " Mikko Perttunen
2021-06-09 14:17 ` Christian König
2021-06-09 14:17   ` Christian König
2021-06-09 14:17   ` [Nouveau] " Christian König
2021-06-09 14:52   ` Ilia Mirkin
2021-06-09 14:52     ` Ilia Mirkin
2021-06-09 14:52     ` Ilia Mirkin
2021-06-09 14:58     ` Christian König
2021-06-09 14:58       ` Christian König
2021-06-09 14:58       ` Christian König
2021-06-09 15:13       ` Ilia Mirkin
2021-06-09 15:13         ` Ilia Mirkin
2021-06-09 15:13         ` Ilia Mirkin
2021-06-09 15:21         ` Christian König
2021-06-09 15:21           ` Christian König
2021-06-09 15:21           ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.