All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <ckoenig.leichtzumerken@gmail.com>
To: Karol Herbst <kherbst@redhat.com>,
	"Koenig, Christian" <Christian.Koenig@amd.com>
Cc: "Deucher, Alexander" <Alexander.Deucher@amd.com>,
	dri-devel <dri-devel@lists.freedesktop.org>
Subject: Re: [PATCH 0/6] Fix crash after reloading a driver using ttm
Date: Tue, 16 Apr 2019 13:25:49 +0200	[thread overview]
Message-ID: <bcb28473-acb4-50c3-f83d-cf6141ce7dd2@gmail.com> (raw)
In-Reply-To: <CACO55tuWZxMg3J5ZUDEreiijFGJjsc+qMVYg_enzA0LkiVd36A@mail.gmail.com>

Am 16.04.19 um 13:20 schrieb Karol Herbst:
> On Tue, Apr 16, 2019 at 1:07 PM Koenig, Christian
> <Christian.Koenig@amd.com> wrote:
>> Am 16.04.19 um 12:54 schrieb Karol Herbst:
>>> On Tue, Apr 16, 2019 at 11:12 AM Koenig, Christian
>>> <Christian.Koenig@amd.com> wrote:
>>>> Am 16.04.19 um 11:10 schrieb Karol Herbst:
>>>>> On Tue, Apr 16, 2019 at 8:38 AM Christian König
>>>>> <ckoenig.leichtzumerken@gmail.com> wrote:
>>>>>> Am 16.04.19 um 02:35 schrieb Karol Herbst:
>>>>>>> Kobjects are supposed to be dynamically allocated, but with recent changes
>>>>>>> this rule was violated. Reverting those commits fixes crashes when a drm
>>>>>>> driver using TTM gets loaded again.
>>>>>>>
>>>>>>> The object in question is "ttm_mem_glob" declared inside
>>>>>>> "include/drm/ttm/ttm_memory.h" and instatiated inside
>>>>>>> "drivers/gpu/drm/ttm/ttm_memory.c".
>>>>>>>
>>>>>>> from "Documentation/kobject.txt":
>>>>>>> "Because kobjects are dynamic, they must not be declared statically or on
>>>>>>> the stack, but instead, always allocated dynamically.  Future versions of
>>>>>>> the kernel will contain a run-time check for kobjects that are created
>>>>>>> statically and will warn the developer of this improper usage."
>>>>>>>
>>>>>>> Unloading ttm before reloading the driver workarounds that crash, because
>>>>>>> the memory backing the kobject member "kobj" is cleaned up. The kobject_del
>>>>>>> and kobject_put function never free or clean up the kobject object leaving
>>>>>>> it in an undefined state.
>>>>>>>
>>>>>>> I reverted a few more commits to make it less painful for me to rever this
>>>>>>> rather big change.
>>>>>> Well, NAK. By reverting those change you also re-introduced the problems
>>>>>> we originally fixed with those patches.
>>>>>>
>>>>>> Please work on a proper fix instead,
>>>>>> Christian.
>>>>> And which problem was that besides duplicated code (or maybe even a
>>>>> bit of memory consumption if multiple ttm driver were used)? If I had
>>>>> to choose between duplicated code and a crash, I choose the former.
>>>>>
>>>>> Maybe I missed the real reason why those changes are made, but the
>>>>> commit messages don't really seem to tell me.
>>>> The old implementation crashed because different drivers tried to
>>>> allocate the same kobj.
>>>>
>>>> Crashing in one way is not better than crashing in another way.
>>>>
>>>> Christian.
>>>>
>>> how can that old crash be triggered? By loading two ttm based drivers
>>> at the same time or by other means?
>> Yes, exactly. Even worse it could be triggered by one driver
>> instantiating multiple times at the same time, e.g two AMD GPUs in the
>> same system.
>>
>> On the other hand I completely agree that using kobj static is
>> completely nuts. The problem is that using a kobj was the wrong approach
>> in the first place.
>>
>> In other words when you have something which controls a global behavior
>> of a module, what do you use? Right, a module parameter.
>>
>> Point is that we can't get away from those kobj without breaking the
>> uapi, so that is something which can't be done easily.
>>
>> Regards,
>> Christian.
> okay, thanks for the explanation!
>
> Currently I will be testing your first patch you sent out on top of
> 5.0.7, so that we could propose that for stable. The second didn't
> apply, but it didn't look relevant for the actual fix. Will reply to
> the patch with my results then.

The second is just a further clean, please ignore that one for backporting.

Let me know if the first one works on 5.0.7, if yes I'm going to add a 
CC stable tag before pushing.

Thanks,
Christian.

>
> Thanks
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2019-04-16 11:25 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-16  0:35 [PATCH 0/6] Fix crash after reloading a driver using ttm Karol Herbst
2019-04-16  0:35 ` [PATCH 1/6] Revert "drm: Remove drm_global.{c,h} v2" Karol Herbst
2019-04-16  0:35 ` [PATCH 2/6] Revert "drm/ttm: initialize globals during device init (v2)" Karol Herbst
2019-04-16  0:35 ` [PATCH 3/6] Revert "drm/ttm: Fix bo_global and mem_global kfree error" Karol Herbst
2019-04-16  0:35 ` [PATCH 4/6] Revert "drm/ttm: use a static ttm_bo_global instance" Karol Herbst
2019-04-16  0:35 ` [PATCH 5/6] Revert "drm/ttm: make the device list mutex static" Karol Herbst
2019-04-16  0:35 ` [PATCH 6/6] Revert "drm/ttm: use a static ttm_mem_global instance" Karol Herbst
2019-04-16  6:38 ` [PATCH 0/6] Fix crash after reloading a driver using ttm Christian König
2019-04-16  9:10   ` Karol Herbst
2019-04-16  9:12     ` Koenig, Christian
2019-04-16 10:54       ` Karol Herbst
2019-04-16 11:06         ` Koenig, Christian
2019-04-16 11:20           ` Karol Herbst
2019-04-16 11:25             ` Christian König [this message]
2019-04-16 12:18           ` Daniel Vetter
2019-04-16 12:29             ` Koenig, Christian
2019-04-16 23:09   ` Eric Anholt
2019-04-17  0:10     ` Karol Herbst

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bcb28473-acb4-50c3-f83d-cf6141ce7dd2@gmail.com \
    --to=ckoenig.leichtzumerken@gmail.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Christian.Koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=kherbst@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.