All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <ckoenig.leichtzumerken@gmail.com>
To: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>,
	christian.koenig@amd.com, amd-gfx@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org, daniel.vetter@ffwll.ch,
	robh@kernel.org, l.stach@pengutronix.de, yuq825@gmail.com,
	eric@anholt.net
Cc: Alexander.Deucher@amd.com, gregkh@linuxfoundation.org
Subject: Re: [PATCH v3 05/12] drm/ttm: Expose ttm_tt_unpopulate for driver use
Date: Wed, 16 Dec 2020 09:04:48 +0100	[thread overview]
Message-ID: <a818be68-518c-754d-f63b-3754ce882fdc@gmail.com> (raw)
In-Reply-To: <1fa4dd77-deec-aa7b-7499-0537e9a01919@amd.com>

Am 15.12.20 um 21:18 schrieb Andrey Grodzovsky:
> [SNIP]
>>>
>>> While we can't control user application accesses to the mapped 
>>> buffers explicitly and hence we use page fault rerouting
>>> I am thinking that in this  case we may be able to sprinkle 
>>> drm_dev_enter/exit in any such sensitive place were we might
>>> CPU access a DMA buffer from the kernel ?
>>
>> Yes, I fear we are going to need that.
>>
>>> Things like CPU page table updates, ring buffer accesses and FW 
>>> memcpy ? Is there other places ?
>>
>> Puh, good question. I have no idea.
>>
>>> Another point is that at this point the driver shouldn't access any 
>>> such buffers as we are at the process finishing the device.
>>> AFAIK there is no page fault mechanism for kernel mappings so I 
>>> don't think there is anything else to do ?
>>
>> Well there is a page fault handler for kernel mappings, but that one 
>> just prints the stack trace into the system log and calls BUG(); :)
>>
>> Long story short we need to avoid any access to released pages after 
>> unplug. No matter if it's from the kernel or userspace.
>
>
> I was just about to start guarding with drm_dev_enter/exit CPU 
> accesses from kernel to GTT ot VRAM buffers but then i looked more in 
> the code
> and seems like ttm_tt_unpopulate just deletes DMA mappings (for the 
> sake of device to main memory access). Kernel page table is not touched
> until last bo refcount is dropped and the bo is released 
> (ttm_bo_release->destroy->amdgpu_bo_destroy->amdgpu_bo_kunmap). This 
> is both
> for GTT BOs maped to kernel by kmap (or vmap) and for VRAM BOs mapped 
> by ioremap. So as i see it, nothing will bad will happen after we
> unpopulate a BO while we still try to use a kernel mapping for it, 
> system memory pages backing GTT BOs are still mapped and not freed and 
> for
> VRAM BOs same is for the IO physical ranges mapped into the kernel 
> page table since iounmap wasn't called yet.

The problem is the system pages would be freed and if we kernel driver 
still happily write to them we are pretty much busted because we write 
to freed up memory.

Christian.

> I loaded the driver with vm_update_mode=3
> meaning all VM updates done using CPU and hasn't seen any OOPs after 
> removing the device. I guess i can test it more by allocating GTT and 
> VRAM BOs
> and trying to read/write to them after device is removed.
>
> Andrey
>
>
>>
>> Regards,
>> Christian.
>>
>>>
>>> Andrey
>>
>>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

WARNING: multiple messages have this Message-ID (diff)
From: "Christian König" <ckoenig.leichtzumerken@gmail.com>
To: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>,
	christian.koenig@amd.com, amd-gfx@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org, daniel.vetter@ffwll.ch,
	robh@kernel.org, l.stach@pengutronix.de, yuq825@gmail.com,
	eric@anholt.net
Cc: Alexander.Deucher@amd.com, gregkh@linuxfoundation.org,
	ppaalanen@gmail.com, Harry.Wentland@amd.com
Subject: Re: [PATCH v3 05/12] drm/ttm: Expose ttm_tt_unpopulate for driver use
Date: Wed, 16 Dec 2020 09:04:48 +0100	[thread overview]
Message-ID: <a818be68-518c-754d-f63b-3754ce882fdc@gmail.com> (raw)
In-Reply-To: <1fa4dd77-deec-aa7b-7499-0537e9a01919@amd.com>

Am 15.12.20 um 21:18 schrieb Andrey Grodzovsky:
> [SNIP]
>>>
>>> While we can't control user application accesses to the mapped 
>>> buffers explicitly and hence we use page fault rerouting
>>> I am thinking that in this  case we may be able to sprinkle 
>>> drm_dev_enter/exit in any such sensitive place were we might
>>> CPU access a DMA buffer from the kernel ?
>>
>> Yes, I fear we are going to need that.
>>
>>> Things like CPU page table updates, ring buffer accesses and FW 
>>> memcpy ? Is there other places ?
>>
>> Puh, good question. I have no idea.
>>
>>> Another point is that at this point the driver shouldn't access any 
>>> such buffers as we are at the process finishing the device.
>>> AFAIK there is no page fault mechanism for kernel mappings so I 
>>> don't think there is anything else to do ?
>>
>> Well there is a page fault handler for kernel mappings, but that one 
>> just prints the stack trace into the system log and calls BUG(); :)
>>
>> Long story short we need to avoid any access to released pages after 
>> unplug. No matter if it's from the kernel or userspace.
>
>
> I was just about to start guarding with drm_dev_enter/exit CPU 
> accesses from kernel to GTT ot VRAM buffers but then i looked more in 
> the code
> and seems like ttm_tt_unpopulate just deletes DMA mappings (for the 
> sake of device to main memory access). Kernel page table is not touched
> until last bo refcount is dropped and the bo is released 
> (ttm_bo_release->destroy->amdgpu_bo_destroy->amdgpu_bo_kunmap). This 
> is both
> for GTT BOs maped to kernel by kmap (or vmap) and for VRAM BOs mapped 
> by ioremap. So as i see it, nothing will bad will happen after we
> unpopulate a BO while we still try to use a kernel mapping for it, 
> system memory pages backing GTT BOs are still mapped and not freed and 
> for
> VRAM BOs same is for the IO physical ranges mapped into the kernel 
> page table since iounmap wasn't called yet.

The problem is the system pages would be freed and if we kernel driver 
still happily write to them we are pretty much busted because we write 
to freed up memory.

Christian.

> I loaded the driver with vm_update_mode=3
> meaning all VM updates done using CPU and hasn't seen any OOPs after 
> removing the device. I guess i can test it more by allocating GTT and 
> VRAM BOs
> and trying to read/write to them after device is removed.
>
> Andrey
>
>
>>
>> Regards,
>> Christian.
>>
>>>
>>> Andrey
>>
>>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  reply	other threads:[~2020-12-16  8:04 UTC|newest]

Thread overview: 212+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-21  5:21 [PATCH v3 00/12] RFC Support hot device unplug in amdgpu Andrey Grodzovsky
2020-11-21  5:21 ` Andrey Grodzovsky
2020-11-21  5:21 ` [PATCH v3 01/12] drm: Add dummy page per device or GEM object Andrey Grodzovsky
2020-11-21  5:21   ` Andrey Grodzovsky
2020-11-21 14:15   ` Christian König
2020-11-21 14:15     ` Christian König
2020-11-23  4:54     ` Andrey Grodzovsky
2020-11-23  4:54       ` Andrey Grodzovsky
2020-11-23  8:01       ` Christian König
2020-11-23  8:01         ` Christian König
2021-01-05 21:04         ` Andrey Grodzovsky
2021-01-05 21:04           ` Andrey Grodzovsky
2021-01-07 16:21           ` Daniel Vetter
2021-01-07 16:21             ` Daniel Vetter
2021-01-07 16:26             ` Andrey Grodzovsky
2021-01-07 16:26               ` Andrey Grodzovsky
2021-01-07 16:28               ` Andrey Grodzovsky
2021-01-07 16:28                 ` Andrey Grodzovsky
2021-01-07 16:30               ` Daniel Vetter
2021-01-07 16:30                 ` Daniel Vetter
2021-01-07 16:37                 ` Andrey Grodzovsky
2021-01-07 16:37                   ` Andrey Grodzovsky
2021-01-08 14:26                   ` Andrey Grodzovsky
2021-01-08 14:26                     ` Andrey Grodzovsky
2021-01-08 14:33                     ` Christian König
2021-01-08 14:33                       ` Christian König
2021-01-08 14:46                       ` Andrey Grodzovsky
2021-01-08 14:46                         ` Andrey Grodzovsky
2021-01-08 14:52                         ` Christian König
2021-01-08 14:52                           ` Christian König
2021-01-08 16:49                           ` Grodzovsky, Andrey
2021-01-08 16:49                             ` Grodzovsky, Andrey
2021-01-11 16:13                             ` Daniel Vetter
2021-01-11 16:13                               ` Daniel Vetter
2021-01-11 16:15                               ` Daniel Vetter
2021-01-11 16:15                                 ` Daniel Vetter
2021-01-11 17:41                                 ` Andrey Grodzovsky
2021-01-11 17:41                                   ` Andrey Grodzovsky
2021-01-11 18:31                                   ` Andrey Grodzovsky
2021-01-12  9:07                                     ` Daniel Vetter
2021-01-11 20:45                                 ` Andrey Grodzovsky
2021-01-11 20:45                                   ` Andrey Grodzovsky
2021-01-12  9:10                                   ` Daniel Vetter
2021-01-12  9:10                                     ` Daniel Vetter
2021-01-12 12:32                                     ` Christian König
2021-01-12 12:32                                       ` Christian König
2021-01-12 15:59                                       ` Andrey Grodzovsky
2021-01-12 15:59                                         ` Andrey Grodzovsky
2021-01-13  9:14                                         ` Christian König
2021-01-13  9:14                                           ` Christian König
2021-01-13 14:40                                           ` Andrey Grodzovsky
2021-01-13 14:40                                             ` Andrey Grodzovsky
2021-01-12 15:54                                     ` Andrey Grodzovsky
2021-01-12 15:54                                       ` Andrey Grodzovsky
2021-01-12  8:12                               ` Christian König
2021-01-12  8:12                                 ` Christian König
2021-01-12  9:13                                 ` Daniel Vetter
2021-01-12  9:13                                   ` Daniel Vetter
2020-11-21  5:21 ` [PATCH v3 02/12] drm: Unamp the entire device address space on device unplug Andrey Grodzovsky
2020-11-21  5:21   ` Andrey Grodzovsky
2020-11-21 14:16   ` Christian König
2020-11-21 14:16     ` Christian König
2020-11-24 14:44     ` Daniel Vetter
2020-11-24 14:44       ` Daniel Vetter
2020-11-21  5:21 ` [PATCH v3 03/12] drm/ttm: Remap all page faults to per process dummy page Andrey Grodzovsky
2020-11-21  5:21   ` Andrey Grodzovsky
2020-11-21  5:21 ` [PATCH v3 04/12] drm/ttm: Set dma addr to null after freee Andrey Grodzovsky
2020-11-21  5:21   ` Andrey Grodzovsky
2020-11-21 14:13   ` Christian König
2020-11-21 14:13     ` Christian König
2020-11-23  5:15     ` Andrey Grodzovsky
2020-11-23  5:15       ` Andrey Grodzovsky
2020-11-23  8:04       ` Christian König
2020-11-23  8:04         ` Christian König
2020-11-21  5:21 ` [PATCH v3 05/12] drm/ttm: Expose ttm_tt_unpopulate for driver use Andrey Grodzovsky
2020-11-21  5:21   ` Andrey Grodzovsky
2020-11-25 10:42   ` Christian König
2020-11-25 10:42     ` Christian König
2020-11-23 20:05     ` Andrey Grodzovsky
2020-11-23 20:05       ` Andrey Grodzovsky
2020-11-23 20:20       ` Christian König
2020-11-23 20:20         ` Christian König
2020-11-23 20:38         ` Andrey Grodzovsky
2020-11-23 20:38           ` Andrey Grodzovsky
2020-11-23 20:41           ` Christian König
2020-11-23 20:41             ` Christian König
2020-11-23 21:08             ` Andrey Grodzovsky
2020-11-23 21:08               ` Andrey Grodzovsky
2020-11-24  7:41               ` Christian König
2020-11-24  7:41                 ` Christian König
2020-11-24 16:22                 ` Andrey Grodzovsky
2020-11-24 16:22                   ` Andrey Grodzovsky
2020-11-24 16:44                   ` Christian König
2020-11-24 16:44                     ` Christian König
2020-11-25 10:40                     ` Daniel Vetter
2020-11-25 10:40                       ` Daniel Vetter
2020-11-25 12:57                       ` Christian König
2020-11-25 12:57                         ` Christian König
2020-11-25 16:36                         ` Daniel Vetter
2020-11-25 16:36                           ` Daniel Vetter
2020-11-25 19:34                           ` Andrey Grodzovsky
2020-11-25 19:34                             ` Andrey Grodzovsky
2020-11-27 13:10                             ` Grodzovsky, Andrey
2020-11-27 13:10                               ` Grodzovsky, Andrey
2020-11-27 14:59                             ` Daniel Vetter
2020-11-27 14:59                               ` Daniel Vetter
2020-11-27 16:04                               ` Andrey Grodzovsky
2020-11-27 16:04                                 ` Andrey Grodzovsky
2020-11-30 14:15                                 ` Daniel Vetter
2020-11-30 14:15                                   ` Daniel Vetter
2020-11-25 16:56                         ` Michel Dänzer
2020-11-25 16:56                           ` Michel Dänzer
2020-11-25 17:02                           ` Daniel Vetter
2020-11-25 17:02                             ` Daniel Vetter
2020-12-15 20:18                     ` Andrey Grodzovsky
2020-12-15 20:18                       ` Andrey Grodzovsky
2020-12-16  8:04                       ` Christian König [this message]
2020-12-16  8:04                         ` Christian König
2020-12-16 14:21                         ` Daniel Vetter
2020-12-16 14:21                           ` Daniel Vetter
2020-12-16 16:13                           ` Andrey Grodzovsky
2020-12-16 16:13                             ` Andrey Grodzovsky
2020-12-16 16:18                             ` Christian König
2020-12-16 16:18                               ` Christian König
2020-12-16 17:12                               ` Daniel Vetter
2020-12-16 17:12                                 ` Daniel Vetter
2020-12-16 17:20                                 ` Daniel Vetter
2020-12-16 17:20                                   ` Daniel Vetter
2020-12-16 18:26                                 ` Andrey Grodzovsky
2020-12-16 18:26                                   ` Andrey Grodzovsky
2020-12-16 23:15                                   ` Daniel Vetter
2020-12-16 23:15                                     ` Daniel Vetter
2020-12-17  0:20                                     ` Andrey Grodzovsky
2020-12-17  0:20                                       ` Andrey Grodzovsky
2020-12-17 12:01                                       ` Daniel Vetter
2020-12-17 12:01                                         ` Daniel Vetter
2020-12-17 19:19                                         ` Andrey Grodzovsky
2020-12-17 19:19                                           ` Andrey Grodzovsky
2020-12-17 20:10                                           ` Christian König
2020-12-17 20:10                                             ` Christian König
2020-12-17 20:38                                             ` Andrey Grodzovsky
2020-12-17 20:38                                               ` Andrey Grodzovsky
2020-12-17 20:48                                               ` Daniel Vetter
2020-12-17 20:48                                                 ` Daniel Vetter
2020-12-17 21:06                                                 ` Andrey Grodzovsky
2020-12-17 21:06                                                   ` Andrey Grodzovsky
2020-12-18 14:30                                                   ` Daniel Vetter
2020-12-18 14:30                                                     ` Daniel Vetter
2020-12-17 20:42                                           ` Daniel Vetter
2020-12-17 20:42                                             ` Daniel Vetter
2020-12-17 21:13                                             ` Andrey Grodzovsky
2020-12-17 21:13                                               ` Andrey Grodzovsky
2021-01-04 16:33                                               ` Andrey Grodzovsky
2021-01-04 16:33                                                 ` Andrey Grodzovsky
2020-11-21  5:21 ` [PATCH v3 06/12] drm/sched: Cancel and flush all oustatdning jobs before finish Andrey Grodzovsky
2020-11-21  5:21   ` Andrey Grodzovsky
2020-11-22 11:56   ` Christian König
2020-11-22 11:56     ` Christian König
2020-11-21  5:21 ` [PATCH v3 07/12] drm/sched: Prevent any job recoveries after device is unplugged Andrey Grodzovsky
2020-11-21  5:21   ` Andrey Grodzovsky
2020-11-22 11:57   ` Christian König
2020-11-22 11:57     ` Christian König
2020-11-23  5:37     ` Andrey Grodzovsky
2020-11-23  5:37       ` Andrey Grodzovsky
2020-11-23  8:06       ` Christian König
2020-11-23  8:06         ` Christian König
2020-11-24  1:12         ` Luben Tuikov
2020-11-24  1:12           ` Luben Tuikov
2020-11-24  7:50           ` Christian König
2020-11-24  7:50             ` Christian König
2020-11-24 17:11             ` Luben Tuikov
2020-11-24 17:11               ` Luben Tuikov
2020-11-24 17:17               ` Andrey Grodzovsky
2020-11-24 17:17                 ` Andrey Grodzovsky
2020-11-24 17:41                 ` Luben Tuikov
2020-11-24 17:41                   ` Luben Tuikov
2020-11-24 17:40               ` Christian König
2020-11-24 17:40                 ` Christian König
2020-11-24 17:44                 ` Luben Tuikov
2020-11-24 17:44                   ` Luben Tuikov
2020-11-21  5:21 ` [PATCH v3 08/12] drm/amdgpu: Split amdgpu_device_fini into early and late Andrey Grodzovsky
2020-11-21  5:21   ` Andrey Grodzovsky
2020-11-24 14:53   ` Daniel Vetter
2020-11-24 14:53     ` Daniel Vetter
2020-11-24 15:51     ` Andrey Grodzovsky
2020-11-24 15:51       ` Andrey Grodzovsky
2020-11-25 10:41       ` Daniel Vetter
2020-11-25 10:41         ` Daniel Vetter
2020-11-25 17:41         ` Andrey Grodzovsky
2020-11-25 17:41           ` Andrey Grodzovsky
2020-11-21  5:21 ` [PATCH v3 09/12] drm/amdgpu: Add early fini callback Andrey Grodzovsky
2020-11-21  5:21   ` Andrey Grodzovsky
2020-11-21  5:21 ` [PATCH v3 10/12] drm/amdgpu: Avoid sysfs dirs removal post device unplug Andrey Grodzovsky
2020-11-21  5:21   ` Andrey Grodzovsky
2020-11-24 14:49   ` Daniel Vetter
2020-11-24 14:49     ` Daniel Vetter
2020-11-24 22:27     ` Andrey Grodzovsky
2020-11-24 22:27       ` Andrey Grodzovsky
2020-11-25  9:04       ` Daniel Vetter
2020-11-25  9:04         ` Daniel Vetter
2020-11-25 17:39         ` Andrey Grodzovsky
2020-11-25 17:39           ` Andrey Grodzovsky
2020-11-27 13:12           ` Grodzovsky, Andrey
2020-11-27 13:12             ` Grodzovsky, Andrey
2020-11-27 15:04           ` Daniel Vetter
2020-11-27 15:04             ` Daniel Vetter
2020-11-27 15:34             ` Andrey Grodzovsky
2020-11-27 15:34               ` Andrey Grodzovsky
2020-11-21  5:21 ` [PATCH v3 11/12] drm/amdgpu: Register IOMMU topology notifier per device Andrey Grodzovsky
2020-11-21  5:21   ` Andrey Grodzovsky
2020-11-21  5:21 ` [PATCH v3 12/12] drm/amdgpu: Fix a bunch of sdma code crash post device unplug Andrey Grodzovsky
2020-11-21  5:21   ` Andrey Grodzovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a818be68-518c-754d-f63b-3754ce882fdc@gmail.com \
    --to=ckoenig.leichtzumerken@gmail.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Andrey.Grodzovsky@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=eric@anholt.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=l.stach@pengutronix.de \
    --cc=robh@kernel.org \
    --cc=yuq825@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.