linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] AMDKFD: criu_checkpoint() error path treats userspace pointer as kernel pointer
@ 2022-10-31 14:20 Jann Horn
  2022-10-31 17:53 ` Felix Kuehling
  0 siblings, 1 reply; 3+ messages in thread
From: Jann Horn @ 2022-10-31 14:20 UTC (permalink / raw)
  To: Rajneesh Bhardwaj, Felix Kuehling
  Cc: David Yat Sin, Alex Deucher, kernel list, amd-gfx, Pan, Xinhui,
	Christian König

be072b06c73970 ("drm/amdkfd: CRIU export BOs as prime dmabuf objects")
added an error path in criu_checkpoint() that (unless I'm completely
misreading this) treats the userspace-supplied args->bos (which was
previously used as a userspace pointer when passed to
criu_checkpoint_bos()) as a kernel pointer:

  ret = criu_checkpoint_bos(p, num_bos, (uint8_t __user *)args->bos,
      (uint8_t __user *)args->priv_data, &priv_offset);
  if (ret)
    goto exit_unlock;
  [...]
close_bo_fds:
  if (ret) {
    /* If IOCTL returns err, user assumes all FDs opened in
criu_dump_bos are closed */
    uint32_t i;
    struct kfd_criu_bo_bucket *bo_buckets = (struct kfd_criu_bo_bucket
*) args->bos;

    for (i = 0; i < num_bos; i++) {
      if (bo_buckets[i].alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM)
        close_fd(bo_buckets[i].dmabuf_fd);
    }
  }

This seems very wrong, and also like it's guaranteed to blow up as
soon as it runs on a machine with SMAP, which makes me think that this
codepath was probably never exercised?

(Also note that just changing this to copy_from_user() instead would
still be wrong, because malicious/bogus userspace could change the FD
number to the KFD device's FD, and the VFS assumes that an FD can't be
closed while it's being accessed in a single-threaded process.)

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [BUG] AMDKFD: criu_checkpoint() error path treats userspace pointer as kernel pointer
  2022-10-31 14:20 [BUG] AMDKFD: criu_checkpoint() error path treats userspace pointer as kernel pointer Jann Horn
@ 2022-10-31 17:53 ` Felix Kuehling
  2022-10-31 17:57   ` Jann Horn
  0 siblings, 1 reply; 3+ messages in thread
From: Felix Kuehling @ 2022-10-31 17:53 UTC (permalink / raw)
  To: Jann Horn, Rajneesh Bhardwaj
  Cc: David Yat Sin, Alex Deucher, kernel list, amd-gfx, Pan, Xinhui,
	Christian König

Am 2022-10-31 um 10:20 schrieb Jann Horn:
> be072b06c73970 ("drm/amdkfd: CRIU export BOs as prime dmabuf objects")
> added an error path in criu_checkpoint() that (unless I'm completely
> misreading this) treats the userspace-supplied args->bos (which was
> previously used as a userspace pointer when passed to
> criu_checkpoint_bos()) as a kernel pointer:
>
>    ret = criu_checkpoint_bos(p, num_bos, (uint8_t __user *)args->bos,
>        (uint8_t __user *)args->priv_data, &priv_offset);
>    if (ret)
>      goto exit_unlock;
>    [...]
> close_bo_fds:
>    if (ret) {
>      /* If IOCTL returns err, user assumes all FDs opened in
> criu_dump_bos are closed */
>      uint32_t i;
>      struct kfd_criu_bo_bucket *bo_buckets = (struct kfd_criu_bo_bucket
> *) args->bos;
>
>      for (i = 0; i < num_bos; i++) {
>        if (bo_buckets[i].alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM)
>          close_fd(bo_buckets[i].dmabuf_fd);
>      }
>    }
>
> This seems very wrong, and also like it's guaranteed to blow up as
> soon as it runs on a machine with SMAP, which makes me think that this
> codepath was probably never exercised?
>
> (Also note that just changing this to copy_from_user() instead would
> still be wrong, because malicious/bogus userspace could change the FD
> number to the KFD device's FD, and the VFS assumes that an FD can't be
> closed while it's being accessed in a single-threaded process.)

Thank you for catching this, and thank you for the advice. In other 
words, we need to store a copy of the FDs in a kernel mode buffer that 
is not accessibly by usermode, so we can reliably close the correct FDs 
in the error handling code path. Rajneesh and I will fix this ASAP.

Do you think we should also avoid copying the FDs to usermode before 
we're sure that we'll return success? I don't think it would make a big 
difference because user mode could try to guess the FDs and use them 
before we return from the ioctl either way.

Regards,
   Felix



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [BUG] AMDKFD: criu_checkpoint() error path treats userspace pointer as kernel pointer
  2022-10-31 17:53 ` Felix Kuehling
@ 2022-10-31 17:57   ` Jann Horn
  0 siblings, 0 replies; 3+ messages in thread
From: Jann Horn @ 2022-10-31 17:57 UTC (permalink / raw)
  To: Felix Kuehling
  Cc: Rajneesh Bhardwaj, David Yat Sin, Alex Deucher, kernel list,
	amd-gfx, Pan, Xinhui, Christian König

On Mon, Oct 31, 2022 at 6:54 PM Felix Kuehling <felix.kuehling@amd.com> wrote:
> Am 2022-10-31 um 10:20 schrieb Jann Horn:
> > be072b06c73970 ("drm/amdkfd: CRIU export BOs as prime dmabuf objects")
> > added an error path in criu_checkpoint() that (unless I'm completely
> > misreading this) treats the userspace-supplied args->bos (which was
> > previously used as a userspace pointer when passed to
> > criu_checkpoint_bos()) as a kernel pointer:
> >
> >    ret = criu_checkpoint_bos(p, num_bos, (uint8_t __user *)args->bos,
> >        (uint8_t __user *)args->priv_data, &priv_offset);
> >    if (ret)
> >      goto exit_unlock;
> >    [...]
> > close_bo_fds:
> >    if (ret) {
> >      /* If IOCTL returns err, user assumes all FDs opened in
> > criu_dump_bos are closed */
> >      uint32_t i;
> >      struct kfd_criu_bo_bucket *bo_buckets = (struct kfd_criu_bo_bucket
> > *) args->bos;
> >
> >      for (i = 0; i < num_bos; i++) {
> >        if (bo_buckets[i].alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM)
> >          close_fd(bo_buckets[i].dmabuf_fd);
> >      }
> >    }
> >
> > This seems very wrong, and also like it's guaranteed to blow up as
> > soon as it runs on a machine with SMAP, which makes me think that this
> > codepath was probably never exercised?
> >
> > (Also note that just changing this to copy_from_user() instead would
> > still be wrong, because malicious/bogus userspace could change the FD
> > number to the KFD device's FD, and the VFS assumes that an FD can't be
> > closed while it's being accessed in a single-threaded process.)
>
> Thank you for catching this, and thank you for the advice. In other
> words, we need to store a copy of the FDs in a kernel mode buffer that
> is not accessibly by usermode, so we can reliably close the correct FDs
> in the error handling code path.

Sounds good to me.

> Rajneesh and I will fix this ASAP.
>
> Do you think we should also avoid copying the FDs to usermode before
> we're sure that we'll return success? I don't think it would make a big
> difference because user mode could try to guess the FDs and use them
> before we return from the ioctl either way.

Yeah, that shouldn't matter - as you said, userspace can guess the FDs.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-10-31 17:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-31 14:20 [BUG] AMDKFD: criu_checkpoint() error path treats userspace pointer as kernel pointer Jann Horn
2022-10-31 17:53 ` Felix Kuehling
2022-10-31 17:57   ` Jann Horn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).