intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH] file, i915: fix file reference for mmap_singleton()
@ 2023-10-25 12:01 Christian Brauner
  2023-10-25 15:23 ` Jann Horn
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Christian Brauner @ 2023-10-25 12:01 UTC (permalink / raw)
  To: Jann Horn, Linus Torvalds
  Cc: Christian Brauner, intel-gfx, Kurmi, Suresh Kumar

Today we got a report at [1] for rcu stalls on the i915 testsuite in [2]
due to the conversion of files to SLAB_TYPSSAFE_BY_RCU. Afaict,
get_file_rcu() goes into an infinite loop trying to carefully verify
that i915->gem.mmap_singleton hasn't changed - see the splat below.

So I stared at this code to figure out what it actually does. It seems
that the i915->gem.mmap_singleton pointer itself never had rcu semantics.

The i915->gem.mmap_singleton is replaced in
file->f_op->release::singleton_release():

        static int singleton_release(struct inode *inode, struct file *file)
        {
                struct drm_i915_private *i915 = file->private_data;

                cmpxchg(&i915->gem.mmap_singleton, file, NULL);
                drm_dev_put(&i915->drm);

                return 0;
        }

The cmpxchg() is ordered against a concurrent update of
i915->gem.mmap_singleton from mmap_singleton(). IOW, when
mmap_singleton() fails to get a reference on i915->gem.mmap_singleton
via mmap_singleton():

        rcu_read_lock();
        file = get_file_rcu(&i915->gem.mmap_singleton);
        rcu_read_unlock();

mmap_singleton() allocates a new file via anon_inode_getfile() and does

        smp_store_mb(i915->gem.mmap_singleton, file);

So, afaiu, then what happens in the case of this bug is that at some
point fput() is called and drops the file->f_count to zero but obviously
leaving the pointer in i915->gem.mmap_singleton in tact until
file->f_op->release::singleton_release() is called.

Now, there might be possibly larger delays until
file->f_op->release::singleton_release() is called and
i915->gem.mmap_singleton is set to NULL?

Say concurrently another task hits mmap_singleton() and does:

        rcu_read_lock();
        file = get_file_rcu(&i915->gem.mmap_singleton);
        rcu_read_unlock();

When get_file_rcu() fails to get a reference via atomic_inc_not_zero()
it will try the reload from i915->gem.mmap_singleton assuming it has
comparable semantics as __fget_files_rcu() that also reloads on
atomic_inc_not_zero() failure.

But since i915->gem.mmap_singleton doesn't have proper rcu semantics it
reloads the same pointer again, trying the same atomic_inc_not_zero()
again and doing so until file->f_op->release::singleton_release() of the
old file has been called.

So, in contrast to __fget_files_rcu() here we want to not retry when
atomic_inc_not_zero() has failed. We only want to retry in case we
managed to get a reference but the pointer did change on reload.

<3> [511.395679] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
<3> [511.395716] rcu:   Tasks blocked on level-1 rcu_node (CPUs 0-9): P6238
<3> [511.395934] rcu:   (detected by 16, t=65002 jiffies, g=123977, q=439 ncpus=20)
<6> [511.395944] task:i915_selftest   state:R  running task     stack:10568 pid:6238  tgid:6238  ppid:1001   flags:0x00004002
<6> [511.395962] Call Trace:
<6> [511.395966]  <TASK>
<6> [511.395974]  ? __schedule+0x3a8/0xd70
<6> [511.395995]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
<6> [511.396003]  ? lockdep_hardirqs_on+0xc3/0x140
<6> [511.396013]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
<6> [511.396029]  ? get_file_rcu+0x10/0x30
<6> [511.396039]  ? get_file_rcu+0x10/0x30
<6> [511.396046]  ? i915_gem_object_mmap+0xbc/0x450 [i915]
<6> [511.396509]  ? i915_gem_mmap+0x272/0x480 [i915]
<6> [511.396903]  ? mmap_region+0x253/0xb60
<6> [511.396925]  ? do_mmap+0x334/0x5c0
<6> [511.396939]  ? vm_mmap_pgoff+0x9f/0x1c0
<6> [511.396949]  ? rcu_is_watching+0x11/0x50
<6> [511.396962]  ? igt_mmap_offset+0xfc/0x110 [i915]
<6> [511.397376]  ? __igt_mmap+0xb3/0x570 [i915]
<6> [511.397762]  ? igt_mmap+0x11e/0x150 [i915]
<6> [511.398139]  ? __trace_bprintk+0x76/0x90
<6> [511.398156]  ? __i915_subtests+0xbf/0x240 [i915]
<6> [511.398586]  ? __pfx___i915_live_setup+0x10/0x10 [i915]
<6> [511.399001]  ? __pfx___i915_live_teardown+0x10/0x10 [i915]
<6> [511.399433]  ? __run_selftests+0xbc/0x1a0 [i915]
<6> [511.399875]  ? i915_live_selftests+0x4b/0x90 [i915]
<6> [511.400308]  ? i915_pci_probe+0x106/0x200 [i915]
<6> [511.400692]  ? pci_device_probe+0x95/0x120
<6> [511.400704]  ? really_probe+0x164/0x3c0
<6> [511.400715]  ? __pfx___driver_attach+0x10/0x10
<6> [511.400722]  ? __driver_probe_device+0x73/0x160
<6> [511.400731]  ? driver_probe_device+0x19/0xa0
<6> [511.400741]  ? __driver_attach+0xb6/0x180
<6> [511.400749]  ? __pfx___driver_attach+0x10/0x10
<6> [511.400756]  ? bus_for_each_dev+0x77/0xd0
<6> [511.400770]  ? bus_add_driver+0x114/0x210
<6> [511.400781]  ? driver_register+0x5b/0x110
<6> [511.400791]  ? i915_init+0x23/0xc0 [i915]
<6> [511.401153]  ? __pfx_i915_init+0x10/0x10 [i915]
<6> [511.401503]  ? do_one_initcall+0x57/0x270
<6> [511.401515]  ? rcu_is_watching+0x11/0x50
<6> [511.401521]  ? kmalloc_trace+0xa3/0xb0
<6> [511.401532]  ? do_init_module+0x5f/0x210
<6> [511.401544]  ? load_module+0x1d00/0x1f60
<6> [511.401581]  ? init_module_from_file+0x86/0xd0
<6> [511.401590]  ? init_module_from_file+0x86/0xd0
<6> [511.401613]  ? idempotent_init_module+0x17c/0x230
<6> [511.401639]  ? __x64_sys_finit_module+0x56/0xb0
<6> [511.401650]  ? do_syscall_64+0x3c/0x90
<6> [511.401659]  ? entry_SYSCALL_64_after_hwframe+0x6e/0xd8
<6> [511.401684]  </TASK>

Link: [1]: https://lore.kernel.org/intel-gfx/SJ1PR11MB6129CB39EED831784C331BAFB9DEA@SJ1PR11MB6129.namprd11.prod.outlook.com
Link: [2]: https://intel-gfx-ci.01.org/tree/linux-next/next-20231013/bat-dg2-11/igt@i915_selftest@live@mman.html#dmesg-warnings10963
Signed-off-by: Christian Brauner <brauner@kernel.org>
---

Jann, Linus,

Since you've been quite involved, can you check that what I'm babbling
here makes sense? If this isn't the fix then I would have to drop the
SLAB_TYPESAFE_BY_RCU conversion patch for now since I'd like to send PRs
by the end of the week.

This is on top of
https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/log/?h=vfs.misc

Appreciate the help, thanks!
Christian

---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c |  2 +-
 fs/file.c                                | 17 ++++++++++++++++-
 include/linux/fs.h                       |  1 +
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index 97bf10861cad..da97e61b18d4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -916,7 +916,7 @@ static struct file *mmap_singleton(struct drm_i915_private *i915)
 	struct file *file;
 
 	rcu_read_lock();
-	file = get_file_rcu(&i915->gem.mmap_singleton);
+	file = get_file_rcu_once(&i915->gem.mmap_singleton);
 	rcu_read_unlock();
 	if (file)
 		return file;
diff --git a/fs/file.c b/fs/file.c
index 1a475d7d636e..2c64d6836f0c 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -867,7 +867,7 @@ static struct file *__get_file_rcu(struct file __rcu **f)
 		return NULL;
 
 	if (unlikely(!atomic_long_inc_not_zero(&file->f_count)))
-		return ERR_PTR(-EAGAIN);
+		return ERR_PTR(-EINVAL);
 
 	file_reloaded = rcu_dereference_raw(*f);
 
@@ -927,6 +927,21 @@ struct file *get_file_rcu(struct file __rcu **f)
 }
 EXPORT_SYMBOL_GPL(get_file_rcu);
 
+struct file *get_file_rcu_once(struct file __rcu **f)
+{
+	for (;;) {
+		struct file __rcu *file;
+
+		file = __get_file_rcu(f);
+		if (!IS_ERR(file))
+			return file;
+
+		if (PTR_ERR(file) == -EINVAL)
+			return NULL;
+	}
+}
+EXPORT_SYMBOL_GPL(get_file_rcu_once);
+
 static inline struct file *__fget_files_rcu(struct files_struct *files,
        unsigned int fd, fmode_t mask)
 {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index cb8bfa1f8ecb..657bbd824490 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1044,6 +1044,7 @@ static inline struct file *get_file(struct file *f)
 }
 
 struct file *get_file_rcu(struct file __rcu **f);
+struct file *get_file_rcu_once(struct file __rcu **f);
 
 #define file_count(x)	atomic_long_read(&(x)->f_count)
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] [PATCH] file, i915: fix file reference for mmap_singleton()
  2023-10-25 12:01 [Intel-gfx] [PATCH] file, i915: fix file reference for mmap_singleton() Christian Brauner
@ 2023-10-25 15:23 ` Jann Horn
  2023-10-26  9:34   ` Tvrtko Ursulin
  2023-10-25 18:52 ` Linus Torvalds
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Jann Horn @ 2023-10-25 15:23 UTC (permalink / raw)
  To: Christian Brauner, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
	Tvrtko Ursulin, intel-gfx
  Cc: Linus Torvalds, Kurmi, Suresh Kumar

On Wed, Oct 25, 2023 at 2:01 PM Christian Brauner <brauner@kernel.org> wrote:
> Today we got a report at [1] for rcu stalls on the i915 testsuite in [2]
> due to the conversion of files to SLAB_TYPSSAFE_BY_RCU. Afaict,
> get_file_rcu() goes into an infinite loop trying to carefully verify
> that i915->gem.mmap_singleton hasn't changed - see the splat below.
>
> So I stared at this code to figure out what it actually does. It seems
> that the i915->gem.mmap_singleton pointer itself never had rcu semantics.
>
> The i915->gem.mmap_singleton is replaced in
> file->f_op->release::singleton_release():
>
>         static int singleton_release(struct inode *inode, struct file *file)
>         {
>                 struct drm_i915_private *i915 = file->private_data;
>
>                 cmpxchg(&i915->gem.mmap_singleton, file, NULL);
>                 drm_dev_put(&i915->drm);
>
>                 return 0;
>         }
>
> The cmpxchg() is ordered against a concurrent update of
> i915->gem.mmap_singleton from mmap_singleton(). IOW, when
> mmap_singleton() fails to get a reference on i915->gem.mmap_singleton
> via mmap_singleton():
>
>         rcu_read_lock();
>         file = get_file_rcu(&i915->gem.mmap_singleton);
>         rcu_read_unlock();
>
> mmap_singleton() allocates a new file via anon_inode_getfile() and does
>
>         smp_store_mb(i915->gem.mmap_singleton, file);
>
> So, afaiu, then what happens in the case of this bug is that at some
> point fput() is called and drops the file->f_count to zero but obviously
> leaving the pointer in i915->gem.mmap_singleton in tact until
> file->f_op->release::singleton_release() is called.
>
> Now, there might be possibly larger delays until
> file->f_op->release::singleton_release() is called and
> i915->gem.mmap_singleton is set to NULL?
>
> Say concurrently another task hits mmap_singleton() and does:
>
>         rcu_read_lock();
>         file = get_file_rcu(&i915->gem.mmap_singleton);
>         rcu_read_unlock();
>
> When get_file_rcu() fails to get a reference via atomic_inc_not_zero()
> it will try the reload from i915->gem.mmap_singleton assuming it has
> comparable semantics as __fget_files_rcu() that also reloads on
> atomic_inc_not_zero() failure.
>
> But since i915->gem.mmap_singleton doesn't have proper rcu semantics it
> reloads the same pointer again, trying the same atomic_inc_not_zero()
> again and doing so until file->f_op->release::singleton_release() of the
> old file has been called.
>
> So, in contrast to __fget_files_rcu() here we want to not retry when
> atomic_inc_not_zero() has failed. We only want to retry in case we
> managed to get a reference but the pointer did change on reload.
[...]
> Link: [1]: https://lore.kernel.org/intel-gfx/SJ1PR11MB6129CB39EED831784C331BAFB9DEA@SJ1PR11MB6129.namprd11.prod.outlook.com
> Link: [2]: https://intel-gfx-ci.01.org/tree/linux-next/next-20231013/bat-dg2-11/igt@i915_selftest@live@mman.html#dmesg-warnings10963
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Patch makes sense to me. I'm not sure why you changed EAGAIN to
EINVAL, and it's obviously a bit ugly, but it looks like a valid fix
for what the SLUB_TYPESAFE_BY_RCU conversion broke.

Reviewed-by: Jann Horn <jannh@google.com>


But as a side note, I am a bit confused about how the concurrency of
the existing code is supposed to work... it looks like two parallel
calls to mmap_singleton() can end up returning different pointers, and
then the singleton is not actually a singleton anymore? If that part
of the existing code is wrong even before the SLAB_TYPESAFE_BY_RCU
conversion, we might later have to open-code get_file_rcu() in
mmap_singleton(), so that we can do a cmpxchg at the end that checks
whether the i915->gem.mmap_singleton pointer is still the same?

Like:

static struct file *mmap_singleton(struct drm_i915_private *i915)
{
        struct file *file, *new_file;

        rcu_read_lock();
        file = READ_ONCE(i915->gem.mmap_singleton);
        if (file && atomic_long_inc_not_zero(&file->f_count)) {
                rcu_read_unlock();
                return file;
        }
        rcu_read_unlock();

        new_file = anon_inode_getfile("i915.gem", &singleton_fops,
i915, O_RDWR);
        if (IS_ERR(new_file))
                return new_file;

        /* Everyone shares a single global address space */
        new_file->f_mapping = i915->drm.anon_inode->i_mapping;

        if (try_cmpxchg(&i915->gem.mmap_singleton, &file, new_file)) {
                // something with drm_dev refcount ?
                return new_file;
        }

        // something with drm_dev refcount ?
        fput(new_file);

        return file;
}

It would be nice to get some i915 maintainer's comment on how the
singleton stuff is supposed to work.

> ---
>
> Jann, Linus,
>
> Since you've been quite involved, can you check that what I'm babbling
> here makes sense? If this isn't the fix then I would have to drop the
> SLAB_TYPESAFE_BY_RCU conversion patch for now since I'd like to send PRs
> by the end of the week.
>
> This is on top of
> https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/log/?h=vfs.misc
>
> Appreciate the help, thanks!
> Christian
>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_mman.c |  2 +-
>  fs/file.c                                | 17 ++++++++++++++++-
>  include/linux/fs.h                       |  1 +
>  3 files changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> index 97bf10861cad..da97e61b18d4 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> @@ -916,7 +916,7 @@ static struct file *mmap_singleton(struct drm_i915_private *i915)
>         struct file *file;
>
>         rcu_read_lock();
> -       file = get_file_rcu(&i915->gem.mmap_singleton);
> +       file = get_file_rcu_once(&i915->gem.mmap_singleton);
>         rcu_read_unlock();
>         if (file)
>                 return file;
> diff --git a/fs/file.c b/fs/file.c
> index 1a475d7d636e..2c64d6836f0c 100644
> --- a/fs/file.c
> +++ b/fs/file.c
> @@ -867,7 +867,7 @@ static struct file *__get_file_rcu(struct file __rcu **f)
>                 return NULL;
>
>         if (unlikely(!atomic_long_inc_not_zero(&file->f_count)))
> -               return ERR_PTR(-EAGAIN);
> +               return ERR_PTR(-EINVAL);
>
>         file_reloaded = rcu_dereference_raw(*f);
>
> @@ -927,6 +927,21 @@ struct file *get_file_rcu(struct file __rcu **f)
>  }
>  EXPORT_SYMBOL_GPL(get_file_rcu);
>
> +struct file *get_file_rcu_once(struct file __rcu **f)
> +{
> +       for (;;) {
> +               struct file __rcu *file;
> +
> +               file = __get_file_rcu(f);
> +               if (!IS_ERR(file))
> +                       return file;
> +
> +               if (PTR_ERR(file) == -EINVAL)
> +                       return NULL;
> +       }
> +}
> +EXPORT_SYMBOL_GPL(get_file_rcu_once);
> +
>  static inline struct file *__fget_files_rcu(struct files_struct *files,
>         unsigned int fd, fmode_t mask)
>  {
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index cb8bfa1f8ecb..657bbd824490 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1044,6 +1044,7 @@ static inline struct file *get_file(struct file *f)
>  }
>
>  struct file *get_file_rcu(struct file __rcu **f);
> +struct file *get_file_rcu_once(struct file __rcu **f);
>
>  #define file_count(x)  atomic_long_read(&(x)->f_count)
>
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] [PATCH] file, i915: fix file reference for mmap_singleton()
  2023-10-25 12:01 [Intel-gfx] [PATCH] file, i915: fix file reference for mmap_singleton() Christian Brauner
  2023-10-25 15:23 ` Jann Horn
@ 2023-10-25 18:52 ` Linus Torvalds
  2023-10-25 20:36   ` Christian Brauner
  2023-10-25 20:16 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for " Patchwork
  2023-10-25 23:48 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for file, i915: fix file reference for mmap_singleton() (rev2) Patchwork
  3 siblings, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2023-10-25 18:52 UTC (permalink / raw)
  To: Christian Brauner; +Cc: Jann Horn, intel-gfx, Kurmi, Suresh Kumar

On Wed, 25 Oct 2023 at 02:01, Christian Brauner <brauner@kernel.org> wrote:
>
>         rcu_read_lock();
> -       file = get_file_rcu(&i915->gem.mmap_singleton);
> +       file = get_file_rcu_once(&i915->gem.mmap_singleton);
>         rcu_read_unlock();
>         if (file)
>                 return file;

Honestly, the get_file_rcu_once() function seems a bit pointless.

The above doesn't want a loop at all. Yet your "once" still does loop,
because "even get_file_rcu_once() is trying to deal with the whole
"the pointer itself changed".

And the i915 code is actually designed to just depend on the atomicity
of the mmap_singleton access itself, in how it uses

        cmpxchg(&i915->gem.mmap_singleton, file, NULL);
...
        file = READ_ONCE(i915->gem.mmap_singleton);

and literally says "I'll remove my singleton as it is released". The
important part there is that the 'map_singleton' pointer itself isn't
actually reference-counted - it's the reverse, where
reference-counting *other* instances will then auto-clear it.

And that's what then makes that get_file_rcu() model not work with it,
because get_file_rcu() kind of assumes that the argument it gets is
*part* of the reference counting, not a cached *result* of the
reference counting that gets cleared as a result of the ref going down
to zero.

I may explain my objections badly, but hopefully you get what I mean.

And I think that also means that using that new get_file_rcu_once()
may be hiding the actual problem, but it's still conceptually wrong,
because it still has that conceptual model of "the pointer I'm getting
is part of the reference counting", when it really isn't.

So I think we'd actually be better off with something that is more
clearly different from get_file_rcu() in naming, so make that
conceptual difference clearer. Make it be something like
"get_active_file(struct file **)", and make the implementation be just
exactly what your current __get_file_rcu() is with no loops at all.

Then thew i915 code ends up being

        rcu_read_lock();
        file = get_active_file(&i915->gem.mmap_singleton);
        rcu_read_unlock();

        if (!IS_ERR_OR_NULL(file))
                return file;

       .. create new mmap_singleton ..

and that's it.

If you don't want to expose __get_file_rcu() as-is, you could maybe just do

  struct file *get_active_file(struct file **fp)
  {
        struct file *file;
        rcu_read_lock();
        file = __get_file_rcu(fp);
        rcu_read_unlock();
        return file;
  }

and then the i916 code would just drop the RCU locking that it has no
business even knowing about.

I realize that my complaints are a bit conceptual, and that the
practical end result is pretty much the same, but I do think that it
is worth noting this conceptual difference between "file pointer is
ref-counted" and "file counter is *controlled* by ref-counting".

Add a comment to the effect at get_active_file() users.

The old i915 code is already racy, in that it's called a "singleton",
but if you have multiple concurrent callers to mmap_singleton(), they
all might see a NULL file at first, and then they all create
*separate* new "singleton" files, and they *all* do that

        smp_store_mb(i915->gem.mmap_singleton, file);

and one random case of them happens to win the race and set *its* file
as "THE singleton" file.

So your "let's loop if it changes" is not fixing anything as-is, and
it's just actually hiding what is going on.

If the i915 code wants to be consistent and really have just one
singleton, it needs to do the looping with some cmpxchg or whatever
itself. Doing the loop in some get_file_rcu_once() function for when
the file pointer changed isn't going to fix the race.

Am I missing something?

              Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for file, i915: fix file reference for mmap_singleton()
  2023-10-25 12:01 [Intel-gfx] [PATCH] file, i915: fix file reference for mmap_singleton() Christian Brauner
  2023-10-25 15:23 ` Jann Horn
  2023-10-25 18:52 ` Linus Torvalds
@ 2023-10-25 20:16 ` Patchwork
  2023-10-25 20:38   ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for file, i915: fix file reference for mmap_singleton()y Christian Brauner
  2023-10-25 23:48 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for file, i915: fix file reference for mmap_singleton() (rev2) Patchwork
  3 siblings, 1 reply; 10+ messages in thread
From: Patchwork @ 2023-10-25 20:16 UTC (permalink / raw)
  To: Christian Brauner; +Cc: intel-gfx

== Series Details ==

Series: file, i915: fix file reference for mmap_singleton()
URL   : https://patchwork.freedesktop.org/series/125570/
State : failure

== Summary ==

Error: patch https://patchwork.freedesktop.org/api/1.0/series/125570/revisions/1/mbox/ not applied
Applying: file, i915: fix file reference for mmap_singleton()
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/gem/i915_gem_mman.c
M	fs/file.c
M	include/linux/fs.h
Falling back to patching base and 3-way merge...
Auto-merging include/linux/fs.h
CONFLICT (content): Merge conflict in include/linux/fs.h
Auto-merging fs/file.c
CONFLICT (content): Merge conflict in fs/file.c
Auto-merging drivers/gpu/drm/i915/gem/i915_gem_mman.c
CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/gem/i915_gem_mman.c
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 file, i915: fix file reference for mmap_singleton()
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
Build failed, no error log produced



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] [PATCH] file, i915: fix file reference for mmap_singleton()
  2023-10-25 18:52 ` Linus Torvalds
@ 2023-10-25 20:36   ` Christian Brauner
  2023-10-25 23:44     ` Linus Torvalds
  0 siblings, 1 reply; 10+ messages in thread
From: Christian Brauner @ 2023-10-25 20:36 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jann Horn, intel-gfx, Kurmi, Suresh Kumar

[-- Attachment #1: Type: text/plain, Size: 4138 bytes --]

On Wed, Oct 25, 2023 at 08:52:57AM -1000, Linus Torvalds wrote:
> On Wed, 25 Oct 2023 at 02:01, Christian Brauner <brauner@kernel.org> wrote:
> >
> >         rcu_read_lock();
> > -       file = get_file_rcu(&i915->gem.mmap_singleton);
> > +       file = get_file_rcu_once(&i915->gem.mmap_singleton);
> >         rcu_read_unlock();
> >         if (file)
> >                 return file;
> 
> Honestly, the get_file_rcu_once() function seems a bit pointless.
> 
> The above doesn't want a loop at all. Yet your "once" still does loop,
> because "even get_file_rcu_once() is trying to deal with the whole
> "the pointer itself changed".
> 
> And the i915 code is actually designed to just depend on the atomicity
> of the mmap_singleton access itself, in how it uses
> 
>         cmpxchg(&i915->gem.mmap_singleton, file, NULL);
> ...
>         file = READ_ONCE(i915->gem.mmap_singleton);
> 
> and literally says "I'll remove my singleton as it is released". The
> important part there is that the 'map_singleton' pointer itself isn't
> actually reference-counted - it's the reverse, where
> reference-counting *other* instances will then auto-clear it.
> 
> And that's what then makes that get_file_rcu() model not work with it,
> because get_file_rcu() kind of assumes that the argument it gets is
> *part* of the reference counting, not a cached *result* of the
> reference counting that gets cleared as a result of the ref going down
> to zero.
> 
> I may explain my objections badly, but hopefully you get what I mean.

No, I get it. I was on the fence how to deal with this because I
honestly find this model here extremely weird and very unintuitive to
begin with with the pointer being NULLed during release and replaced
that way.

> 
> And I think that also means that using that new get_file_rcu_once()
> may be hiding the actual problem, but it's still conceptually wrong,
> because it still has that conceptual model of "the pointer I'm getting
> is part of the reference counting", when it really isn't.
> 
> So I think we'd actually be better off with something that is more
> clearly different from get_file_rcu() in naming, so make that
> conceptual difference clearer. Make it be something like
> "get_active_file(struct file **)", and make the implementation be just
> exactly what your current __get_file_rcu() is with no loops at all.
> 
> Then thew i915 code ends up being
> 
>         rcu_read_lock();
>         file = get_active_file(&i915->gem.mmap_singleton);
>         rcu_read_unlock();
> 
>         if (!IS_ERR_OR_NULL(file))
>                 return file;
> 
>        .. create new mmap_singleton ..
> 
> and that's it.
> 
> If you don't want to expose __get_file_rcu() as-is, you could maybe just do
> 
>   struct file *get_active_file(struct file **fp)
>   {
>         struct file *file;
>         rcu_read_lock();
>         file = __get_file_rcu(fp);
>         rcu_read_unlock();
>         return file;
>   }
> 
> and then the i916 code would just drop the RCU locking that it has no
> business even knowing about.

Yeah, fair idea. I just want this fixed so we don't have to drop. Pushed
to vfs.misc with your suggested changes for get_file_active() with rcu
hidden in that function. Double-check and yell if something looks wrong:

> The old i915 code is already racy, in that it's called a "singleton",
> but if you have multiple concurrent callers to mmap_singleton(), they
> all might see a NULL file at first, and then they all create
> *separate* new "singleton" files, and they *all* do that
> 
>         smp_store_mb(i915->gem.mmap_singleton, file);
> 
> and one random case of them happens to win the race and set *its* file
> as "THE singleton" file.

Yeah, I think that's all really weird but whatever.

> Am I missing something?

No, I don't think so. I thought you might have a good opinion here.

I did think that the loop didn't really matter for this case but since
it seemingly does paper over the weird semantics here let's drop it.
Anyway, pushed to:

https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/commit/?h=vfs.misc

and appended here. Please yell, if something's still off.

[-- Attachment #2: 0001-file-i915-fix-file-reference-for-mmap_singleton.patch --]
[-- Type: text/x-diff, Size: 7876 bytes --]

From 61d4fb0b349ec1b33119913c3b0bd109de30142c Mon Sep 17 00:00:00 2001
From: Christian Brauner <brauner@kernel.org>
Date: Wed, 25 Oct 2023 12:14:37 +0200
Subject: [PATCH] file, i915: fix file reference for mmap_singleton()

Today we got a report at [1] for rcu stalls on the i915 testsuite in [2]
due to the conversion of files to SLAB_TYPSSAFE_BY_RCU. Afaict,
get_file_rcu() goes into an infinite loop trying to carefully verify
that i915->gem.mmap_singleton hasn't changed - see the splat below.

So I stared at this code to figure out what it actually does. It seems
that the i915->gem.mmap_singleton pointer itself never had rcu semantics.

The i915->gem.mmap_singleton is replaced in
file->f_op->release::singleton_release():

        static int singleton_release(struct inode *inode, struct file *file)
        {
                struct drm_i915_private *i915 = file->private_data;

                cmpxchg(&i915->gem.mmap_singleton, file, NULL);
                drm_dev_put(&i915->drm);

                return 0;
        }

The cmpxchg() is ordered against a concurrent update of
i915->gem.mmap_singleton from mmap_singleton(). IOW, when
mmap_singleton() fails to get a reference on i915->gem.mmap_singleton:

While mmap_singleton() does

        rcu_read_lock();
        file = get_file_rcu(&i915->gem.mmap_singleton);
        rcu_read_unlock();

it allocates a new file via anon_inode_getfile() and does

        smp_store_mb(i915->gem.mmap_singleton, file);

So, then what happens in the case of this bug is that at some point
fput() is called and drops the file->f_count to zero leaving the pointer
in i915->gem.mmap_singleton in tact.

Now, there might be delays until
file->f_op->release::singleton_release() is called and
i915->gem.mmap_singleton is set to NULL.

Say concurrently another task hits mmap_singleton() and does:

        rcu_read_lock();
        file = get_file_rcu(&i915->gem.mmap_singleton);
        rcu_read_unlock();

When get_file_rcu() fails to get a reference via atomic_inc_not_zero()
it will try the reload from i915->gem.mmap_singleton expecting it to be
NULL, assuming it has comparable semantics as we expect in
__fget_files_rcu().

But it hasn't so it reloads the same pointer again, trying the same
atomic_inc_not_zero() again and doing so until
file->f_op->release::singleton_release() of the old file has been
called.

So, in contrast to __fget_files_rcu() here we want to not retry when
atomic_inc_not_zero() has failed. We only want to retry in case we
managed to get a reference but the pointer did change on reload.

<3> [511.395679] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
<3> [511.395716] rcu:   Tasks blocked on level-1 rcu_node (CPUs 0-9): P6238
<3> [511.395934] rcu:   (detected by 16, t=65002 jiffies, g=123977, q=439 ncpus=20)
<6> [511.395944] task:i915_selftest   state:R  running task     stack:10568 pid:6238  tgid:6238  ppid:1001   flags:0x00004002
<6> [511.395962] Call Trace:
<6> [511.395966]  <TASK>
<6> [511.395974]  ? __schedule+0x3a8/0xd70
<6> [511.395995]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
<6> [511.396003]  ? lockdep_hardirqs_on+0xc3/0x140
<6> [511.396013]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
<6> [511.396029]  ? get_file_rcu+0x10/0x30
<6> [511.396039]  ? get_file_rcu+0x10/0x30
<6> [511.396046]  ? i915_gem_object_mmap+0xbc/0x450 [i915]
<6> [511.396509]  ? i915_gem_mmap+0x272/0x480 [i915]
<6> [511.396903]  ? mmap_region+0x253/0xb60
<6> [511.396925]  ? do_mmap+0x334/0x5c0
<6> [511.396939]  ? vm_mmap_pgoff+0x9f/0x1c0
<6> [511.396949]  ? rcu_is_watching+0x11/0x50
<6> [511.396962]  ? igt_mmap_offset+0xfc/0x110 [i915]
<6> [511.397376]  ? __igt_mmap+0xb3/0x570 [i915]
<6> [511.397762]  ? igt_mmap+0x11e/0x150 [i915]
<6> [511.398139]  ? __trace_bprintk+0x76/0x90
<6> [511.398156]  ? __i915_subtests+0xbf/0x240 [i915]
<6> [511.398586]  ? __pfx___i915_live_setup+0x10/0x10 [i915]
<6> [511.399001]  ? __pfx___i915_live_teardown+0x10/0x10 [i915]
<6> [511.399433]  ? __run_selftests+0xbc/0x1a0 [i915]
<6> [511.399875]  ? i915_live_selftests+0x4b/0x90 [i915]
<6> [511.400308]  ? i915_pci_probe+0x106/0x200 [i915]
<6> [511.400692]  ? pci_device_probe+0x95/0x120
<6> [511.400704]  ? really_probe+0x164/0x3c0
<6> [511.400715]  ? __pfx___driver_attach+0x10/0x10
<6> [511.400722]  ? __driver_probe_device+0x73/0x160
<6> [511.400731]  ? driver_probe_device+0x19/0xa0
<6> [511.400741]  ? __driver_attach+0xb6/0x180
<6> [511.400749]  ? __pfx___driver_attach+0x10/0x10
<6> [511.400756]  ? bus_for_each_dev+0x77/0xd0
<6> [511.400770]  ? bus_add_driver+0x114/0x210
<6> [511.400781]  ? driver_register+0x5b/0x110
<6> [511.400791]  ? i915_init+0x23/0xc0 [i915]
<6> [511.401153]  ? __pfx_i915_init+0x10/0x10 [i915]
<6> [511.401503]  ? do_one_initcall+0x57/0x270
<6> [511.401515]  ? rcu_is_watching+0x11/0x50
<6> [511.401521]  ? kmalloc_trace+0xa3/0xb0
<6> [511.401532]  ? do_init_module+0x5f/0x210
<6> [511.401544]  ? load_module+0x1d00/0x1f60
<6> [511.401581]  ? init_module_from_file+0x86/0xd0
<6> [511.401590]  ? init_module_from_file+0x86/0xd0
<6> [511.401613]  ? idempotent_init_module+0x17c/0x230
<6> [511.401639]  ? __x64_sys_finit_module+0x56/0xb0
<6> [511.401650]  ? do_syscall_64+0x3c/0x90
<6> [511.401659]  ? entry_SYSCALL_64_after_hwframe+0x6e/0xd8
<6> [511.401684]  </TASK>

Link: [1]: https://lore.kernel.org/intel-gfx/SJ1PR11MB6129CB39EED831784C331BAFB9DEA@SJ1PR11MB6129.namprd11.prod.outlook.com
Link: [2]: https://intel-gfx-ci.01.org/tree/linux-next/next-20231013/bat-dg2-11/igt@i915_selftest@live@mman.html#dmesg-warnings10963
Cc: Jann Horn <jannh@google.com>,
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20231025-formfrage-watscheln-84526cd3bd7d@brauner
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c |  4 +---
 fs/file.c                                | 25 ++++++++++++++++++++++++
 include/linux/fs.h                       |  1 +
 3 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index 97bf10861cad..72b353737334 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -915,9 +915,7 @@ static struct file *mmap_singleton(struct drm_i915_private *i915)
 {
 	struct file *file;
 
-	rcu_read_lock();
-	file = get_file_rcu(&i915->gem.mmap_singleton);
-	rcu_read_unlock();
+	file = get_file_active(&i915->gem.mmap_singleton);
 	if (file)
 		return file;
 
diff --git a/fs/file.c b/fs/file.c
index 1a475d7d636e..5fb0b146e79e 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -927,6 +927,31 @@ struct file *get_file_rcu(struct file __rcu **f)
 }
 EXPORT_SYMBOL_GPL(get_file_rcu);
 
+/**
+ * get_file_active - try go get a reference to a file
+ * @f: the file to get a reference on
+ *
+ * In contast to get_file_rcu() the pointer itself isn't part of the
+ * reference counting.
+ *
+ * This function should rarely have to be used and only by users who
+ * understand the implications of SLAB_TYPESAFE_BY_RCU. Try to avoid it.
+ *
+ * Return: Returns @f with the reference count increased or NULL.
+ */
+struct file *get_file_active(struct file **f)
+{
+	struct file __rcu *file;
+
+	rcu_read_lock();
+	file = __get_file_rcu(f);
+	rcu_read_unlock();
+	if (IS_ERR(file))
+		file = NULL;
+	return file;
+}
+EXPORT_SYMBOL_GPL(get_file_active);
+
 static inline struct file *__fget_files_rcu(struct files_struct *files,
        unsigned int fd, fmode_t mask)
 {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index cb8bfa1f8ecb..b35815277cc6 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1044,6 +1044,7 @@ static inline struct file *get_file(struct file *f)
 }
 
 struct file *get_file_rcu(struct file __rcu **f);
+struct file *get_file_active(struct file **f);
 
 #define file_count(x)	atomic_long_read(&(x)->f_count)
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx]  ✗ Fi.CI.BUILD: failure for file, i915: fix file reference for mmap_singleton()y
  2023-10-25 20:16 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for " Patchwork
@ 2023-10-25 20:38   ` Christian Brauner
  2023-10-26  5:52     ` Saarinen, Jani
  0 siblings, 1 reply; 10+ messages in thread
From: Christian Brauner @ 2023-10-25 20:38 UTC (permalink / raw)
  To: intel-gfx

On Wed, Oct 25, 2023 at 08:16:23PM -0000, Patchwork wrote:
> == Series Details ==
> 
> Series: file, i915: fix file reference for mmap_singleton()
> URL   : https://patchwork.freedesktop.org/series/125570/
> State : failure
> 
> == Summary ==
> 
> Error: patch https://patchwork.freedesktop.org/api/1.0/series/125570/revisions/1/mbox/ not applied
> Applying: file, i915: fix file reference for mmap_singleton()
> Using index info to reconstruct a base tree...
> M	drivers/gpu/drm/i915/gem/i915_gem_mman.c
> M	fs/file.c
> M	include/linux/fs.h
> Falling back to patching base and 3-way merge...
> Auto-merging include/linux/fs.h
> CONFLICT (content): Merge conflict in include/linux/fs.h
> Auto-merging fs/file.c
> CONFLICT (content): Merge conflict in fs/file.c
> Auto-merging drivers/gpu/drm/i915/gem/i915_gem_mman.c
> CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/gem/i915_gem_mman.c
> error: Failed to merge in the changes.
> hint: Use 'git am --show-current-patch=diff' to see the failed patch
> Patch failed at 0001 file, i915: fix file reference for mmap_singleton()
> When you have resolved this problem, run "git am --continue".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".
> Build failed, no error log produced

I'm not sure what tree you're testing on but please test on whatever is
in vfs.misc, I guess which has that patch.
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] [PATCH] file, i915: fix file reference for mmap_singleton()
  2023-10-25 20:36   ` Christian Brauner
@ 2023-10-25 23:44     ` Linus Torvalds
  0 siblings, 0 replies; 10+ messages in thread
From: Linus Torvalds @ 2023-10-25 23:44 UTC (permalink / raw)
  To: Christian Brauner; +Cc: Jann Horn, intel-gfx, Kurmi, Suresh Kumar

On Wed, 25 Oct 2023 at 10:36, Christian Brauner <brauner@kernel.org> wrote:
>
> I did think that the loop didn't really matter for this case but since
> it seemingly does paper over the weird semantics here let's drop it.
> Anyway, pushed to:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/commit/?h=vfs.misc

LGTM.

We could then make the i915 mmap_singleton() do the proper loop over
the whole thing. Not quite the way Jann suggested - which would not
increment the file refcount properly, but instead of the current

        smp_store_mb(i915->gem.mmap_singleton, file);
        drm_dev_get(&i915->drm);
        return file;

it could do something much more complicated like

        drm_dev_get(&i915->drm);
        if (cmpxchg(&i915->gem.mmap_singleton, NULL, file) == NULL)
                return file;

        // mmap_singleton wasn't NULL: it might be an old one in the
        // process of being torn down (with a zero refcount), or a new
        // one that was just installed that we should use instead
        fput(file);
        file = READ_ONCE(i915->gem.mmap_singleton);
        if (!file)
                goto repeat;
        // Is it valid? Just try again
        if (atomic_read(&file->f_count))
                goto repeat;

        // We have a file pointer, but it's in the process of being torn
        // down but gem.mmap_singleton hasn't been cleared yet. Yield to
        // make progress.
        sched_yield();
        goto repeat;

which is disgusting, but would probably work.

Note the "probably work". I'm handwaving: "something like the above".

Presumably not even worth doing - I assume a correct client always
just does a single mmap() before starting work.

               Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for file, i915: fix file reference for mmap_singleton() (rev2)
  2023-10-25 12:01 [Intel-gfx] [PATCH] file, i915: fix file reference for mmap_singleton() Christian Brauner
                   ` (2 preceding siblings ...)
  2023-10-25 20:16 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for " Patchwork
@ 2023-10-25 23:48 ` Patchwork
  3 siblings, 0 replies; 10+ messages in thread
From: Patchwork @ 2023-10-25 23:48 UTC (permalink / raw)
  To: Christian Brauner; +Cc: intel-gfx

== Series Details ==

Series: file, i915: fix file reference for mmap_singleton() (rev2)
URL   : https://patchwork.freedesktop.org/series/125570/
State : failure

== Summary ==

Error: patch https://patchwork.freedesktop.org/api/1.0/series/125570/revisions/2/mbox/ not applied
Patch is empty.
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To record the empty patch as an empty commit, run "git am --allow-empty".
To restore the original branch and stop patching, run "git am --abort".
Build failed, no error log produced



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx]  ✗ Fi.CI.BUILD: failure for file, i915: fix file reference for mmap_singleton()y
  2023-10-25 20:38   ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for file, i915: fix file reference for mmap_singleton()y Christian Brauner
@ 2023-10-26  5:52     ` Saarinen, Jani
  0 siblings, 0 replies; 10+ messages in thread
From: Saarinen, Jani @ 2023-10-26  5:52 UTC (permalink / raw)
  To: Christian Brauner, intel-gfx

Hi, 
CI is always testing against https://cgit.freedesktop.org/drm-tip 

Br
Jani

> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Christian
> Brauner
> Sent: Wednesday, October 25, 2023 11:39 PM
> To: intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] ✗ Fi.CI.BUILD: failure for file, i915: fix file reference for
> mmap_singleton()y
> 
> On Wed, Oct 25, 2023 at 08:16:23PM -0000, Patchwork wrote:
> > == Series Details ==
> >
> > Series: file, i915: fix file reference for mmap_singleton()
> > URL   : https://patchwork.freedesktop.org/series/125570/
> > State : failure
> >
> > == Summary ==
> >
> > Error: patch
> > https://patchwork.freedesktop.org/api/1.0/series/125570/revisions/1/mb
> > ox/ not applied
> > Applying: file, i915: fix file reference for mmap_singleton() Using
> > index info to reconstruct a base tree...
> > M	drivers/gpu/drm/i915/gem/i915_gem_mman.c
> > M	fs/file.c
> > M	include/linux/fs.h
> > Falling back to patching base and 3-way merge...
> > Auto-merging include/linux/fs.h
> > CONFLICT (content): Merge conflict in include/linux/fs.h Auto-merging
> > fs/file.c CONFLICT (content): Merge conflict in fs/file.c Auto-merging
> > drivers/gpu/drm/i915/gem/i915_gem_mman.c
> > CONFLICT (content): Merge conflict in
> > drivers/gpu/drm/i915/gem/i915_gem_mman.c
> > error: Failed to merge in the changes.
> > hint: Use 'git am --show-current-patch=diff' to see the failed patch
> > Patch failed at 0001 file, i915: fix file reference for
> > mmap_singleton() When you have resolved this problem, run "git am --
> continue".
> > If you prefer to skip this patch, run "git am --skip" instead.
> > To restore the original branch and stop patching, run "git am --abort".
> > Build failed, no error log produced
> 
> I'm not sure what tree you're testing on but please test on whatever is in
> vfs.misc, I guess which has that patch.
> git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] [PATCH] file, i915: fix file reference for mmap_singleton()
  2023-10-25 15:23 ` Jann Horn
@ 2023-10-26  9:34   ` Tvrtko Ursulin
  0 siblings, 0 replies; 10+ messages in thread
From: Tvrtko Ursulin @ 2023-10-26  9:34 UTC (permalink / raw)
  To: Jann Horn, Christian Brauner, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, intel-gfx
  Cc: Linus Torvalds, Kurmi, Suresh Kumar


On 25/10/2023 16:23, Jann Horn wrote:
> On Wed, Oct 25, 2023 at 2:01 PM Christian Brauner <brauner@kernel.org> wrote:
>> Today we got a report at [1] for rcu stalls on the i915 testsuite in [2]
>> due to the conversion of files to SLAB_TYPSSAFE_BY_RCU. Afaict,
>> get_file_rcu() goes into an infinite loop trying to carefully verify
>> that i915->gem.mmap_singleton hasn't changed - see the splat below.
>>
>> So I stared at this code to figure out what it actually does. It seems
>> that the i915->gem.mmap_singleton pointer itself never had rcu semantics.
>>
>> The i915->gem.mmap_singleton is replaced in
>> file->f_op->release::singleton_release():
>>
>>          static int singleton_release(struct inode *inode, struct file *file)
>>          {
>>                  struct drm_i915_private *i915 = file->private_data;
>>
>>                  cmpxchg(&i915->gem.mmap_singleton, file, NULL);
>>                  drm_dev_put(&i915->drm);
>>
>>                  return 0;
>>          }
>>
>> The cmpxchg() is ordered against a concurrent update of
>> i915->gem.mmap_singleton from mmap_singleton(). IOW, when
>> mmap_singleton() fails to get a reference on i915->gem.mmap_singleton
>> via mmap_singleton():
>>
>>          rcu_read_lock();
>>          file = get_file_rcu(&i915->gem.mmap_singleton);
>>          rcu_read_unlock();
>>
>> mmap_singleton() allocates a new file via anon_inode_getfile() and does
>>
>>          smp_store_mb(i915->gem.mmap_singleton, file);
>>
>> So, afaiu, then what happens in the case of this bug is that at some
>> point fput() is called and drops the file->f_count to zero but obviously
>> leaving the pointer in i915->gem.mmap_singleton in tact until
>> file->f_op->release::singleton_release() is called.
>>
>> Now, there might be possibly larger delays until
>> file->f_op->release::singleton_release() is called and
>> i915->gem.mmap_singleton is set to NULL?
>>
>> Say concurrently another task hits mmap_singleton() and does:
>>
>>          rcu_read_lock();
>>          file = get_file_rcu(&i915->gem.mmap_singleton);
>>          rcu_read_unlock();
>>
>> When get_file_rcu() fails to get a reference via atomic_inc_not_zero()
>> it will try the reload from i915->gem.mmap_singleton assuming it has
>> comparable semantics as __fget_files_rcu() that also reloads on
>> atomic_inc_not_zero() failure.
>>
>> But since i915->gem.mmap_singleton doesn't have proper rcu semantics it
>> reloads the same pointer again, trying the same atomic_inc_not_zero()
>> again and doing so until file->f_op->release::singleton_release() of the
>> old file has been called.
>>
>> So, in contrast to __fget_files_rcu() here we want to not retry when
>> atomic_inc_not_zero() has failed. We only want to retry in case we
>> managed to get a reference but the pointer did change on reload.
> [...]
>> Link: [1]: https://lore.kernel.org/intel-gfx/SJ1PR11MB6129CB39EED831784C331BAFB9DEA@SJ1PR11MB6129.namprd11.prod.outlook.com
>> Link: [2]: https://intel-gfx-ci.01.org/tree/linux-next/next-20231013/bat-dg2-11/igt@i915_selftest@live@mman.html#dmesg-warnings10963
>> Signed-off-by: Christian Brauner <brauner@kernel.org>
> 
> Patch makes sense to me. I'm not sure why you changed EAGAIN to
> EINVAL, and it's obviously a bit ugly, but it looks like a valid fix
> for what the SLUB_TYPESAFE_BY_RCU conversion broke.
> 
> Reviewed-by: Jann Horn <jannh@google.com>
> 
> 
> But as a side note, I am a bit confused about how the concurrency of
> the existing code is supposed to work... it looks like two parallel
> calls to mmap_singleton() can end up returning different pointers, and
> then the singleton is not actually a singleton anymore? If that part
> of the existing code is wrong even before the SLAB_TYPESAFE_BY_RCU
> conversion, we might later have to open-code get_file_rcu() in
> mmap_singleton(), so that we can do a cmpxchg at the end that checks
> whether the i915->gem.mmap_singleton pointer is still the same?
> 
> Like:
> 
> static struct file *mmap_singleton(struct drm_i915_private *i915)
> {
>          struct file *file, *new_file;
> 
>          rcu_read_lock();
>          file = READ_ONCE(i915->gem.mmap_singleton);
>          if (file && atomic_long_inc_not_zero(&file->f_count)) {
>                  rcu_read_unlock();
>                  return file;
>          }
>          rcu_read_unlock();
> 
>          new_file = anon_inode_getfile("i915.gem", &singleton_fops,
> i915, O_RDWR);
>          if (IS_ERR(new_file))
>                  return new_file;
> 
>          /* Everyone shares a single global address space */
>          new_file->f_mapping = i915->drm.anon_inode->i_mapping;
> 
>          if (try_cmpxchg(&i915->gem.mmap_singleton, &file, new_file)) {
>                  // something with drm_dev refcount ?
>                  return new_file;
>          }
> 
>          // something with drm_dev refcount ?
>          fput(new_file);
> 
>          return file;
> }
> 
> It would be nice to get some i915 maintainer's comment on how the
> singleton stuff is supposed to work.

I see that the story has mostly been unraveled by now elsewhere in the 
thread, and yes, upon some historical digging and looking at the code I 
agree that the name singleton is confusing/misleading.

And although race can happen and we can end up with more than one 
anonymous inode, I don't think there is any real harm which would 
warrant complicating this code further.

Perhaps a good enough solution would be to rename into something like 
i915->gem.cached_mmap_anon and put a comment above the smp_store_mb 
explaining that the race is unlikely but harmless. Because all i915 
needs is to decouple the file refcount from the DRM file, and whether it 
manages to do it with one anon file, or rarely slightly more than one, 
it doesn't really matter.

So if people agree, plan could be for you to proceed with the 
get_active_file fix and I can follow up with the rename and comment? 
Possibly at some later date so to avoid conflicts between the trees.

Regards,

Tvrtko

> 
>> ---
>>
>> Jann, Linus,
>>
>> Since you've been quite involved, can you check that what I'm babbling
>> here makes sense? If this isn't the fix then I would have to drop the
>> SLAB_TYPESAFE_BY_RCU conversion patch for now since I'd like to send PRs
>> by the end of the week.
>>
>> This is on top of
>> https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/log/?h=vfs.misc
>>
>> Appreciate the help, thanks!
>> Christian
>>
>> ---
>>   drivers/gpu/drm/i915/gem/i915_gem_mman.c |  2 +-
>>   fs/file.c                                | 17 ++++++++++++++++-
>>   include/linux/fs.h                       |  1 +
>>   3 files changed, 18 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> index 97bf10861cad..da97e61b18d4 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> @@ -916,7 +916,7 @@ static struct file *mmap_singleton(struct drm_i915_private *i915)
>>          struct file *file;
>>
>>          rcu_read_lock();
>> -       file = get_file_rcu(&i915->gem.mmap_singleton);
>> +       file = get_file_rcu_once(&i915->gem.mmap_singleton);
>>          rcu_read_unlock();
>>          if (file)
>>                  return file;
>> diff --git a/fs/file.c b/fs/file.c
>> index 1a475d7d636e..2c64d6836f0c 100644
>> --- a/fs/file.c
>> +++ b/fs/file.c
>> @@ -867,7 +867,7 @@ static struct file *__get_file_rcu(struct file __rcu **f)
>>                  return NULL;
>>
>>          if (unlikely(!atomic_long_inc_not_zero(&file->f_count)))
>> -               return ERR_PTR(-EAGAIN);
>> +               return ERR_PTR(-EINVAL);
>>
>>          file_reloaded = rcu_dereference_raw(*f);
>>
>> @@ -927,6 +927,21 @@ struct file *get_file_rcu(struct file __rcu **f)
>>   }
>>   EXPORT_SYMBOL_GPL(get_file_rcu);
>>
>> +struct file *get_file_rcu_once(struct file __rcu **f)
>> +{
>> +       for (;;) {
>> +               struct file __rcu *file;
>> +
>> +               file = __get_file_rcu(f);
>> +               if (!IS_ERR(file))
>> +                       return file;
>> +
>> +               if (PTR_ERR(file) == -EINVAL)
>> +                       return NULL;
>> +       }
>> +}
>> +EXPORT_SYMBOL_GPL(get_file_rcu_once);
>> +
>>   static inline struct file *__fget_files_rcu(struct files_struct *files,
>>          unsigned int fd, fmode_t mask)
>>   {
>> diff --git a/include/linux/fs.h b/include/linux/fs.h
>> index cb8bfa1f8ecb..657bbd824490 100644
>> --- a/include/linux/fs.h
>> +++ b/include/linux/fs.h
>> @@ -1044,6 +1044,7 @@ static inline struct file *get_file(struct file *f)
>>   }
>>
>>   struct file *get_file_rcu(struct file __rcu **f);
>> +struct file *get_file_rcu_once(struct file __rcu **f);
>>
>>   #define file_count(x)  atomic_long_read(&(x)->f_count)
>>
>> --
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-10-26 12:36 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-25 12:01 [Intel-gfx] [PATCH] file, i915: fix file reference for mmap_singleton() Christian Brauner
2023-10-25 15:23 ` Jann Horn
2023-10-26  9:34   ` Tvrtko Ursulin
2023-10-25 18:52 ` Linus Torvalds
2023-10-25 20:36   ` Christian Brauner
2023-10-25 23:44     ` Linus Torvalds
2023-10-25 20:16 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for " Patchwork
2023-10-25 20:38   ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for file, i915: fix file reference for mmap_singleton()y Christian Brauner
2023-10-26  5:52     ` Saarinen, Jani
2023-10-25 23:48 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for file, i915: fix file reference for mmap_singleton() (rev2) Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).