All of lore.kernel.org
 help / color / mirror / Atom feed
* deadlock in debugfs synchronize_srcu() when unplugging USB
@ 2017-10-12 20:01 Tyler Hall
  2017-10-12 21:04 ` Paul E. McKenney
  2017-10-16  7:51 ` Greg Kroah-Hartman
  0 siblings, 2 replies; 18+ messages in thread
From: Tyler Hall @ 2017-10-12 20:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Nicolai Stange, Johannes Berg, Paul E.McKenney, Greg Kroah-Hartman

Hi,

I have a reproducible scenario wherein removing a USB device while
reading /sys/kernel/debug/usb/devices causes a deadlock. This should
not be specific to any USB device. Any USB device removal that causes
a call to debugfs_remove() has inverted lock ordering with respect to
the read() of debug/usb/devices.

e.g.
read thread: srcu_read_lock(&debugfs_srcu);
-- usb unplug --
  remove thread: mutex_lock(&usb_bus_idr_lock);
  remove thread: synchronize_srcu(&debugfs_srcu); <- blocked
read thread: mutex_lock(&usb_bus_idr_lock); <- blocked
read thread: srcu_read_unlock(&debugfs_srcu, ...);

This seems to be another flavor of what Johannes Berg reported:
deadlock in synchronize_srcu() in debugfs?
https://lkml.org/lkml/2017/3/23/415

I applied this patch set from Nicolai Stange and can no longer
reproduce the hang.
[RFC PATCH v2 0/9] debugfs: per-file removal protection
https://lkml.org/lkml/2017/5/3/292

As patch 2/9 in the series indicates, commit 49d200deaa68 ("debugfs:
prevent access to removed files' private data") is where this was
first introduced, and it is reproducible on v4.14-rc4.

How should we move forward with the resolution of this debugfs change?
It seems to me that the USB locking is reasonable but the debugfs
global srcu is overly restrictive. This could lead to unexpected lock
inversion any time a driver shares a mutex between its debugfs read
and removal paths.

Backtrace below. Thanks!

-Tyler Hall

This is easier to reproduce by adding a sleep before the
usb_bus_idr_lock, but I've seen it on an unmodified kernel.

diff --git a/drivers/usb/core/devices.c b/drivers/usb/core/devices.c
index 55dea2e7828f..534650cd0950 100644
--- a/drivers/usb/core/devices.c
+++ b/drivers/usb/core/devices.c
@@ -614,6 +614,7 @@ static ssize_t usb_device_read(struct file *file,
char __user *buf,
     if (!access_ok(VERIFY_WRITE, buf, nbytes))
         return -EFAULT;

+    msleep(1000);
     mutex_lock(&usb_bus_idr_lock);
     /* print devices for all busses */
     idr_for_each_entry(&usb_bus_idr, bus, id) {


[   24.240542] sysrq: SysRq : Show Blocked State
[   24.240765]   task                        PC stack   pid father
[   24.240975] kworker/0:2     D13840   881      2 0x80000000
[   24.241525] Workqueue: usb_hub_wq hub_event
[   24.241682] Call Trace:
[   24.242273]  __schedule+0x317/0x6d0
[   24.242442]  schedule+0x31/0x80
[   24.242514]  schedule_timeout+0x1d0/0x320
[   24.242603]  ? __queue_work+0x135/0x400
[   24.242689]  wait_for_completion+0x92/0xf0
[   24.242765]  ? wait_for_completion+0x92/0xf0
[   24.242841]  ? wake_up_q+0x70/0x70
[   24.242907]  __synchronize_srcu.part.14+0x71/0x90
[   24.242985]  ? trace_event_raw_event_rcu_torture_read+0xe0/0xe0
[   24.243169]  synchronize_srcu_expedited+0x22/0x30
[   24.243265]  ? synchronize_srcu_expedited+0x22/0x30
[   24.243347]  synchronize_srcu+0x9a/0xc0
[   24.243418]  debugfs_remove+0x6d/0xa0
[   24.243490]  bdi_unregister+0x8b/0x170
[   24.243558]  del_gendisk+0x139/0x220
[   24.243624]  sd_remove+0x5c/0xc0
[   24.243685]  device_release_driver_internal+0x150/0x210
[   24.243769]  device_release_driver+0xd/0x10
[   24.243841]  bus_remove_device+0xdb/0x120
[   24.243915]  device_del+0x1c3/0x2e0
[   24.243977]  __scsi_remove_device+0xff/0x130
[   24.244122]  scsi_forget_host+0x5b/0x60
[   24.244203]  scsi_remove_host+0x74/0x140
[   24.244281]  usb_stor_disconnect+0x54/0xc0
[   24.244357]  usb_unbind_interface+0x6d/0x260
[   24.244437]  device_release_driver_internal+0x150/0x210
[   24.244520]  device_release_driver+0xd/0x10
[   24.244591]  bus_remove_device+0xdb/0x120
[   24.244659]  device_del+0x1c3/0x2e0
[   24.244722]  usb_disable_device+0x97/0x1f0
[   24.244792]  usb_disconnect+0x88/0x230
[   24.244853]  hub_event+0x5b9/0x11e0
[   24.244915]  ? add_timer+0x10e/0x230
[   24.244984]  process_one_work+0x146/0x3e0
[   24.245124]  worker_thread+0x43/0x3e0
[   24.245204]  kthread+0x104/0x140
[   24.245266]  ? create_worker+0x190/0x190
[   24.245333]  ? kthread_create_on_node+0x40/0x40
[   24.245406]  ret_from_fork+0x22/0x30

[   24.245542] cat             D13712  1029   1018 0x00000000
[   24.245652] Call Trace:
[   24.245705]  __schedule+0x317/0x6d0
[   24.245770]  schedule+0x31/0x80
[   24.245830]  schedule_preempt_disabled+0x9/0x10
[   24.245903]  __mutex_lock.isra.2+0x225/0x470
[   24.245975]  __mutex_lock_slowpath+0xe/0x10
[   24.246110]  ? __mutex_lock_slowpath+0xe/0x10
[   24.246199]  mutex_lock+0x2a/0x30
[   24.246261]  usb_device_read+0xb6/0x140
[   24.246325]  full_proxy_read+0x4f/0x90
[   24.246394]  __vfs_read+0x23/0x120
[   24.246456]  ? security_file_permission+0x96/0xb0
[   24.246533]  ? rw_verify_area+0x49/0xb0
[   24.246593]  vfs_read+0x8e/0x130
[   24.246646]  SyS_read+0x41/0xa0
[   24.246698]  entry_SYSCALL_64_fastpath+0x13/0x94

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: deadlock in debugfs synchronize_srcu() when unplugging USB
  2017-10-12 20:01 deadlock in debugfs synchronize_srcu() when unplugging USB Tyler Hall
@ 2017-10-12 21:04 ` Paul E. McKenney
  2017-10-16  7:51 ` Greg Kroah-Hartman
  1 sibling, 0 replies; 18+ messages in thread
From: Paul E. McKenney @ 2017-10-12 21:04 UTC (permalink / raw)
  To: Tyler Hall
  Cc: Linux Kernel Mailing List, Nicolai Stange, Johannes Berg,
	Greg Kroah-Hartman

On Thu, Oct 12, 2017 at 04:01:48PM -0400, Tyler Hall wrote:
> Hi,
> 
> I have a reproducible scenario wherein removing a USB device while
> reading /sys/kernel/debug/usb/devices causes a deadlock. This should
> not be specific to any USB device. Any USB device removal that causes
> a call to debugfs_remove() has inverted lock ordering with respect to
> the read() of debug/usb/devices.
> 
> e.g.
> read thread: srcu_read_lock(&debugfs_srcu);
> -- usb unplug --
>   remove thread: mutex_lock(&usb_bus_idr_lock);
>   remove thread: synchronize_srcu(&debugfs_srcu); <- blocked
> read thread: mutex_lock(&usb_bus_idr_lock); <- blocked
> read thread: srcu_read_unlock(&debugfs_srcu, ...);

The reader cannot exit its SRCU read-side critical section until it
acquires usb_bus_idr_lock.  The updater's synchronize_srcu() is not
permitted to return until all pre-existing readers complete, and it won't
release usb_bus_idr_lock until that happens.  So you have a deadlock,
pure and simple.

Use of RCU and SRCU can greatly reduce the possibility of deadlock,
but as you can see, sufficiently clever code can still manage to get
into a deadlock state.

The rule is "Within a read-side critical section, never wait on anything
that directly or indirectly waits on a grace period."  The above code
violates that rule, and so the above code needs to be fixed.

> This seems to be another flavor of what Johannes Berg reported:
> deadlock in synchronize_srcu() in debugfs?
> https://lkml.org/lkml/2017/3/23/415

It does look quite similar.

> I applied this patch set from Nicolai Stange and can no longer
> reproduce the hang.
> [RFC PATCH v2 0/9] debugfs: per-file removal protection
> https://lkml.org/lkml/2017/5/3/292
> 
> As patch 2/9 in the series indicates, commit 49d200deaa68 ("debugfs:
> prevent access to removed files' private data") is where this was
> first introduced, and it is reproducible on v4.14-rc4.
> 
> How should we move forward with the resolution of this debugfs change?
> It seems to me that the USB locking is reasonable but the debugfs
> global srcu is overly restrictive. This could lead to unexpected lock
> inversion any time a driver shares a mutex between its debugfs read
> and removal paths.

It looks like no one took Nicolai's series and that Nicolai never
reposted it.  Would you like to forward-port and repost?

						Thanx, Paul

> Backtrace below. Thanks!
> 
> -Tyler Hall
> 
> This is easier to reproduce by adding a sleep before the
> usb_bus_idr_lock, but I've seen it on an unmodified kernel.
> 
> diff --git a/drivers/usb/core/devices.c b/drivers/usb/core/devices.c
> index 55dea2e7828f..534650cd0950 100644
> --- a/drivers/usb/core/devices.c
> +++ b/drivers/usb/core/devices.c
> @@ -614,6 +614,7 @@ static ssize_t usb_device_read(struct file *file,
> char __user *buf,
>      if (!access_ok(VERIFY_WRITE, buf, nbytes))
>          return -EFAULT;
> 
> +    msleep(1000);
>      mutex_lock(&usb_bus_idr_lock);
>      /* print devices for all busses */
>      idr_for_each_entry(&usb_bus_idr, bus, id) {
> 
> 
> [   24.240542] sysrq: SysRq : Show Blocked State
> [   24.240765]   task                        PC stack   pid father
> [   24.240975] kworker/0:2     D13840   881      2 0x80000000
> [   24.241525] Workqueue: usb_hub_wq hub_event
> [   24.241682] Call Trace:
> [   24.242273]  __schedule+0x317/0x6d0
> [   24.242442]  schedule+0x31/0x80
> [   24.242514]  schedule_timeout+0x1d0/0x320
> [   24.242603]  ? __queue_work+0x135/0x400
> [   24.242689]  wait_for_completion+0x92/0xf0
> [   24.242765]  ? wait_for_completion+0x92/0xf0
> [   24.242841]  ? wake_up_q+0x70/0x70
> [   24.242907]  __synchronize_srcu.part.14+0x71/0x90
> [   24.242985]  ? trace_event_raw_event_rcu_torture_read+0xe0/0xe0
> [   24.243169]  synchronize_srcu_expedited+0x22/0x30
> [   24.243265]  ? synchronize_srcu_expedited+0x22/0x30
> [   24.243347]  synchronize_srcu+0x9a/0xc0
> [   24.243418]  debugfs_remove+0x6d/0xa0
> [   24.243490]  bdi_unregister+0x8b/0x170
> [   24.243558]  del_gendisk+0x139/0x220
> [   24.243624]  sd_remove+0x5c/0xc0
> [   24.243685]  device_release_driver_internal+0x150/0x210
> [   24.243769]  device_release_driver+0xd/0x10
> [   24.243841]  bus_remove_device+0xdb/0x120
> [   24.243915]  device_del+0x1c3/0x2e0
> [   24.243977]  __scsi_remove_device+0xff/0x130
> [   24.244122]  scsi_forget_host+0x5b/0x60
> [   24.244203]  scsi_remove_host+0x74/0x140
> [   24.244281]  usb_stor_disconnect+0x54/0xc0
> [   24.244357]  usb_unbind_interface+0x6d/0x260
> [   24.244437]  device_release_driver_internal+0x150/0x210
> [   24.244520]  device_release_driver+0xd/0x10
> [   24.244591]  bus_remove_device+0xdb/0x120
> [   24.244659]  device_del+0x1c3/0x2e0
> [   24.244722]  usb_disable_device+0x97/0x1f0
> [   24.244792]  usb_disconnect+0x88/0x230
> [   24.244853]  hub_event+0x5b9/0x11e0
> [   24.244915]  ? add_timer+0x10e/0x230
> [   24.244984]  process_one_work+0x146/0x3e0
> [   24.245124]  worker_thread+0x43/0x3e0
> [   24.245204]  kthread+0x104/0x140
> [   24.245266]  ? create_worker+0x190/0x190
> [   24.245333]  ? kthread_create_on_node+0x40/0x40
> [   24.245406]  ret_from_fork+0x22/0x30
> 
> [   24.245542] cat             D13712  1029   1018 0x00000000
> [   24.245652] Call Trace:
> [   24.245705]  __schedule+0x317/0x6d0
> [   24.245770]  schedule+0x31/0x80
> [   24.245830]  schedule_preempt_disabled+0x9/0x10
> [   24.245903]  __mutex_lock.isra.2+0x225/0x470
> [   24.245975]  __mutex_lock_slowpath+0xe/0x10
> [   24.246110]  ? __mutex_lock_slowpath+0xe/0x10
> [   24.246199]  mutex_lock+0x2a/0x30
> [   24.246261]  usb_device_read+0xb6/0x140
> [   24.246325]  full_proxy_read+0x4f/0x90
> [   24.246394]  __vfs_read+0x23/0x120
> [   24.246456]  ? security_file_permission+0x96/0xb0
> [   24.246533]  ? rw_verify_area+0x49/0xb0
> [   24.246593]  vfs_read+0x8e/0x130
> [   24.246646]  SyS_read+0x41/0xa0
> [   24.246698]  entry_SYSCALL_64_fastpath+0x13/0x94
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: deadlock in debugfs synchronize_srcu() when unplugging USB
  2017-10-12 20:01 deadlock in debugfs synchronize_srcu() when unplugging USB Tyler Hall
  2017-10-12 21:04 ` Paul E. McKenney
@ 2017-10-16  7:51 ` Greg Kroah-Hartman
  2017-10-16  8:15   ` Johannes Berg
  1 sibling, 1 reply; 18+ messages in thread
From: Greg Kroah-Hartman @ 2017-10-16  7:51 UTC (permalink / raw)
  To: Tyler Hall
  Cc: Linux Kernel Mailing List, Nicolai Stange, Johannes Berg,
	Paul E.McKenney

On Thu, Oct 12, 2017 at 04:01:48PM -0400, Tyler Hall wrote:
> Hi,
> 
> I have a reproducible scenario wherein removing a USB device while
> reading /sys/kernel/debug/usb/devices causes a deadlock. This should
> not be specific to any USB device. Any USB device removal that causes
> a call to debugfs_remove() has inverted lock ordering with respect to
> the read() of debug/usb/devices.
> 
> e.g.
> read thread: srcu_read_lock(&debugfs_srcu);
> -- usb unplug --
>   remove thread: mutex_lock(&usb_bus_idr_lock);
>   remove thread: synchronize_srcu(&debugfs_srcu); <- blocked
> read thread: mutex_lock(&usb_bus_idr_lock); <- blocked
> read thread: srcu_read_unlock(&debugfs_srcu, ...);
> 
> This seems to be another flavor of what Johannes Berg reported:
> deadlock in synchronize_srcu() in debugfs?
> https://lkml.org/lkml/2017/3/23/415
> 
> I applied this patch set from Nicolai Stange and can no longer
> reproduce the hang.
> [RFC PATCH v2 0/9] debugfs: per-file removal protection
> https://lkml.org/lkml/2017/5/3/292
> 
> As patch 2/9 in the series indicates, commit 49d200deaa68 ("debugfs:
> prevent access to removed files' private data") is where this was
> first introduced, and it is reproducible on v4.14-rc4.
> 
> How should we move forward with the resolution of this debugfs change?
> It seems to me that the USB locking is reasonable but the debugfs
> global srcu is overly restrictive. This could lead to unexpected lock
> inversion any time a driver shares a mutex between its debugfs read
> and removal paths.

As Paul stated, fixing up the patches and sending them in is the best
solution, can you help out with that?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: deadlock in debugfs synchronize_srcu() when unplugging USB
  2017-10-16  7:51 ` Greg Kroah-Hartman
@ 2017-10-16  8:15   ` Johannes Berg
  2017-10-16  8:31     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 18+ messages in thread
From: Johannes Berg @ 2017-10-16  8:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Tyler Hall
  Cc: Linux Kernel Mailing List, Nicolai Stange, Paul E.McKenney

On Mon, 2017-10-16 at 09:51 +0200, Greg Kroah-Hartman wrote:

> As Paul stated, fixing up the patches and sending them in is the best
> solution, can you help out with that?

Is there anything to fix though? I'm not really aware of anything, and
there were no comments on his v2 patchset.

johannes

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: deadlock in debugfs synchronize_srcu() when unplugging USB
  2017-10-16  8:15   ` Johannes Berg
@ 2017-10-16  8:31     ` Greg Kroah-Hartman
  2017-10-16 15:08       ` Tyler Hall
  0 siblings, 1 reply; 18+ messages in thread
From: Greg Kroah-Hartman @ 2017-10-16  8:31 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Tyler Hall, Linux Kernel Mailing List, Nicolai Stange, Paul E.McKenney

On Mon, Oct 16, 2017 at 10:15:55AM +0200, Johannes Berg wrote:
> On Mon, 2017-10-16 at 09:51 +0200, Greg Kroah-Hartman wrote:
> 
> > As Paul stated, fixing up the patches and sending them in is the best
> > solution, can you help out with that?
> 
> Is there anything to fix though? I'm not really aware of anything, and
> there were no comments on his v2 patchset.

I don't remember, maybe someone needs to just resend them without the
"RFC" marking, as I of course can't apply a patch series like that :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: deadlock in debugfs synchronize_srcu() when unplugging USB
  2017-10-16  8:31     ` Greg Kroah-Hartman
@ 2017-10-16 15:08       ` Tyler Hall
       [not found]         ` <CAOjnSCazvybpH5ZhHn=iTLm_dVGEdr=W1pUQv6zowRisB1MWTQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Tyler Hall @ 2017-10-16 15:08 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Johannes Berg, Linux Kernel Mailing List, Nicolai Stange,
	Paul E.McKenney

On Mon, Oct 16, 2017 at 4:31 AM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> I don't remember, maybe someone needs to just resend them without the
> "RFC" marking, as I of course can't apply a patch series like that :)

I can resend if Nicolai doesn't get to it before me. I tested that
they fix this problem but I would want to review them in a bit more
detail first.

Thanks,
Tyler

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 0/8] debugfs: per-file removal protection
  2017-10-16 15:08       ` Tyler Hall
@ 2017-10-30 23:15             ` Nicolai Stange
  0 siblings, 0 replies; 18+ messages in thread
From: Nicolai Stange @ 2017-10-30 23:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Johannes Berg, Paul E.McKenney, Tyler Hall, Mike Marciniszyn,
	Dennis Dalessandro, Doug Ledford, Sean Hefty, Hal Rosenstock,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Nicolai Stange

Hi,

this is v3 of the per-file removal protection with the main change to v2
being that it's sent in non-RFC mode now.

For this, I dropped the questionable former [9/9] ("debugfs: free
debugfs_fsdata instances") for now. Perhaps I'll resend it later on
its own as a RFC again.

The problem this series attempts to address is that any indefinitely
blocking fops blocks _unrelated_ debugfs_remove()ers. This will be resolved
by introducing a per-file removal protection mechanism in place of the
former global debugfs_srcu shared among all debugfs files.

This problem has first been spotted and debugged by Johannes Berg [1]
and Tyler Hall is now facing the same issue again [2].

There's one patch to non-debugfs code,
[5/8] ("IB/hfi1: convert to debugfs_file_get() and -put()"),
which should be taken together with the rest of the series because
it depends on prior patches and later patches depend on it.

Applies to next-20171018.

I reviewed the patches once again and did an allmodconfig as well as an 
allnoconfig build. I did more thorough testing on v2, including whether
the removal protection still works.

Thanks,

Nicolai


Changes to v2:
  [9/9] ("debugfs: free debugfs_fsdata instances")
   - dropped for now.

Changes to v1:
  [2/9] ("debugfs: implement per-file removal protection")
  - In an attempt to resolve the issue reported by the kernel test robot
    for v1, restrict the "extended removal logic" to regular files in
    __debugfs_remove().

  [8/9] ("debugfs: defer debugfs_fsdata allocation to first usage")
  - Following review from Johannes Berg, replace the WARN_ON in
    debugfs_real_fops() by a WARN + 'return NULL'. The return NULL is
    expected to crash current soon and serves as an alternative for a
    BUG_ON here.
  - Mention the change in debugfs_real_fops() in the commit message.

  [9/9] ("debugfs: free debugfs_fsdata instances")
  - Following advice from Paul E. McKenney, make debugfs_file_get()
    release the RCU read section inbetween retry loop iterations.
  - Fix a race in debugfs_file_get()'s path handling a concurrent
    debugfs_file_put(): the former must not "help out resetting ->d_fsdata"
    because this can wipe out another debugfs_file_get()'s achievements.


[1] http://lkml.kernel.org/r/1490280886.2766.4.camel-cdvu00un1VgdHxzADdlk8Q@public.gmane.org
[2] https://lkml.kernel.org/r/CAOjnSCYGprej+vEEsSXwr=wO+eWLe2d6sHQYTpp-DFpQ3pmguw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org

Nicolai Stange (8):
  debugfs: add support for more elaborate ->d_fsdata
  debugfs: implement per-file removal protection
  debugfs: debugfs_real_fops(): drop __must_hold sparse annotation
  debugfs: convert to debugfs_file_get() and -put()
  IB/hfi1: convert to debugfs_file_get() and -put()
  debugfs: purge obsolete SRCU based removal protection
  debugfs: call debugfs_real_fops() only after debugfs_file_get()
  debugfs: defer debugfs_fsdata allocation to first usage

 drivers/infiniband/hw/hfi1/debugfs.c |  20 ++--
 fs/debugfs/file.c                    | 210 +++++++++++++++++++++--------------
 fs/debugfs/inode.c                   |  56 +++++++---
 fs/debugfs/internal.h                |  14 +++
 include/linux/debugfs.h              |  33 +-----
 lib/Kconfig.debug                    |   1 -
 6 files changed, 196 insertions(+), 138 deletions(-)

-- 
2.13.6

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 0/8] debugfs: per-file removal protection
@ 2017-10-30 23:15             ` Nicolai Stange
  0 siblings, 0 replies; 18+ messages in thread
From: Nicolai Stange @ 2017-10-30 23:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Johannes Berg, Paul E.McKenney, Tyler Hall, Mike Marciniszyn,
	Dennis Dalessandro, Doug Ledford, Sean Hefty, Hal Rosenstock,
	linux-rdma, linux-kernel, Nicolai Stange

Hi,

this is v3 of the per-file removal protection with the main change to v2
being that it's sent in non-RFC mode now.

For this, I dropped the questionable former [9/9] ("debugfs: free
debugfs_fsdata instances") for now. Perhaps I'll resend it later on
its own as a RFC again.

The problem this series attempts to address is that any indefinitely
blocking fops blocks _unrelated_ debugfs_remove()ers. This will be resolved
by introducing a per-file removal protection mechanism in place of the
former global debugfs_srcu shared among all debugfs files.

This problem has first been spotted and debugged by Johannes Berg [1]
and Tyler Hall is now facing the same issue again [2].

There's one patch to non-debugfs code,
[5/8] ("IB/hfi1: convert to debugfs_file_get() and -put()"),
which should be taken together with the rest of the series because
it depends on prior patches and later patches depend on it.

Applies to next-20171018.

I reviewed the patches once again and did an allmodconfig as well as an 
allnoconfig build. I did more thorough testing on v2, including whether
the removal protection still works.

Thanks,

Nicolai


Changes to v2:
  [9/9] ("debugfs: free debugfs_fsdata instances")
   - dropped for now.

Changes to v1:
  [2/9] ("debugfs: implement per-file removal protection")
  - In an attempt to resolve the issue reported by the kernel test robot
    for v1, restrict the "extended removal logic" to regular files in
    __debugfs_remove().

  [8/9] ("debugfs: defer debugfs_fsdata allocation to first usage")
  - Following review from Johannes Berg, replace the WARN_ON in
    debugfs_real_fops() by a WARN + 'return NULL'. The return NULL is
    expected to crash current soon and serves as an alternative for a
    BUG_ON here.
  - Mention the change in debugfs_real_fops() in the commit message.

  [9/9] ("debugfs: free debugfs_fsdata instances")
  - Following advice from Paul E. McKenney, make debugfs_file_get()
    release the RCU read section inbetween retry loop iterations.
  - Fix a race in debugfs_file_get()'s path handling a concurrent
    debugfs_file_put(): the former must not "help out resetting ->d_fsdata"
    because this can wipe out another debugfs_file_get()'s achievements.


[1] http://lkml.kernel.org/r/1490280886.2766.4.camel@sipsolutions.net
[2] https://lkml.kernel.org/r/CAOjnSCYGprej+vEEsSXwr=wO+eWLe2d6sHQYTpp-DFpQ3pmguw@mail.gmail.com

Nicolai Stange (8):
  debugfs: add support for more elaborate ->d_fsdata
  debugfs: implement per-file removal protection
  debugfs: debugfs_real_fops(): drop __must_hold sparse annotation
  debugfs: convert to debugfs_file_get() and -put()
  IB/hfi1: convert to debugfs_file_get() and -put()
  debugfs: purge obsolete SRCU based removal protection
  debugfs: call debugfs_real_fops() only after debugfs_file_get()
  debugfs: defer debugfs_fsdata allocation to first usage

 drivers/infiniband/hw/hfi1/debugfs.c |  20 ++--
 fs/debugfs/file.c                    | 210 +++++++++++++++++++++--------------
 fs/debugfs/inode.c                   |  56 +++++++---
 fs/debugfs/internal.h                |  14 +++
 include/linux/debugfs.h              |  33 +-----
 lib/Kconfig.debug                    |   1 -
 6 files changed, 196 insertions(+), 138 deletions(-)

-- 
2.13.6

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 1/8] debugfs: add support for more elaborate ->d_fsdata
  2017-10-30 23:15             ` Nicolai Stange
  (?)
@ 2017-10-30 23:15             ` Nicolai Stange
  -1 siblings, 0 replies; 18+ messages in thread
From: Nicolai Stange @ 2017-10-30 23:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Johannes Berg, Paul E.McKenney, Tyler Hall, Mike Marciniszyn,
	Dennis Dalessandro, Doug Ledford, Sean Hefty, Hal Rosenstock,
	linux-rdma, linux-kernel, Nicolai Stange

Currently, the user provided fops, "real_fops", are stored directly into
->d_fsdata.

In order to be able to store more per-file state and thus prepare for more
granular file removal protection, wrap the real_fops into a dynamically
allocated container struct, debugfs_fsdata.

A struct debugfs_fsdata gets allocated at file creation and freed from the
newly intoduced ->d_release().

Finally, move the implementation of debugfs_real_fops() out of the public
debugfs header such that struct debugfs_fsdata's declaration can be kept
private.

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
---
 fs/debugfs/file.c       | 12 ++++++++++++
 fs/debugfs/inode.c      | 22 +++++++++++++++++++---
 fs/debugfs/internal.h   |  4 ++++
 include/linux/debugfs.h | 20 +++-----------------
 4 files changed, 38 insertions(+), 20 deletions(-)

diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
index 6dabc4a10396..b6f5ddab66bf 100644
--- a/fs/debugfs/file.c
+++ b/fs/debugfs/file.c
@@ -97,6 +97,18 @@ EXPORT_SYMBOL_GPL(debugfs_use_file_finish);
 
 #define F_DENTRY(filp) ((filp)->f_path.dentry)
 
+const struct file_operations *debugfs_real_fops(const struct file *filp)
+	__must_hold(&debugfs_srcu)
+{
+	struct debugfs_fsdata *fsd = F_DENTRY(filp)->d_fsdata;
+	/*
+	 * Neither the pointer to the struct file_operations, nor its
+	 * contents ever change -- srcu_dereference() is not needed here.
+	 */
+	return fsd->real_fops;
+}
+EXPORT_SYMBOL_GPL(debugfs_real_fops);
+
 static int open_proxy_open(struct inode *inode, struct file *filp)
 {
 	const struct dentry *dentry = F_DENTRY(filp);
diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index c59f015f386e..a9c3d3e9af39 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -185,6 +185,11 @@ static const struct super_operations debugfs_super_operations = {
 	.evict_inode	= debugfs_evict_inode,
 };
 
+static void debugfs_release_dentry(struct dentry *dentry)
+{
+	kfree(dentry->d_fsdata);
+}
+
 static struct vfsmount *debugfs_automount(struct path *path)
 {
 	debugfs_automount_t f;
@@ -194,6 +199,7 @@ static struct vfsmount *debugfs_automount(struct path *path)
 
 static const struct dentry_operations debugfs_dops = {
 	.d_delete = always_delete_dentry,
+	.d_release = debugfs_release_dentry,
 	.d_automount = debugfs_automount,
 };
 
@@ -341,24 +347,34 @@ static struct dentry *__debugfs_create_file(const char *name, umode_t mode,
 {
 	struct dentry *dentry;
 	struct inode *inode;
+	struct debugfs_fsdata *fsd;
+
+	fsd = kmalloc(sizeof(*fsd), GFP_KERNEL);
+	if (!fsd)
+		return NULL;
 
 	if (!(mode & S_IFMT))
 		mode |= S_IFREG;
 	BUG_ON(!S_ISREG(mode));
 	dentry = start_creating(name, parent);
 
-	if (IS_ERR(dentry))
+	if (IS_ERR(dentry)) {
+		kfree(fsd);
 		return NULL;
+	}
 
 	inode = debugfs_get_inode(dentry->d_sb);
-	if (unlikely(!inode))
+	if (unlikely(!inode)) {
+		kfree(fsd);
 		return failed_creating(dentry);
+	}
 
 	inode->i_mode = mode;
 	inode->i_private = data;
 
 	inode->i_fop = proxy_fops;
-	dentry->d_fsdata = (void *)real_fops;
+	fsd->real_fops = real_fops;
+	dentry->d_fsdata = fsd;
 
 	d_instantiate(dentry, inode);
 	fsnotify_create(d_inode(dentry->d_parent), dentry);
diff --git a/fs/debugfs/internal.h b/fs/debugfs/internal.h
index b3e8443a1f47..512601eed3ce 100644
--- a/fs/debugfs/internal.h
+++ b/fs/debugfs/internal.h
@@ -19,4 +19,8 @@ extern const struct file_operations debugfs_noop_file_operations;
 extern const struct file_operations debugfs_open_proxy_file_operations;
 extern const struct file_operations debugfs_full_proxy_file_operations;
 
+struct debugfs_fsdata {
+	const struct file_operations *real_fops;
+};
+
 #endif /* _DEBUGFS_INTERNAL_H_ */
diff --git a/include/linux/debugfs.h b/include/linux/debugfs.h
index b93efc8feecd..cbee5f4a02a3 100644
--- a/include/linux/debugfs.h
+++ b/include/linux/debugfs.h
@@ -45,23 +45,6 @@ extern struct dentry *arch_debugfs_dir;
 
 extern struct srcu_struct debugfs_srcu;
 
-/**
- * debugfs_real_fops - getter for the real file operation
- * @filp: a pointer to a struct file
- *
- * Must only be called under the protection established by
- * debugfs_use_file_start().
- */
-static inline const struct file_operations *debugfs_real_fops(const struct file *filp)
-	__must_hold(&debugfs_srcu)
-{
-	/*
-	 * Neither the pointer to the struct file_operations, nor its
-	 * contents ever change -- srcu_dereference() is not needed here.
-	 */
-	return filp->f_path.dentry->d_fsdata;
-}
-
 #define DEFINE_DEBUGFS_ATTRIBUTE(__fops, __get, __set, __fmt)		\
 static int __fops ## _open(struct inode *inode, struct file *file)	\
 {									\
@@ -112,6 +95,9 @@ int debugfs_use_file_start(const struct dentry *dentry, int *srcu_idx)
 
 void debugfs_use_file_finish(int srcu_idx) __releases(&debugfs_srcu);
 
+const struct file_operations *debugfs_real_fops(const struct file *filp)
+	__must_hold(&debugfs_srcu);
+
 ssize_t debugfs_attr_read(struct file *file, char __user *buf,
 			size_t len, loff_t *ppos);
 ssize_t debugfs_attr_write(struct file *file, const char __user *buf,
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 2/8] debugfs: implement per-file removal protection
  2017-10-30 23:15             ` Nicolai Stange
  (?)
  (?)
@ 2017-10-30 23:15             ` Nicolai Stange
  -1 siblings, 0 replies; 18+ messages in thread
From: Nicolai Stange @ 2017-10-30 23:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Johannes Berg, Paul E.McKenney, Tyler Hall, Mike Marciniszyn,
	Dennis Dalessandro, Doug Ledford, Sean Hefty, Hal Rosenstock,
	linux-rdma, linux-kernel, Nicolai Stange

Since commit 49d200deaa68 ("debugfs: prevent access to removed files'
private data"), accesses to a file's private data are protected from
concurrent removal by covering all file_operations with a SRCU read section
and sychronizing with those before returning from debugfs_remove() by means
of synchronize_srcu().

As pointed out by Johannes Berg, there are debugfs files with forever
blocking file_operations. Their corresponding SRCU read side sections would
block any debugfs_remove() forever as well, even unrelated ones. This
results in a livelock. Because a remover can't cancel any indefinite
blocking within foreign files, this is a problem.

Resolve this by introducing support for more granular protection on a
per-file basis.

This is implemented by introducing an  'active_users' refcount_t to the
per-file struct debugfs_fsdata state. At file creation time, it is set to
one and a debugfs_remove() will drop that initial reference. The new
debugfs_file_get() and debugfs_file_put(), intended to be used in place of
former debugfs_use_file_start() and debugfs_use_file_finish(), increment
and decrement it respectively. Once the count drops to zero,
debugfs_file_put() will signal a completion which is possibly being waited
for from debugfs_remove().
Thus, as long as there is a debugfs_file_get() not yet matched by a
corresponding debugfs_file_put() around, debugfs_remove() will block.

Actual users of debugfs_use_file_start() and -finish() will get converted
to the new debugfs_file_get() and debugfs_file_put() by followup patches.

Fixes: 49d200deaa68 ("debugfs: prevent access to removed files' private
                      data")
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
---
 fs/debugfs/file.c       | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 fs/debugfs/inode.c      | 29 +++++++++++++++++++++++------
 fs/debugfs/internal.h   |  2 ++
 include/linux/debugfs.h | 11 +++++++++++
 4 files changed, 84 insertions(+), 6 deletions(-)

diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
index b6f5ddab66bf..6644bfdea2f8 100644
--- a/fs/debugfs/file.c
+++ b/fs/debugfs/file.c
@@ -109,6 +109,54 @@ const struct file_operations *debugfs_real_fops(const struct file *filp)
 }
 EXPORT_SYMBOL_GPL(debugfs_real_fops);
 
+/**
+ * debugfs_file_get - mark the beginning of file data access
+ * @dentry: the dentry object whose data is being accessed.
+ *
+ * Up to a matching call to debugfs_file_put(), any successive call
+ * into the file removing functions debugfs_remove() and
+ * debugfs_remove_recursive() will block. Since associated private
+ * file data may only get freed after a successful return of any of
+ * the removal functions, you may safely access it after a successful
+ * call to debugfs_file_get() without worrying about lifetime issues.
+ *
+ * If -%EIO is returned, the file has already been removed and thus,
+ * it is not safe to access any of its data. If, on the other hand,
+ * it is allowed to access the file data, zero is returned.
+ */
+int debugfs_file_get(struct dentry *dentry)
+{
+	struct debugfs_fsdata *fsd = dentry->d_fsdata;
+
+	/* Avoid starvation of removers. */
+	if (d_unlinked(dentry))
+		return -EIO;
+
+	if (!refcount_inc_not_zero(&fsd->active_users))
+		return -EIO;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(debugfs_file_get);
+
+/**
+ * debugfs_file_put - mark the end of file data access
+ * @dentry: the dentry object formerly passed to
+ *          debugfs_file_get().
+ *
+ * Allow any ongoing concurrent call into debugfs_remove() or
+ * debugfs_remove_recursive() blocked by a former call to
+ * debugfs_file_get() to proceed and return to its caller.
+ */
+void debugfs_file_put(struct dentry *dentry)
+{
+	struct debugfs_fsdata *fsd = dentry->d_fsdata;
+
+	if (refcount_dec_and_test(&fsd->active_users))
+		complete(&fsd->active_users_drained);
+}
+EXPORT_SYMBOL_GPL(debugfs_file_put);
+
 static int open_proxy_open(struct inode *inode, struct file *filp)
 {
 	const struct dentry *dentry = F_DENTRY(filp);
diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index a9c3d3e9af39..6449eb935540 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -374,6 +374,7 @@ static struct dentry *__debugfs_create_file(const char *name, umode_t mode,
 
 	inode->i_fop = proxy_fops;
 	fsd->real_fops = real_fops;
+	refcount_set(&fsd->active_users, 1);
 	dentry->d_fsdata = fsd;
 
 	d_instantiate(dentry, inode);
@@ -631,18 +632,34 @@ struct dentry *debugfs_create_symlink(const char *name, struct dentry *parent,
 }
 EXPORT_SYMBOL_GPL(debugfs_create_symlink);
 
+static void __debugfs_remove_file(struct dentry *dentry, struct dentry *parent)
+{
+	struct debugfs_fsdata *fsd;
+
+	simple_unlink(d_inode(parent), dentry);
+	d_delete(dentry);
+	fsd = dentry->d_fsdata;
+	init_completion(&fsd->active_users_drained);
+	if (!refcount_dec_and_test(&fsd->active_users))
+		wait_for_completion(&fsd->active_users_drained);
+}
+
 static int __debugfs_remove(struct dentry *dentry, struct dentry *parent)
 {
 	int ret = 0;
 
 	if (simple_positive(dentry)) {
 		dget(dentry);
-		if (d_is_dir(dentry))
-			ret = simple_rmdir(d_inode(parent), dentry);
-		else
-			simple_unlink(d_inode(parent), dentry);
-		if (!ret)
-			d_delete(dentry);
+		if (!d_is_reg(dentry)) {
+			if (d_is_dir(dentry))
+				ret = simple_rmdir(d_inode(parent), dentry);
+			else
+				simple_unlink(d_inode(parent), dentry);
+			if (!ret)
+				d_delete(dentry);
+		} else {
+			__debugfs_remove_file(dentry, parent);
+		}
 		dput(dentry);
 	}
 	return ret;
diff --git a/fs/debugfs/internal.h b/fs/debugfs/internal.h
index 512601eed3ce..0eea99432840 100644
--- a/fs/debugfs/internal.h
+++ b/fs/debugfs/internal.h
@@ -21,6 +21,8 @@ extern const struct file_operations debugfs_full_proxy_file_operations;
 
 struct debugfs_fsdata {
 	const struct file_operations *real_fops;
+	refcount_t active_users;
+	struct completion active_users_drained;
 };
 
 #endif /* _DEBUGFS_INTERNAL_H_ */
diff --git a/include/linux/debugfs.h b/include/linux/debugfs.h
index cbee5f4a02a3..3b914d588148 100644
--- a/include/linux/debugfs.h
+++ b/include/linux/debugfs.h
@@ -98,6 +98,9 @@ void debugfs_use_file_finish(int srcu_idx) __releases(&debugfs_srcu);
 const struct file_operations *debugfs_real_fops(const struct file *filp)
 	__must_hold(&debugfs_srcu);
 
+int debugfs_file_get(struct dentry *dentry);
+void debugfs_file_put(struct dentry *dentry);
+
 ssize_t debugfs_attr_read(struct file *file, char __user *buf,
 			size_t len, loff_t *ppos);
 ssize_t debugfs_attr_write(struct file *file, const char __user *buf,
@@ -236,6 +239,14 @@ static inline void debugfs_use_file_finish(int srcu_idx)
 	__releases(&debugfs_srcu)
 { }
 
+static inline int debugfs_file_get(struct dentry *dentry)
+{
+	return 0;
+}
+
+static inline void debugfs_file_put(struct dentry *dentry)
+{ }
+
 static inline ssize_t debugfs_attr_read(struct file *file, char __user *buf,
 					size_t len, loff_t *ppos)
 {
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 3/8] debugfs: debugfs_real_fops(): drop __must_hold sparse annotation
  2017-10-30 23:15             ` Nicolai Stange
                               ` (2 preceding siblings ...)
  (?)
@ 2017-10-30 23:15             ` Nicolai Stange
  -1 siblings, 0 replies; 18+ messages in thread
From: Nicolai Stange @ 2017-10-30 23:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Johannes Berg, Paul E.McKenney, Tyler Hall, Mike Marciniszyn,
	Dennis Dalessandro, Doug Ledford, Sean Hefty, Hal Rosenstock,
	linux-rdma, linux-kernel, Nicolai Stange

Currently, debugfs_real_fops() is annotated with a
__must_hold(&debugfs_srcu) sparse annotation.

With the conversion of the SRCU based protection of users against
concurrent file removals to a per-file refcount based scheme, this becomes
wrong.

Drop this annotation.

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
---
 fs/debugfs/file.c       | 6 +-----
 include/linux/debugfs.h | 3 +--
 2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
index 6644bfdea2f8..08511678b782 100644
--- a/fs/debugfs/file.c
+++ b/fs/debugfs/file.c
@@ -98,13 +98,9 @@ EXPORT_SYMBOL_GPL(debugfs_use_file_finish);
 #define F_DENTRY(filp) ((filp)->f_path.dentry)
 
 const struct file_operations *debugfs_real_fops(const struct file *filp)
-	__must_hold(&debugfs_srcu)
 {
 	struct debugfs_fsdata *fsd = F_DENTRY(filp)->d_fsdata;
-	/*
-	 * Neither the pointer to the struct file_operations, nor its
-	 * contents ever change -- srcu_dereference() is not needed here.
-	 */
+
 	return fsd->real_fops;
 }
 EXPORT_SYMBOL_GPL(debugfs_real_fops);
diff --git a/include/linux/debugfs.h b/include/linux/debugfs.h
index 3b914d588148..c5eda259b9d6 100644
--- a/include/linux/debugfs.h
+++ b/include/linux/debugfs.h
@@ -95,8 +95,7 @@ int debugfs_use_file_start(const struct dentry *dentry, int *srcu_idx)
 
 void debugfs_use_file_finish(int srcu_idx) __releases(&debugfs_srcu);
 
-const struct file_operations *debugfs_real_fops(const struct file *filp)
-	__must_hold(&debugfs_srcu);
+const struct file_operations *debugfs_real_fops(const struct file *filp);
 
 int debugfs_file_get(struct dentry *dentry);
 void debugfs_file_put(struct dentry *dentry);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 4/8] debugfs: convert to debugfs_file_get() and -put()
  2017-10-30 23:15             ` Nicolai Stange
                               ` (3 preceding siblings ...)
  (?)
@ 2017-10-30 23:15             ` Nicolai Stange
  -1 siblings, 0 replies; 18+ messages in thread
From: Nicolai Stange @ 2017-10-30 23:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Johannes Berg, Paul E.McKenney, Tyler Hall, Mike Marciniszyn,
	Dennis Dalessandro, Doug Ledford, Sean Hefty, Hal Rosenstock,
	linux-rdma, linux-kernel, Nicolai Stange

Convert all calls to the now obsolete debugfs_use_file_start() and
debugfs_use_file_finish() from the debugfs core itself to the new
debugfs_file_get() and debugfs_file_put() API.

Fixes: 49d200deaa68 ("debugfs: prevent access to removed files' private
                      data")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
---
 fs/debugfs/file.c | 106 ++++++++++++++++++++++++++----------------------------
 1 file changed, 50 insertions(+), 56 deletions(-)

diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
index 08511678b782..d3a972b45ff0 100644
--- a/fs/debugfs/file.c
+++ b/fs/debugfs/file.c
@@ -155,15 +155,12 @@ EXPORT_SYMBOL_GPL(debugfs_file_put);
 
 static int open_proxy_open(struct inode *inode, struct file *filp)
 {
-	const struct dentry *dentry = F_DENTRY(filp);
+	struct dentry *dentry = F_DENTRY(filp);
 	const struct file_operations *real_fops = NULL;
-	int srcu_idx, r;
+	int r = 0;
 
-	r = debugfs_use_file_start(dentry, &srcu_idx);
-	if (r) {
-		r = -ENOENT;
-		goto out;
-	}
+	if (debugfs_file_get(dentry))
+		return -ENOENT;
 
 	real_fops = debugfs_real_fops(filp);
 	real_fops = fops_get(real_fops);
@@ -180,7 +177,7 @@ static int open_proxy_open(struct inode *inode, struct file *filp)
 		r = real_fops->open(inode, filp);
 
 out:
-	debugfs_use_file_finish(srcu_idx);
+	debugfs_file_put(dentry);
 	return r;
 }
 
@@ -194,16 +191,16 @@ const struct file_operations debugfs_open_proxy_file_operations = {
 #define FULL_PROXY_FUNC(name, ret_type, filp, proto, args)		\
 static ret_type full_proxy_ ## name(proto)				\
 {									\
-	const struct dentry *dentry = F_DENTRY(filp);			\
+	struct dentry *dentry = F_DENTRY(filp);			\
 	const struct file_operations *real_fops =			\
 		debugfs_real_fops(filp);				\
-	int srcu_idx;							\
 	ret_type r;							\
 									\
-	r = debugfs_use_file_start(dentry, &srcu_idx);			\
-	if (likely(!r))						\
-		r = real_fops->name(args);				\
-	debugfs_use_file_finish(srcu_idx);				\
+	r = debugfs_file_get(dentry);					\
+	if (unlikely(r))						\
+		return r;						\
+	r = real_fops->name(args);					\
+	debugfs_file_put(dentry);					\
 	return r;							\
 }
 
@@ -228,18 +225,15 @@ FULL_PROXY_FUNC(unlocked_ioctl, long, filp,
 static unsigned int full_proxy_poll(struct file *filp,
 				struct poll_table_struct *wait)
 {
-	const struct dentry *dentry = F_DENTRY(filp);
 	const struct file_operations *real_fops = debugfs_real_fops(filp);
-	int srcu_idx;
+	struct dentry *dentry = F_DENTRY(filp);
 	unsigned int r = 0;
 
-	if (debugfs_use_file_start(dentry, &srcu_idx)) {
-		debugfs_use_file_finish(srcu_idx);
+	if (debugfs_file_get(dentry))
 		return POLLHUP;
-	}
 
 	r = real_fops->poll(filp, wait);
-	debugfs_use_file_finish(srcu_idx);
+	debugfs_file_put(dentry);
 	return r;
 }
 
@@ -283,16 +277,13 @@ static void __full_proxy_fops_init(struct file_operations *proxy_fops,
 
 static int full_proxy_open(struct inode *inode, struct file *filp)
 {
-	const struct dentry *dentry = F_DENTRY(filp);
+	struct dentry *dentry = F_DENTRY(filp);
 	const struct file_operations *real_fops = NULL;
 	struct file_operations *proxy_fops = NULL;
-	int srcu_idx, r;
+	int r = 0;
 
-	r = debugfs_use_file_start(dentry, &srcu_idx);
-	if (r) {
-		r = -ENOENT;
-		goto out;
-	}
+	if (debugfs_file_get(dentry))
+		return -ENOENT;
 
 	real_fops = debugfs_real_fops(filp);
 	real_fops = fops_get(real_fops);
@@ -330,7 +321,7 @@ static int full_proxy_open(struct inode *inode, struct file *filp)
 	kfree(proxy_fops);
 	fops_put(real_fops);
 out:
-	debugfs_use_file_finish(srcu_idx);
+	debugfs_file_put(dentry);
 	return r;
 }
 
@@ -341,13 +332,14 @@ const struct file_operations debugfs_full_proxy_file_operations = {
 ssize_t debugfs_attr_read(struct file *file, char __user *buf,
 			size_t len, loff_t *ppos)
 {
+	struct dentry *dentry = F_DENTRY(file);
 	ssize_t ret;
-	int srcu_idx;
 
-	ret = debugfs_use_file_start(F_DENTRY(file), &srcu_idx);
-	if (likely(!ret))
-		ret = simple_attr_read(file, buf, len, ppos);
-	debugfs_use_file_finish(srcu_idx);
+	ret = debugfs_file_get(dentry);
+	if (unlikely(ret))
+		return ret;
+	ret = simple_attr_read(file, buf, len, ppos);
+	debugfs_file_put(dentry);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(debugfs_attr_read);
@@ -355,13 +347,14 @@ EXPORT_SYMBOL_GPL(debugfs_attr_read);
 ssize_t debugfs_attr_write(struct file *file, const char __user *buf,
 			 size_t len, loff_t *ppos)
 {
+	struct dentry *dentry = F_DENTRY(file);
 	ssize_t ret;
-	int srcu_idx;
 
-	ret = debugfs_use_file_start(F_DENTRY(file), &srcu_idx);
-	if (likely(!ret))
-		ret = simple_attr_write(file, buf, len, ppos);
-	debugfs_use_file_finish(srcu_idx);
+	ret = debugfs_file_get(dentry);
+	if (unlikely(ret))
+		return ret;
+	ret = simple_attr_write(file, buf, len, ppos);
+	debugfs_file_put(dentry);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(debugfs_attr_write);
@@ -795,14 +788,14 @@ ssize_t debugfs_read_file_bool(struct file *file, char __user *user_buf,
 {
 	char buf[3];
 	bool val;
-	int r, srcu_idx;
+	int r;
+	struct dentry *dentry = F_DENTRY(file);
 
-	r = debugfs_use_file_start(F_DENTRY(file), &srcu_idx);
-	if (likely(!r))
-		val = *(bool *)file->private_data;
-	debugfs_use_file_finish(srcu_idx);
-	if (r)
+	r = debugfs_file_get(dentry);
+	if (unlikely(r))
 		return r;
+	val = *(bool *)file->private_data;
+	debugfs_file_put(dentry);
 
 	if (val)
 		buf[0] = 'Y';
@@ -820,8 +813,9 @@ ssize_t debugfs_write_file_bool(struct file *file, const char __user *user_buf,
 	char buf[32];
 	size_t buf_size;
 	bool bv;
-	int r, srcu_idx;
+	int r;
 	bool *val = file->private_data;
+	struct dentry *dentry = F_DENTRY(file);
 
 	buf_size = min(count, (sizeof(buf)-1));
 	if (copy_from_user(buf, user_buf, buf_size))
@@ -829,12 +823,11 @@ ssize_t debugfs_write_file_bool(struct file *file, const char __user *user_buf,
 
 	buf[buf_size] = '\0';
 	if (strtobool(buf, &bv) == 0) {
-		r = debugfs_use_file_start(F_DENTRY(file), &srcu_idx);
-		if (likely(!r))
-			*val = bv;
-		debugfs_use_file_finish(srcu_idx);
-		if (r)
+		r = debugfs_file_get(dentry);
+		if (unlikely(r))
 			return r;
+		*val = bv;
+		debugfs_file_put(dentry);
 	}
 
 	return count;
@@ -896,14 +889,15 @@ static ssize_t read_file_blob(struct file *file, char __user *user_buf,
 			      size_t count, loff_t *ppos)
 {
 	struct debugfs_blob_wrapper *blob = file->private_data;
+	struct dentry *dentry = F_DENTRY(file);
 	ssize_t r;
-	int srcu_idx;
 
-	r = debugfs_use_file_start(F_DENTRY(file), &srcu_idx);
-	if (likely(!r))
-		r = simple_read_from_buffer(user_buf, count, ppos, blob->data,
-					blob->size);
-	debugfs_use_file_finish(srcu_idx);
+	r = debugfs_file_get(dentry);
+	if (unlikely(r))
+		return r;
+	r = simple_read_from_buffer(user_buf, count, ppos, blob->data,
+				blob->size);
+	debugfs_file_put(dentry);
 	return r;
 }
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 5/8] IB/hfi1: convert to debugfs_file_get() and -put()
  2017-10-30 23:15             ` Nicolai Stange
                               ` (4 preceding siblings ...)
  (?)
@ 2017-10-30 23:15             ` Nicolai Stange
       [not found]               ` <20171030231554.6200-6-nicstange-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  -1 siblings, 1 reply; 18+ messages in thread
From: Nicolai Stange @ 2017-10-30 23:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Johannes Berg, Paul E.McKenney, Tyler Hall, Mike Marciniszyn,
	Dennis Dalessandro, Doug Ledford, Sean Hefty, Hal Rosenstock,
	linux-rdma, linux-kernel, Nicolai Stange

Convert all calls to the now obsolete debugfs_use_file_start() and
debugfs_use_file_finish() to the new debugfs_file_get() and
debugfs_file_put() API.

Fixes: 49d200deaa68 ("debugfs: prevent access to removed files' private
                      data")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
---
 drivers/infiniband/hw/hfi1/debugfs.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/debugfs.c b/drivers/infiniband/hw/hfi1/debugfs.c
index 24128f4e5748..ca03d8b23851 100644
--- a/drivers/infiniband/hw/hfi1/debugfs.c
+++ b/drivers/infiniband/hw/hfi1/debugfs.c
@@ -71,13 +71,13 @@ static ssize_t hfi1_seq_read(
 	loff_t *ppos)
 {
 	struct dentry *d = file->f_path.dentry;
-	int srcu_idx;
 	ssize_t r;
 
-	r = debugfs_use_file_start(d, &srcu_idx);
-	if (likely(!r))
-		r = seq_read(file, buf, size, ppos);
-	debugfs_use_file_finish(srcu_idx);
+	r = debugfs_file_get(d);
+	if (unlikely(r))
+		return r;
+	r = seq_read(file, buf, size, ppos);
+	debugfs_file_put(d);
 	return r;
 }
 
@@ -87,13 +87,13 @@ static loff_t hfi1_seq_lseek(
 	int whence)
 {
 	struct dentry *d = file->f_path.dentry;
-	int srcu_idx;
 	loff_t r;
 
-	r = debugfs_use_file_start(d, &srcu_idx);
-	if (likely(!r))
-		r = seq_lseek(file, offset, whence);
-	debugfs_use_file_finish(srcu_idx);
+	r = debugfs_file_get(d);
+	if (unlikely(r))
+		return r;
+	r = seq_lseek(file, offset, whence);
+	debugfs_file_put(d);
 	return r;
 }
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 6/8] debugfs: purge obsolete SRCU based removal protection
  2017-10-30 23:15             ` Nicolai Stange
                               ` (5 preceding siblings ...)
  (?)
@ 2017-10-30 23:15             ` Nicolai Stange
  -1 siblings, 0 replies; 18+ messages in thread
From: Nicolai Stange @ 2017-10-30 23:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Johannes Berg, Paul E.McKenney, Tyler Hall, Mike Marciniszyn,
	Dennis Dalessandro, Doug Ledford, Sean Hefty, Hal Rosenstock,
	linux-rdma, linux-kernel, Nicolai Stange

Purge the SRCU based file removal race protection in favour of the new,
refcount based debugfs_file_get()/debugfs_file_put() API.

Fixes: 49d200deaa68 ("debugfs: prevent access to removed files' private
                      data")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
---
 fs/debugfs/file.c       | 48 ------------------------------------------------
 fs/debugfs/inode.c      |  7 -------
 include/linux/debugfs.h | 19 -------------------
 lib/Kconfig.debug       |  1 -
 4 files changed, 75 deletions(-)

diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
index d3a972b45ff0..53f5c9a2af88 100644
--- a/fs/debugfs/file.c
+++ b/fs/debugfs/file.c
@@ -22,7 +22,6 @@
 #include <linux/slab.h>
 #include <linux/atomic.h>
 #include <linux/device.h>
-#include <linux/srcu.h>
 #include <asm/poll.h>
 
 #include "internal.h"
@@ -48,53 +47,6 @@ const struct file_operations debugfs_noop_file_operations = {
 	.llseek =	noop_llseek,
 };
 
-/**
- * debugfs_use_file_start - mark the beginning of file data access
- * @dentry: the dentry object whose data is being accessed.
- * @srcu_idx: a pointer to some memory to store a SRCU index in.
- *
- * Up to a matching call to debugfs_use_file_finish(), any
- * successive call into the file removing functions debugfs_remove()
- * and debugfs_remove_recursive() will block. Since associated private
- * file data may only get freed after a successful return of any of
- * the removal functions, you may safely access it after a successful
- * call to debugfs_use_file_start() without worrying about
- * lifetime issues.
- *
- * If -%EIO is returned, the file has already been removed and thus,
- * it is not safe to access any of its data. If, on the other hand,
- * it is allowed to access the file data, zero is returned.
- *
- * Regardless of the return code, any call to
- * debugfs_use_file_start() must be followed by a matching call
- * to debugfs_use_file_finish().
- */
-int debugfs_use_file_start(const struct dentry *dentry, int *srcu_idx)
-	__acquires(&debugfs_srcu)
-{
-	*srcu_idx = srcu_read_lock(&debugfs_srcu);
-	barrier();
-	if (d_unlinked(dentry))
-		return -EIO;
-	return 0;
-}
-EXPORT_SYMBOL_GPL(debugfs_use_file_start);
-
-/**
- * debugfs_use_file_finish - mark the end of file data access
- * @srcu_idx: the SRCU index "created" by a former call to
- *            debugfs_use_file_start().
- *
- * Allow any ongoing concurrent call into debugfs_remove() or
- * debugfs_remove_recursive() blocked by a former call to
- * debugfs_use_file_start() to proceed and return to its caller.
- */
-void debugfs_use_file_finish(int srcu_idx) __releases(&debugfs_srcu)
-{
-	srcu_read_unlock(&debugfs_srcu, srcu_idx);
-}
-EXPORT_SYMBOL_GPL(debugfs_use_file_finish);
-
 #define F_DENTRY(filp) ((filp)->f_path.dentry)
 
 const struct file_operations *debugfs_real_fops(const struct file *filp)
diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index 6449eb935540..f587aded46b5 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -27,14 +27,11 @@
 #include <linux/parser.h>
 #include <linux/magic.h>
 #include <linux/slab.h>
-#include <linux/srcu.h>
 
 #include "internal.h"
 
 #define DEBUGFS_DEFAULT_MODE	0700
 
-DEFINE_SRCU(debugfs_srcu);
-
 static struct vfsmount *debugfs_mount;
 static int debugfs_mount_count;
 static bool debugfs_registered;
@@ -693,8 +690,6 @@ void debugfs_remove(struct dentry *dentry)
 	inode_unlock(d_inode(parent));
 	if (!ret)
 		simple_release_fs(&debugfs_mount, &debugfs_mount_count);
-
-	synchronize_srcu(&debugfs_srcu);
 }
 EXPORT_SYMBOL_GPL(debugfs_remove);
 
@@ -768,8 +763,6 @@ void debugfs_remove_recursive(struct dentry *dentry)
 	if (!__debugfs_remove(child, parent))
 		simple_release_fs(&debugfs_mount, &debugfs_mount_count);
 	inode_unlock(d_inode(parent));
-
-	synchronize_srcu(&debugfs_srcu);
 }
 EXPORT_SYMBOL_GPL(debugfs_remove_recursive);
 
diff --git a/include/linux/debugfs.h b/include/linux/debugfs.h
index c5eda259b9d6..9f821a43bb0d 100644
--- a/include/linux/debugfs.h
+++ b/include/linux/debugfs.h
@@ -23,7 +23,6 @@
 
 struct device;
 struct file_operations;
-struct srcu_struct;
 
 struct debugfs_blob_wrapper {
 	void *data;
@@ -43,8 +42,6 @@ struct debugfs_regset32 {
 
 extern struct dentry *arch_debugfs_dir;
 
-extern struct srcu_struct debugfs_srcu;
-
 #define DEFINE_DEBUGFS_ATTRIBUTE(__fops, __get, __set, __fmt)		\
 static int __fops ## _open(struct inode *inode, struct file *file)	\
 {									\
@@ -90,11 +87,6 @@ struct dentry *debugfs_create_automount(const char *name,
 void debugfs_remove(struct dentry *dentry);
 void debugfs_remove_recursive(struct dentry *dentry);
 
-int debugfs_use_file_start(const struct dentry *dentry, int *srcu_idx)
-	__acquires(&debugfs_srcu);
-
-void debugfs_use_file_finish(int srcu_idx) __releases(&debugfs_srcu);
-
 const struct file_operations *debugfs_real_fops(const struct file *filp);
 
 int debugfs_file_get(struct dentry *dentry);
@@ -227,17 +219,6 @@ static inline void debugfs_remove(struct dentry *dentry)
 static inline void debugfs_remove_recursive(struct dentry *dentry)
 { }
 
-static inline int debugfs_use_file_start(const struct dentry *dentry,
-					int *srcu_idx)
-	__acquires(&debugfs_srcu)
-{
-	return 0;
-}
-
-static inline void debugfs_use_file_finish(int srcu_idx)
-	__releases(&debugfs_srcu)
-{ }
-
 static inline int debugfs_file_get(struct dentry *dentry)
 {
 	return 0;
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 03e2ac7cceb6..f6140d5a50f1 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -326,7 +326,6 @@ config PAGE_OWNER
 
 config DEBUG_FS
 	bool "Debug Filesystem"
-	select SRCU
 	help
 	  debugfs is a virtual file system that kernel developers use to put
 	  debugging files into.  Enable this option to be able to read and
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 7/8] debugfs: call debugfs_real_fops() only after debugfs_file_get()
  2017-10-30 23:15             ` Nicolai Stange
                               ` (6 preceding siblings ...)
  (?)
@ 2017-10-30 23:15             ` Nicolai Stange
  -1 siblings, 0 replies; 18+ messages in thread
From: Nicolai Stange @ 2017-10-30 23:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Johannes Berg, Paul E.McKenney, Tyler Hall, Mike Marciniszyn,
	Dennis Dalessandro, Doug Ledford, Sean Hefty, Hal Rosenstock,
	linux-rdma, linux-kernel, Nicolai Stange

The current implementation of debugfs_real_fops() relies on a
debugfs_fsdata instance to be installed at ->d_fsdata.

With future patches introducing lazy allocation of these, this requirement
will be guaranteed to be fullfilled only inbetween a
debugfs_file_get()/debugfs_file_put() pair.

The full proxies' fops implemented by debugfs happen to be the only
offenders. Fix them up by moving their debugfs_real_fops() calls past those
to debugfs_file_get().

full_proxy_release() is special as it doesn't invoke debugfs_file_get() at
all. Leave it alone for now.

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
---
 fs/debugfs/file.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
index 53f5c9a2af88..bc3549c95574 100644
--- a/fs/debugfs/file.c
+++ b/fs/debugfs/file.c
@@ -144,13 +144,13 @@ const struct file_operations debugfs_open_proxy_file_operations = {
 static ret_type full_proxy_ ## name(proto)				\
 {									\
 	struct dentry *dentry = F_DENTRY(filp);			\
-	const struct file_operations *real_fops =			\
-		debugfs_real_fops(filp);				\
+	const struct file_operations *real_fops;			\
 	ret_type r;							\
 									\
 	r = debugfs_file_get(dentry);					\
 	if (unlikely(r))						\
 		return r;						\
+	real_fops = debugfs_real_fops(filp);				\
 	r = real_fops->name(args);					\
 	debugfs_file_put(dentry);					\
 	return r;							\
@@ -177,13 +177,14 @@ FULL_PROXY_FUNC(unlocked_ioctl, long, filp,
 static unsigned int full_proxy_poll(struct file *filp,
 				struct poll_table_struct *wait)
 {
-	const struct file_operations *real_fops = debugfs_real_fops(filp);
 	struct dentry *dentry = F_DENTRY(filp);
 	unsigned int r = 0;
+	const struct file_operations *real_fops;
 
 	if (debugfs_file_get(dentry))
 		return POLLHUP;
 
+	real_fops = debugfs_real_fops(filp);
 	r = real_fops->poll(filp, wait);
 	debugfs_file_put(dentry);
 	return r;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 8/8] debugfs: defer debugfs_fsdata allocation to first usage
  2017-10-30 23:15             ` Nicolai Stange
                               ` (7 preceding siblings ...)
  (?)
@ 2017-10-30 23:15             ` Nicolai Stange
  -1 siblings, 0 replies; 18+ messages in thread
From: Nicolai Stange @ 2017-10-30 23:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Johannes Berg, Paul E.McKenney, Tyler Hall, Mike Marciniszyn,
	Dennis Dalessandro, Doug Ledford, Sean Hefty, Hal Rosenstock,
	linux-rdma, linux-kernel, Nicolai Stange

Currently, __debugfs_create_file allocates one struct debugfs_fsdata
instance for every file created. However, there are potentially many
debugfs file around, most of which are never touched by userspace.

Thus, defer the allocations to the first usage, i.e. to the first
debugfs_file_get().

A dentry's ->d_fsdata starts out to point to the "real", user provided
fops. After a debugfs_fsdata instance has been allocated (and the real
fops pointer has been moved over into its ->real_fops member),
->d_fsdata is changed to point to it from then on. The two cases are
distinguished by setting BIT(0) for the real fops case.

struct debugfs_fsdata's foremost purpose is to track active users and to
make debugfs_remove() block until they are done. Since no debugfs_fsdata
instance means no active users, make debugfs_remove() return immediately
in this case.

Take care of possible races between debugfs_file_get() and
debugfs_remove(): either debugfs_remove() must see a debugfs_fsdata
instance and thus wait for possible active users or debugfs_file_get() must
see a dead dentry and return immediately.

Make a dentry's ->d_release(), i.e. debugfs_release_dentry(), check whether
->d_fsdata is actually a debugfs_fsdata instance before kfree()ing it.

Similarly, make debugfs_real_fops() check whether ->d_fsdata is actually
a debugfs_fsdata instance before returning it, otherwise emit a warning.

The set of possible error codes returned from debugfs_file_get() has grown
from -EIO to -EIO and -ENOMEM. Make open_proxy_open() and full_proxy_open()
pass the -ENOMEM onwards to their callers.

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
---
 fs/debugfs/file.c     | 55 ++++++++++++++++++++++++++++++++++++++++++---------
 fs/debugfs/inode.c    | 36 +++++++++++++++++----------------
 fs/debugfs/internal.h |  8 ++++++++
 3 files changed, 73 insertions(+), 26 deletions(-)

diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
index bc3549c95574..65872340e301 100644
--- a/fs/debugfs/file.c
+++ b/fs/debugfs/file.c
@@ -53,6 +53,15 @@ const struct file_operations *debugfs_real_fops(const struct file *filp)
 {
 	struct debugfs_fsdata *fsd = F_DENTRY(filp)->d_fsdata;
 
+	if ((unsigned long)fsd & DEBUGFS_FSDATA_IS_REAL_FOPS_BIT) {
+		/*
+		 * Urgh, we've been called w/o a protecting
+		 * debugfs_file_get().
+		 */
+		WARN_ON(1);
+		return NULL;
+	}
+
 	return fsd->real_fops;
 }
 EXPORT_SYMBOL_GPL(debugfs_real_fops);
@@ -74,9 +83,35 @@ EXPORT_SYMBOL_GPL(debugfs_real_fops);
  */
 int debugfs_file_get(struct dentry *dentry)
 {
-	struct debugfs_fsdata *fsd = dentry->d_fsdata;
+	struct debugfs_fsdata *fsd;
+	void *d_fsd;
+
+	d_fsd = READ_ONCE(dentry->d_fsdata);
+	if (!((unsigned long)d_fsd & DEBUGFS_FSDATA_IS_REAL_FOPS_BIT)) {
+		fsd = d_fsd;
+	} else {
+		fsd = kmalloc(sizeof(*fsd), GFP_KERNEL);
+		if (!fsd)
+			return -ENOMEM;
+
+		fsd->real_fops = (void *)((unsigned long)d_fsd &
+					~DEBUGFS_FSDATA_IS_REAL_FOPS_BIT);
+		refcount_set(&fsd->active_users, 1);
+		init_completion(&fsd->active_users_drained);
+		if (cmpxchg(&dentry->d_fsdata, d_fsd, fsd) != d_fsd) {
+			kfree(fsd);
+			fsd = READ_ONCE(dentry->d_fsdata);
+		}
+	}
 
-	/* Avoid starvation of removers. */
+	/*
+	 * In case of a successful cmpxchg() above, this check is
+	 * strictly necessary and must follow it, see the comment in
+	 * __debugfs_remove_file().
+	 * OTOH, if the cmpxchg() hasn't been executed or wasn't
+	 * successful, this serves the purpose of not starving
+	 * removers.
+	 */
 	if (d_unlinked(dentry))
 		return -EIO;
 
@@ -98,7 +133,7 @@ EXPORT_SYMBOL_GPL(debugfs_file_get);
  */
 void debugfs_file_put(struct dentry *dentry)
 {
-	struct debugfs_fsdata *fsd = dentry->d_fsdata;
+	struct debugfs_fsdata *fsd = READ_ONCE(dentry->d_fsdata);
 
 	if (refcount_dec_and_test(&fsd->active_users))
 		complete(&fsd->active_users_drained);
@@ -109,10 +144,11 @@ static int open_proxy_open(struct inode *inode, struct file *filp)
 {
 	struct dentry *dentry = F_DENTRY(filp);
 	const struct file_operations *real_fops = NULL;
-	int r = 0;
+	int r;
 
-	if (debugfs_file_get(dentry))
-		return -ENOENT;
+	r = debugfs_file_get(dentry);
+	if (r)
+		return r == -EIO ? -ENOENT : r;
 
 	real_fops = debugfs_real_fops(filp);
 	real_fops = fops_get(real_fops);
@@ -233,10 +269,11 @@ static int full_proxy_open(struct inode *inode, struct file *filp)
 	struct dentry *dentry = F_DENTRY(filp);
 	const struct file_operations *real_fops = NULL;
 	struct file_operations *proxy_fops = NULL;
-	int r = 0;
+	int r;
 
-	if (debugfs_file_get(dentry))
-		return -ENOENT;
+	r = debugfs_file_get(dentry);
+	if (r)
+		return r == -EIO ? -ENOENT : r;
 
 	real_fops = debugfs_real_fops(filp);
 	real_fops = fops_get(real_fops);
diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index f587aded46b5..9dca4da059b3 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -184,7 +184,10 @@ static const struct super_operations debugfs_super_operations = {
 
 static void debugfs_release_dentry(struct dentry *dentry)
 {
-	kfree(dentry->d_fsdata);
+	void *fsd = dentry->d_fsdata;
+
+	if (!((unsigned long)fsd & DEBUGFS_FSDATA_IS_REAL_FOPS_BIT))
+		kfree(dentry->d_fsdata);
 }
 
 static struct vfsmount *debugfs_automount(struct path *path)
@@ -344,35 +347,25 @@ static struct dentry *__debugfs_create_file(const char *name, umode_t mode,
 {
 	struct dentry *dentry;
 	struct inode *inode;
-	struct debugfs_fsdata *fsd;
-
-	fsd = kmalloc(sizeof(*fsd), GFP_KERNEL);
-	if (!fsd)
-		return NULL;
 
 	if (!(mode & S_IFMT))
 		mode |= S_IFREG;
 	BUG_ON(!S_ISREG(mode));
 	dentry = start_creating(name, parent);
 
-	if (IS_ERR(dentry)) {
-		kfree(fsd);
+	if (IS_ERR(dentry))
 		return NULL;
-	}
 
 	inode = debugfs_get_inode(dentry->d_sb);
-	if (unlikely(!inode)) {
-		kfree(fsd);
+	if (unlikely(!inode))
 		return failed_creating(dentry);
-	}
 
 	inode->i_mode = mode;
 	inode->i_private = data;
 
 	inode->i_fop = proxy_fops;
-	fsd->real_fops = real_fops;
-	refcount_set(&fsd->active_users, 1);
-	dentry->d_fsdata = fsd;
+	dentry->d_fsdata = (void *)((unsigned long)real_fops |
+				DEBUGFS_FSDATA_IS_REAL_FOPS_BIT);
 
 	d_instantiate(dentry, inode);
 	fsnotify_create(d_inode(dentry->d_parent), dentry);
@@ -635,8 +628,17 @@ static void __debugfs_remove_file(struct dentry *dentry, struct dentry *parent)
 
 	simple_unlink(d_inode(parent), dentry);
 	d_delete(dentry);
-	fsd = dentry->d_fsdata;
-	init_completion(&fsd->active_users_drained);
+
+	/*
+	 * Paired with the closing smp_mb() implied by a successful
+	 * cmpxchg() in debugfs_file_get(): either
+	 * debugfs_file_get() must see a dead dentry or we must see a
+	 * debugfs_fsdata instance at ->d_fsdata here (or both).
+	 */
+	smp_mb();
+	fsd = READ_ONCE(dentry->d_fsdata);
+	if ((unsigned long)fsd & DEBUGFS_FSDATA_IS_REAL_FOPS_BIT)
+		return;
 	if (!refcount_dec_and_test(&fsd->active_users))
 		wait_for_completion(&fsd->active_users_drained);
 }
diff --git a/fs/debugfs/internal.h b/fs/debugfs/internal.h
index 0eea99432840..cb1e8139c398 100644
--- a/fs/debugfs/internal.h
+++ b/fs/debugfs/internal.h
@@ -25,4 +25,12 @@ struct debugfs_fsdata {
 	struct completion active_users_drained;
 };
 
+/*
+ * A dentry's ->d_fsdata either points to the real fops or to a
+ * dynamically allocated debugfs_fsdata instance.
+ * In order to distinguish between these two cases, a real fops
+ * pointer gets its lowest bit set.
+ */
+#define DEBUGFS_FSDATA_IS_REAL_FOPS_BIT BIT(0)
+
 #endif /* _DEBUGFS_INTERNAL_H_ */
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 5/8] IB/hfi1: convert to debugfs_file_get() and -put()
  2017-10-30 23:15             ` [PATCH v3 5/8] IB/hfi1: " Nicolai Stange
@ 2017-11-07 14:29                   ` Dennis Dalessandro
  0 siblings, 0 replies; 18+ messages in thread
From: Dennis Dalessandro @ 2017-11-07 14:29 UTC (permalink / raw)
  To: Nicolai Stange, Greg Kroah-Hartman
  Cc: Johannes Berg, Paul E.McKenney, Tyler Hall, Mike Marciniszyn,
	Doug Ledford, Sean Hefty, Hal Rosenstock,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On 10/30/2017 7:15 PM, Nicolai Stange wrote:
> Convert all calls to the now obsolete debugfs_use_file_start() and
> debugfs_use_file_finish() to the new debugfs_file_get() and
> debugfs_file_put() API.
> 
> Fixes: 49d200deaa68 ("debugfs: prevent access to removed files' private
>                        data")
> Signed-off-by: Nicolai Stange <nicstange-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

Looks OK to me.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 5/8] IB/hfi1: convert to debugfs_file_get() and -put()
@ 2017-11-07 14:29                   ` Dennis Dalessandro
  0 siblings, 0 replies; 18+ messages in thread
From: Dennis Dalessandro @ 2017-11-07 14:29 UTC (permalink / raw)
  To: Nicolai Stange, Greg Kroah-Hartman
  Cc: Johannes Berg, Paul E.McKenney, Tyler Hall, Mike Marciniszyn,
	Doug Ledford, Sean Hefty, Hal Rosenstock, linux-rdma,
	linux-kernel

On 10/30/2017 7:15 PM, Nicolai Stange wrote:
> Convert all calls to the now obsolete debugfs_use_file_start() and
> debugfs_use_file_finish() to the new debugfs_file_get() and
> debugfs_file_put() API.
> 
> Fixes: 49d200deaa68 ("debugfs: prevent access to removed files' private
>                        data")
> Signed-off-by: Nicolai Stange <nicstange@gmail.com>

Looks OK to me.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-11-07 14:29 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-12 20:01 deadlock in debugfs synchronize_srcu() when unplugging USB Tyler Hall
2017-10-12 21:04 ` Paul E. McKenney
2017-10-16  7:51 ` Greg Kroah-Hartman
2017-10-16  8:15   ` Johannes Berg
2017-10-16  8:31     ` Greg Kroah-Hartman
2017-10-16 15:08       ` Tyler Hall
     [not found]         ` <CAOjnSCazvybpH5ZhHn=iTLm_dVGEdr=W1pUQv6zowRisB1MWTQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-10-30 23:15           ` [PATCH v3 0/8] debugfs: per-file removal protection Nicolai Stange
2017-10-30 23:15             ` Nicolai Stange
2017-10-30 23:15             ` [PATCH v3 1/8] debugfs: add support for more elaborate ->d_fsdata Nicolai Stange
2017-10-30 23:15             ` [PATCH v3 2/8] debugfs: implement per-file removal protection Nicolai Stange
2017-10-30 23:15             ` [PATCH v3 3/8] debugfs: debugfs_real_fops(): drop __must_hold sparse annotation Nicolai Stange
2017-10-30 23:15             ` [PATCH v3 4/8] debugfs: convert to debugfs_file_get() and -put() Nicolai Stange
2017-10-30 23:15             ` [PATCH v3 5/8] IB/hfi1: " Nicolai Stange
     [not found]               ` <20171030231554.6200-6-nicstange-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-11-07 14:29                 ` Dennis Dalessandro
2017-11-07 14:29                   ` Dennis Dalessandro
2017-10-30 23:15             ` [PATCH v3 6/8] debugfs: purge obsolete SRCU based removal protection Nicolai Stange
2017-10-30 23:15             ` [PATCH v3 7/8] debugfs: call debugfs_real_fops() only after debugfs_file_get() Nicolai Stange
2017-10-30 23:15             ` [PATCH v3 8/8] debugfs: defer debugfs_fsdata allocation to first usage Nicolai Stange

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.