Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] fput: Allow calling __fput_sync() from !PF_KTHREAD thread.
@ 2020-07-08 14:24 Tetsuo Handa
  2020-07-29 13:04 ` [PATCH v2] " Tetsuo Handa
  0 siblings, 1 reply; 8+ messages in thread
From: Tetsuo Handa @ 2020-07-08 14:24 UTC (permalink / raw)
  To: Al Viro; +Cc: Eric W . Biederman, linux-fsdevel, Tetsuo Handa

__fput_sync() was introduced by commit 4a9d4b024a3102fc ("switch fput to
task_work_add") with BUG_ON(!(current->flags & PF_KTHREAD)) check, and
the only user of __fput_sync() was introduced by commit 17c0a5aaffa63da6
("make acct_kill() wait for file closing."). However, the latter commit is
effectively calling __fput_sync() from !PF_KTHREAD thread because of
schedule_work() call followed by immediate wait_for_completion() call.
That is, there is no need to defer close_work() to a WQ context. I guess
that the reason to defer was nothing but to bypass this BUG_ON() check.
While we need to remain careful about calling __fput_sync(), we can remove
bypassable BUG_ON() check from __fput_sync().

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
Al, is this change acceptable?

Eric is trying to use fput()/flush_delayed_fput()/task_work_run() from
blob_to_mnt() which is going to be introduced by
https://lkml.kernel.org/r/20200702164140.4468-8-ebiederm@xmission.com
in order to make sure that a file (which was opened for writing and is
intended to be execve()d shortly) is closed by current thread before
leaving blob_to_mnt().

But since current thread might fail to find the interested file (which was
opened for writing and is intended to be execve()d shortly) and/or might find
uninterested files (which current thread does not need to process) when
multiple threads concurrently called flush_delayed_fput(), I think that we
should use __fput_sync() in order to make sure that only the interested file
is closed by current thread.

Therefore, I propose this change.

 fs/file_table.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index 656647f9575a..7c4125179469 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -359,20 +359,15 @@ void fput(struct file *file)
 }
 
 /*
- * synchronous analog of fput(); for kernel threads that might be needed
- * in some umount() (and thus can't use flush_delayed_fput() without
- * risking deadlocks), need to wait for completion of __fput() and know
- * for this specific struct file it won't involve anything that would
- * need them.  Use only if you really need it - at the very least,
- * don't blindly convert fput() by kernel thread to that.
+ * synchronous analog of fput(); for threads that need to wait for completion
+ * of __fput() and know for this specific struct file it won't involve anything
+ * that would need them.  Use only if you really need it - at the very least,
+ * don't blindly convert fput() to __fput_sync().
  */
 void __fput_sync(struct file *file)
 {
-	if (atomic_long_dec_and_test(&file->f_count)) {
-		struct task_struct *task = current;
-		BUG_ON(!(task->flags & PF_KTHREAD));
+	if (atomic_long_dec_and_test(&file->f_count))
 		__fput(file);
-	}
 }
 
 EXPORT_SYMBOL(fput);
-- 
2.18.4


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] fput: Allow calling __fput_sync() from !PF_KTHREAD thread.
  2020-07-08 14:24 [PATCH] fput: Allow calling __fput_sync() from !PF_KTHREAD thread Tetsuo Handa
@ 2020-07-29 13:04 ` Tetsuo Handa
  2020-08-19 12:42   ` [PATCH v2 (resend)] " Tetsuo Handa
  2020-09-10  3:57   ` [PATCH v2] " Al Viro
  0 siblings, 2 replies; 8+ messages in thread
From: Tetsuo Handa @ 2020-07-29 13:04 UTC (permalink / raw)
  To: Al Viro; +Cc: Eric W. Biederman, linux-fsdevel, Tetsuo Handa

__fput_sync() was introduced by commit 4a9d4b024a3102fc ("switch fput to
task_work_add") with BUG_ON(!(current->flags & PF_KTHREAD)) check, and
the only user of __fput_sync() was introduced by commit 17c0a5aaffa63da6
("make acct_kill() wait for file closing."). However, the latter commit is
effectively calling __fput_sync() from !PF_KTHREAD thread because of
schedule_work() call followed by immediate wait_for_completion() call.
That is, there is no need to defer close_work() to a WQ context. I guess
that the reason to defer was nothing but to bypass this BUG_ON() check.
While we need to remain careful about calling __fput_sync(), we can remove
bypassable BUG_ON() check from __fput_sync().

If this change is accepted, racy fput()+flush_delayed_fput() introduced
by commit e2dc9bf3f5275ca3 ("umd: Transform fork_usermode_blob into
fork_usermode_driver") will be replaced by this raceless __fput_sync().

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 fs/file_table.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index 656647f..7c41251 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -359,20 +359,15 @@ void fput(struct file *file)
 }
 
 /*
- * synchronous analog of fput(); for kernel threads that might be needed
- * in some umount() (and thus can't use flush_delayed_fput() without
- * risking deadlocks), need to wait for completion of __fput() and know
- * for this specific struct file it won't involve anything that would
- * need them.  Use only if you really need it - at the very least,
- * don't blindly convert fput() by kernel thread to that.
+ * synchronous analog of fput(); for threads that need to wait for completion
+ * of __fput() and know for this specific struct file it won't involve anything
+ * that would need them.  Use only if you really need it - at the very least,
+ * don't blindly convert fput() to __fput_sync().
  */
 void __fput_sync(struct file *file)
 {
-	if (atomic_long_dec_and_test(&file->f_count)) {
-		struct task_struct *task = current;
-		BUG_ON(!(task->flags & PF_KTHREAD));
+	if (atomic_long_dec_and_test(&file->f_count))
 		__fput(file);
-	}
 }
 
 EXPORT_SYMBOL(fput);
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 (resend)] fput: Allow calling __fput_sync() from !PF_KTHREAD thread.
  2020-07-29 13:04 ` [PATCH v2] " Tetsuo Handa
@ 2020-08-19 12:42   ` Tetsuo Handa
  2020-09-09 21:59     ` Tetsuo Handa
  2020-09-10  3:57   ` [PATCH v2] " Al Viro
  1 sibling, 1 reply; 8+ messages in thread
From: Tetsuo Handa @ 2020-08-19 12:42 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Al Viro, Eric W. Biederman, linux-fsdevel

__fput_sync() was introduced by commit 4a9d4b024a3102fc ("switch fput to
task_work_add") with BUG_ON(!(current->flags & PF_KTHREAD)) check, and
the only user of __fput_sync() was introduced by commit 17c0a5aaffa63da6
("make acct_kill() wait for file closing."). However, the latter commit is
effectively calling __fput_sync() from !PF_KTHREAD thread because of
schedule_work() call followed by immediate wait_for_completion() call.
That is, there is no need to defer close_work() to a WQ context. I guess
that the reason to defer was nothing but to bypass this BUG_ON() check.
While we need to remain careful about calling __fput_sync(), we can remove
bypassable BUG_ON() check from __fput_sync().

If this change is accepted, racy fput()+flush_delayed_fput() introduced
by commit e2dc9bf3f5275ca3 ("umd: Transform fork_usermode_blob into
fork_usermode_driver") will be replaced by this raceless __fput_sync().

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 fs/file_table.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index 656647f..7c41251 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -359,20 +359,15 @@ void fput(struct file *file)
 }
 
 /*
- * synchronous analog of fput(); for kernel threads that might be needed
- * in some umount() (and thus can't use flush_delayed_fput() without
- * risking deadlocks), need to wait for completion of __fput() and know
- * for this specific struct file it won't involve anything that would
- * need them.  Use only if you really need it - at the very least,
- * don't blindly convert fput() by kernel thread to that.
+ * synchronous analog of fput(); for threads that need to wait for completion
+ * of __fput() and know for this specific struct file it won't involve anything
+ * that would need them.  Use only if you really need it - at the very least,
+ * don't blindly convert fput() to __fput_sync().
  */
 void __fput_sync(struct file *file)
 {
-	if (atomic_long_dec_and_test(&file->f_count)) {
-		struct task_struct *task = current;
-		BUG_ON(!(task->flags & PF_KTHREAD));
+	if (atomic_long_dec_and_test(&file->f_count))
 		__fput(file);
-	}
 }
 
 EXPORT_SYMBOL(fput);
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 (resend)] fput: Allow calling __fput_sync() from !PF_KTHREAD thread.
  2020-08-19 12:42   ` [PATCH v2 (resend)] " Tetsuo Handa
@ 2020-09-09 21:59     ` Tetsuo Handa
  0 siblings, 0 replies; 8+ messages in thread
From: Tetsuo Handa @ 2020-09-09 21:59 UTC (permalink / raw)
  To: Andrew Morton, Al Viro; +Cc: Eric W. Biederman, linux-fsdevel

Ping?

On 2020/08/19 21:42, Tetsuo Handa wrote:
> __fput_sync() was introduced by commit 4a9d4b024a3102fc ("switch fput to
> task_work_add") with BUG_ON(!(current->flags & PF_KTHREAD)) check, and
> the only user of __fput_sync() was introduced by commit 17c0a5aaffa63da6
> ("make acct_kill() wait for file closing."). However, the latter commit is
> effectively calling __fput_sync() from !PF_KTHREAD thread because of
> schedule_work() call followed by immediate wait_for_completion() call.
> That is, there is no need to defer close_work() to a WQ context. I guess
> that the reason to defer was nothing but to bypass this BUG_ON() check.
> While we need to remain careful about calling __fput_sync(), we can remove
> bypassable BUG_ON() check from __fput_sync().
> 
> If this change is accepted, racy fput()+flush_delayed_fput() introduced
> by commit e2dc9bf3f5275ca3 ("umd: Transform fork_usermode_blob into
> fork_usermode_driver") will be replaced by this raceless __fput_sync().
> 
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> ---
>  fs/file_table.c | 15 +++++----------
>  1 file changed, 5 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/file_table.c b/fs/file_table.c
> index 656647f..7c41251 100644
> --- a/fs/file_table.c
> +++ b/fs/file_table.c
> @@ -359,20 +359,15 @@ void fput(struct file *file)
>  }
>  
>  /*
> - * synchronous analog of fput(); for kernel threads that might be needed
> - * in some umount() (and thus can't use flush_delayed_fput() without
> - * risking deadlocks), need to wait for completion of __fput() and know
> - * for this specific struct file it won't involve anything that would
> - * need them.  Use only if you really need it - at the very least,
> - * don't blindly convert fput() by kernel thread to that.
> + * synchronous analog of fput(); for threads that need to wait for completion
> + * of __fput() and know for this specific struct file it won't involve anything
> + * that would need them.  Use only if you really need it - at the very least,
> + * don't blindly convert fput() to __fput_sync().
>   */
>  void __fput_sync(struct file *file)
>  {
> -	if (atomic_long_dec_and_test(&file->f_count)) {
> -		struct task_struct *task = current;
> -		BUG_ON(!(task->flags & PF_KTHREAD));
> +	if (atomic_long_dec_and_test(&file->f_count))
>  		__fput(file);
> -	}
>  }
>  
>  EXPORT_SYMBOL(fput);
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] fput: Allow calling __fput_sync() from !PF_KTHREAD thread.
  2020-07-29 13:04 ` [PATCH v2] " Tetsuo Handa
  2020-08-19 12:42   ` [PATCH v2 (resend)] " Tetsuo Handa
@ 2020-09-10  3:57   ` Al Viro
  2020-09-10  5:26     ` Tetsuo Handa
  1 sibling, 1 reply; 8+ messages in thread
From: Al Viro @ 2020-09-10  3:57 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Eric W. Biederman, linux-fsdevel

On Wed, Jul 29, 2020 at 10:04:45PM +0900, Tetsuo Handa wrote:
> __fput_sync() was introduced by commit 4a9d4b024a3102fc ("switch fput to
> task_work_add") with BUG_ON(!(current->flags & PF_KTHREAD)) check, and
> the only user of __fput_sync() was introduced by commit 17c0a5aaffa63da6
> ("make acct_kill() wait for file closing."). However, the latter commit is
> effectively calling __fput_sync() from !PF_KTHREAD thread because of
> schedule_work() call followed by immediate wait_for_completion() call.
> That is, there is no need to defer close_work() to a WQ context. I guess
> that the reason to defer was nothing but to bypass this BUG_ON() check.
> While we need to remain careful about calling __fput_sync(), we can remove
> bypassable BUG_ON() check from __fput_sync().
> 
> If this change is accepted, racy fput()+flush_delayed_fput() introduced
> by commit e2dc9bf3f5275ca3 ("umd: Transform fork_usermode_blob into
> fork_usermode_driver") will be replaced by this raceless __fput_sync().

NAK.  The reason to defer is *NOT* to bypass that BUG_ON() - we really do not
want that thing done on anything other than extremely shallow stack.
Incidentally, why is that thing ever done _not_ in a kernel thread context?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] fput: Allow calling __fput_sync() from !PF_KTHREAD thread.
  2020-09-10  3:57   ` [PATCH v2] " Al Viro
@ 2020-09-10  5:26     ` Tetsuo Handa
  2020-09-10 11:25       ` Al Viro
  0 siblings, 1 reply; 8+ messages in thread
From: Tetsuo Handa @ 2020-09-10  5:26 UTC (permalink / raw)
  To: Al Viro; +Cc: Eric W. Biederman, linux-fsdevel

On 2020/09/10 12:57, Al Viro wrote:
> On Wed, Jul 29, 2020 at 10:04:45PM +0900, Tetsuo Handa wrote:
>> __fput_sync() was introduced by commit 4a9d4b024a3102fc ("switch fput to
>> task_work_add") with BUG_ON(!(current->flags & PF_KTHREAD)) check, and
>> the only user of __fput_sync() was introduced by commit 17c0a5aaffa63da6
>> ("make acct_kill() wait for file closing."). However, the latter commit is
>> effectively calling __fput_sync() from !PF_KTHREAD thread because of
>> schedule_work() call followed by immediate wait_for_completion() call.
>> That is, there is no need to defer close_work() to a WQ context. I guess
>> that the reason to defer was nothing but to bypass this BUG_ON() check.
>> While we need to remain careful about calling __fput_sync(), we can remove
>> bypassable BUG_ON() check from __fput_sync().
>>
>> If this change is accepted, racy fput()+flush_delayed_fput() introduced
>> by commit e2dc9bf3f5275ca3 ("umd: Transform fork_usermode_blob into
>> fork_usermode_driver") will be replaced by this raceless __fput_sync().

Thank you for responding. I'm also waiting for your response on
"[RFC PATCH] pipe: make pipe_release() deferrable." at 
https://lore.kernel.org/linux-fsdevel/7ba35ca4-13c1-caa3-0655-50d328304462@i-love.sakura.ne.jp/
and "[PATCH] splice: fix premature end of input detection" at 
https://lore.kernel.org/linux-block/cf26a57e-01f4-32a9-0b2c-9102bffe76b2@i-love.sakura.ne.jp/ .

> 
> NAK.  The reason to defer is *NOT* to bypass that BUG_ON() - we really do not
> want that thing done on anything other than extremely shallow stack.
> Incidentally, why is that thing ever done _not_ in a kernel thread context?

What does "that thing" refer to? acct_pin_kill() ? blob_to_mnt() ?
I don't know the reason because I'm not the author of these functions.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] fput: Allow calling __fput_sync() from !PF_KTHREAD thread.
  2020-09-10  5:26     ` Tetsuo Handa
@ 2020-09-10 11:25       ` Al Viro
  2020-09-10 20:06         ` Eric W. Biederman
  0 siblings, 1 reply; 8+ messages in thread
From: Al Viro @ 2020-09-10 11:25 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Eric W. Biederman, linux-fsdevel

On Thu, Sep 10, 2020 at 02:26:46PM +0900, Tetsuo Handa wrote:
> Thank you for responding. I'm also waiting for your response on
> "[RFC PATCH] pipe: make pipe_release() deferrable." at 
> https://lore.kernel.org/linux-fsdevel/7ba35ca4-13c1-caa3-0655-50d328304462@i-love.sakura.ne.jp/
> and "[PATCH] splice: fix premature end of input detection" at 
> https://lore.kernel.org/linux-block/cf26a57e-01f4-32a9-0b2c-9102bffe76b2@i-love.sakura.ne.jp/ .
> 
> > 
> > NAK.  The reason to defer is *NOT* to bypass that BUG_ON() - we really do not
> > want that thing done on anything other than extremely shallow stack.
> > Incidentally, why is that thing ever done _not_ in a kernel thread context?
> 
> What does "that thing" refer to? acct_pin_kill() ? blob_to_mnt() ?
> I don't know the reason because I'm not the author of these functions.

	The latter.  What I mean, why not simply do that from inside of
fork_usermode_driver()?  umd_setup is stored in sub_info->init and
eventually called from call_usermodehelper_exec_async(), right before
the created kernel thread is about to call kernel_execve() and stop
being a kernel thread...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] fput: Allow calling __fput_sync() from !PF_KTHREAD thread.
  2020-09-10 11:25       ` Al Viro
@ 2020-09-10 20:06         ` Eric W. Biederman
  0 siblings, 0 replies; 8+ messages in thread
From: Eric W. Biederman @ 2020-09-10 20:06 UTC (permalink / raw)
  To: Al Viro; +Cc: Tetsuo Handa, linux-fsdevel

Al Viro <viro@zeniv.linux.org.uk> writes:

> On Thu, Sep 10, 2020 at 02:26:46PM +0900, Tetsuo Handa wrote:
>> Thank you for responding. I'm also waiting for your response on
>> "[RFC PATCH] pipe: make pipe_release() deferrable." at 
>> https://lore.kernel.org/linux-fsdevel/7ba35ca4-13c1-caa3-0655-50d328304462@i-love.sakura.ne.jp/
>> and "[PATCH] splice: fix premature end of input detection" at 
>> https://lore.kernel.org/linux-block/cf26a57e-01f4-32a9-0b2c-9102bffe76b2@i-love.sakura.ne.jp/ .
>> 
>> > 
>> > NAK.  The reason to defer is *NOT* to bypass that BUG_ON() - we really do not
>> > want that thing done on anything other than extremely shallow stack.
>> > Incidentally, why is that thing ever done _not_ in a kernel thread context?
>> 
>> What does "that thing" refer to? acct_pin_kill() ? blob_to_mnt() ?
>> I don't know the reason because I'm not the author of these functions.
>
> 	The latter.  What I mean, why not simply do that from inside of
> fork_usermode_driver()?

Because that is a stupid place to do the work.  The usermode driver is
currently allowed to die and the kernel be respawned when needed.  Which
means there is not a 1 to 1 relationship between blob_to_mnt and
fork_usermode_driver.

As for the current code being racy, it is approxiamtely as racy as the
current code to load files init an initrd.  AKA no one has ever observed
any problems in practice but if you squint you can see where maybe
something could happen.

I think there is a stronger argument for finding a way to guarantee
that flush_delayed_fput will wait until any scheduled delayed_fput_work
will complete.  As that is the race Tetsuo is complaining about,
and it does also appear to also be present in populate_rootfs.


Flushing the fput is needed to ensure the writable struct file is
completely gone before an exec opens file file and calles
deny_write_access.

> umd_setup is stored in sub_info->init and
> eventually called from call_usermodehelper_exec_async(), right before
> the created kernel thread is about to call kernel_execve() and stop
> being a kernel thread...

I think you are suggesting calling __fput_sync in umd_setup.  Instead
of calling fput from blob_to_mnt.

To have a special case that only applies the first time a function is
called is possible but it is awkward, and likely more error prone.



I moved all of the user mode driver code out of exec and out of the user
mode helper code as the user mode driver code is essentially unused at
present.  The bpf folks really want to try and make it work so I wrote
something that is not completely insane so they can have their chance to
try.  I really suspect it will go the way of all of the migration of
the early kernel init code to userspace with klibc.  With the practical
details overwhelming things and making it not work or worth it in
practice.  Time will tell.


I hope that is enough context to understand what is going on there.

Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, back to index

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-08 14:24 [PATCH] fput: Allow calling __fput_sync() from !PF_KTHREAD thread Tetsuo Handa
2020-07-29 13:04 ` [PATCH v2] " Tetsuo Handa
2020-08-19 12:42   ` [PATCH v2 (resend)] " Tetsuo Handa
2020-09-09 21:59     ` Tetsuo Handa
2020-09-10  3:57   ` [PATCH v2] " Al Viro
2020-09-10  5:26     ` Tetsuo Handa
2020-09-10 11:25       ` Al Viro
2020-09-10 20:06         ` Eric W. Biederman

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git