IO-Uring Archive on lore.kernel.org
 help / color / Atom feed
* [RFC][BUG] io_uring: fix work corruption for poll_add
@ 2020-07-23 18:12 Pavel Begunkov
  2020-07-23 18:15 ` Pavel Begunkov
  2020-07-23 22:16 ` Jens Axboe
  0 siblings, 2 replies; 10+ messages in thread
From: Pavel Begunkov @ 2020-07-23 18:12 UTC (permalink / raw)
  To: Jens Axboe, io-uring

poll_add can have req->work initialised, which will be overwritten in
__io_arm_poll_handler() because of the union. Luckily, hash_node is
zeroed in the end, so the damage is limited to lost put for work.creds,
and probably corrupted work.list.

That's the easiest and really dirty fix, which rearranges members in the
union, arm_poll*() modifies and zeroes only work.files and work.mm,
which are never taken for poll add.
note: io_kiocb is exactly 4 cachelines now.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 32b0064f806e..58e6f7d938b6 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -669,12 +669,12 @@ struct io_kiocb {
 		 * restore the work, if needed.
 		 */
 		struct {
-			struct callback_head	task_work;
-			struct hlist_node	hash_node;
 			struct async_poll	*apoll;
+			struct hlist_node	hash_node;
 		};
 		struct io_wq_work	work;
 	};
+	struct callback_head	task_work;
 };
 
 #define IO_PLUG_THRESHOLD		2
-- 
2.24.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][BUG] io_uring: fix work corruption for poll_add
  2020-07-23 18:12 [RFC][BUG] io_uring: fix work corruption for poll_add Pavel Begunkov
@ 2020-07-23 18:15 ` Pavel Begunkov
  2020-07-23 18:19   ` Jens Axboe
  2020-07-23 22:16 ` Jens Axboe
  1 sibling, 1 reply; 10+ messages in thread
From: Pavel Begunkov @ 2020-07-23 18:15 UTC (permalink / raw)
  To: Jens Axboe, io-uring

On 23/07/2020 21:12, Pavel Begunkov wrote:
> poll_add can have req->work initialised, which will be overwritten in
> __io_arm_poll_handler() because of the union. Luckily, hash_node is
> zeroed in the end, so the damage is limited to lost put for work.creds,
> and probably corrupted work.list.
> 
> That's the easiest and really dirty fix, which rearranges members in the
> union, arm_poll*() modifies and zeroes only work.files and work.mm,
> which are never taken for poll add.
> note: io_kiocb is exactly 4 cachelines now.

Please, tell me if anybody has a good lean solution, because I'm a bit
too tired at the moment to fix it properly.
BTW, that's for 5.8, for-5.9 it should be done differently because of
io_kiocb compaction. 


> 
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
>  fs/io_uring.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 32b0064f806e..58e6f7d938b6 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -669,12 +669,12 @@ struct io_kiocb {
>  		 * restore the work, if needed.
>  		 */
>  		struct {
> -			struct callback_head	task_work;
> -			struct hlist_node	hash_node;
>  			struct async_poll	*apoll;
> +			struct hlist_node	hash_node;
>  		};
>  		struct io_wq_work	work;
>  	};
> +	struct callback_head	task_work;
>  };
>  
>  #define IO_PLUG_THRESHOLD		2
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][BUG] io_uring: fix work corruption for poll_add
  2020-07-23 18:15 ` Pavel Begunkov
@ 2020-07-23 18:19   ` Jens Axboe
  2020-07-23 19:10     ` Pavel Begunkov
  0 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2020-07-23 18:19 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 7/23/20 12:15 PM, Pavel Begunkov wrote:
> On 23/07/2020 21:12, Pavel Begunkov wrote:
>> poll_add can have req->work initialised, which will be overwritten in
>> __io_arm_poll_handler() because of the union. Luckily, hash_node is
>> zeroed in the end, so the damage is limited to lost put for work.creds,
>> and probably corrupted work.list.
>>
>> That's the easiest and really dirty fix, which rearranges members in the
>> union, arm_poll*() modifies and zeroes only work.files and work.mm,
>> which are never taken for poll add.
>> note: io_kiocb is exactly 4 cachelines now.
> 
> Please, tell me if anybody has a good lean solution, because I'm a bit
> too tired at the moment to fix it properly.
> BTW, that's for 5.8, for-5.9 it should be done differently because of
> io_kiocb compaction. 

Do you have a test case that leaks the reference?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][BUG] io_uring: fix work corruption for poll_add
  2020-07-23 18:19   ` Jens Axboe
@ 2020-07-23 19:10     ` Pavel Begunkov
  0 siblings, 0 replies; 10+ messages in thread
From: Pavel Begunkov @ 2020-07-23 19:10 UTC (permalink / raw)
  To: Jens Axboe, io-uring

On 23/07/2020 21:19, Jens Axboe wrote:
> On 7/23/20 12:15 PM, Pavel Begunkov wrote:
>> On 23/07/2020 21:12, Pavel Begunkov wrote:
>>> poll_add can have req->work initialised, which will be overwritten in
>>> __io_arm_poll_handler() because of the union. Luckily, hash_node is
>>> zeroed in the end, so the damage is limited to lost put for work.creds,
>>> and probably corrupted work.list.
>>>
>>> That's the easiest and really dirty fix, which rearranges members in the
>>> union, arm_poll*() modifies and zeroes only work.files and work.mm,
>>> which are never taken for poll add.
>>> note: io_kiocb is exactly 4 cachelines now.
>>
>> Please, tell me if anybody has a good lean solution, because I'm a bit
>> too tired at the moment to fix it properly.
>> BTW, that's for 5.8, for-5.9 it should be done differently because of
>> io_kiocb compaction. 
> 
> Do you have a test case that leaks the reference?

link-timeout.c::test_timeout_link_chain2()
- add IOSQE_ASYNC after poll_add_prep() (probably, not even needed)
- close() pipes fds at the end.
- while(1) test_timeout_link_chain2()

That's what I did to test it. Confirmed with printk + it killed the
system in 10-30 minutes. I can get something faster sometime later.

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][BUG] io_uring: fix work corruption for poll_add
  2020-07-23 18:12 [RFC][BUG] io_uring: fix work corruption for poll_add Pavel Begunkov
  2020-07-23 18:15 ` Pavel Begunkov
@ 2020-07-23 22:16 ` Jens Axboe
  2020-07-23 22:24   ` Jens Axboe
  1 sibling, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2020-07-23 22:16 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 7/23/20 12:12 PM, Pavel Begunkov wrote:
> poll_add can have req->work initialised, which will be overwritten in
> __io_arm_poll_handler() because of the union. Luckily, hash_node is
> zeroed in the end, so the damage is limited to lost put for work.creds,
> and probably corrupted work.list.
> 
> That's the easiest and really dirty fix, which rearranges members in the
> union, arm_poll*() modifies and zeroes only work.files and work.mm,
> which are never taken for poll add.
> note: io_kiocb is exactly 4 cachelines now.

I don't think there's a way around moving task_work out, just like it
was done on 5.9. The problem is that we could put the environment bits
before doing task_work_add(), but we might need them if the subsequent
queue ends up having to go async. So there's really no know when we can
put them, outside of when the request finishes. Hence, we are kind of
SOL here.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][BUG] io_uring: fix work corruption for poll_add
  2020-07-23 22:16 ` Jens Axboe
@ 2020-07-23 22:24   ` Jens Axboe
  2020-07-24 12:46     ` Pavel Begunkov
  0 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2020-07-23 22:24 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 7/23/20 4:16 PM, Jens Axboe wrote:
> On 7/23/20 12:12 PM, Pavel Begunkov wrote:
>> poll_add can have req->work initialised, which will be overwritten in
>> __io_arm_poll_handler() because of the union. Luckily, hash_node is
>> zeroed in the end, so the damage is limited to lost put for work.creds,
>> and probably corrupted work.list.
>>
>> That's the easiest and really dirty fix, which rearranges members in the
>> union, arm_poll*() modifies and zeroes only work.files and work.mm,
>> which are never taken for poll add.
>> note: io_kiocb is exactly 4 cachelines now.
> 
> I don't think there's a way around moving task_work out, just like it
> was done on 5.9. The problem is that we could put the environment bits
> before doing task_work_add(), but we might need them if the subsequent
> queue ends up having to go async. So there's really no know when we can
> put them, outside of when the request finishes. Hence, we are kind of
> SOL here.

Actually, if we do go async, then we can just grab the environment
again. We're in the same task at that point. So maybe it'd be better to
work on ensuring that the request is either in the valid work state, or
empty work if using task_work.

Only potential complication with that is doing io_req_work_drop_env()
from the waitqueue handler, at least the ->needs_fs part won't like that
too much.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][BUG] io_uring: fix work corruption for poll_add
  2020-07-23 22:24   ` Jens Axboe
@ 2020-07-24 12:46     ` Pavel Begunkov
  2020-07-24 12:52       ` Pavel Begunkov
  0 siblings, 1 reply; 10+ messages in thread
From: Pavel Begunkov @ 2020-07-24 12:46 UTC (permalink / raw)
  To: Jens Axboe, io-uring

On 24/07/2020 01:24, Jens Axboe wrote:
> On 7/23/20 4:16 PM, Jens Axboe wrote:
>> On 7/23/20 12:12 PM, Pavel Begunkov wrote:
>>> poll_add can have req->work initialised, which will be overwritten in
>>> __io_arm_poll_handler() because of the union. Luckily, hash_node is
>>> zeroed in the end, so the damage is limited to lost put for work.creds,
>>> and probably corrupted work.list.
>>>
>>> That's the easiest and really dirty fix, which rearranges members in the
>>> union, arm_poll*() modifies and zeroes only work.files and work.mm,
>>> which are never taken for poll add.
>>> note: io_kiocb is exactly 4 cachelines now.
>>
>> I don't think there's a way around moving task_work out, just like it

+hash_node. I was thinking to do apoll alloc+memcpy as for rw, but this
one is ugly.

>> was done on 5.9. The problem is that we could put the environment bits
>> before doing task_work_add(), but we might need them if the subsequent
>> queue ends up having to go async. So there's really no know when we can
>> put them, outside of when the request finishes. Hence, we are kind of
>> SOL here.
> 
> Actually, if we do go async, then we can just grab the environment
> again. We're in the same task at that point. So maybe it'd be better to
> work on ensuring that the request is either in the valid work state, or
> empty work if using task_work.
> 
> Only potential complication with that is doing io_req_work_drop_env()
> from the waitqueue handler, at least the ->needs_fs part won't like that
> too much.

Considering that work->list is removed before executing io_wq_work, it
should work. And if done only for poll_add, which needs nothing and ends up
with creds, there shouldn't be any problems. I'll try this out

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][BUG] io_uring: fix work corruption for poll_add
  2020-07-24 12:46     ` Pavel Begunkov
@ 2020-07-24 12:52       ` Pavel Begunkov
  2020-07-24 14:12         ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Pavel Begunkov @ 2020-07-24 12:52 UTC (permalink / raw)
  To: Jens Axboe, io-uring

On 24/07/2020 15:46, Pavel Begunkov wrote:
> On 24/07/2020 01:24, Jens Axboe wrote:
>> On 7/23/20 4:16 PM, Jens Axboe wrote:
>>> On 7/23/20 12:12 PM, Pavel Begunkov wrote:
>>>> poll_add can have req->work initialised, which will be overwritten in
>>>> __io_arm_poll_handler() because of the union. Luckily, hash_node is
>>>> zeroed in the end, so the damage is limited to lost put for work.creds,
>>>> and probably corrupted work.list.
>>>>
>>>> That's the easiest and really dirty fix, which rearranges members in the
>>>> union, arm_poll*() modifies and zeroes only work.files and work.mm,
>>>> which are never taken for poll add.
>>>> note: io_kiocb is exactly 4 cachelines now.
>>>
>>> I don't think there's a way around moving task_work out, just like it
> 
> +hash_node. I was thinking to do apoll alloc+memcpy as for rw, but this
> one is ugly.
> 
>>> was done on 5.9. The problem is that we could put the environment bits
>>> before doing task_work_add(), but we might need them if the subsequent
>>> queue ends up having to go async. So there's really no know when we can
>>> put them, outside of when the request finishes. Hence, we are kind of
>>> SOL here.
>>
>> Actually, if we do go async, then we can just grab the environment
>> again. We're in the same task at that point. So maybe it'd be better to
>> work on ensuring that the request is either in the valid work state, or
>> empty work if using task_work.
>>
>> Only potential complication with that is doing io_req_work_drop_env()
>> from the waitqueue handler, at least the ->needs_fs part won't like that
>> too much.
> 
> Considering that work->list is removed before executing io_wq_work, it
> should work. And if done only for poll_add, which needs nothing and ends up
> with creds, there shouldn't be any problems. I'll try this out

Except for custom ->creds assigned at the beginning with the personality
feature. Does poll ever use it?

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][BUG] io_uring: fix work corruption for poll_add
  2020-07-24 12:52       ` Pavel Begunkov
@ 2020-07-24 14:12         ` Jens Axboe
  2020-07-24 14:23           ` Pavel Begunkov
  0 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2020-07-24 14:12 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 7/24/20 6:52 AM, Pavel Begunkov wrote:
> On 24/07/2020 15:46, Pavel Begunkov wrote:
>> On 24/07/2020 01:24, Jens Axboe wrote:
>>> On 7/23/20 4:16 PM, Jens Axboe wrote:
>>>> On 7/23/20 12:12 PM, Pavel Begunkov wrote:
>>>>> poll_add can have req->work initialised, which will be overwritten in
>>>>> __io_arm_poll_handler() because of the union. Luckily, hash_node is
>>>>> zeroed in the end, so the damage is limited to lost put for work.creds,
>>>>> and probably corrupted work.list.
>>>>>
>>>>> That's the easiest and really dirty fix, which rearranges members in the
>>>>> union, arm_poll*() modifies and zeroes only work.files and work.mm,
>>>>> which are never taken for poll add.
>>>>> note: io_kiocb is exactly 4 cachelines now.
>>>>
>>>> I don't think there's a way around moving task_work out, just like it
>>
>> +hash_node. I was thinking to do apoll alloc+memcpy as for rw, but this
>> one is ugly.
>>
>>>> was done on 5.9. The problem is that we could put the environment bits
>>>> before doing task_work_add(), but we might need them if the subsequent
>>>> queue ends up having to go async. So there's really no know when we can
>>>> put them, outside of when the request finishes. Hence, we are kind of
>>>> SOL here.
>>>
>>> Actually, if we do go async, then we can just grab the environment
>>> again. We're in the same task at that point. So maybe it'd be better to
>>> work on ensuring that the request is either in the valid work state, or
>>> empty work if using task_work.
>>>
>>> Only potential complication with that is doing io_req_work_drop_env()
>>> from the waitqueue handler, at least the ->needs_fs part won't like that
>>> too much.
>>
>> Considering that work->list is removed before executing io_wq_work, it
>> should work. And if done only for poll_add, which needs nothing and ends up
>> with creds, there shouldn't be any problems. I'll try this out
> 
> Except for custom ->creds assigned at the beginning with the personality
> feature. Does poll ever use it?

It's kind of annoying how we don't have a def->needs_creds, because lots
of things would never use it. For poll, it wouldn't be used at all,
which makes this issue doubly annoying.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][BUG] io_uring: fix work corruption for poll_add
  2020-07-24 14:12         ` Jens Axboe
@ 2020-07-24 14:23           ` Pavel Begunkov
  0 siblings, 0 replies; 10+ messages in thread
From: Pavel Begunkov @ 2020-07-24 14:23 UTC (permalink / raw)
  To: Jens Axboe, io-uring

On 24/07/2020 17:12, Jens Axboe wrote:
> On 7/24/20 6:52 AM, Pavel Begunkov wrote:
>> On 24/07/2020 15:46, Pavel Begunkov wrote:
>>> On 24/07/2020 01:24, Jens Axboe wrote:
>>>> On 7/23/20 4:16 PM, Jens Axboe wrote:
>>>>> On 7/23/20 12:12 PM, Pavel Begunkov wrote:
>>>>>> poll_add can have req->work initialised, which will be overwritten in
>>>>>> __io_arm_poll_handler() because of the union. Luckily, hash_node is
>>>>>> zeroed in the end, so the damage is limited to lost put for work.creds,
>>>>>> and probably corrupted work.list.
>>>>>>
>>>>>> That's the easiest and really dirty fix, which rearranges members in the
>>>>>> union, arm_poll*() modifies and zeroes only work.files and work.mm,
>>>>>> which are never taken for poll add.
>>>>>> note: io_kiocb is exactly 4 cachelines now.
>>>>>
>>>>> I don't think there's a way around moving task_work out, just like it
>>>
>>> +hash_node. I was thinking to do apoll alloc+memcpy as for rw, but this
>>> one is ugly.
>>>
>>>>> was done on 5.9. The problem is that we could put the environment bits
>>>>> before doing task_work_add(), but we might need them if the subsequent
>>>>> queue ends up having to go async. So there's really no know when we can
>>>>> put them, outside of when the request finishes. Hence, we are kind of
>>>>> SOL here.
>>>>
>>>> Actually, if we do go async, then we can just grab the environment
>>>> again. We're in the same task at that point. So maybe it'd be better to
>>>> work on ensuring that the request is either in the valid work state, or
>>>> empty work if using task_work.
>>>>
>>>> Only potential complication with that is doing io_req_work_drop_env()
>>>> from the waitqueue handler, at least the ->needs_fs part won't like that
>>>> too much.
>>>
>>> Considering that work->list is removed before executing io_wq_work, it
>>> should work. And if done only for poll_add, which needs nothing and ends up
>>> with creds, there shouldn't be any problems. I'll try this out
>>
>> Except for custom ->creds assigned at the beginning with the personality
>> feature. Does poll ever use it?
> 
> It's kind of annoying how we don't have a def->needs_creds, because lots
> of things would never use it. For poll, it wouldn't be used at all,
> which makes this issue doubly annoying.

Then we don't have to care which one it has, and the scheme should work
good enough for a quick fix.
I still don't like overwriting work.list until it leaves io-wq, but that's
to think about for 5.9

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, back to index

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-23 18:12 [RFC][BUG] io_uring: fix work corruption for poll_add Pavel Begunkov
2020-07-23 18:15 ` Pavel Begunkov
2020-07-23 18:19   ` Jens Axboe
2020-07-23 19:10     ` Pavel Begunkov
2020-07-23 22:16 ` Jens Axboe
2020-07-23 22:24   ` Jens Axboe
2020-07-24 12:46     ` Pavel Begunkov
2020-07-24 12:52       ` Pavel Begunkov
2020-07-24 14:12         ` Jens Axboe
2020-07-24 14:23           ` Pavel Begunkov

IO-Uring Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/io-uring/0 io-uring/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 io-uring io-uring/ https://lore.kernel.org/io-uring \
		io-uring@vger.kernel.org
	public-inbox-index io-uring

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.io-uring


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git