* loop: are parallel requests serialized by the single workqueue?
@ 2022-03-17 2:26 Eric Wheeler
2022-03-17 10:20 ` Chaitanya Kulkarni
0 siblings, 1 reply; 3+ messages in thread
From: Eric Wheeler @ 2022-03-17 2:26 UTC (permalink / raw)
To: Ming Lei; +Cc: linux-block
Hi Ming,
I was studying the loop.c DIO & AIO changes you made back in 2015 that
increased loop performance and reduced the memory footprint
(bc07c10a3603a5ab3ef01ba42b3d41f9ac63d1b6).
I have a few questions if you are able to comment, here is a quick
summary:
The direct IO path starts by queuing the work:
.queue_rq = loop_queue_rq:
-> loop_queue_work(lo, cmd);
-> INIT_WORK(&worker->work, loop_workfn);
... queue_work(lo->workqueue, work);
Then from within the workqueue:
-> loop_workfn()
-> loop_process_work(worker, &worker->cmd_list, worker->lo);
-> loop_handle_cmd(cmd);
-> do_req_filebacked(lo, blk_mq_rq_from_pdu(cmd) );
-> lo_rw_aio(lo, cmd, pos, READ) // (or WRITE)
From here the kiocb is setup and this is the 5.17-rc8 code at the
bottom of lo_rw_aio() when it sets up the dispatch to the filesystem:
cmd->iocb.ki_pos = pos;
cmd->iocb.ki_filp = file;
cmd->iocb.ki_complete = lo_rw_aio_complete;
cmd->iocb.ki_flags = IOCB_DIRECT;
cmd->iocb.ki_ioprio = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_NONE, 0);
if (rw == WRITE)
ret = call_write_iter(file, &cmd->iocb, &iter);
else
ret = call_read_iter(file, &cmd->iocb, &iter);
lo_rw_aio_do_completion(cmd);
if (ret != -EIOCBQUEUED)
lo_rw_aio_complete(&cmd->iocb, ret);
After having called `call_read_iter` it is in the filesystem's
handler.
Since ki_complete is defined, does that mean the filesystem will _always_
take these in and always queue these internally and return -EIOCBQUEUED
from call_read_iter()? Another way to ask: if ki_complete!=NULL, can a
filesystem ever behave synchronously? (Is there documentation about this
somewhere? I couldn't find anything definitive.)
About the cleanup after dispatch at the bottom of lo_rw_aio() from this
code (also shown above):
lo_rw_aio_do_completion(cmd);
if (ret != -EIOCBQUEUED)
lo_rw_aio_complete(&cmd->iocb, ret);
* It appears that lo_rw_aio_do_completion() will `kfree(cmd->bvec)`. If
the filesystem queued the cmd->iocb for internal use, would it have made
a copy of cmd->bvec so that this is safe?
* If ret != -EIOCBQUEUED, then lo_rw_aio_complete() is called which calls
lo_rw_aio_do_completion() a second time. Now lo_rw_aio_do_completion
does do this ref check, so it _is_ safe:
if (!atomic_dec_and_test(&cmd->ref))
return;
For my own understanding, is this equivalent?
- lo_rw_aio_do_completion(cmd);
if (ret != -EIOCBQUEUED)
lo_rw_aio_complete(&cmd->iocb, ret);
+ else
+ lo_rw_aio_do_completion(cmd);
Related questions:
* For REQ_OP_FLUSH, lo_req_flush() is called: it does not appear that
lo_req_flush() does anything to make sure ki_complete has been called
for pending work, it just calls vfs_fsync().
Is this a consistency problem?
For example, if loop has queued async kiocb's to the filesystem via
.read_iter/.write_iter, is it the filesystem's responsibility to
complete that work before returning from vfs_sync() or is it possible
that the vfs_sync() completes before ->ki_complete() is called?
* Would there be any benefit to parallelizing the do_req_filebackend()
calls with (for example) multiple work queues?
* If so, then besides flushing the parallel work and doing vfs_fsync (from
lo_req_flush), are there any other consistency issues to consider on
REQ_OP_FLUSH?
* Can READs and WRITEs be out-of-order between flushes?
Thanks for your help!
-Eric
--
Eric Wheeler
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: loop: are parallel requests serialized by the single workqueue?
2022-03-17 2:26 loop: are parallel requests serialized by the single workqueue? Eric Wheeler
@ 2022-03-17 10:20 ` Chaitanya Kulkarni
2022-03-17 10:27 ` Chaitanya Kulkarni
0 siblings, 1 reply; 3+ messages in thread
From: Chaitanya Kulkarni @ 2022-03-17 10:20 UTC (permalink / raw)
To: Eric Wheeler; +Cc: linux-block, Ming Lei
Eric,
On 3/16/22 7:26 PM, Eric Wheeler wrote:
> [Some people who received this message don't often get email from linux-block@lists.ewheeler.net. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.]
>
> Hi Ming,
>
> I was studying the loop.c DIO & AIO changes you made back in 2015 that
> increased loop performance and reduced the memory footprint
> (bc07c10a3603a5ab3ef01ba42b3d41f9ac63d1b6).
>
> I have a few questions if you are able to comment, here is a quick
> summary:
>
> The direct IO path starts by queuing the work:
>
> .queue_rq = loop_queue_rq:
>
> -> loop_queue_work(lo, cmd);
> -> INIT_WORK(&worker->work, loop_workfn);
> ... queue_work(lo->workqueue, work);
>
> Then from within the workqueue:
>
> -> loop_workfn()
> -> loop_process_work(worker, &worker->cmd_list, worker->lo);
> -> loop_handle_cmd(cmd);
> -> do_req_filebacked(lo, blk_mq_rq_from_pdu(cmd) );
> -> lo_rw_aio(lo, cmd, pos, READ) // (or WRITE)
>
> From here the kiocb is setup and this is the 5.17-rc8 code at the
> bottom of lo_rw_aio() when it sets up the dispatch to the filesystem:
>
> cmd->iocb.ki_pos = pos;
> cmd->iocb.ki_filp = file;
> cmd->iocb.ki_complete = lo_rw_aio_complete;
> cmd->iocb.ki_flags = IOCB_DIRECT;
> cmd->iocb.ki_ioprio = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_NONE, 0);
>
> if (rw == WRITE)
> ret = call_write_iter(file, &cmd->iocb, &iter);
> else
> ret = call_read_iter(file, &cmd->iocb, &iter);
>
> lo_rw_aio_do_completion(cmd);
>
> if (ret != -EIOCBQUEUED)
> lo_rw_aio_complete(&cmd->iocb, ret);
>
>
> After having called `call_read_iter` it is in the filesystem's
> handler.
>
> Since ki_complete is defined, does that mean the filesystem will _always_
> take these in and always queue these internally and return -EIOCBQUEUED
> from call_read_iter()? Another way to ask: if ki_complete!=NULL, can a
> filesystem ever behave synchronously? (Is there documentation about this
> somewhere? I couldn't find anything definitive.)
>
a non-null ki_complete asks for async I/O and that is what we need to
get the higher performance.
>
> About the cleanup after dispatch at the bottom of lo_rw_aio() from this
> code (also shown above):
>
> lo_rw_aio_do_completion(cmd);
>
> if (ret != -EIOCBQUEUED)
> lo_rw_aio_complete(&cmd->iocb, ret);
>
> * It appears that lo_rw_aio_do_completion() will `kfree(cmd->bvec)`. If
> the filesystem queued the cmd->iocb for internal use, would it have made
> a copy of cmd->bvec so that this is safe?
>
> * If ret != -EIOCBQUEUED, then lo_rw_aio_complete() is called which calls
> lo_rw_aio_do_completion() a second time. Now lo_rw_aio_do_completion
> does do this ref check, so it _is_ safe:
>
> if (!atomic_dec_and_test(&cmd->ref))
> return;
>
> For my own understanding, is this equivalent?
>
> - lo_rw_aio_do_completion(cmd);
>
> if (ret != -EIOCBQUEUED)
> lo_rw_aio_complete(&cmd->iocb, ret);
> + else
> + lo_rw_aio_do_completion(cmd);
>
>
>
>
I think the purpose of refcount is to make sure we free the request in
lo_rw_aio_do_completion() whoever finishes last either submission thread
or fs completion ctx calling ki_complete() -> lo_rw_aio_complete().
So there are actually three cases :-
1. I/O is successfully queued i.e. call_iter() ret == -EIOCBQUEUED.
Case 1 :
1.1 fs completion happnes after we exit from lo_rw_aio()
a. submission thread lo_rw_aio() refcnt = 2
b. submission thread lo_rq_aio_do_completion() refcnt = 1
c. fs completion ctx fs->ki_complete()->lo_rw_aio_complete() refcnt = 0
Case 2:
1.2 fs completion happens before we exit lo_rq_aio()
a. submission thread lo_rw_aio() refcnt = 2
b. fs completion ctx fs->ki_complete()->lo_rw_aio_complete() refcnt = 1
c. submission thread lo_rq_aio_do_completion() refcnt = 0
2. I/O is not successfully queued i.e. call_iter() ret != -EIOCBQUEUED.
Case 3:
a. submission thread lo_rw_aio() refcnt = 2
b. submission thread lo_rq_aio_do_completion() refcnt = 1
c. submission thread lo_rw_aio_complete() refcnt = 0
so if you change the position of the call lo_rw_aio_do_completion()
it might not work since refcount will be decremented by only once.
hope this helps, if it creates more confusion then plz ignore this and
follow Ming's reply.
-ck
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: loop: are parallel requests serialized by the single workqueue?
2022-03-17 10:20 ` Chaitanya Kulkarni
@ 2022-03-17 10:27 ` Chaitanya Kulkarni
0 siblings, 0 replies; 3+ messages in thread
From: Chaitanya Kulkarni @ 2022-03-17 10:27 UTC (permalink / raw)
To: Eric Wheeler; +Cc: linux-block, Ming Lei
On 3/17/22 3:20 AM, Chaitanya Kulkarni wrote:
> Eric,
>
> On 3/16/22 7:26 PM, Eric Wheeler wrote:
>> [Some people who received this message don't often get email from linux-block@lists.ewheeler.net. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.]
>>
>> Hi Ming,
>>
>> I was studying the loop.c DIO & AIO changes you made back in 2015 that
>> increased loop performance and reduced the memory footprint
>> (bc07c10a3603a5ab3ef01ba42b3d41f9ac63d1b6).
>>
>> I have a few questions if you are able to comment, here is a quick
>> summary:
>>
>> The direct IO path starts by queuing the work:
>>
>> .queue_rq = loop_queue_rq:
>>
>> -> loop_queue_work(lo, cmd);
>> -> INIT_WORK(&worker->work, loop_workfn);
>> ... queue_work(lo->workqueue, work);
>>
>> Then from within the workqueue:
>>
>> -> loop_workfn()
>> -> loop_process_work(worker, &worker->cmd_list, worker->lo);
>> -> loop_handle_cmd(cmd);
>> -> do_req_filebacked(lo, blk_mq_rq_from_pdu(cmd) );
>> -> lo_rw_aio(lo, cmd, pos, READ) // (or WRITE)
>>
>> From here the kiocb is setup and this is the 5.17-rc8 code at the
>> bottom of lo_rw_aio() when it sets up the dispatch to the filesystem:
>>
>> cmd->iocb.ki_pos = pos;
>> cmd->iocb.ki_filp = file;
>> cmd->iocb.ki_complete = lo_rw_aio_complete;
>> cmd->iocb.ki_flags = IOCB_DIRECT;
>> cmd->iocb.ki_ioprio = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_NONE, 0);
>>
>> if (rw == WRITE)
>> ret = call_write_iter(file, &cmd->iocb, &iter);
>> else
>> ret = call_read_iter(file, &cmd->iocb, &iter);
>>
>> lo_rw_aio_do_completion(cmd);
>>
>> if (ret != -EIOCBQUEUED)
>> lo_rw_aio_complete(&cmd->iocb, ret);
>>
>>
>> After having called `call_read_iter` it is in the filesystem's
>> handler.
>>
>> Since ki_complete is defined, does that mean the filesystem will _always_
>> take these in and always queue these internally and return -EIOCBQUEUED
>> from call_read_iter()? Another way to ask: if ki_complete!=NULL, can a
>> filesystem ever behave synchronously? (Is there documentation about this
>> somewhere? I couldn't find anything definitive.)
>>
>
> a non-null ki_complete asks for async I/O and that is what we need to
> get the higher performance.
>
>>
>> About the cleanup after dispatch at the bottom of lo_rw_aio() from this
>> code (also shown above):
>>
>> lo_rw_aio_do_completion(cmd);
>>
>> if (ret != -EIOCBQUEUED)
>> lo_rw_aio_complete(&cmd->iocb, ret);
>>
>> * It appears that lo_rw_aio_do_completion() will `kfree(cmd->bvec)`. If
>> the filesystem queued the cmd->iocb for internal use, would it have made
>> a copy of cmd->bvec so that this is safe?
>>
>> * If ret != -EIOCBQUEUED, then lo_rw_aio_complete() is called which calls
>> lo_rw_aio_do_completion() a second time. Now lo_rw_aio_do_completion
>> does do this ref check, so it _is_ safe:
>>
>> if (!atomic_dec_and_test(&cmd->ref))
>> return;
>>
>> For my own understanding, is this equivalent?
>>
>> - lo_rw_aio_do_completion(cmd);
>>
>> if (ret != -EIOCBQUEUED)
>> lo_rw_aio_complete(&cmd->iocb, ret);
>> + else
>> + lo_rw_aio_do_completion(cmd);
>>
>>
>>
>>
>
> I think the purpose of refcount is to make sure we free the request in
> lo_rw_aio_do_completion() whoever finishes last either submission thread
> or fs completion ctx calling ki_complete() -> lo_rw_aio_complete().
>
what I also meant here is that if fs completion happens before we exit
the lo_rw_aio() then we should not proceed with kfree(cmd->bvec)
but only decrement the ref-count and wait for lo_rw_aio() to decrement
the ref-count by one more time to kfree(cmd->bvec) before exit.
-ck
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-03-17 10:27 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-17 2:26 loop: are parallel requests serialized by the single workqueue? Eric Wheeler
2022-03-17 10:20 ` Chaitanya Kulkarni
2022-03-17 10:27 ` Chaitanya Kulkarni
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.