All of lore.kernel.org
 help / color / mirror / Atom feed
* loop: are parallel requests serialized by the single workqueue?
@ 2022-03-17  2:26 Eric Wheeler
  2022-03-17 10:20 ` Chaitanya Kulkarni
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Wheeler @ 2022-03-17  2:26 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-block

Hi Ming,

I was studying the loop.c DIO & AIO changes you made back in 2015 that 
increased loop performance and reduced the memory footprint 
(bc07c10a3603a5ab3ef01ba42b3d41f9ac63d1b6).  

I have a few questions if you are able to comment, here is a quick 
summary:

The direct IO path starts by queuing the work:

  .queue_rq       = loop_queue_rq:

	-> loop_queue_work(lo, cmd);
	-> INIT_WORK(&worker->work, loop_workfn);
		... queue_work(lo->workqueue, work);

Then from within the workqueue:

	-> loop_workfn()
	-> loop_process_work(worker, &worker->cmd_list, worker->lo);
	-> loop_handle_cmd(cmd);
	-> do_req_filebacked(lo, blk_mq_rq_from_pdu(cmd) );
	-> lo_rw_aio(lo, cmd, pos, READ) // (or WRITE)

From here the kiocb is setup and this is the 5.17-rc8 code at the 
bottom of lo_rw_aio() when it sets up the dispatch to the filesystem:

	cmd->iocb.ki_pos = pos;
	cmd->iocb.ki_filp = file;
	cmd->iocb.ki_complete = lo_rw_aio_complete;
	cmd->iocb.ki_flags = IOCB_DIRECT;
	cmd->iocb.ki_ioprio = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_NONE, 0);

	if (rw == WRITE)
		ret = call_write_iter(file, &cmd->iocb, &iter);
	else
		ret = call_read_iter(file, &cmd->iocb, &iter);

	lo_rw_aio_do_completion(cmd);

	if (ret != -EIOCBQUEUED)
		lo_rw_aio_complete(&cmd->iocb, ret);


After having called `call_read_iter` it is in the filesystem's
handler.

Since ki_complete is defined, does that mean the filesystem will _always_ 
take these in and always queue these internally and return -EIOCBQUEUED 
from call_read_iter()?  Another way to ask: if ki_complete!=NULL, can a 
filesystem ever behave synchronously?  (Is there documentation about this 
somewhere?  I couldn't find anything definitive.)



About the cleanup after dispatch at the bottom of lo_rw_aio() from this 
code (also shown above):

	lo_rw_aio_do_completion(cmd);

	if (ret != -EIOCBQUEUED)
		lo_rw_aio_complete(&cmd->iocb, ret);

* It appears that lo_rw_aio_do_completion() will `kfree(cmd->bvec)`.  If 
  the filesystem queued the cmd->iocb for internal use, would it have made 
  a copy of cmd->bvec so that this is safe?

* If ret != -EIOCBQUEUED, then lo_rw_aio_complete() is called which calls 
  lo_rw_aio_do_completion() a second time.  Now lo_rw_aio_do_completion 
  does do this ref check, so it _is_ safe:

	if (!atomic_dec_and_test(&cmd->ref))
		return;

For my own understanding, is this equivalent?

-	lo_rw_aio_do_completion(cmd);

	if (ret != -EIOCBQUEUED)
		lo_rw_aio_complete(&cmd->iocb, ret);
+	else
+		lo_rw_aio_do_completion(cmd);



Related questions: 

* For REQ_OP_FLUSH, lo_req_flush() is called: it does not appear that 
  lo_req_flush() does anything to make sure ki_complete has been called 
  for pending work, it just calls vfs_fsync().

  Is this a consistency problem?

  For example, if loop has queued async kiocb's to the filesystem via
  .read_iter/.write_iter, is it the filesystem's responsibility to 
  complete that work before returning from vfs_sync() or is it possible 
  that the vfs_sync() completes before ->ki_complete() is called?

* Would there be any benefit to parallelizing the do_req_filebackend()  
  calls with (for example) multiple work queues?

* If so, then besides flushing the parallel work and doing vfs_fsync (from 
  lo_req_flush), are there any other consistency issues to consider on 
  REQ_OP_FLUSH?  
     * Can READs and WRITEs be out-of-order between flushes?

Thanks for your help!

-Eric

--
Eric Wheeler

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: loop: are parallel requests serialized by the single workqueue?
  2022-03-17  2:26 loop: are parallel requests serialized by the single workqueue? Eric Wheeler
@ 2022-03-17 10:20 ` Chaitanya Kulkarni
  2022-03-17 10:27   ` Chaitanya Kulkarni
  0 siblings, 1 reply; 3+ messages in thread
From: Chaitanya Kulkarni @ 2022-03-17 10:20 UTC (permalink / raw)
  To: Eric Wheeler; +Cc: linux-block, Ming Lei

Eric,

On 3/16/22 7:26 PM, Eric Wheeler wrote:
> [Some people who received this message don't often get email from linux-block@lists.ewheeler.net. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.]
> 
> Hi Ming,
> 
> I was studying the loop.c DIO & AIO changes you made back in 2015 that
> increased loop performance and reduced the memory footprint
> (bc07c10a3603a5ab3ef01ba42b3d41f9ac63d1b6).
> 
> I have a few questions if you are able to comment, here is a quick
> summary:
> 
> The direct IO path starts by queuing the work:
> 
>    .queue_rq       = loop_queue_rq:
> 
>          -> loop_queue_work(lo, cmd);
>          -> INIT_WORK(&worker->work, loop_workfn);
>                  ... queue_work(lo->workqueue, work);
> 
> Then from within the workqueue:
> 
>          -> loop_workfn()
>          -> loop_process_work(worker, &worker->cmd_list, worker->lo);
>          -> loop_handle_cmd(cmd);
>          -> do_req_filebacked(lo, blk_mq_rq_from_pdu(cmd) );
>          -> lo_rw_aio(lo, cmd, pos, READ) // (or WRITE)
> 
>  From here the kiocb is setup and this is the 5.17-rc8 code at the
> bottom of lo_rw_aio() when it sets up the dispatch to the filesystem:
> 
>          cmd->iocb.ki_pos = pos;
>          cmd->iocb.ki_filp = file;
>          cmd->iocb.ki_complete = lo_rw_aio_complete;
>          cmd->iocb.ki_flags = IOCB_DIRECT;
>          cmd->iocb.ki_ioprio = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_NONE, 0);
> 
>          if (rw == WRITE)
>                  ret = call_write_iter(file, &cmd->iocb, &iter);
>          else
>                  ret = call_read_iter(file, &cmd->iocb, &iter);
> 
>          lo_rw_aio_do_completion(cmd);
> 
>          if (ret != -EIOCBQUEUED)
>                  lo_rw_aio_complete(&cmd->iocb, ret);
> 
> 
> After having called `call_read_iter` it is in the filesystem's
> handler.
> 
> Since ki_complete is defined, does that mean the filesystem will _always_
> take these in and always queue these internally and return -EIOCBQUEUED
> from call_read_iter()?  Another way to ask: if ki_complete!=NULL, can a
> filesystem ever behave synchronously?  (Is there documentation about this
> somewhere?  I couldn't find anything definitive.)
> 

a non-null ki_complete asks for async I/O and that is what we need to
get the higher performance.

> 
> About the cleanup after dispatch at the bottom of lo_rw_aio() from this
> code (also shown above):
> 
>          lo_rw_aio_do_completion(cmd);
> 
>          if (ret != -EIOCBQUEUED)
>                  lo_rw_aio_complete(&cmd->iocb, ret);
> 
> * It appears that lo_rw_aio_do_completion() will `kfree(cmd->bvec)`.  If
>    the filesystem queued the cmd->iocb for internal use, would it have made
>    a copy of cmd->bvec so that this is safe?
> 
> * If ret != -EIOCBQUEUED, then lo_rw_aio_complete() is called which calls
>    lo_rw_aio_do_completion() a second time.  Now lo_rw_aio_do_completion
>    does do this ref check, so it _is_ safe:
> 
>          if (!atomic_dec_and_test(&cmd->ref))
>                  return;
> 
> For my own understanding, is this equivalent?
> 
> -       lo_rw_aio_do_completion(cmd);
> 
>          if (ret != -EIOCBQUEUED)
>                  lo_rw_aio_complete(&cmd->iocb, ret);
> +       else
> +               lo_rw_aio_do_completion(cmd);
> 
> 
> 
>

I think the purpose of refcount is to make sure we free the request in
lo_rw_aio_do_completion() whoever finishes last either submission thread
or fs completion ctx calling ki_complete() -> lo_rw_aio_complete().

So there are actually three cases :-

1. I/O is successfully queued i.e. call_iter() ret == -EIOCBQUEUED.
Case 1 :
1.1 fs completion happnes after we exit from lo_rw_aio()
  a. submission thread lo_rw_aio()                             refcnt = 2
  b. submission thread lo_rq_aio_do_completion()               refcnt = 1
  c. fs completion ctx fs->ki_complete()->lo_rw_aio_complete() refcnt = 0

Case 2:
1.2 fs completion happens before we exit lo_rq_aio()
  a. submission thread lo_rw_aio()                             refcnt = 2
  b. fs completion ctx fs->ki_complete()->lo_rw_aio_complete() refcnt = 1
  c. submission thread lo_rq_aio_do_completion()               refcnt = 0

2. I/O is not successfully queued i.e. call_iter() ret != -EIOCBQUEUED.
Case 3:
  a. submission thread lo_rw_aio()                             refcnt = 2
  b. submission thread lo_rq_aio_do_completion()               refcnt = 1
  c. submission thread lo_rw_aio_complete()                    refcnt = 0

so if you change the position of the call lo_rw_aio_do_completion()
it might not work since refcount will be decremented by only once.

hope this helps, if it creates more confusion then plz ignore this and
follow Ming's reply.

-ck


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: loop: are parallel requests serialized by the single workqueue?
  2022-03-17 10:20 ` Chaitanya Kulkarni
@ 2022-03-17 10:27   ` Chaitanya Kulkarni
  0 siblings, 0 replies; 3+ messages in thread
From: Chaitanya Kulkarni @ 2022-03-17 10:27 UTC (permalink / raw)
  To: Eric Wheeler; +Cc: linux-block, Ming Lei

On 3/17/22 3:20 AM, Chaitanya Kulkarni wrote:
> Eric,
> 
> On 3/16/22 7:26 PM, Eric Wheeler wrote:
>> [Some people who received this message don't often get email from linux-block@lists.ewheeler.net. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.]
>>
>> Hi Ming,
>>
>> I was studying the loop.c DIO & AIO changes you made back in 2015 that
>> increased loop performance and reduced the memory footprint
>> (bc07c10a3603a5ab3ef01ba42b3d41f9ac63d1b6).
>>
>> I have a few questions if you are able to comment, here is a quick
>> summary:
>>
>> The direct IO path starts by queuing the work:
>>
>>     .queue_rq       = loop_queue_rq:
>>
>>           -> loop_queue_work(lo, cmd);
>>           -> INIT_WORK(&worker->work, loop_workfn);
>>                   ... queue_work(lo->workqueue, work);
>>
>> Then from within the workqueue:
>>
>>           -> loop_workfn()
>>           -> loop_process_work(worker, &worker->cmd_list, worker->lo);
>>           -> loop_handle_cmd(cmd);
>>           -> do_req_filebacked(lo, blk_mq_rq_from_pdu(cmd) );
>>           -> lo_rw_aio(lo, cmd, pos, READ) // (or WRITE)
>>
>>   From here the kiocb is setup and this is the 5.17-rc8 code at the
>> bottom of lo_rw_aio() when it sets up the dispatch to the filesystem:
>>
>>           cmd->iocb.ki_pos = pos;
>>           cmd->iocb.ki_filp = file;
>>           cmd->iocb.ki_complete = lo_rw_aio_complete;
>>           cmd->iocb.ki_flags = IOCB_DIRECT;
>>           cmd->iocb.ki_ioprio = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_NONE, 0);
>>
>>           if (rw == WRITE)
>>                   ret = call_write_iter(file, &cmd->iocb, &iter);
>>           else
>>                   ret = call_read_iter(file, &cmd->iocb, &iter);
>>
>>           lo_rw_aio_do_completion(cmd);
>>
>>           if (ret != -EIOCBQUEUED)
>>                   lo_rw_aio_complete(&cmd->iocb, ret);
>>
>>
>> After having called `call_read_iter` it is in the filesystem's
>> handler.
>>
>> Since ki_complete is defined, does that mean the filesystem will _always_
>> take these in and always queue these internally and return -EIOCBQUEUED
>> from call_read_iter()?  Another way to ask: if ki_complete!=NULL, can a
>> filesystem ever behave synchronously?  (Is there documentation about this
>> somewhere?  I couldn't find anything definitive.)
>>
> 
> a non-null ki_complete asks for async I/O and that is what we need to
> get the higher performance.
> 
>>
>> About the cleanup after dispatch at the bottom of lo_rw_aio() from this
>> code (also shown above):
>>
>>           lo_rw_aio_do_completion(cmd);
>>
>>           if (ret != -EIOCBQUEUED)
>>                   lo_rw_aio_complete(&cmd->iocb, ret);
>>
>> * It appears that lo_rw_aio_do_completion() will `kfree(cmd->bvec)`.  If
>>     the filesystem queued the cmd->iocb for internal use, would it have made
>>     a copy of cmd->bvec so that this is safe?
>>
>> * If ret != -EIOCBQUEUED, then lo_rw_aio_complete() is called which calls
>>     lo_rw_aio_do_completion() a second time.  Now lo_rw_aio_do_completion
>>     does do this ref check, so it _is_ safe:
>>
>>           if (!atomic_dec_and_test(&cmd->ref))
>>                   return;
>>
>> For my own understanding, is this equivalent?
>>
>> -       lo_rw_aio_do_completion(cmd);
>>
>>           if (ret != -EIOCBQUEUED)
>>                   lo_rw_aio_complete(&cmd->iocb, ret);
>> +       else
>> +               lo_rw_aio_do_completion(cmd);
>>
>>
>>
>>
> 
> I think the purpose of refcount is to make sure we free the request in
> lo_rw_aio_do_completion() whoever finishes last either submission thread
> or fs completion ctx calling ki_complete() -> lo_rw_aio_complete().
> 

what I also meant here is that if fs completion happens before we exit
the lo_rw_aio() then we should not proceed with kfree(cmd->bvec)
but only decrement the ref-count and wait for lo_rw_aio() to decrement 
the ref-count by one more time to kfree(cmd->bvec) before exit.

-ck


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-03-17 10:27 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-17  2:26 loop: are parallel requests serialized by the single workqueue? Eric Wheeler
2022-03-17 10:20 ` Chaitanya Kulkarni
2022-03-17 10:27   ` Chaitanya Kulkarni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.