linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Revert "aio: block exit_aio() until all context requests are completed"
       [not found] <1431675417-30464-1-git-send-email-borntraeger@de.ibm.com>
@ 2015-05-15  7:41 ` Christian Borntraeger
  2015-05-15 13:42   ` Jeff Moyer
  0 siblings, 1 reply; 4+ messages in thread
From: Christian Borntraeger @ 2015-05-15  7:41 UTC (permalink / raw)
  To: Gu Zheng, Benjamin LaHaise; +Cc: linux-aio, linux-fsdevel

I see a significant latency (can be minutes with 2000 disks and HZ=100)
when exiting a QEMU process that has lots of disk devices via aio. The
process sits idle doing nothing as zombie in exit_aio waiting for the
completion.

Turns out that 
commit 6098b45b32 ("aio: block exit_aio() until all context requests are
completed") caused the delay.

Patch description was:

It seems that exit_aio() also needs to wait for all iocbs to complete (like
io_destroy), but we missed the wait step in current implemention, so fix
it in the same way as we did in io_destroy.

Now: io_destroy requires to block until everything is cleaned up from its
interface description in the manpage:
DESCRIPTION
The  io_destroy()  system call will attempt to cancel all outstanding
asynchronous I/O operations against ctx_id, will block on the completion
of all operations that could not be canceled, and will destroy the ctx_id.

Does process exit require the same full blocking? We might be able to
cleanup the process and let the aio data structures be freed lazily.
Opinions or better ideas?

Christian

diff --git a/fs/aio.c b/fs/aio.c
index a793f70..1e6bcdb 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -820,8 +820,6 @@ void exit_aio(struct mm_struct *mm)

 	for (i = 0; i < table->nr; ++i) {
 		struct kioctx *ctx = table->table[i];
-		struct completion requests_done =
-			COMPLETION_INITIALIZER_ONSTACK(requests_done);

 		if (!ctx)
 			continue;
@@ -833,10 +831,7 @@ void exit_aio(struct mm_struct *mm)
 		 * that it needs to unmap the area, just set it to 0.
 		 */
 		ctx->mmap_size = 0;
-		kill_ioctx(mm, ctx, &requests_done);
-
-		/* Wait until all IO for the context are done. */
-		wait_for_completion(&requests_done);
+		kill_ioctx(mm, ctx, NULL);
 	}

 	RCU_INIT_POINTER(mm->ioctx_table, NULL);


--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: Revert "aio: block exit_aio() until all context requests are completed"
  2015-05-15  7:41 ` Revert "aio: block exit_aio() until all context requests are completed" Christian Borntraeger
@ 2015-05-15 13:42   ` Jeff Moyer
  2015-05-15 15:26     ` Christian Borntraeger
  0 siblings, 1 reply; 4+ messages in thread
From: Jeff Moyer @ 2015-05-15 13:42 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Gu Zheng, Benjamin LaHaise, linux-aio, linux-fsdevel

Christian Borntraeger <borntraeger@de.ibm.com> writes:

> I see a significant latency (can be minutes with 2000 disks and HZ=100)
> when exiting a QEMU process that has lots of disk devices via aio. The
> process sits idle doing nothing as zombie in exit_aio waiting for the
> completion.
>
> Turns out that 
> commit 6098b45b32 ("aio: block exit_aio() until all context requests are
> completed") caused the delay.
>
> Patch description was:
>
> It seems that exit_aio() also needs to wait for all iocbs to complete (like
> io_destroy), but we missed the wait step in current implemention, so fix
> it in the same way as we did in io_destroy.
>
> Now: io_destroy requires to block until everything is cleaned up from its
> interface description in the manpage:
> DESCRIPTION
> The  io_destroy()  system call will attempt to cancel all outstanding
> asynchronous I/O operations against ctx_id, will block on the completion
> of all operations that could not be canceled, and will destroy the ctx_id.
>
> Does process exit require the same full blocking? We might be able to
> cleanup the process and let the aio data structures be freed lazily.
> Opinions or better ideas?

This has already been fixed:

commit dc48e56d761610da4ea1088d1bea0a030b8e3e43
Author: Jens Axboe <axboe@fb.com>
Date:   Wed Apr 15 11:17:23 2015 -0600

    aio: fix serial draining in exit_aio()

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Revert "aio: block exit_aio() until all context requests are completed"
  2015-05-15 13:42   ` Jeff Moyer
@ 2015-05-15 15:26     ` Christian Borntraeger
  2015-05-16 15:16       ` Jens Axboe
  0 siblings, 1 reply; 4+ messages in thread
From: Christian Borntraeger @ 2015-05-15 15:26 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: Gu Zheng, Benjamin LaHaise, linux-aio, linux-fsdevel, stable

Am 15.05.2015 um 15:42 schrieb Jeff Moyer:
> Christian Borntraeger <borntraeger@de.ibm.com> writes:
> 
>> I see a significant latency (can be minutes with 2000 disks and HZ=100)
>> when exiting a QEMU process that has lots of disk devices via aio. The
>> process sits idle doing nothing as zombie in exit_aio waiting for the
>> completion.
>>
>> Turns out that 
>> commit 6098b45b32 ("aio: block exit_aio() until all context requests are
>> completed") caused the delay.
>>
>> Patch description was:
>>
>> It seems that exit_aio() also needs to wait for all iocbs to complete (like
>> io_destroy), but we missed the wait step in current implemention, so fix
>> it in the same way as we did in io_destroy.
>>
>> Now: io_destroy requires to block until everything is cleaned up from its
>> interface description in the manpage:
>> DESCRIPTION
>> The  io_destroy()  system call will attempt to cancel all outstanding
>> asynchronous I/O operations against ctx_id, will block on the completion
>> of all operations that could not be canceled, and will destroy the ctx_id.
>>
>> Does process exit require the same full blocking? We might be able to
>> cleanup the process and let the aio data structures be freed lazily.
>> Opinions or better ideas?
> 
> This has already been fixed:
> 
> commit dc48e56d761610da4ea1088d1bea0a030b8e3e43
> Author: Jens Axboe <axboe@fb.com>
> Date:   Wed Apr 15 11:17:23 2015 -0600
> 
>     aio: fix serial draining in exit_aio()
> 
> Cheers,
> Jeff
> 
Cool thanks. As the original patch had cc stable, shouldnt the fix also be backported?

Christian

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Revert "aio: block exit_aio() until all context requests are completed"
  2015-05-15 15:26     ` Christian Borntraeger
@ 2015-05-16 15:16       ` Jens Axboe
  0 siblings, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2015-05-16 15:16 UTC (permalink / raw)
  To: Christian Borntraeger, Jeff Moyer
  Cc: Gu Zheng, Benjamin LaHaise, linux-aio, linux-fsdevel, stable

On 05/15/2015 09:26 AM, Christian Borntraeger wrote:
> Am 15.05.2015 um 15:42 schrieb Jeff Moyer:
>> Christian Borntraeger <borntraeger@de.ibm.com> writes:
>>
>>> I see a significant latency (can be minutes with 2000 disks and HZ=100)
>>> when exiting a QEMU process that has lots of disk devices via aio. The
>>> process sits idle doing nothing as zombie in exit_aio waiting for the
>>> completion.
>>>
>>> Turns out that
>>> commit 6098b45b32 ("aio: block exit_aio() until all context requests are
>>> completed") caused the delay.
>>>
>>> Patch description was:
>>>
>>> It seems that exit_aio() also needs to wait for all iocbs to complete (like
>>> io_destroy), but we missed the wait step in current implemention, so fix
>>> it in the same way as we did in io_destroy.
>>>
>>> Now: io_destroy requires to block until everything is cleaned up from its
>>> interface description in the manpage:
>>> DESCRIPTION
>>> The  io_destroy()  system call will attempt to cancel all outstanding
>>> asynchronous I/O operations against ctx_id, will block on the completion
>>> of all operations that could not be canceled, and will destroy the ctx_id.
>>>
>>> Does process exit require the same full blocking? We might be able to
>>> cleanup the process and let the aio data structures be freed lazily.
>>> Opinions or better ideas?
>>
>> This has already been fixed:
>>
>> commit dc48e56d761610da4ea1088d1bea0a030b8e3e43
>> Author: Jens Axboe <axboe@fb.com>
>> Date:   Wed Apr 15 11:17:23 2015 -0600
>>
>>      aio: fix serial draining in exit_aio()
>>
>> Cheers,
>> Jeff
>>
> Cool thanks. As the original patch had cc stable, shouldnt the fix also be backported?

I'll email stable.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-05-16 15:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1431675417-30464-1-git-send-email-borntraeger@de.ibm.com>
2015-05-15  7:41 ` Revert "aio: block exit_aio() until all context requests are completed" Christian Borntraeger
2015-05-15 13:42   ` Jeff Moyer
2015-05-15 15:26     ` Christian Borntraeger
2015-05-16 15:16       ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).