linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/2] io_uring: examine request result only after completion
@ 2019-10-24  9:18 Bijan Mottahedeh
  2019-10-24  9:18 ` [RFC 1/2] io_uring: create io_queue_async() function Bijan Mottahedeh
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Bijan Mottahedeh @ 2019-10-24  9:18 UTC (permalink / raw)
  To: axboe; +Cc: linux-block

Running an fio test consistenly crashes the kernel with the trace included
below.  The root cause seems to be the code in __io_submit_sqe() that
checks the result of a request for -EAGAIN in polled mode, without
ensuring first that the request has completed:

	if (ctx->flags & IORING_SETUP_IOPOLL) {
		if (req->result == -EAGAIN)
			return -EAGAIN;

The request will be immediately resubmitted in io_sq_wq_submit_work(),
potentially before the the fisrt submission has completed.  This creates
a race where the original completion may set REQ_F_IOPOLL_COMPLETED in
a freed submission request, overwriting the poisoned bits, casusing the
panic below.

	do {
		ret = __io_submit_sqe(ctx, req, s, false);
		/*
		 * We can get EAGAIN for polled IO even though
		 * we're forcing a sync submission from here,
		 * since we can't wait for request slots on the
		 * block side.
		 */
		if (ret != -EAGAIN)
			break;
		cond_resched();
	} while (1);

The suggested fix is to move a submitted request to the poll list
unconditionally in polled mode.  The request can then be retried if
necessary once the original submission has indeed completed.

This bug raises an issue however since REQ_F_IOPOLL_COMPLETED is set
in io_complete_rw_iopoll() from interrupt context.  NVMe polled queues
however are not supposed to generate interrupts so it is not clear what
is the reason for this apparent inconsitency.

fio job
-------
[global]
filename=/dev/nvme0n1
rw=randread
bs=4k
size=4G
direct=1
time_based=1
runtime=60
randrepeat=1
gtod_reduce=1

fio test
--------
fio nvme.fio --readonly --ioengine=io_uring --iodepth 1024 --fixedbufs --hipri --numjobs=8

panic trace
-----------
[  450.395076] BUG io_kiocb (Not tainted): Poison overwritten
[  450.537797] -----------------------------------------------------------------------------
[  450.537799] INFO: 0x00000000cb333e0b-0x00000000cb333e0b. First byte 0x7b instead of 0x6b
[  450.656496] RIP: 0010:blkdev_bio_end_io+0x71/0xe0
[  450.772066] INFO: Allocated in io_submit_sqe+0x84/0x3d0 age=555 cpu=9 pid=3665
[  450.772070]  __slab_alloc+0x40/0x62
[  450.868914] Code: 75 3c 0f b6 43 32 4c 8b 2b 84 c0 75 0a 48 8b 73 08 49 01 75 08 eb 0b 0f b6 f8 e8 aa 9c 0e 00 48 63 f0 48 8b 03 31 d2 4c 89 ef <ff> 50 10 f6 43 14 01 74 32 48 8d 7b 18 e8 0d 56 0e 00 eb 27 48 8b
[  450.925197]  kmem_cache_alloc+0xa3/0x260
[  450.925198]  io_submit_sqe+0x84/0x3d0
[  451.011642] RSP: 0018:ffffc90006908e28 EFLAGS: 00010046
[  451.053353]  io_ring_submit+0xd5/0x150
[  451.053355]  __x64_sys_io_uring_enter+0x14e/0x290

Bijan Mottahedeh (2):
  io_uring: create io_queue_async() function
  io_uring: examine request result only after completion

 fs/io_uring.c | 122 ++++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 89 insertions(+), 33 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2019-10-30 19:26 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-24  9:18 [RFC 0/2] io_uring: examine request result only after completion Bijan Mottahedeh
2019-10-24  9:18 ` [RFC 1/2] io_uring: create io_queue_async() function Bijan Mottahedeh
2019-10-24  9:18 ` [RFC 2/2] io_uring: examine request result only after completion Bijan Mottahedeh
2019-10-24 17:09 ` [RFC 0/2] " Jens Axboe
2019-10-24 19:18   ` Bijan Mottahedeh
2019-10-24 22:31     ` Jens Axboe
     [not found]       ` <fa82e9fc-caf7-a94a-ebff-536413e9ecce@oracle.com>
2019-10-25 14:07         ` Jens Axboe
2019-10-25 14:18           ` Jens Axboe
2019-10-25 14:21             ` Jens Axboe
2019-10-29 19:17               ` Bijan Mottahedeh
2019-10-29 19:23                 ` Bijan Mottahedeh
2019-10-29 19:27                   ` Jens Axboe
2019-10-29 19:31                     ` Bijan Mottahedeh
2019-10-29 19:33                       ` Jens Axboe
2019-10-29 19:40                         ` Bijan Mottahedeh
2019-10-29 19:46                           ` Jens Axboe
2019-10-29 19:51                             ` Bijan Mottahedeh
2019-10-29 19:52                               ` Jens Axboe
2019-10-30  1:02                                 ` Jens Axboe
2019-10-30 14:02                                   ` Bijan Mottahedeh
2019-10-30 14:18                                     ` Jens Axboe
2019-10-30 17:32                                       ` Jens Axboe
2019-10-30 19:21                                         ` Bijan Mottahedeh
2019-10-30 19:26                                           ` Jens Axboe
2019-10-25 14:42             ` Bijan Mottahedeh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).