io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* WRITEV with IOSQE_ASYNC broken?
@ 2020-09-05  3:22 nick
  2020-09-05  3:53 ` Jens Axboe
  0 siblings, 1 reply; 11+ messages in thread
From: nick @ 2020-09-05  3:22 UTC (permalink / raw)
  To: io-uring

Hi,

I am helping out with the netty io_uring integration, and came across 
some strange behaviour which seems like it might be a bug related to 
async offload of read/write iovecs.

Basically a WRITEV SQE seems to fail reliably with -BADADDRESS when the 
IOSQE_ASYNC flag is set but works fine otherwise (everything else the 
same). This is with 5.9.0-rc3.

Sorry if I've made a mistake somehow, and thanks for all the great work 
on this game-changing feature!

Nick

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WRITEV with IOSQE_ASYNC broken?
  2020-09-05  3:22 WRITEV with IOSQE_ASYNC broken? nick
@ 2020-09-05  3:53 ` Jens Axboe
  2020-09-05  3:57   ` Jens Axboe
  0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2020-09-05  3:53 UTC (permalink / raw)
  To: nick, io-uring

On 9/4/20 9:22 PM, nick@nickhill.org wrote:
> Hi,
> 
> I am helping out with the netty io_uring integration, and came across 
> some strange behaviour which seems like it might be a bug related to 
> async offload of read/write iovecs.
> 
> Basically a WRITEV SQE seems to fail reliably with -BADADDRESS when the 
> IOSQE_ASYNC flag is set but works fine otherwise (everything else the 
> same). This is with 5.9.0-rc3.

Do you see it just on 5.9-rc3, or also 5.8? Just curious... But that is
very odd in any case, ASYNC writev is even part of the regular tests.
Any sort of deferral, be it explicit via ASYNC or implicit through
needing to retry, saves all the needed details to retry without
needing any of the original context.

Can you narrow down what exactly is being written - like file type,
buffered/O_DIRECT, etc. What file system, what device is hosting it.
The more details the better, will help me narrow down what is going on.
 
> Sorry if I've made a mistake somehow, and thanks for all the great work 
> on this game-changing feature!

Thanks! Let's get to the bottom of this.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WRITEV with IOSQE_ASYNC broken?
  2020-09-05  3:53 ` Jens Axboe
@ 2020-09-05  3:57   ` Jens Axboe
  2020-09-05  4:35     ` Jens Axboe
  2020-09-05  5:04     ` nick
  0 siblings, 2 replies; 11+ messages in thread
From: Jens Axboe @ 2020-09-05  3:57 UTC (permalink / raw)
  To: nick, io-uring

On 9/4/20 9:53 PM, Jens Axboe wrote:
> On 9/4/20 9:22 PM, nick@nickhill.org wrote:
>> Hi,
>>
>> I am helping out with the netty io_uring integration, and came across 
>> some strange behaviour which seems like it might be a bug related to 
>> async offload of read/write iovecs.
>>
>> Basically a WRITEV SQE seems to fail reliably with -BADADDRESS when the 
>> IOSQE_ASYNC flag is set but works fine otherwise (everything else the 
>> same). This is with 5.9.0-rc3.
> 
> Do you see it just on 5.9-rc3, or also 5.8? Just curious... But that is
> very odd in any case, ASYNC writev is even part of the regular tests.
> Any sort of deferral, be it explicit via ASYNC or implicit through
> needing to retry, saves all the needed details to retry without
> needing any of the original context.
> 
> Can you narrow down what exactly is being written - like file type,
> buffered/O_DIRECT, etc. What file system, what device is hosting it.
> The more details the better, will help me narrow down what is going on.

Forgot, also size of the IO (both total, but also number of iovecs in
that particular request.

Essentially all the details that I would need to recreate what you're
seeing.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WRITEV with IOSQE_ASYNC broken?
  2020-09-05  3:57   ` Jens Axboe
@ 2020-09-05  4:35     ` Jens Axboe
  2020-09-05  5:50       ` Pavel Begunkov
  2020-09-05  5:04     ` nick
  1 sibling, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2020-09-05  4:35 UTC (permalink / raw)
  To: nick, io-uring

On 9/4/20 9:57 PM, Jens Axboe wrote:
> On 9/4/20 9:53 PM, Jens Axboe wrote:
>> On 9/4/20 9:22 PM, nick@nickhill.org wrote:
>>> Hi,
>>>
>>> I am helping out with the netty io_uring integration, and came across 
>>> some strange behaviour which seems like it might be a bug related to 
>>> async offload of read/write iovecs.
>>>
>>> Basically a WRITEV SQE seems to fail reliably with -BADADDRESS when the 
>>> IOSQE_ASYNC flag is set but works fine otherwise (everything else the 
>>> same). This is with 5.9.0-rc3.
>>
>> Do you see it just on 5.9-rc3, or also 5.8? Just curious... But that is
>> very odd in any case, ASYNC writev is even part of the regular tests.
>> Any sort of deferral, be it explicit via ASYNC or implicit through
>> needing to retry, saves all the needed details to retry without
>> needing any of the original context.
>>
>> Can you narrow down what exactly is being written - like file type,
>> buffered/O_DIRECT, etc. What file system, what device is hosting it.
>> The more details the better, will help me narrow down what is going on.
> 
> Forgot, also size of the IO (both total, but also number of iovecs in
> that particular request.
> 
> Essentially all the details that I would need to recreate what you're
> seeing.

Turns out there was a bug in the explicit handling, new in the current
-rc series. Can you try and add the below?

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 0d7be2e9d005..000ae2acfd58 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2980,14 +2980,15 @@ static inline int io_rw_prep_async(struct io_kiocb *req, int rw,
 				   bool force_nonblock)
 {
 	struct io_async_rw *iorw = &req->io->rw;
+	struct iovec *iov;
 	ssize_t ret;
 
-	iorw->iter.iov = iorw->fast_iov;
-	ret = __io_import_iovec(rw, req, (struct iovec **) &iorw->iter.iov,
-				&iorw->iter, !force_nonblock);
+	iorw->iter.iov = iov = iorw->fast_iov;
+	ret = __io_import_iovec(rw, req, &iov, &iorw->iter, !force_nonblock);
 	if (unlikely(ret < 0))
 		return ret;
 
+	iorw->iter.iov = iov;
 	io_req_map_rw(req, iorw->iter.iov, iorw->fast_iov, &iorw->iter);
 	return 0;
 }

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: WRITEV with IOSQE_ASYNC broken?
  2020-09-05  3:57   ` Jens Axboe
  2020-09-05  4:35     ` Jens Axboe
@ 2020-09-05  5:04     ` nick
  1 sibling, 0 replies; 11+ messages in thread
From: nick @ 2020-09-05  5:04 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring

On 2020-09-04 20:57, Jens Axboe wrote:
> On 9/4/20 9:53 PM, Jens Axboe wrote:
>> On 9/4/20 9:22 PM, nick@nickhill.org wrote:
>>> Hi,
>>> 
>>> I am helping out with the netty io_uring integration, and came across
>>> some strange behaviour which seems like it might be a bug related to
>>> async offload of read/write iovecs.
>>> 
>>> Basically a WRITEV SQE seems to fail reliably with -BADADDRESS when 
>>> the
>>> IOSQE_ASYNC flag is set but works fine otherwise (everything else the
>>> same). This is with 5.9.0-rc3.
>> 
>> Do you see it just on 5.9-rc3, or also 5.8? Just curious... But that 
>> is
>> very odd in any case, ASYNC writev is even part of the regular tests.
>> Any sort of deferral, be it explicit via ASYNC or implicit through
>> needing to retry, saves all the needed details to retry without
>> needing any of the original context.
>> 
>> Can you narrow down what exactly is being written - like file type,
>> buffered/O_DIRECT, etc. What file system, what device is hosting it.
>> The more details the better, will help me narrow down what is going 
>> on.
> 
> Forgot, also size of the IO (both total, but also number of iovecs in
> that particular request.
> 
> Essentially all the details that I would need to recreate what you're
> seeing.

I only started testing on 5.9-rc3 so not sure about earlier versions, 
but I'll try and report back.

It's a socket with O_NONBLOCK, iovec array length is ~30 and sum of 
buffer sizes ~1MB.

If it's not easy to recreate then please don't waste time since it could 
be my mistake - I'll try to make a standalone reproducer in that case.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WRITEV with IOSQE_ASYNC broken?
  2020-09-05  4:35     ` Jens Axboe
@ 2020-09-05  5:50       ` Pavel Begunkov
  2020-09-05  8:24         ` nick
  2020-09-05 15:10         ` Jens Axboe
  0 siblings, 2 replies; 11+ messages in thread
From: Pavel Begunkov @ 2020-09-05  5:50 UTC (permalink / raw)
  To: Jens Axboe, nick, io-uring

On 05/09/2020 07:35, Jens Axboe wrote:
> On 9/4/20 9:57 PM, Jens Axboe wrote:
>> On 9/4/20 9:53 PM, Jens Axboe wrote:
>>> On 9/4/20 9:22 PM, nick@nickhill.org wrote:
>>>> Hi,
>>>>
>>>> I am helping out with the netty io_uring integration, and came across 
>>>> some strange behaviour which seems like it might be a bug related to 
>>>> async offload of read/write iovecs.
>>>>
>>>> Basically a WRITEV SQE seems to fail reliably with -BADADDRESS when the 
>>>> IOSQE_ASYNC flag is set but works fine otherwise (everything else the 
>>>> same). This is with 5.9.0-rc3.
>>>
>>> Do you see it just on 5.9-rc3, or also 5.8? Just curious... But that is
>>> very odd in any case, ASYNC writev is even part of the regular tests.
>>> Any sort of deferral, be it explicit via ASYNC or implicit through
>>> needing to retry, saves all the needed details to retry without
>>> needing any of the original context.
>>>
>>> Can you narrow down what exactly is being written - like file type,
>>> buffered/O_DIRECT, etc. What file system, what device is hosting it.
>>> The more details the better, will help me narrow down what is going on.
>>
>> Forgot, also size of the IO (both total, but also number of iovecs in
>> that particular request.
>>
>> Essentially all the details that I would need to recreate what you're
>> seeing.
> 
> Turns out there was a bug in the explicit handling, new in the current
> -rc series. Can you try and add the below?

Hah, absolutely the same patch was in a series I was going to send
today, but with a note that it works by luck so not a bug. Apparently,
it is :)

BTW, const in iter->iov is guarding from such cases, yet another proof
that const casts are evil.

> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 0d7be2e9d005..000ae2acfd58 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -2980,14 +2980,15 @@ static inline int io_rw_prep_async(struct io_kiocb *req, int rw,
>  				   bool force_nonblock)
>  {
>  	struct io_async_rw *iorw = &req->io->rw;
> +	struct iovec *iov;
>  	ssize_t ret;
>  
> -	iorw->iter.iov = iorw->fast_iov;
> -	ret = __io_import_iovec(rw, req, (struct iovec **) &iorw->iter.iov,
> -				&iorw->iter, !force_nonblock);
> +	iorw->iter.iov = iov = iorw->fast_iov;
> +	ret = __io_import_iovec(rw, req, &iov, &iorw->iter, !force_nonblock);
>  	if (unlikely(ret < 0))
>  		return ret;
>  
> +	iorw->iter.iov = iov;
>  	io_req_map_rw(req, iorw->iter.iov, iorw->fast_iov, &iorw->iter);
>  	return 0;
>  }
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WRITEV with IOSQE_ASYNC broken?
  2020-09-05  5:50       ` Pavel Begunkov
@ 2020-09-05  8:24         ` nick
  2020-09-05  8:26           ` Norman Maurer
  2020-09-05 15:10         ` Jens Axboe
  1 sibling, 1 reply; 11+ messages in thread
From: nick @ 2020-09-05  8:24 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: Jens Axboe, io-uring

On 2020-09-04 22:50, Pavel Begunkov wrote:
> On 05/09/2020 07:35, Jens Axboe wrote:
>> On 9/4/20 9:57 PM, Jens Axboe wrote:
>>> On 9/4/20 9:53 PM, Jens Axboe wrote:
>>>> On 9/4/20 9:22 PM, nick@nickhill.org wrote:
>>>>> Hi,
>>>>> 
>>>>> I am helping out with the netty io_uring integration, and came 
>>>>> across
>>>>> some strange behaviour which seems like it might be a bug related 
>>>>> to
>>>>> async offload of read/write iovecs.
>>>>> 
>>>>> Basically a WRITEV SQE seems to fail reliably with -BADADDRESS when 
>>>>> the
>>>>> IOSQE_ASYNC flag is set but works fine otherwise (everything else 
>>>>> the
>>>>> same). This is with 5.9.0-rc3.
>>>> 
>>>> Do you see it just on 5.9-rc3, or also 5.8? Just curious... But that 
>>>> is
>>>> very odd in any case, ASYNC writev is even part of the regular 
>>>> tests.
>>>> Any sort of deferral, be it explicit via ASYNC or implicit through
>>>> needing to retry, saves all the needed details to retry without
>>>> needing any of the original context.
>>>> 
>>>> Can you narrow down what exactly is being written - like file type,
>>>> buffered/O_DIRECT, etc. What file system, what device is hosting it.
>>>> The more details the better, will help me narrow down what is going 
>>>> on.
>>> 
>>> Forgot, also size of the IO (both total, but also number of iovecs in
>>> that particular request.
>>> 
>>> Essentially all the details that I would need to recreate what you're
>>> seeing.
>> 
>> Turns out there was a bug in the explicit handling, new in the current
>> -rc series. Can you try and add the below?
> 
> Hah, absolutely the same patch was in a series I was going to send
> today, but with a note that it works by luck so not a bug. Apparently,
> it is :)
> 
> BTW, const in iter->iov is guarding from such cases, yet another proof
> that const casts are evil.
> 
>> 
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 0d7be2e9d005..000ae2acfd58 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -2980,14 +2980,15 @@ static inline int io_rw_prep_async(struct 
>> io_kiocb *req, int rw,
>>  				   bool force_nonblock)
>>  {
>>  	struct io_async_rw *iorw = &req->io->rw;
>> +	struct iovec *iov;
>>  	ssize_t ret;
>> 
>> -	iorw->iter.iov = iorw->fast_iov;
>> -	ret = __io_import_iovec(rw, req, (struct iovec **) &iorw->iter.iov,
>> -				&iorw->iter, !force_nonblock);
>> +	iorw->iter.iov = iov = iorw->fast_iov;
>> +	ret = __io_import_iovec(rw, req, &iov, &iorw->iter, 
>> !force_nonblock);
>>  	if (unlikely(ret < 0))
>>  		return ret;
>> 
>> +	iorw->iter.iov = iov;
>>  	io_req_map_rw(req, iorw->iter.iov, iorw->fast_iov, &iorw->iter);
>>  	return 0;
>>  }
>> 

Thanks for the speedy replies and finding/fixing this so fast! I'm new 
to kernel dev and haven't built my own yet but I think Norman is going 
to try out your patch soon.

Nick

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WRITEV with IOSQE_ASYNC broken?
  2020-09-05  8:24         ` nick
@ 2020-09-05  8:26           ` Norman Maurer
  2020-09-05 14:28             ` Norman Maurer
  0 siblings, 1 reply; 11+ messages in thread
From: Norman Maurer @ 2020-09-05  8:26 UTC (permalink / raw)
  To: Nick Hill; +Cc: Pavel Begunkov, Jens Axboe, io-uring

Yes … I will :) I am already compiling the kernel as we speak with the patch applied. Will report back later today. 



> On 5. Sep 2020, at 10:24, nick@nickhill.org wrote:
> 
> On 2020-09-04 22:50, Pavel Begunkov wrote:
>> On 05/09/2020 07:35, Jens Axboe wrote:
>>> On 9/4/20 9:57 PM, Jens Axboe wrote:
>>>> On 9/4/20 9:53 PM, Jens Axboe wrote:
>>>>> On 9/4/20 9:22 PM, nick@nickhill.org wrote:
>>>>>> Hi,
>>>>>> I am helping out with the netty io_uring integration, and came across
>>>>>> some strange behaviour which seems like it might be a bug related to
>>>>>> async offload of read/write iovecs.
>>>>>> Basically a WRITEV SQE seems to fail reliably with -BADADDRESS when the
>>>>>> IOSQE_ASYNC flag is set but works fine otherwise (everything else the
>>>>>> same). This is with 5.9.0-rc3.
>>>>> Do you see it just on 5.9-rc3, or also 5.8? Just curious... But that is
>>>>> very odd in any case, ASYNC writev is even part of the regular tests.
>>>>> Any sort of deferral, be it explicit via ASYNC or implicit through
>>>>> needing to retry, saves all the needed details to retry without
>>>>> needing any of the original context.
>>>>> Can you narrow down what exactly is being written - like file type,
>>>>> buffered/O_DIRECT, etc. What file system, what device is hosting it.
>>>>> The more details the better, will help me narrow down what is going on.
>>>> Forgot, also size of the IO (both total, but also number of iovecs in
>>>> that particular request.
>>>> Essentially all the details that I would need to recreate what you're
>>>> seeing.
>>> Turns out there was a bug in the explicit handling, new in the current
>>> -rc series. Can you try and add the below?
>> Hah, absolutely the same patch was in a series I was going to send
>> today, but with a note that it works by luck so not a bug. Apparently,
>> it is :)
>> BTW, const in iter->iov is guarding from such cases, yet another proof
>> that const casts are evil.
>>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>>> index 0d7be2e9d005..000ae2acfd58 100644
>>> --- a/fs/io_uring.c
>>> +++ b/fs/io_uring.c
>>> @@ -2980,14 +2980,15 @@ static inline int io_rw_prep_async(struct io_kiocb *req, int rw,
>>> 				   bool force_nonblock)
>>> {
>>> 	struct io_async_rw *iorw = &req->io->rw;
>>> +	struct iovec *iov;
>>> 	ssize_t ret;
>>> -	iorw->iter.iov = iorw->fast_iov;
>>> -	ret = __io_import_iovec(rw, req, (struct iovec **) &iorw->iter.iov,
>>> -				&iorw->iter, !force_nonblock);
>>> +	iorw->iter.iov = iov = iorw->fast_iov;
>>> +	ret = __io_import_iovec(rw, req, &iov, &iorw->iter, !force_nonblock);
>>> 	if (unlikely(ret < 0))
>>> 		return ret;
>>> +	iorw->iter.iov = iov;
>>> 	io_req_map_rw(req, iorw->iter.iov, iorw->fast_iov, &iorw->iter);
>>> 	return 0;
>>> }
> 
> Thanks for the speedy replies and finding/fixing this so fast! I'm new to kernel dev and haven't built my own yet but I think Norman is going to try out your patch soon.
> 
> Nick


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WRITEV with IOSQE_ASYNC broken?
  2020-09-05  8:26           ` Norman Maurer
@ 2020-09-05 14:28             ` Norman Maurer
  2020-09-05 15:02               ` Jens Axboe
  0 siblings, 1 reply; 11+ messages in thread
From: Norman Maurer @ 2020-09-05 14:28 UTC (permalink / raw)
  To: Nick Hill, Jens Axboe; +Cc: Pavel Begunkov, io-uring

I can confirm this fixed the problem for us.

Thanks a lot of the quick turnaround (as always!).

Bye
Norman


> On 5. Sep 2020, at 10:26, Norman Maurer <norman.maurer@googlemail.com> wrote:
> 
> Yes … I will :) I am already compiling the kernel as we speak with the patch applied. Will report back later today. 
> 
> 
> 
>> On 5. Sep 2020, at 10:24, nick@nickhill.org wrote:
>> 
>> On 2020-09-04 22:50, Pavel Begunkov wrote:
>>> On 05/09/2020 07:35, Jens Axboe wrote:
>>>> On 9/4/20 9:57 PM, Jens Axboe wrote:
>>>>> On 9/4/20 9:53 PM, Jens Axboe wrote:
>>>>>> On 9/4/20 9:22 PM, nick@nickhill.org wrote:
>>>>>>> Hi,
>>>>>>> I am helping out with the netty io_uring integration, and came across
>>>>>>> some strange behaviour which seems like it might be a bug related to
>>>>>>> async offload of read/write iovecs.
>>>>>>> Basically a WRITEV SQE seems to fail reliably with -BADADDRESS when the
>>>>>>> IOSQE_ASYNC flag is set but works fine otherwise (everything else the
>>>>>>> same). This is with 5.9.0-rc3.
>>>>>> Do you see it just on 5.9-rc3, or also 5.8? Just curious... But that is
>>>>>> very odd in any case, ASYNC writev is even part of the regular tests.
>>>>>> Any sort of deferral, be it explicit via ASYNC or implicit through
>>>>>> needing to retry, saves all the needed details to retry without
>>>>>> needing any of the original context.
>>>>>> Can you narrow down what exactly is being written - like file type,
>>>>>> buffered/O_DIRECT, etc. What file system, what device is hosting it.
>>>>>> The more details the better, will help me narrow down what is going on.
>>>>> Forgot, also size of the IO (both total, but also number of iovecs in
>>>>> that particular request.
>>>>> Essentially all the details that I would need to recreate what you're
>>>>> seeing.
>>>> Turns out there was a bug in the explicit handling, new in the current
>>>> -rc series. Can you try and add the below?
>>> Hah, absolutely the same patch was in a series I was going to send
>>> today, but with a note that it works by luck so not a bug. Apparently,
>>> it is :)
>>> BTW, const in iter->iov is guarding from such cases, yet another proof
>>> that const casts are evil.
>>>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>>>> index 0d7be2e9d005..000ae2acfd58 100644
>>>> --- a/fs/io_uring.c
>>>> +++ b/fs/io_uring.c
>>>> @@ -2980,14 +2980,15 @@ static inline int io_rw_prep_async(struct io_kiocb *req, int rw,
>>>> 				   bool force_nonblock)
>>>> {
>>>> 	struct io_async_rw *iorw = &req->io->rw;
>>>> +	struct iovec *iov;
>>>> 	ssize_t ret;
>>>> -	iorw->iter.iov = iorw->fast_iov;
>>>> -	ret = __io_import_iovec(rw, req, (struct iovec **) &iorw->iter.iov,
>>>> -				&iorw->iter, !force_nonblock);
>>>> +	iorw->iter.iov = iov = iorw->fast_iov;
>>>> +	ret = __io_import_iovec(rw, req, &iov, &iorw->iter, !force_nonblock);
>>>> 	if (unlikely(ret < 0))
>>>> 		return ret;
>>>> +	iorw->iter.iov = iov;
>>>> 	io_req_map_rw(req, iorw->iter.iov, iorw->fast_iov, &iorw->iter);
>>>> 	return 0;
>>>> }
>> 
>> Thanks for the speedy replies and finding/fixing this so fast! I'm new to kernel dev and haven't built my own yet but I think Norman is going to try out your patch soon.
>> 
>> Nick
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WRITEV with IOSQE_ASYNC broken?
  2020-09-05 14:28             ` Norman Maurer
@ 2020-09-05 15:02               ` Jens Axboe
  0 siblings, 0 replies; 11+ messages in thread
From: Jens Axboe @ 2020-09-05 15:02 UTC (permalink / raw)
  To: Norman Maurer, Nick Hill; +Cc: Pavel Begunkov, io-uring

On 9/5/20 8:28 AM, Norman Maurer wrote:
> I can confirm this fixed the problem for us.
> 
> Thanks a lot of the quick turnaround (as always!).

No problem, thanks for reporting! I've queued up a regression test
as well so this won't ever happen again.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WRITEV with IOSQE_ASYNC broken?
  2020-09-05  5:50       ` Pavel Begunkov
  2020-09-05  8:24         ` nick
@ 2020-09-05 15:10         ` Jens Axboe
  1 sibling, 0 replies; 11+ messages in thread
From: Jens Axboe @ 2020-09-05 15:10 UTC (permalink / raw)
  To: Pavel Begunkov, nick, io-uring

On 9/4/20 11:50 PM, Pavel Begunkov wrote:
> On 05/09/2020 07:35, Jens Axboe wrote:
>> On 9/4/20 9:57 PM, Jens Axboe wrote:
>>> On 9/4/20 9:53 PM, Jens Axboe wrote:
>>>> On 9/4/20 9:22 PM, nick@nickhill.org wrote:
>>>>> Hi,
>>>>>
>>>>> I am helping out with the netty io_uring integration, and came across 
>>>>> some strange behaviour which seems like it might be a bug related to 
>>>>> async offload of read/write iovecs.
>>>>>
>>>>> Basically a WRITEV SQE seems to fail reliably with -BADADDRESS when the 
>>>>> IOSQE_ASYNC flag is set but works fine otherwise (everything else the 
>>>>> same). This is with 5.9.0-rc3.
>>>>
>>>> Do you see it just on 5.9-rc3, or also 5.8? Just curious... But that is
>>>> very odd in any case, ASYNC writev is even part of the regular tests.
>>>> Any sort of deferral, be it explicit via ASYNC or implicit through
>>>> needing to retry, saves all the needed details to retry without
>>>> needing any of the original context.
>>>>
>>>> Can you narrow down what exactly is being written - like file type,
>>>> buffered/O_DIRECT, etc. What file system, what device is hosting it.
>>>> The more details the better, will help me narrow down what is going on.
>>>
>>> Forgot, also size of the IO (both total, but also number of iovecs in
>>> that particular request.
>>>
>>> Essentially all the details that I would need to recreate what you're
>>> seeing.
>>
>> Turns out there was a bug in the explicit handling, new in the current
>> -rc series. Can you try and add the below?
> 
> Hah, absolutely the same patch was in a series I was going to send
> today, but with a note that it works by luck so not a bug. Apparently,
> it is :)> 
> BTW, const in iter->iov is guarding from such cases, yet another proof
> that const casts are evil.

Definitely, not a great idea to begin with...

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-09-05 15:11 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-05  3:22 WRITEV with IOSQE_ASYNC broken? nick
2020-09-05  3:53 ` Jens Axboe
2020-09-05  3:57   ` Jens Axboe
2020-09-05  4:35     ` Jens Axboe
2020-09-05  5:50       ` Pavel Begunkov
2020-09-05  8:24         ` nick
2020-09-05  8:26           ` Norman Maurer
2020-09-05 14:28             ` Norman Maurer
2020-09-05 15:02               ` Jens Axboe
2020-09-05 15:10         ` Jens Axboe
2020-09-05  5:04     ` nick

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).