All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Diop-Gonzalez <marcelo827@gmail.com>
To: Pavel Begunkov <asml.silence@gmail.com>
Cc: axboe@kernel.dk, io-uring@vger.kernel.org
Subject: Re: [PATCH v2 2/2] io_uring: flush timeouts that should already have expired
Date: Mon, 11 Jan 2021 10:28:00 -0500	[thread overview]
Message-ID: <20210111152800.GB2998@marcelo-debian.domain> (raw)
In-Reply-To: <2fc9e651-d786-7c2d-0d2c-47ed454f06be@gmail.com>

On Mon, Jan 11, 2021 at 04:57:21AM +0000, Pavel Begunkov wrote:
> On 08/01/2021 15:57, Marcelo Diop-Gonzalez wrote:
> > On Sat, Jan 02, 2021 at 08:26:26PM +0000, Pavel Begunkov wrote:
> >> On 02/01/2021 19:54, Pavel Begunkov wrote:
> >>> On 19/12/2020 19:15, Marcelo Diop-Gonzalez wrote:
> >>>> Right now io_flush_timeouts() checks if the current number of events
> >>>> is equal to ->timeout.target_seq, but this will miss some timeouts if
> >>>> there have been more than 1 event added since the last time they were
> >>>> flushed (possible in io_submit_flush_completions(), for example). Fix
> >>>> it by recording the starting value of ->cached_cq_overflow -
> >>>> ->cq_timeouts instead of the target value, so that we can safely
> >>>> (without overflow problems) compare the number of events that have
> >>>> happened with the number of events needed to trigger the timeout.
> >>
> >> https://www.spinics.net/lists/kernel/msg3475160.html
> >>
> >> The idea was to replace u32 cached_cq_tail with u64 while keeping
> >> timeout offsets u32. Assuming that we won't ever hit ~2^62 inflight
> >> requests, complete all requests falling into some large enough window
> >> behind that u64 cached_cq_tail.
> >>
> >> simplifying:
> >>
> >> i64 d = target_off - ctx->u64_cq_tail
> >> if (d <= 0 && d > -2^32)
> >> 	complete_it()
> >>
> >> Not fond  of it, but at least worked at that time. You can try out
> >> this approach if you want, but would be perfect if you would find
> >> something more elegant :)
> >>
> > 
> > What do you think about something like this? I think it's not totally
> > correct because it relies on having ->completion_lock in io_timeout() so
> > that ->cq_last_tm_flushed is updated, but in case of IORING_SETUP_IOPOLL,
> > io_iopoll_complete() doesn't take that lock, and ->uring_lock will not
> > be held if io_timeout() is called from io_wq_submit_work(), but maybe
> > could still be worth it since that was already possibly a problem?
> 
> I'll take a look later, but IOPOLL doesn't support timeouts, see
> the first if in io_timeout_prep(), so that's not a problem, but would
> better to leave a comment.
>

Ah right! Nevermind about that then.

> > 
> > diff --git a/fs/io_uring.c b/fs/io_uring.c
> > index cb57e0360fcb..50984709879c 100644
> > --- a/fs/io_uring.c
> > +++ b/fs/io_uring.c
> > @@ -353,6 +353,7 @@ struct io_ring_ctx {
> >  		unsigned		cq_entries;
> >  		unsigned		cq_mask;
> >  		atomic_t		cq_timeouts;
> > +		unsigned		cq_last_tm_flush;
> >  		unsigned long		cq_check_overflow;
> >  		struct wait_queue_head	cq_wait;
> >  		struct fasync_struct	*cq_fasync;
> > @@ -1633,19 +1634,26 @@ static void __io_queue_deferred(struct io_ring_ctx *ctx)
> >  
> >  static void io_flush_timeouts(struct io_ring_ctx *ctx)
> >  {
> > +	u32 seq = ctx->cached_cq_tail - atomic_read(&ctx->cq_timeouts);
> > +
> >  	while (!list_empty(&ctx->timeout_list)) {
> > +		u32 events_needed, events_got;
> >  		struct io_kiocb *req = list_first_entry(&ctx->timeout_list,
> >  						struct io_kiocb, timeout.list);
> >  
> >  		if (io_is_timeout_noseq(req))
> >  			break;
> > -		if (req->timeout.target_seq != ctx->cached_cq_tail
> > -					- atomic_read(&ctx->cq_timeouts))
> > +
> > +		events_needed = req->timeout.target_seq - ctx->cq_last_tm_flush;
> > +		events_got = seq - ctx->cq_last_tm_flush;
> > +		if (events_got < events_needed)
> >  			break;
> >  
> >  		list_del_init(&req->timeout.list);
> >  		io_kill_timeout(req);
> >  	}
> > +
> > +	ctx->cq_last_tm_flush = seq;
> >  }
> >  
> >  static void io_commit_cqring(struct io_ring_ctx *ctx)
> > 
> 
> -- 
> Pavel Begunkov

  reply	other threads:[~2021-01-11 15:29 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-19 19:15 [PATCH v2 0/2] io_uring: fix skipping of old timeout events Marcelo Diop-Gonzalez
2020-12-19 19:15 ` [PATCH v2 1/2] io_uring: only increment ->cq_timeouts along with ->cached_cq_tail Marcelo Diop-Gonzalez
2021-01-02 20:03   ` Pavel Begunkov
2021-01-04 16:49     ` Marcelo Diop-Gonzalez
2020-12-19 19:15 ` [PATCH v2 2/2] io_uring: flush timeouts that should already have expired Marcelo Diop-Gonzalez
2021-01-02 19:54   ` Pavel Begunkov
2021-01-02 20:26     ` Pavel Begunkov
2021-01-08 15:57       ` Marcelo Diop-Gonzalez
2021-01-11  4:57         ` Pavel Begunkov
2021-01-11 15:28           ` Marcelo Diop-Gonzalez [this message]
2021-01-12 20:47         ` Pavel Begunkov
2021-01-13 14:41           ` Marcelo Diop-Gonzalez
2021-01-13 15:20             ` Pavel Begunkov
2021-01-14  0:46           ` Marcelo Diop-Gonzalez
2021-01-14 21:04             ` Pavel Begunkov
2021-01-04 17:56     ` Marcelo Diop-Gonzalez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210111152800.GB2998@marcelo-debian.domain \
    --to=marcelo827@gmail.com \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.