linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stefano Stabellini <sstabellini@kernel.org>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Stefano Stabellini <sstabellini@kernel.org>,
	v9fs-developer@lists.sourceforge.net, ericvh@gmail.com,
	rminnich@sandia.gov, lucho@ionkov.net,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/5] 9p: introduce async read requests
Date: Mon, 12 Dec 2016 17:15:55 -0800 (PST)	[thread overview]
Message-ID: <alpine.DEB.2.10.1612121635020.2777@sstabellini-ThinkPad-X260> (raw)
In-Reply-To: <20161210015059.GD1555@ZenIV.linux.org.uk>

Hi Al,
thanks for looking at the patch.

On Sat, 10 Dec 2016, Al Viro wrote:
> On Thu, Dec 08, 2016 at 12:59:05PM -0800, Stefano Stabellini wrote:
> 
> 
> > +		} else {
> > +			req = p9_client_get_req(clnt, P9_TREAD, "dqd", fid->fid, offset, rsize);
> > +			if (IS_ERR(req)) {
> > +				*err = PTR_ERR(req);
> > +				break;
> > +			}
> > +			req->rsize = iov_iter_get_pages_alloc(to, &req->pagevec, 
> > +					(size_t)rsize, &req->offset);
> > +			req->kiocb = iocb;
> > +			for (i = 0; i < req->rsize; i += PAGE_SIZE)
> > +				page_cache_get_speculative(req->pagevec[i/PAGE_SIZE]);
> > +			req->callback = p9_client_read_complete;
> > +
> > +			*err = clnt->trans_mod->request(clnt, req);
> > +			if (*err < 0) {
> > +				clnt->status = Disconnected;
> > +				release_pages(req->pagevec,
> > +						(req->rsize + PAGE_SIZE - 1) / PAGE_SIZE,
> > +						true);
> > +				kvfree(req->pagevec);
> > +				p9_free_req(clnt, req);
> > +				break;
> > +			}
> > +
> > +			*err = -EIOCBQUEUED;
> 
> IDGI.  AFAICS, your code will result in shitloads of short reads - every
> time when you give it a multi-iovec array, only the first one will be
> issued and the rest won't be even looked at.  Sure, it is technically
> legal, but I very much doubt that aio users will be happy with that.
> 
> What am I missing here?

There is a problem with short reads, but I don't think it is the one
you described, unless I made a mistake somewhere.

The code above will issue one request as big as possible (size is
limited by clnt->msize, which is the size of the channel). No matter how
many segs in the iov_iter, the request will cover the maximum amount of
bytes allowed by the channel, typically 4K but can be larger. So
multi-iovec arrays should be handled correctly up to msize. Please let
me know if I am wrong, I am not an expert on this. I tried, but couldn't
actually get any multi-iovec arrays in my tests.


That said, reading the code again, there are indeed two possible causes
of short reads. The first one are responses which have a smaller byte
count than requested. Currently this case is not handled, forwarding
the short read up to the user. But I wrote and tested successfully a
patch that issues another follow-up request from the completion
function. Example:
  p9_client_read, issue request, size 4K
  p9_client_read_complete, receive response, count 2K
  p9_client_read_complete, issue request size 4K-2K = 2K
  p9_client_read_complete, receive response size 2K
  p9_client_read_complete, call ki_complete


The second possible cause of short reads are requests which are bigger
than msize. For example a 2MB read over a 4K channel. In this patch we
simply issue one request as big as msize, and eventually return to the
caller a smaller byte count. One option is to simply fall back to sync
IO in these cases. Another solution is to use the same technique
described above: we issue the first request as big as msize, then, from
the callback function, we issue a follow-up request, and again, and
again, until we fully complete the large read. This way, although we are
not issuing any simultaneous requests for a specific large read, at
least we can issue other aio requests in parallel for other reads or
writes. It should still be more performant than sync IO.

I am thinking of implementing this second option in the next version of
the series.

  reply	other threads:[~2016-12-13  1:16 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-08 20:58 [PATCH 0/5] async requests support for 9pfs Stefano Stabellini
2016-12-08 20:59 ` [PATCH 1/5] 9p: add iocb parameter to p9_client_read and p9_client_write Stefano Stabellini
2016-12-08 20:59   ` [PATCH 2/5] 9p: store req details and callback in struct p9_req_t Stefano Stabellini
2016-12-09  7:18     ` [V9fs-developer] " Dominique Martinet
2016-12-09 23:24       ` Stefano Stabellini
2016-12-08 20:59   ` [PATCH 3/5] 9p: introduce p9_client_get_req Stefano Stabellini
2016-12-08 20:59   ` [PATCH 4/5] 9p: introduce async read requests Stefano Stabellini
2016-12-09  7:27     ` [V9fs-developer] " Dominique Martinet
2016-12-09 22:22       ` Stefano Stabellini
2016-12-10  1:50     ` Al Viro
2016-12-13  1:15       ` Stefano Stabellini [this message]
2016-12-13 14:29         ` Latchesar Ionkov
2016-12-08 20:59   ` [PATCH 5/5] 9p: introduce async write requests Stefano Stabellini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.10.1612121635020.2777@sstabellini-ThinkPad-X260 \
    --to=sstabellini@kernel.org \
    --cc=ericvh@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lucho@ionkov.net \
    --cc=rminnich@sandia.gov \
    --cc=v9fs-developer@lists.sourceforge.net \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).