From: Ming Lei <ming.lei@canonical.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Dave Kleikamp <dave.kleikamp@oracle.com>,
Jens Axboe <axboe@kernel.dk>, Zach Brown <zab@zabbo.net>,
Maxim Patlasov <mpatlasov@parallels.com>,
Andrew Morton <akpm@linux-foundation.org>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Tejun Heo <tj@kernel.org>, Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH v5 5/5] block: loop: support DIO & AIO
Date: Tue, 23 Jun 2015 20:43:44 +0800 [thread overview]
Message-ID: <CACVXFVPB7UovZOg1DXJRLunBkud2-BBzn6kT8xpk5Rh3ACOFQw@mail.gmail.com> (raw)
In-Reply-To: <CACVXFVPrLGG9tsCwEWXzUhKG+KX=iTX_CEh+7ciMBf-8vyuHRw@mail.gmail.com>
On Mon, Jun 22, 2015 at 8:09 PM, Ming Lei <ming.lei@canonical.com> wrote:
> On Wed, Jun 10, 2015 at 3:46 PM, Christoph Hellwig <hch@infradead.org> wrote:
>>> + int ret;
>>> +
>>> + /* nomerge for loop request queue */
>>> + WARN_ON(cmd->rq->bio != cmd->rq->biotail);
>>> +
>>> + bvec = __bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
>>> + iov_iter_bvec(&iter, ITER_BVEC | rw, bvec,
>>> + bio_segments(bio), blk_rq_bytes(cmd->rq));
>>> +
>>> + cmd->iocb.ki_pos = pos;
>>> + cmd->iocb.ki_filp = file;
>>> + cmd->iocb.ki_complete = lo_rw_aio_complete;
>>> + cmd->iocb.ki_flags = IOCB_DIRECT;
>>> +
>>> + if (rw == WRITE)
>>> + ret = file->f_op->write_iter(&cmd->iocb, &iter);
>>> + else
>>> + ret = file->f_op->read_iter(&cmd->iocb, &iter);
>>
>> I think we really need a vfs_ wrapper here similar to what I did a while
>> ago, e.g. vfs_iter_read/write_async.
>
> For the general async interface, it is a bit complicated than sync interfaces:
>
> - iocb need to be one parameter, because it often depends on callers, such
> as loop can preallocate it
> - direct I/O need to be another parameter(in loop we can use the same helper
> to handle sync request)
> - bvec and the segment number are another two parameters
> - not mention the common parameters(file, offset, pos, complete...)
>
> And this kind of interfaces appeared in V1/V2, looks AIO guys
> doesn't care that, then I moved the helper into loop, and it becomes
> quite simple now. If we convert it to vfs_iter_read/write_async(), more
> source code are introduced, I think.
>
> So how about considering that if there are other uses in the future?
>
>>
>>> +static inline int lo_rw_simple(struct loop_device *lo,
>>> + struct request *rq, loff_t pos, bool rw)
>>> +{
>>> + struct loop_cmd *cmd = blk_mq_rq_to_pdu(rq);
>>> +
>>> + if (cmd->use_aio)
>>> + return lo_rw_aio(lo, cmd, pos, rw);
>>> +
>>> + if (rw == WRITE)
>>> + return lo_write_simple(lo, rq, pos);
>>> + else
>>> + return lo_read_simple(lo, rq, pos);
>>> +}
>>
>> And the io_submit style read/write also works for buffered I/O, so no
>> need to keep lo_write_simple/lo_read_simple around.
>
> That is really a good idea.
There is still one issue to convert lo_write/read_simple as io_submit
style: flush_dcache_page() should be done just after the page is
written by kernel, and it isn't good to do that at batch after the
request is completed.
But flush_dcache_page() isn't needed for dio/aio case, which can be
another benifit by using dio/aio for loop.
>
>>
>>> @@ -1569,7 +1634,8 @@ static void loop_handle_cmd(struct loop_cmd *cmd)
>>> failed:
>>> if (ret)
>>> cmd->rq->errors = -EIO;
>>> - blk_mq_complete_request(cmd->rq);
>>> + if (!cmd->use_aio || ret)
>>> + blk_mq_complete_request(cmd->rq);
>>
>> If you don't complete the request here setting req->error doesn't
>> make sense. I'd suggest to move the blk_mq_complete_request for
>
> The request with ->erros set is really completed here, and the curent
> rule is very simple:
>
> - complete sync/submit failed requests in loop_handle_cmd()
> - complete async requests submitted successfully in its .complete
>
>> everything but the trivial error case into the actual I/O handlers
>> to clean this up a bit, too.
>
> That need to copy the code for handling error in other handlers.
>
> Thanks,
> Ming
prev parent reply other threads:[~2015-06-23 12:44 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-09 13:49 [PATCH v5 0/5] block: loop: improve loop with AIO Ming Lei
2015-06-09 13:49 ` [PATCH v5 1/5] fs: direct-io: don't dirtying pages for ITER_BVEC/ITER_KVEC direct read Ming Lei
2015-06-10 7:35 ` Christoph Hellwig
2015-06-09 13:49 ` [PATCH v5 2/5] block: loop: set QUEUE_FLAG_NOMERGES for request queue of loop Ming Lei
2015-06-10 7:36 ` Christoph Hellwig
2015-06-22 6:32 ` Ming Lei
2015-06-09 13:49 ` [PATCH v5 3/5] block: loop: use kthread_work Ming Lei
2015-06-09 13:49 ` [PATCH v5 4/5] block: loop: prepare for supporing direct IO Ming Lei
2015-06-10 7:40 ` Christoph Hellwig
2015-06-22 12:19 ` Ming Lei
2015-06-09 13:49 ` [PATCH v5 5/5] block: loop: support DIO & AIO Ming Lei
2015-06-10 7:46 ` Christoph Hellwig
2015-06-22 12:09 ` Ming Lei
2015-06-22 16:00 ` Christoph Hellwig
2015-06-23 2:59 ` Ming Lei
2015-06-23 12:43 ` Ming Lei [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CACVXFVPB7UovZOg1DXJRLunBkud2-BBzn6kT8xpk5Rh3ACOFQw@mail.gmail.com \
--to=ming.lei@canonical.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=dave.kleikamp@oracle.com \
--cc=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mpatlasov@parallels.com \
--cc=tj@kernel.org \
--cc=viro@zeniv.linux.org.uk \
--cc=zab@zabbo.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).