All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chaitanya Kulkarni <Chaitanya.Kulkarni@wdc.com>
To: Damien Le Moal <Damien.LeMoal@wdc.com>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>
Cc: "hch@lst.de" <hch@lst.de>,
	"danil.kipnis@cloud.ionos.com" <danil.kipnis@cloud.ionos.com>,
	"jinpu.wang@cloud.ionos.com" <jinpu.wang@cloud.ionos.com>
Subject: Re: [PATCH 1/2] block: add bio based rw helper for data buffer
Date: Tue, 7 Apr 2020 20:01:43 +0000	[thread overview]
Message-ID: <BYAPR04MB4965D18A3BE7602AFBC4951D86C30@BYAPR04MB4965.namprd04.prod.outlook.com> (raw)
In-Reply-To: BY5PR04MB6900AB25131618BED40BE0E0E7C30@BY5PR04MB6900.namprd04.prod.outlook.com

On 04/06/2020 05:28 PM, Damien Le Moal wrote:
> On 2020/04/07 8:24, Chaitanya Kulkarni wrote:
>> One of the common application for the file systems and drivers to map
>> the buffer to the bio and issue I/Os on the block device.
>
> Drivers generally do not do this. At least not lower ones (scsi, nvme, etc). I
> guess you are referring to DM drivers. If yes, better be clear about this.
> Also, "the buffer" is a little unclear. Better qualify that.
>
Agree most drivers deals with sg_list. Right now there is only one such
an example I know which is rnbd as driver as it is mentioned in the
cover-letter references.
> Another thing: "to map the buffer to the bio" is a little strange. The reverse
> seems more natural: a bio maps a buffer.
>
Make sense, will do.
>>
>> This is a prep patch which adds two helper functions for block layer
>> which allows file-systems and the drivers to map data buffer on to the
>> bios and issue bios synchronously using blkdev_issue_rw() or
>> asynchronously using __blkdev_issue_rw().
>
> The function names are basically the same but do completely different things.
> Better rename that. May be blkdev_issue_rw_sync() and blkdev_issue_rw_async() ?
>
 >

This is exactly I named functions to start with (see the function
documentation which I missed to update), but the pattern in blk-lib.c
uses no such sync and async postfix, which is used is sync and async
context all over the kernel :-

# grep EXPORT block/blk-lib.c
EXPORT_SYMBOL(__blkdev_issue_discard);
EXPORT_SYMBOL(blkdev_issue_discard);, since as it is nr_sects and sector 
calculation doesn't matter after break.
EXPORT_SYMBOL(blkdev_issue_write_same);
EXPORT_SYMBOL(__blkdev_issue_zeroout);
EXPORT_SYMBOL(blkdev_issue_zeroout);
EXPORT_SYMBOL_GPL(__blkdev_issue_rw);
EXPORT_SYMBOL_GPL(blkdev_issue_rw);

In absence of documentation for naming how about we just follow
existing naming convention ?

Which matches the pattern for new API.

>>
>> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
>> ---
>>   block/blk-lib.c        | 105 +++++++++++++++++++++++++++++++++++++++++
>>   include/linux/blkdev.h |   7 +++
>>   2 files changed, 112 insertions(+)
>>
>> diff --git a/block/blk-lib.c b/block/blk-lib.c
>> index 5f2c429d4378..451c367fc0d6 100644
>> --- a/block/blk-lib.c
>> +++ b/block/blk-lib.c
>> @@ -405,3 +405,108 @@ int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
>>   	return ret;
>>   }
>>   EXPORT_SYMBOL(blkdev_issue_zeroout);
>> +
>> +/**
>> + * __blkdev_ssue_rw - issue read/write bios from buffer asynchronously
>
> s/__blkdev_ssue_rw/__blkdev_issue_rw
Thanks, will fix.
>
>> + * @bdev:	blockdev to read/write
>> + * @buf:	data buffer
>> + * @sector:	start sector
>> + * @nr_sects:	number of sectors
>> + * @op:	REQ_OP_READ or REQ_OP_WRITE
>> + * @opf:	flags
>> + * @gfp_mask:	memory allocation flags (for bio_alloc)
>> + * @biop:	pointer to anchor bio
>> + *
>> + * Description:
>> + *  Generic helper function to map data buffer into bios for read and write ops.
>> + *  Returns pointer to the anchored last bio for caller to submit asynchronously
>> + *  or synchronously.
>> + */
>> +int __blkdev_issue_rw(struct block_device *bdev, char *buf, sector_t sector,
>> +		      sector_t nr_sects, unsigned op, unsigned opf,
>> +		      gfp_t gfp_mask, struct bio **biop)
>> +{
>> +	bool vm = is_vmalloc_addr(buf) ? true : false;
>
> You do not need the clunky "? true : false" here. is_vmalloc_addr() returns a
> bool already.
>
I will remove it.
>> +	struct bio *bio = *biop;
>> +	unsigned int nr_pages;
>> +	struct page *page;
>> +	unsigned int off;
>> +	unsigned int len;
>> +	int bi_size;
>> +
>> +	if (!bdev_get_queue(bdev))
>> +		return -ENXIO;
>> +
>> +	if (bdev_read_only(bdev))
>> +		return -EPERM;
>
> One can read a read-only device. So the check here is not correct. You need a
> "&& op == REQ_OP_WRITE" in the condition.
>
True, this shouldn't be here, also I'm thinking of lifting these checks 
to caller, we can add it later if needed.
>> +
>> +	if (!(op == REQ_OP_READ || op == REQ_OP_WRITE))
>> +		return -EINVAL;
>
> This probably should be checked before read-only.
>
Yes.
>> +
>> +	while (nr_sects != 0) {
>> +		nr_pages = __blkdev_sectors_to_bio_pages(nr_sects);
>> +
>> +		bio = blk_next_bio(bio, nr_pages, gfp_mask);
>> +		bio->bi_iter.bi_sector = sector;
>> +		bio_set_dev(bio, bdev);
>> +		bio_set_op_attrs(bio, op, 0);
>> +
>> +		while (nr_sects != 0) {
>> +			off = offset_in_page(buf);
>> +			page = vm ? vmalloc_to_page(buf) : virt_to_page(buf);
>> +			len = min((sector_t) PAGE_SIZE, nr_sects << 9);
>> +
>> +			bi_size = bio_add_page(bio, page, len, off);
>
> The variable naming is super confusing. bio_add_page() returns 0 if nothing is
> added and len if the page was added. So bi_size as a var name is not the best of
> name in my opinion.
>
Here bio, page, len, off variables are passed to the bio_add_page() 
function, which has the same names in the function parameter list.
(I'll fix the off to offset)

Regarding the bi_size, given that bio_add_page() returns 
bio->bi_iter.bi_size, isn't bi_size also maps to the what function is 
returning and explains what we are dealing with?

Also, bi_size is used in the blk-lib.c: __blkdev_issue_zero_pages(), if 
that is confusing then we need to change that, what is your preference?

I'll send a cleanup patch __blkdev_issue_zero_pages() also.

>> +
>> +			nr_sects -= bi_size >> 9;
>> +			sector += bi_size >> 9;
>> +			buf += bi_size;
>> +
>> +			if (bi_size < len)
>> +				break;
>
> You will get either 0 or len from bio_add_page. So the check here is not ideal.
> I think it is better to move it up under bio_add_page() and make it:
>
> 			len = bioa_add_page(bio, page, len, off);
> 			if (!len)
> 				break;
I'd still like to keep bi_size, I think I had received a comment about 
using same variable for function call and updating as a return value.

Regarding the check, will move it.
>
>> +		}
>> +		cond_resched();
>> +	}
>> +	*biop = bio;
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(__blkdev_issue_rw);
>> +
>> +/**
>> + * blkdev_execute_rw_sync - issue read/write bios from buffer synchronously
>> + * @bdev:	blockdev to read/write
>> + * @buf:	data buffer
>> + * @sector:	start sector
>> + * @count:	number of bytes
>> + * @op:	REQ_OP_READ or REQ_OP_WRITE
>> + * @opf:	flags
>> + * @gfp_mask:	memory allocation flags (for bio_alloc)
>> + *
>> + * Description:
>> + *  Generic helper function to map data buffer buffer into bios for read and
>> + *  write requests.
>> + */
>> +int blkdev_issue_rw(struct block_device *b, char *buf, sector_t sector,
>> +		     unsigned count, unsigned op, unsigned opf, gfp_t mask)
>
> function name differs between description and declaration.
> blkdev_execute_rw_sync() is better than blkdev_issue_rw() in my opinion.
>
That was remained from initial version, will fix it, thanks for
pointing it out.
>> +{
>> +	unsigned int is_vmalloc = is_vmalloc_addr(buf);
>> +	sector_t nr_sects = count >> 9;
>> +	struct bio *bio = NULL;
>> +	int error;
>> +
>> +	if (is_vmalloc && op == REQ_OP_WRITE)
>> +		flush_kernel_vmap_range(buf, count);
>> +
>> +	opf |= REQ_SYNC;
>
> You can add this directly in the call below.
>
Breaks calling of __blkdev_issue_rw() to new line. This also adds a new
line which is unavoidable but keeps the function call smooth in one
line.
>> +	error = __blkdev_issue_rw(b, buf, sector, nr_sects, op, opf, mask, &bio);
>> +	if (!error && bio) {
>> +		error = submit_bio_wait(bio);
>> +		bio_put(bio);
>> +	}
>> +
>> +	if (is_vmalloc && op == REQ_OP_READ)
>> +		invalidate_kernel_vmap_range(buf, count);
>> +
>> +	return error;
>> +}
>> +EXPORT_SYMBOL_GPL(blkdev_issue_rw);
>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>> index 32868fbedc9e..cb315b301ad9 100644
>> --- a/include/linux/blkdev.h
>> +++ b/include/linux/blkdev.h
>> @@ -1248,6 +1248,13 @@ static inline int sb_issue_zeroout(struct super_block *sb, sector_t block,
>>   				    gfp_mask, 0);
>>   }
>>
>> +extern int __blkdev_issue_rw(struct block_device *bdev, char *buf,
>> +			     sector_t sector, sector_t nr_sects, unsigned op,
>> +			     unsigned opf, gfp_t gfp_mask, struct bio **biop);
>> +extern int blkdev_issue_rw(struct block_device *b, char *buf, sector_t sector,
>> +			   unsigned count, unsigned op, unsigned opf,
>> +			   gfp_t mask);
>
> No need for the extern I think.
>
Again, following the exiting pattern in the file for the header:-

  # grep blkdev_issue include/linux/blkdev.h
extern int blkdev_issue_flush(struct block_device *, gfp_t, sector_t *);

extern int blkdev_issue_write_same(struct block_device *bdev, sector_t
sector,

extern int blkdev_issue_discard(struct block_device *bdev, sector_t
sector,

extern int __blkdev_issue_discard(struct block_device *bdev, sector_t
sector,

extern int __blkdev_issue_zeroout(struct block_device *bdev, sector_t
sector,

extern int blkdev_issue_zeroout(struct block_device *bdev, sector_t
sector,

extern int __blkdev_issue_rw(struct block_device *bdev, char *buf,

extern int blkdev_issue_rw(struct block_device *b, char *buf, sector_t
sector,
>> +
>>   extern int blk_verify_command(unsigned char *cmd, fmode_t mode);
>>
>>   enum blk_default_limits {
>>
>
>


  reply	other threads:[~2020-04-07 20:01 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-06 23:24 [PATCH 0/2] block: add bio based read-write helper Chaitanya Kulkarni
2020-04-06 23:24 ` [PATCH 1/2] block: add bio based rw helper for data buffer Chaitanya Kulkarni
2020-04-07  0:28   ` Damien Le Moal
2020-04-07 20:01     ` Chaitanya Kulkarni [this message]
2020-04-07 10:09   ` Danil Kipnis
2020-04-06 23:24 ` [PATCH 2/2] xfs: use block layer helper for rw Chaitanya Kulkarni
2020-04-07  0:32   ` Damien Le Moal
2020-04-07 20:06     ` Chaitanya Kulkarni
2020-04-07 23:27       ` Dave Chinner
2020-04-08 23:22         ` Chaitanya Kulkarni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BYAPR04MB4965D18A3BE7602AFBC4951D86C30@BYAPR04MB4965.namprd04.prod.outlook.com \
    --to=chaitanya.kulkarni@wdc.com \
    --cc=Damien.LeMoal@wdc.com \
    --cc=danil.kipnis@cloud.ionos.com \
    --cc=hch@lst.de \
    --cc=jinpu.wang@cloud.ionos.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.