linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Some new bio merging behaviors in __bio_try_merge_page
@ 2019-04-11  5:47 Gao Xiang
  2019-04-11  7:08 ` Ming Lei
  0 siblings, 1 reply; 7+ messages in thread
From: Gao Xiang @ 2019-04-11  5:47 UTC (permalink / raw)
  To: Ming Lei
  Cc: linux-block, LKML, linux-erofs, Jens Axboe, Chao Yu, Greg Kroah-Hartman

Hi Ming,

I found a erofs issue after commit 07173c3ec276
("block: enable multipage bvecs") is merged. It seems that
it tries to merge more physical continuous pages in one iovec.

However it breaks the current erofs_read_raw_page logic since it uses
nr_iovecs of bio_alloc to limit the maximum number of physical
continuous blocks as well. It was practicable since the old
__bio_try_merge_page only tries to merge in the same page.
it is a kAPI behavior change which also affects bio_alloc...

...
231                 err = erofs_map_blocks(inode, &map, EROFS_GET_BLOCKS_RAW);
232                 if (unlikely(err))
233                         goto err_out;
...
284                 /* max # of continuous pages */
285                 if (nblocks > DIV_ROUND_UP(map.m_plen, PAGE_SIZE))
286                         nblocks = DIV_ROUND_UP(map.m_plen, PAGE_SIZE);
287                 if (nblocks > BIO_MAX_PAGES)
288                         nblocks = BIO_MAX_PAGES;
289
290                 bio = erofs_grab_bio(sb, blknr, nblocks, sb,
291                                      read_endio, false);
292                 if (IS_ERR(bio)) {
293                         err = PTR_ERR(bio);
294                         bio = NULL;
295                         goto err_out;
296                 }
297         }
298
299         err = bio_add_page(bio, page, PAGE_SIZE, 0);
300         /* out of the extent or bio is full */
301         if (err < PAGE_SIZE)
302                 goto submit_bio_retry;
...

After commit 07173c3ec276 ("block: enable multipage bvecs"), erofs could
read more beyond what erofs_map_blocks assigns, and out-of-bound data could
be read and it breaks tail-end inline determination.

I can change the logic in erofs. However, out of curiosity, I have no idea
if some other places also are designed like this.

IMO, it's better to provide a total count which indicates how many real
pages have been added in this bio. some thoughts?

Thanks,
Gao Xiang

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some new bio merging behaviors in __bio_try_merge_page
  2019-04-11  5:47 Some new bio merging behaviors in __bio_try_merge_page Gao Xiang
@ 2019-04-11  7:08 ` Ming Lei
  2019-04-11  7:43   ` Gao Xiang
  0 siblings, 1 reply; 7+ messages in thread
From: Ming Lei @ 2019-04-11  7:08 UTC (permalink / raw)
  To: Gao Xiang
  Cc: linux-block, LKML, linux-erofs, Jens Axboe, Chao Yu, Greg Kroah-Hartman

Hi Gao Xiang,

On Thu, Apr 11, 2019 at 01:47:49PM +0800, Gao Xiang wrote:
> Hi Ming,
> 
> I found a erofs issue after commit 07173c3ec276
> ("block: enable multipage bvecs") is merged. It seems that
> it tries to merge more physical continuous pages in one iovec.
> 
> However it breaks the current erofs_read_raw_page logic since it uses
> nr_iovecs of bio_alloc to limit the maximum number of physical

I believe you can do the limit outside easily, such as by checking
how many pages have been added to the bio.

> continuous blocks as well. It was practicable since the old
> __bio_try_merge_page only tries to merge in the same page.
> it is a kAPI behavior change which also affects bio_alloc...
> 
> ...
> 231                 err = erofs_map_blocks(inode, &map, EROFS_GET_BLOCKS_RAW);
> 232                 if (unlikely(err))
> 233                         goto err_out;
> ...
> 284                 /* max # of continuous pages */
> 285                 if (nblocks > DIV_ROUND_UP(map.m_plen, PAGE_SIZE))
> 286                         nblocks = DIV_ROUND_UP(map.m_plen, PAGE_SIZE);
> 287                 if (nblocks > BIO_MAX_PAGES)
> 288                         nblocks = BIO_MAX_PAGES;
> 289
> 290                 bio = erofs_grab_bio(sb, blknr, nblocks, sb,
> 291                                      read_endio, false);

Previously this bio is allowed to add at most 'nblocks' pages, however,
now we are allowed to add at most 'nblocks' io vecs, instead of pages.

> 292                 if (IS_ERR(bio)) {
> 293                         err = PTR_ERR(bio);
> 294                         bio = NULL;
> 295                         goto err_out;
> 296                 }
> 297         }
> 298
> 299         err = bio_add_page(bio, page, PAGE_SIZE, 0);
> 300         /* out of the extent or bio is full */
> 301         if (err < PAGE_SIZE)
> 302                 goto submit_bio_retry;
> ...
> 
> After commit 07173c3ec276 ("block: enable multipage bvecs"), erofs could
> read more beyond what erofs_map_blocks assigns, and out-of-bound data could
> be read and it breaks tail-end inline determination.

I don't understand why, could you explain a bit why erofs reads more?

The amount depends on how many pages you added to the bio.

> 
> I can change the logic in erofs. However, out of curiosity, I have no idea
> if some other places also are designed like this.
> 
> IMO, it's better to provide a total count which indicates how many real
> pages have been added in this bio. some thoughts?

As I mentioned, you may count how many pages added to bio, or you still
can get the number via bio_segments(bio).

Thanks,
Ming

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some new bio merging behaviors in __bio_try_merge_page
  2019-04-11  7:08 ` Ming Lei
@ 2019-04-11  7:43   ` Gao Xiang
  2019-04-11  8:09     ` Ming Lei
  0 siblings, 1 reply; 7+ messages in thread
From: Gao Xiang @ 2019-04-11  7:43 UTC (permalink / raw)
  To: Ming Lei
  Cc: linux-block, LKML, linux-erofs, Jens Axboe, Chao Yu, Greg Kroah-Hartman



On 2019/4/11 15:08, Ming Lei wrote:
> Hi Gao Xiang,
> 
> On Thu, Apr 11, 2019 at 01:47:49PM +0800, Gao Xiang wrote:
>> Hi Ming,
>>
>> I found a erofs issue after commit 07173c3ec276
>> ("block: enable multipage bvecs") is merged. It seems that
>> it tries to merge more physical continuous pages in one iovec.
>>
>> However it breaks the current erofs_read_raw_page logic since it uses
>> nr_iovecs of bio_alloc to limit the maximum number of physical
> 
> I believe you can do the limit outside easily, such as by checking
> how many pages have been added to the bio.

Yes, I agree that. However, I noticed that bio_add_page is exported as
EXPORT_SYMBOL. I have no idea how many out-of-tree drivers break as well.

Is there no way to take old behavior into consideration in the block layer as well?

> 
>> continuous blocks as well. It was practicable since the old
>> __bio_try_merge_page only tries to merge in the same page.
>> it is a kAPI behavior change which also affects bio_alloc...
>>
>> ...
>> 231                 err = erofs_map_blocks(inode, &map, EROFS_GET_BLOCKS_RAW);
>> 232                 if (unlikely(err))
>> 233                         goto err_out;
>> ...
>> 284                 /* max # of continuous pages */
>> 285                 if (nblocks > DIV_ROUND_UP(map.m_plen, PAGE_SIZE))
>> 286                         nblocks = DIV_ROUND_UP(map.m_plen, PAGE_SIZE);
>> 287                 if (nblocks > BIO_MAX_PAGES)
>> 288                         nblocks = BIO_MAX_PAGES;
>> 289
>> 290                 bio = erofs_grab_bio(sb, blknr, nblocks, sb,
>> 291                                      read_endio, false);
> 
> Previously this bio is allowed to add at most 'nblocks' pages, however,
> now we are allowed to add at most 'nblocks' io vecs, instead of pages.
> 
>> 292                 if (IS_ERR(bio)) {
>> 293                         err = PTR_ERR(bio);
>> 294                         bio = NULL;
>> 295                         goto err_out;
>> 296                 }
>> 297         }
>> 298
>> 299         err = bio_add_page(bio, page, PAGE_SIZE, 0);
>> 300         /* out of the extent or bio is full */
>> 301         if (err < PAGE_SIZE)
>> 302                 goto submit_bio_retry;
>> ...
>>
>> After commit 07173c3ec276 ("block: enable multipage bvecs"), erofs could
>> read more beyond what erofs_map_blocks assigns, and out-of-bound data could
>> be read and it breaks tail-end inline determination.
> 
> I don't understand why, could you explain a bit why erofs reads more?

In current erofs, file could be break in two parts (non tail-end and tail-end blocks).
Considering this on-disk layout,
 ________________________________________________________________
|     non tail-end data              |   meta data               |
|____________________________________|_(inode) + non-inline data_|
                          ^          ^
                          |          |        
                          bio start  \-- what nr_iovecs indicates as well:
                                         the end of the non tail-end data.

Before this commit, it will stop just before the next meta data. However,
if the new bio merging behavior is introduced, it could add pages more than
nr_iovecs, thus meta data will be read as the normal data...


> 
> The amount depends on how many pages you added to the bio.
> 
>>
>> I can change the logic in erofs. However, out of curiosity, I have no idea
>> if some other places also are designed like this.
>>
>> IMO, it's better to provide a total count which indicates how many real
>> pages have been added in this bio. some thoughts?
> 
> As I mentioned, you may count how many pages added to bio, or you still
> can get the number via bio_segments(bio).

It is unsuitable to introduce bio_segments for each bio_add_page...

Thanks,
Gao Xiang

> 
> Thanks,
> Ming
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some new bio merging behaviors in __bio_try_merge_page
  2019-04-11  7:43   ` Gao Xiang
@ 2019-04-11  8:09     ` Ming Lei
  2019-04-11 10:20       ` Gao Xiang
  2019-04-11 15:34       ` Christoph Hellwig
  0 siblings, 2 replies; 7+ messages in thread
From: Ming Lei @ 2019-04-11  8:09 UTC (permalink / raw)
  To: Gao Xiang
  Cc: linux-block, LKML, linux-erofs, Jens Axboe, Chao Yu, Greg Kroah-Hartman

On Thu, Apr 11, 2019 at 03:43:02PM +0800, Gao Xiang wrote:
> 
> 
> On 2019/4/11 15:08, Ming Lei wrote:
> > Hi Gao Xiang,
> > 
> > On Thu, Apr 11, 2019 at 01:47:49PM +0800, Gao Xiang wrote:
> >> Hi Ming,
> >>
> >> I found a erofs issue after commit 07173c3ec276
> >> ("block: enable multipage bvecs") is merged. It seems that
> >> it tries to merge more physical continuous pages in one iovec.
> >>
> >> However it breaks the current erofs_read_raw_page logic since it uses
> >> nr_iovecs of bio_alloc to limit the maximum number of physical
> > 
> > I believe you can do the limit outside easily, such as by checking
> > how many pages have been added to the bio.
> 
> Yes, I agree that. However, I noticed that bio_add_page is exported as
> EXPORT_SYMBOL. I have no idea how many out-of-tree drivers break as well.

I don't think it is a good behaviour to use bio->bi_max_vecs to limit
max allowed page, you may see the idea from the naming simply...

If there were other such drivers, we may fix it easily, and the following
patch should fix your issue:

diff --git a/drivers/staging/erofs/data.c b/drivers/staging/erofs/data.c
index 526e0dbea5b5..8878f2f2593e 100644
--- a/drivers/staging/erofs/data.c
+++ b/drivers/staging/erofs/data.c
@@ -298,7 +298,7 @@ static inline struct bio *erofs_read_raw_page(struct bio *bio,
 	*last_block = current_block;
 
 	/* shift in advance in case of it followed by too many gaps */
-	if (unlikely(bio->bi_vcnt >= bio->bi_max_vecs)) {
+	if (unlikely(bio->bi_iter.bi_size >= bio->bi_max_vecs * PAGE_SIZE)) {
 		/* err should reassign to 0 after submitting */
 		err = 0;
 		goto submit_bio_out;
> 
> Is there no way to take old behavior into consideration in the block layer as well?

One new interface, such as bio_add_page_limit(), may be helpful for
addressing this issue, but not sure if it is necessary, since it is
more reasonable to put the logic in user side.

> 
> > 
> >> continuous blocks as well. It was practicable since the old
> >> __bio_try_merge_page only tries to merge in the same page.
> >> it is a kAPI behavior change which also affects bio_alloc...
> >>
> >> ...
> >> 231                 err = erofs_map_blocks(inode, &map, EROFS_GET_BLOCKS_RAW);
> >> 232                 if (unlikely(err))
> >> 233                         goto err_out;
> >> ...
> >> 284                 /* max # of continuous pages */
> >> 285                 if (nblocks > DIV_ROUND_UP(map.m_plen, PAGE_SIZE))
> >> 286                         nblocks = DIV_ROUND_UP(map.m_plen, PAGE_SIZE);
> >> 287                 if (nblocks > BIO_MAX_PAGES)
> >> 288                         nblocks = BIO_MAX_PAGES;
> >> 289
> >> 290                 bio = erofs_grab_bio(sb, blknr, nblocks, sb,
> >> 291                                      read_endio, false);
> > 
> > Previously this bio is allowed to add at most 'nblocks' pages, however,
> > now we are allowed to add at most 'nblocks' io vecs, instead of pages.
> > 
> >> 292                 if (IS_ERR(bio)) {
> >> 293                         err = PTR_ERR(bio);
> >> 294                         bio = NULL;
> >> 295                         goto err_out;
> >> 296                 }
> >> 297         }
> >> 298
> >> 299         err = bio_add_page(bio, page, PAGE_SIZE, 0);
> >> 300         /* out of the extent or bio is full */
> >> 301         if (err < PAGE_SIZE)
> >> 302                 goto submit_bio_retry;
> >> ...
> >>
> >> After commit 07173c3ec276 ("block: enable multipage bvecs"), erofs could
> >> read more beyond what erofs_map_blocks assigns, and out-of-bound data could
> >> be read and it breaks tail-end inline determination.
> > 
> > I don't understand why, could you explain a bit why erofs reads more?
> 
> In current erofs, file could be break in two parts (non tail-end and tail-end blocks).
> Considering this on-disk layout,
>  ________________________________________________________________
> |     non tail-end data              |   meta data               |
> |____________________________________|_(inode) + non-inline data_|
>                           ^          ^
>                           |          |        
>                           bio start  \-- what nr_iovecs indicates as well:
>                                          the end of the non tail-end data.
> 
> Before this commit, it will stop just before the next meta data. However,
> if the new bio merging behavior is introduced, it could add pages more than
> nr_iovecs, thus meta data will be read as the normal data...

I'd suggest you to not re-use bio->bi_max_vecs for this purpose.

> 
> 
> > 
> > The amount depends on how many pages you added to the bio.
> > 
> >>
> >> I can change the logic in erofs. However, out of curiosity, I have no idea
> >> if some other places also are designed like this.
> >>
> >> IMO, it's better to provide a total count which indicates how many real
> >> pages have been added in this bio. some thoughts?
> > 
> > As I mentioned, you may count how many pages added to bio, or you still
> > can get the number via bio_segments(bio).
> 
> It is unsuitable to introduce bio_segments for each bio_add_page...

It isn't necessary, see the above patch.


Thanks,
Ming

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: Some new bio merging behaviors in __bio_try_merge_page
  2019-04-11  8:09     ` Ming Lei
@ 2019-04-11 10:20       ` Gao Xiang
  2019-04-11 15:34       ` Christoph Hellwig
  1 sibling, 0 replies; 7+ messages in thread
From: Gao Xiang @ 2019-04-11 10:20 UTC (permalink / raw)
  To: Ming Lei
  Cc: linux-block, LKML, linux-erofs, Jens Axboe, Chao Yu, Greg Kroah-Hartman



On 2019/4/11 16:09, Ming Lei wrote:
> On Thu, Apr 11, 2019 at 03:43:02PM +0800, Gao Xiang wrote:
>>
>>
>> On 2019/4/11 15:08, Ming Lei wrote:
>>> Hi Gao Xiang,
>>>
>>> On Thu, Apr 11, 2019 at 01:47:49PM +0800, Gao Xiang wrote:
>>>> Hi Ming,
>>>>
>>>> I found a erofs issue after commit 07173c3ec276
>>>> ("block: enable multipage bvecs") is merged. It seems that
>>>> it tries to merge more physical continuous pages in one iovec.
>>>>
>>>> However it breaks the current erofs_read_raw_page logic since it uses
>>>> nr_iovecs of bio_alloc to limit the maximum number of physical
>>>
>>> I believe you can do the limit outside easily, such as by checking
>>> how many pages have been added to the bio.
>>
>> Yes, I agree that. However, I noticed that bio_add_page is exported as
>> EXPORT_SYMBOL. I have no idea how many out-of-tree drivers break as well.
> 
> I don't think it is a good behaviour to use bio->bi_max_vecs to limit
> max allowed page, you may see the idea from the naming simply...

I hoped to simplify all these page limits to one number which is
the minimum number among them all then...

> 
> If there were other such drivers, we may fix it easily, and the following
> patch should fix your issue:
> 
> diff --git a/drivers/staging/erofs/data.c b/drivers/staging/erofs/data.c
> index 526e0dbea5b5..8878f2f2593e 100644
> --- a/drivers/staging/erofs/data.c
> +++ b/drivers/staging/erofs/data.c
> @@ -298,7 +298,7 @@ static inline struct bio *erofs_read_raw_page(struct bio *bio,
>  	*last_block = current_block;
>  
>  	/* shift in advance in case of it followed by too many gaps */
> -	if (unlikely(bio->bi_vcnt >= bio->bi_max_vecs)) {
> +	if (unlikely(bio->bi_iter.bi_size >= bio->bi_max_vecs * PAGE_SIZE)) {
>  		/* err should reassign to 0 after submitting */
>  		err = 0;
>  		goto submit_bio_out;

Yeah, it is fine, thanks for your suggestion :)

Thanks,
Gao Xiang

>>
>> Is there no way to take old behavior into consideration in the block layer as well?
> 
> One new interface, such as bio_add_page_limit(), may be helpful for
> addressing this issue, but not sure if it is necessary, since it is
> more reasonable to put the logic in user side.
> 
>>
>>>
>>>> continuous blocks as well. It was practicable since the old
>>>> __bio_try_merge_page only tries to merge in the same page.
>>>> it is a kAPI behavior change which also affects bio_alloc...
>>>>
>>>> ...
>>>> 231                 err = erofs_map_blocks(inode, &map, EROFS_GET_BLOCKS_RAW);
>>>> 232                 if (unlikely(err))
>>>> 233                         goto err_out;
>>>> ...
>>>> 284                 /* max # of continuous pages */
>>>> 285                 if (nblocks > DIV_ROUND_UP(map.m_plen, PAGE_SIZE))
>>>> 286                         nblocks = DIV_ROUND_UP(map.m_plen, PAGE_SIZE);
>>>> 287                 if (nblocks > BIO_MAX_PAGES)
>>>> 288                         nblocks = BIO_MAX_PAGES;
>>>> 289
>>>> 290                 bio = erofs_grab_bio(sb, blknr, nblocks, sb,
>>>> 291                                      read_endio, false);
>>>
>>> Previously this bio is allowed to add at most 'nblocks' pages, however,
>>> now we are allowed to add at most 'nblocks' io vecs, instead of pages.
>>>
>>>> 292                 if (IS_ERR(bio)) {
>>>> 293                         err = PTR_ERR(bio);
>>>> 294                         bio = NULL;
>>>> 295                         goto err_out;
>>>> 296                 }
>>>> 297         }
>>>> 298
>>>> 299         err = bio_add_page(bio, page, PAGE_SIZE, 0);
>>>> 300         /* out of the extent or bio is full */
>>>> 301         if (err < PAGE_SIZE)
>>>> 302                 goto submit_bio_retry;
>>>> ...
>>>>
>>>> After commit 07173c3ec276 ("block: enable multipage bvecs"), erofs could
>>>> read more beyond what erofs_map_blocks assigns, and out-of-bound data could
>>>> be read and it breaks tail-end inline determination.
>>>
>>> I don't understand why, could you explain a bit why erofs reads more?
>>
>> In current erofs, file could be break in two parts (non tail-end and tail-end blocks).
>> Considering this on-disk layout,
>>  ________________________________________________________________
>> |     non tail-end data              |   meta data               |
>> |____________________________________|_(inode) + non-inline data_|
>>                           ^          ^
>>                           |          |        
>>                           bio start  \-- what nr_iovecs indicates as well:
>>                                          the end of the non tail-end data.
>>
>> Before this commit, it will stop just before the next meta data. However,
>> if the new bio merging behavior is introduced, it could add pages more than
>> nr_iovecs, thus meta data will be read as the normal data...
> 
> I'd suggest you to not re-use bio->bi_max_vecs for this purpose.
> 
>>
>>
>>>
>>> The amount depends on how many pages you added to the bio.
>>>
>>>>
>>>> I can change the logic in erofs. However, out of curiosity, I have no idea
>>>> if some other places also are designed like this.
>>>>
>>>> IMO, it's better to provide a total count which indicates how many real
>>>> pages have been added in this bio. some thoughts?
>>>
>>> As I mentioned, you may count how many pages added to bio, or you still
>>> can get the number via bio_segments(bio).
>>
>> It is unsuitable to introduce bio_segments for each bio_add_page...
> 
> It isn't necessary, see the above patch.
> 
> 
> Thanks,
> Ming
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some new bio merging behaviors in __bio_try_merge_page
  2019-04-11  8:09     ` Ming Lei
  2019-04-11 10:20       ` Gao Xiang
@ 2019-04-11 15:34       ` Christoph Hellwig
  2019-04-11 16:25         ` Gao Xiang
  1 sibling, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2019-04-11 15:34 UTC (permalink / raw)
  To: Ming Lei
  Cc: Gao Xiang, linux-block, LKML, linux-erofs, Jens Axboe, Chao Yu,
	Greg Kroah-Hartman

On Thu, Apr 11, 2019 at 04:09:54PM +0800, Ming Lei wrote:
> I don't think it is a good behaviour to use bio->bi_max_vecs to limit
> max allowed page, you may see the idea from the naming simply...
> 
> If there were other such drivers, we may fix it easily, and the following
> patch should fix your issue:

Yep.  Consumers of the block layer really have no business at all
looking at bi_vcnt.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some new bio merging behaviors in __bio_try_merge_page
  2019-04-11 15:34       ` Christoph Hellwig
@ 2019-04-11 16:25         ` Gao Xiang
  0 siblings, 0 replies; 7+ messages in thread
From: Gao Xiang @ 2019-04-11 16:25 UTC (permalink / raw)
  To: Christoph Hellwig, Ming Lei
  Cc: Jens Axboe, Greg Kroah-Hartman, LKML, linux-block, linux-erofs



On 2019/4/11 23:34, Christoph Hellwig wrote:
> On Thu, Apr 11, 2019 at 04:09:54PM +0800, Ming Lei wrote:
>> I don't think it is a good behaviour to use bio->bi_max_vecs to limit
>> max allowed page, you may see the idea from the naming simply...
>>
>> If there were other such drivers, we may fix it easily, and the following
>> patch should fix your issue:
> 
> Yep.  Consumers of the block layer really have no business at all
> looking at bi_vcnt.

Thanks for pointing out, Ming's solution is enough to solve erofs issue.

Thanks,
Gao XIang

> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-04-11 16:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-11  5:47 Some new bio merging behaviors in __bio_try_merge_page Gao Xiang
2019-04-11  7:08 ` Ming Lei
2019-04-11  7:43   ` Gao Xiang
2019-04-11  8:09     ` Ming Lei
2019-04-11 10:20       ` Gao Xiang
2019-04-11 15:34       ` Christoph Hellwig
2019-04-11 16:25         ` Gao Xiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).