linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Patch v2] block: return the correct bvec when checking for gaps
@ 2021-06-04 23:37 longli
  2021-06-07  0:09 ` Ming Lei
  2021-06-07  7:09 ` Christoph Hellwig
  0 siblings, 2 replies; 4+ messages in thread
From: longli @ 2021-06-04 23:37 UTC (permalink / raw)
  To: linux-block
  Cc: Long Li, Jens Axboe, Johannes Thumshirn, Pavel Begunkov,
	Ming Lei, Tejun Heo, Matthew Wilcox (Oracle),
	Jeffle Xu, linux-kernel, stable

From: Long Li <longli@microsoft.com>

After commit 07173c3ec276 ("block: enable multipage bvecs"), a bvec can
have multiple pages. But bio_will_gap() still assumes one page bvec while
checking for merging. If the pages in the bvec go across the
seg_boundary_mask, this check for merging can potentially succeed if only
the 1st page is tested, and can fail if all the pages are tested.

Later, when SCSI builds the SG list the same check for merging is done in
__blk_segment_map_sg_merge() with all the pages in the bvec tested. This
time the check may fail if the pages in bvec go across the
seg_boundary_mask (but tested okay in bio_will_gap() earlier, so those
BIOs were merged). If this check fails, we end up with a broken SG list
for drivers assuming the SG list not having offsets in intermediate pages.
This results in incorrect pages written to the disk.

Fix this by returning the multi-page bvec when testing gaps for merging.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Jeffle Xu <jefflexu@linux.alibaba.com>
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Fixes: 07173c3ec276 ("block: enable multipage bvecs")
Signed-off-by: Long Li <longli@microsoft.com>
---
Change from v1: add commit details on how data corruption happens

 include/linux/bio.h | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/include/linux/bio.h b/include/linux/bio.h
index a0b4cfdf62a4..6b2f609ccfbf 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -44,9 +44,6 @@ static inline unsigned int bio_max_segs(unsigned int nr_segs)
 #define bio_offset(bio)		bio_iter_offset((bio), (bio)->bi_iter)
 #define bio_iovec(bio)		bio_iter_iovec((bio), (bio)->bi_iter)
 
-#define bio_multiple_segments(bio)				\
-	((bio)->bi_iter.bi_size != bio_iovec(bio).bv_len)
-
 #define bvec_iter_sectors(iter)	((iter).bi_size >> 9)
 #define bvec_iter_end_sector(iter) ((iter).bi_sector + bvec_iter_sectors((iter)))
 
@@ -271,7 +268,7 @@ static inline void bio_clear_flag(struct bio *bio, unsigned int bit)
 
 static inline void bio_get_first_bvec(struct bio *bio, struct bio_vec *bv)
 {
-	*bv = bio_iovec(bio);
+	*bv = mp_bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
 }
 
 static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec *bv)
@@ -279,10 +276,10 @@ static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec *bv)
 	struct bvec_iter iter = bio->bi_iter;
 	int idx;
 
-	if (unlikely(!bio_multiple_segments(bio))) {
-		*bv = bio_iovec(bio);
+	/* this bio has only one bvec */
+	*bv = mp_bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
+	if (bv->bv_len == bio->bi_iter.bi_size)
 		return;
-	}
 
 	bio_advance_iter(bio, &iter, iter.bi_size);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [Patch v2] block: return the correct bvec when checking for gaps
  2021-06-04 23:37 [Patch v2] block: return the correct bvec when checking for gaps longli
@ 2021-06-07  0:09 ` Ming Lei
  2021-06-07  7:09 ` Christoph Hellwig
  1 sibling, 0 replies; 4+ messages in thread
From: Ming Lei @ 2021-06-07  0:09 UTC (permalink / raw)
  To: longli
  Cc: linux-block, Long Li, Jens Axboe, Johannes Thumshirn,
	Pavel Begunkov, Tejun Heo, Matthew Wilcox (Oracle),
	Jeffle Xu, linux-kernel, stable

On Fri, Jun 04, 2021 at 04:37:19PM -0700, longli@linuxonhyperv.com wrote:
> From: Long Li <longli@microsoft.com>
> 
> After commit 07173c3ec276 ("block: enable multipage bvecs"), a bvec can
> have multiple pages. But bio_will_gap() still assumes one page bvec while
> checking for merging. If the pages in the bvec go across the
> seg_boundary_mask, this check for merging can potentially succeed if only
> the 1st page is tested, and can fail if all the pages are tested.
> 
> Later, when SCSI builds the SG list the same check for merging is done in
> __blk_segment_map_sg_merge() with all the pages in the bvec tested. This
> time the check may fail if the pages in bvec go across the
> seg_boundary_mask (but tested okay in bio_will_gap() earlier, so those
> BIOs were merged). If this check fails, we end up with a broken SG list
> for drivers assuming the SG list not having offsets in intermediate pages.
> This results in incorrect pages written to the disk.
> 
> Fix this by returning the multi-page bvec when testing gaps for merging.
> 
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> Cc: Pavel Begunkov <asml.silence@gmail.com>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> Cc: Jeffle Xu <jefflexu@linux.alibaba.com>
> Cc: linux-kernel@vger.kernel.org
> Cc: stable@vger.kernel.org
> Fixes: 07173c3ec276 ("block: enable multipage bvecs")
> Signed-off-by: Long Li <longli@microsoft.com>
> ---
> Change from v1: add commit details on how data corruption happens

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Patch v2] block: return the correct bvec when checking for gaps
  2021-06-04 23:37 [Patch v2] block: return the correct bvec when checking for gaps longli
  2021-06-07  0:09 ` Ming Lei
@ 2021-06-07  7:09 ` Christoph Hellwig
  2021-06-07 18:00   ` Long Li
  1 sibling, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2021-06-07  7:09 UTC (permalink / raw)
  To: longli
  Cc: linux-block, Long Li, Jens Axboe, Johannes Thumshirn,
	Pavel Begunkov, Ming Lei, Tejun Heo, Matthew Wilcox (Oracle),
	Jeffle Xu, linux-kernel, stable

On Fri, Jun 04, 2021 at 04:37:19PM -0700, longli@linuxonhyperv.com wrote:
> From: Long Li <longli@microsoft.com>
> 
> After commit 07173c3ec276 ("block: enable multipage bvecs"), a bvec can
> have multiple pages. But bio_will_gap() still assumes one page bvec while
> checking for merging. If the pages in the bvec go across the
> seg_boundary_mask, this check for merging can potentially succeed if only
> the 1st page is tested, and can fail if all the pages are tested.
> 
> Later, when SCSI builds the SG list the same check for merging is done in
> __blk_segment_map_sg_merge() with all the pages in the bvec tested. This
> time the check may fail if the pages in bvec go across the
> seg_boundary_mask (but tested okay in bio_will_gap() earlier, so those
> BIOs were merged). If this check fails, we end up with a broken SG list
> for drivers assuming the SG list not having offsets in intermediate pages.
> This results in incorrect pages written to the disk.
> 
> Fix this by returning the multi-page bvec when testing gaps for merging.
> 
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> Cc: Pavel Begunkov <asml.silence@gmail.com>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> Cc: Jeffle Xu <jefflexu@linux.alibaba.com>
> Cc: linux-kernel@vger.kernel.org
> Cc: stable@vger.kernel.org
> Fixes: 07173c3ec276 ("block: enable multipage bvecs")
> Signed-off-by: Long Li <longli@microsoft.com>
> ---
> Change from v1: add commit details on how data corruption happens
> 
>  include/linux/bio.h | 11 ++++-------
>  1 file changed, 4 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/bio.h b/include/linux/bio.h
> index a0b4cfdf62a4..6b2f609ccfbf 100644
> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -44,9 +44,6 @@ static inline unsigned int bio_max_segs(unsigned int nr_segs)
>  #define bio_offset(bio)		bio_iter_offset((bio), (bio)->bi_iter)
>  #define bio_iovec(bio)		bio_iter_iovec((bio), (bio)->bi_iter)
>  
> -#define bio_multiple_segments(bio)				\
> -	((bio)->bi_iter.bi_size != bio_iovec(bio).bv_len)
> -
>  #define bvec_iter_sectors(iter)	((iter).bi_size >> 9)
>  #define bvec_iter_end_sector(iter) ((iter).bi_sector + bvec_iter_sectors((iter)))
>  
> @@ -271,7 +268,7 @@ static inline void bio_clear_flag(struct bio *bio, unsigned int bit)
>  
>  static inline void bio_get_first_bvec(struct bio *bio, struct bio_vec *bv)
>  {
> -	*bv = bio_iovec(bio);
> +	*bv = mp_bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
>  }
>  
>  static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec *bv)
> @@ -279,10 +276,10 @@ static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec *bv)
>  	struct bvec_iter iter = bio->bi_iter;
>  	int idx;
>  
> -	if (unlikely(!bio_multiple_segments(bio))) {
> -		*bv = bio_iovec(bio);
> +	/* this bio has only one bvec */
> +	*bv = mp_bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
> +	if (bv->bv_len == bio->bi_iter.bi_size)
>  		return;

Nit: I'd move the comment a bit as the current placement confused me at
first.  Also maybe use bio_get_first_bvec here to make it even more
obvious:

	bio_get_first_bvec(bio, bv);
	if (bv->bv_len == bio->bi_iter.bi_size)
		return;		/* this bio only has a single bvec */

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [Patch v2] block: return the correct bvec when checking for gaps
  2021-06-07  7:09 ` Christoph Hellwig
@ 2021-06-07 18:00   ` Long Li
  0 siblings, 0 replies; 4+ messages in thread
From: Long Li @ 2021-06-07 18:00 UTC (permalink / raw)
  To: Christoph Hellwig, longli
  Cc: linux-block, Jens Axboe, Johannes Thumshirn, Pavel Begunkov,
	Ming Lei, Tejun Heo, Matthew Wilcox (Oracle),
	Jeffle Xu, linux-kernel, stable

> Subject: Re: [Patch v2] block: return the correct bvec when checking for gaps
> 
> On Fri, Jun 04, 2021 at 04:37:19PM -0700, longli@linuxonhyperv.com wrote:
> > From: Long Li <longli@microsoft.com>
> >
> > After commit 07173c3ec276 ("block: enable multipage bvecs"), a bvec
> > can have multiple pages. But bio_will_gap() still assumes one page
> > bvec while checking for merging. If the pages in the bvec go across
> > the seg_boundary_mask, this check for merging can potentially succeed
> > if only the 1st page is tested, and can fail if all the pages are tested.
> >
> > Later, when SCSI builds the SG list the same check for merging is done
> > in
> > __blk_segment_map_sg_merge() with all the pages in the bvec tested.
> > This time the check may fail if the pages in bvec go across the
> > seg_boundary_mask (but tested okay in bio_will_gap() earlier, so those
> > BIOs were merged). If this check fails, we end up with a broken SG
> > list for drivers assuming the SG list not having offsets in intermediate pages.
> > This results in incorrect pages written to the disk.
> >
> > Fix this by returning the multi-page bvec when testing gaps for merging.
> >
> > Cc: Jens Axboe <axboe@kernel.dk>
> > Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> > Cc: Pavel Begunkov <asml.silence@gmail.com>
> > Cc: Ming Lei <ming.lei@redhat.com>
> > Cc: Tejun Heo <tj@kernel.org>
> > Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> > Cc: Jeffle Xu <jefflexu@linux.alibaba.com>
> > Cc: linux-kernel@vger.kernel.org
> > Cc: stable@vger.kernel.org
> > Fixes: 07173c3ec276 ("block: enable multipage bvecs")
> > Signed-off-by: Long Li <longli@microsoft.com>
> > ---
> > Change from v1: add commit details on how data corruption happens
> >
> >  include/linux/bio.h | 11 ++++-------
> >  1 file changed, 4 insertions(+), 7 deletions(-)
> >
> > diff --git a/include/linux/bio.h b/include/linux/bio.h index
> > a0b4cfdf62a4..6b2f609ccfbf 100644
> > --- a/include/linux/bio.h
> > +++ b/include/linux/bio.h
> > @@ -44,9 +44,6 @@ static inline unsigned int bio_max_segs(unsigned int
> nr_segs)
> >  #define bio_offset(bio)		bio_iter_offset((bio), (bio)->bi_iter)
> >  #define bio_iovec(bio)		bio_iter_iovec((bio), (bio)->bi_iter)
> >
> > -#define bio_multiple_segments(bio)				\
> > -	((bio)->bi_iter.bi_size != bio_iovec(bio).bv_len)
> > -
> >  #define bvec_iter_sectors(iter)	((iter).bi_size >> 9)
> >  #define bvec_iter_end_sector(iter) ((iter).bi_sector +
> > bvec_iter_sectors((iter)))
> >
> > @@ -271,7 +268,7 @@ static inline void bio_clear_flag(struct bio *bio,
> > unsigned int bit)
> >
> >  static inline void bio_get_first_bvec(struct bio *bio, struct bio_vec
> > *bv)  {
> > -	*bv = bio_iovec(bio);
> > +	*bv = mp_bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
> >  }
> >
> >  static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec
> > *bv) @@ -279,10 +276,10 @@ static inline void bio_get_last_bvec(struct
> bio *bio, struct bio_vec *bv)
> >  	struct bvec_iter iter = bio->bi_iter;
> >  	int idx;
> >
> > -	if (unlikely(!bio_multiple_segments(bio))) {
> > -		*bv = bio_iovec(bio);
> > +	/* this bio has only one bvec */
> > +	*bv = mp_bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
> > +	if (bv->bv_len == bio->bi_iter.bi_size)
> >  		return;
> 
> Nit: I'd move the comment a bit as the current placement confused me at
> first.  Also maybe use bio_get_first_bvec here to make it even more
> obvious:
> 
> 	bio_get_first_bvec(bio, bv);
> 	if (bv->bv_len == bio->bi_iter.bi_size)
> 		return;		/* this bio only has a single bvec */

Thanks! I'll send v3 with those changes.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-06-07 18:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-04 23:37 [Patch v2] block: return the correct bvec when checking for gaps longli
2021-06-07  0:09 ` Ming Lei
2021-06-07  7:09 ` Christoph Hellwig
2021-06-07 18:00   ` Long Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).