* [PATCH v2 0/3] add flag for tracking bio allocation @ 2019-03-22 13:13 Johannes Thumshirn 2019-03-22 13:13 ` [PATCH v2 1/3] block: bio: kill BIO_SEG_VALID flag Johannes Thumshirn ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: Johannes Thumshirn @ 2019-03-22 13:13 UTC (permalink / raw) To: Jens Axboe Cc: Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist, Johannes Thumshirn Hannes recently sent a patch in [1] to fix an Oops caused by __blkdev_direct_IO_simple() doing bio submissions from the stack and which ended up being freed bio bio_free(). As bio_free() expected a bio which was allocated by bio_alloc_bioset() it crashed. I've sent out a different aproach to tackling this problem and both Jens and Jan leaned to my solution, namely adding another bio flag tracking the allocation. As we start to run out of flags, Jens has killed the BIO_SEG_VALID flag in a preparation patch and I added a compile time check as a safety net, so we're not accidentially overriding the high 3 bits of bi_flags, which are used for the BVEC_POOL_IDX(). This set has passed a full xfstests run on both BTRFS and XFS (against zram disks) and a full blktests run without introducing any regressions. [1] https://lore.kernel.org/linux-block/20190320081253.129688-1-hare@suse.de/ Jens Axboe (1): block: bio: kill BIO_SEG_VALID flag Johannes Thumshirn (2): block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX block: bio: introduce BIO_ALLOCED flag and check it in bio_free block/bio.c | 17 ++++++++++++----- block/blk-merge.c | 11 ++++------- drivers/md/raid5.c | 2 +- include/linux/blk_types.h | 27 +++++++++++++++------------ 4 files changed, 32 insertions(+), 25 deletions(-) -- 2.16.4 ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2 1/3] block: bio: kill BIO_SEG_VALID flag 2019-03-22 13:13 [PATCH v2 0/3] add flag for tracking bio allocation Johannes Thumshirn @ 2019-03-22 13:13 ` Johannes Thumshirn 2019-03-22 14:01 ` Christoph Hellwig ` (3 more replies) 2019-03-22 13:13 ` [PATCH v2 2/3] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX Johannes Thumshirn 2019-03-22 13:13 ` [PATCH v2 3/3] block: bio: introduce BIO_ALLOCED flag and check it in bio_free Johannes Thumshirn 2 siblings, 4 replies; 18+ messages in thread From: Johannes Thumshirn @ 2019-03-22 13:13 UTC (permalink / raw) To: Jens Axboe Cc: Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist, Jens Axboe, Johannes Thumshirn From: Jens Axboe <axboe@kernel.dk> Kill the BIO_SEG_VALID flag. We should just use ->bi_phys_segments to tell if it's valid or not. This patch uses -1 to signify it's not. Signed-off-by: Jens Axboe <axboe@kernel.dk> [ jth initialize pc bios with 1 phys segment and WARN if we're submitting bios with -1 segments ] Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> --- block/bio.c | 15 +++++++++------ block/blk-core.c | 1 + block/blk-merge.c | 13 ++++++------- drivers/md/raid5.c | 2 +- include/linux/blk_types.h | 1 - 5 files changed, 17 insertions(+), 15 deletions(-) diff --git a/block/bio.c b/block/bio.c index 71a78d9fb8b7..a79b2bbcffd0 100644 --- a/block/bio.c +++ b/block/bio.c @@ -280,6 +280,7 @@ void bio_init(struct bio *bio, struct bio_vec *table, unsigned short max_vecs) { memset(bio, 0, sizeof(*bio)); + bio->bi_phys_segments = -1; atomic_set(&bio->__bi_remaining, 1); atomic_set(&bio->__bi_cnt, 1); @@ -305,6 +306,7 @@ void bio_reset(struct bio *bio) bio_uninit(bio); memset(bio, 0, BIO_RESET_BYTES); + bio->bi_phys_segments = -1; bio->bi_flags = flags; atomic_set(&bio->__bi_remaining, 1); } @@ -573,7 +575,7 @@ EXPORT_SYMBOL(bio_put); int bio_phys_segments(struct request_queue *q, struct bio *bio) { - if (unlikely(!bio_flagged(bio, BIO_SEG_VALID))) + if (unlikely(bio->bi_phys_segments == -1)) blk_recount_segments(q, bio); return bio->bi_phys_segments; @@ -712,7 +714,10 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page bvec->bv_len = len; bvec->bv_offset = offset; bio->bi_vcnt++; - bio->bi_phys_segments++; + if (bio->bi_phys_segments == -1) + bio->bi_phys_segments = 1; + else + bio->bi_phys_segments++; bio->bi_iter.bi_size += len; /* @@ -731,7 +736,7 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page /* If we may be able to merge these biovecs, force a recount */ if (bio->bi_vcnt > 1 && biovec_phys_mergeable(q, bvec - 1, bvec)) - bio_clear_flag(bio, BIO_SEG_VALID); + bio->bi_phys_segments = -1; done: return len; @@ -1913,10 +1918,8 @@ void bio_trim(struct bio *bio, int offset, int size) if (offset == 0 && size == bio->bi_iter.bi_size) return; - bio_clear_flag(bio, BIO_SEG_VALID); - + bio->bi_phys_segments = -1; bio_advance(bio, offset << 9); - bio->bi_iter.bi_size = size; if (bio_integrity(bio)) diff --git a/block/blk-core.c b/block/blk-core.c index 4673ebe42255..53372a16dd7c 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1514,6 +1514,7 @@ void blk_rq_bio_prep(struct request_queue *q, struct request *rq, else if (bio_op(bio) == REQ_OP_DISCARD) rq->nr_phys_segments = 1; + WARN_ON(rq->nr_phys_segments == -1); rq->__data_len = bio->bi_iter.bi_size; rq->bio = rq->biotail = bio; diff --git a/block/blk-merge.c b/block/blk-merge.c index 1c9d4f0f96ea..16c2ae69c46b 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -343,7 +343,6 @@ void blk_queue_split(struct request_queue *q, struct bio **bio) /* physical segments can be figured out during splitting */ res = split ? split : *bio; res->bi_phys_segments = nsegs; - bio_set_flag(res, BIO_SEG_VALID); if (split) { /* there isn't chance to merge the splitted bio */ @@ -440,8 +439,6 @@ void blk_recount_segments(struct request_queue *q, struct bio *bio) bio->bi_next = NULL; bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio); bio->bi_next = nxt; - - bio_set_flag(bio, BIO_SEG_VALID); } static int blk_phys_contig_segment(struct request_queue *q, struct bio *bio, @@ -620,6 +617,8 @@ static inline int ll_new_hw_segment(struct request_queue *q, { int nr_phys_segs = bio_phys_segments(q, bio); + if (WARN_ON(nr_phys_segs == -1)) + nr_phys_segs = 0; if (req->nr_phys_segments + nr_phys_segs > queue_max_segments(q)) goto no_merge; @@ -651,9 +650,9 @@ int ll_back_merge_fn(struct request_queue *q, struct request *req, req_set_nomerge(q, req); return 0; } - if (!bio_flagged(req->biotail, BIO_SEG_VALID)) + if (req->biotail->bi_phys_segments == -1) blk_recount_segments(q, req->biotail); - if (!bio_flagged(bio, BIO_SEG_VALID)) + if (bio->bi_phys_segments == -1) blk_recount_segments(q, bio); return ll_new_hw_segment(q, req, bio); @@ -673,9 +672,9 @@ int ll_front_merge_fn(struct request_queue *q, struct request *req, req_set_nomerge(q, req); return 0; } - if (!bio_flagged(bio, BIO_SEG_VALID)) + if (bio->bi_phys_segments == -1) blk_recount_segments(q, bio); - if (!bio_flagged(req->bio, BIO_SEG_VALID)) + if (req->bio->bi_phys_segments == -1) blk_recount_segments(q, req->bio); return ll_new_hw_segment(q, req, bio); diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index c033bfcb209e..79eb54dcf0f9 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -5247,7 +5247,7 @@ static int raid5_read_one_chunk(struct mddev *mddev, struct bio *raid_bio) rcu_read_unlock(); raid_bio->bi_next = (void*)rdev; bio_set_dev(align_bi, rdev->bdev); - bio_clear_flag(align_bi, BIO_SEG_VALID); + align_bi->bi_phys_segments = -1; if (is_badblock(rdev, align_bi->bi_iter.bi_sector, bio_sectors(align_bi), diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index d66bf5f32610..472059e92071 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -215,7 +215,6 @@ struct bio { /* * bio flags */ -#define BIO_SEG_VALID 1 /* bi_phys_segments valid */ #define BIO_CLONED 2 /* doesn't own data */ #define BIO_BOUNCED 3 /* bio is a bounce bio */ #define BIO_USER_MAPPED 4 /* contains user pages */ -- 2.16.4 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] block: bio: kill BIO_SEG_VALID flag 2019-03-22 13:13 ` [PATCH v2 1/3] block: bio: kill BIO_SEG_VALID flag Johannes Thumshirn @ 2019-03-22 14:01 ` Christoph Hellwig 2019-03-25 8:02 ` Johannes Thumshirn 2019-03-22 14:06 ` Hannes Reinecke ` (2 subsequent siblings) 3 siblings, 1 reply; 18+ messages in thread From: Christoph Hellwig @ 2019-03-22 14:01 UTC (permalink / raw) To: Johannes Thumshirn Cc: Jens Axboe, Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist, Jens Axboe > + if (bio->bi_phys_segments == -1) > + bio->bi_phys_segments = 1; > + else > + bio->bi_phys_segments++; Minor nitpickm but I'd kust write this as: if (bio->bi_phys_segments == -1) bio->bi_phys_segments = 0; bio->bi_phys_segments++; Otherwise this looks fine: Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] block: bio: kill BIO_SEG_VALID flag 2019-03-22 14:01 ` Christoph Hellwig @ 2019-03-25 8:02 ` Johannes Thumshirn 0 siblings, 0 replies; 18+ messages in thread From: Johannes Thumshirn @ 2019-03-25 8:02 UTC (permalink / raw) To: Christoph Hellwig Cc: Jens Axboe, Hannes Reinecke, Bart Van Assche, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist, Jens Axboe On 22/03/2019 15:01, Christoph Hellwig wrote: >> + if (bio->bi_phys_segments == -1) >> + bio->bi_phys_segments = 1; >> + else >> + bio->bi_phys_segments++; > > Minor nitpickm but I'd kust write this as: > > if (bio->bi_phys_segments == -1) > bio->bi_phys_segments = 0; > bio->bi_phys_segments++; Yeah, that looks more obvious -- Johannes Thumshirn SUSE Labs Filesystems jthumshirn@suse.de +49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850 ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] block: bio: kill BIO_SEG_VALID flag 2019-03-22 13:13 ` [PATCH v2 1/3] block: bio: kill BIO_SEG_VALID flag Johannes Thumshirn 2019-03-22 14:01 ` Christoph Hellwig @ 2019-03-22 14:06 ` Hannes Reinecke 2019-03-22 22:00 ` Jens Axboe 2019-03-22 22:40 ` Ming Lei 3 siblings, 0 replies; 18+ messages in thread From: Hannes Reinecke @ 2019-03-22 14:06 UTC (permalink / raw) To: Johannes Thumshirn, Jens Axboe Cc: Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist, Jens Axboe On 3/22/19 2:13 PM, Johannes Thumshirn wrote: > From: Jens Axboe <axboe@kernel.dk> > > Kill the BIO_SEG_VALID flag. We should just use ->bi_phys_segments to tell > if it's valid or not. > > This patch uses -1 to signify it's not. > > Signed-off-by: Jens Axboe <axboe@kernel.dk> > [ jth initialize pc bios with 1 phys segment and WARN if we're submitting > bios with -1 segments ] > Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> > --- > block/bio.c | 15 +++++++++------ > block/blk-core.c | 1 + > block/blk-merge.c | 13 ++++++------- > drivers/md/raid5.c | 2 +- > include/linux/blk_types.h | 1 - > 5 files changed, 17 insertions(+), 15 deletions(-) > Reviewed-by: Hannes Reinecke <hare@suse.com> Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] block: bio: kill BIO_SEG_VALID flag 2019-03-22 13:13 ` [PATCH v2 1/3] block: bio: kill BIO_SEG_VALID flag Johannes Thumshirn 2019-03-22 14:01 ` Christoph Hellwig 2019-03-22 14:06 ` Hannes Reinecke @ 2019-03-22 22:00 ` Jens Axboe 2019-03-23 19:31 ` Jens Axboe 2019-03-22 22:40 ` Ming Lei 3 siblings, 1 reply; 18+ messages in thread From: Jens Axboe @ 2019-03-22 22:00 UTC (permalink / raw) To: Johannes Thumshirn Cc: Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist, Jens Axboe On 3/22/19 7:13 AM, Johannes Thumshirn wrote: > @@ -712,7 +714,10 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page > bvec->bv_len = len; > bvec->bv_offset = offset; > bio->bi_vcnt++; > - bio->bi_phys_segments++; > + if (bio->bi_phys_segments == -1) > + bio->bi_phys_segments = 1; > + else > + bio->bi_phys_segments++; > bio->bi_iter.bi_size += len; Echo Christophs suggestion here. > diff --git a/block/blk-core.c b/block/blk-core.c > index 4673ebe42255..53372a16dd7c 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -1514,6 +1514,7 @@ void blk_rq_bio_prep(struct request_queue *q, struct request *rq, > else if (bio_op(bio) == REQ_OP_DISCARD) > rq->nr_phys_segments = 1; > > + WARN_ON(rq->nr_phys_segments == -1); > rq->__data_len = bio->bi_iter.bi_size; > rq->bio = rq->biotail = bio; Just make that: else rq->nr_phys_segments = 0; for the third case? -- Jens Axboe ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] block: bio: kill BIO_SEG_VALID flag 2019-03-22 22:00 ` Jens Axboe @ 2019-03-23 19:31 ` Jens Axboe 2019-03-25 13:32 ` Johannes Thumshirn 0 siblings, 1 reply; 18+ messages in thread From: Jens Axboe @ 2019-03-23 19:31 UTC (permalink / raw) To: Johannes Thumshirn Cc: Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist On 3/22/19 4:00 PM, Jens Axboe wrote: > On 3/22/19 7:13 AM, Johannes Thumshirn wrote: >> @@ -712,7 +714,10 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page >> bvec->bv_len = len; >> bvec->bv_offset = offset; >> bio->bi_vcnt++; >> - bio->bi_phys_segments++; >> + if (bio->bi_phys_segments == -1) >> + bio->bi_phys_segments = 1; >> + else >> + bio->bi_phys_segments++; >> bio->bi_iter.bi_size += len; > > Echo Christophs suggestion here. > >> diff --git a/block/blk-core.c b/block/blk-core.c >> index 4673ebe42255..53372a16dd7c 100644 >> --- a/block/blk-core.c >> +++ b/block/blk-core.c >> @@ -1514,6 +1514,7 @@ void blk_rq_bio_prep(struct request_queue *q, struct request *rq, >> else if (bio_op(bio) == REQ_OP_DISCARD) >> rq->nr_phys_segments = 1; >> >> + WARN_ON(rq->nr_phys_segments == -1); >> rq->__data_len = bio->bi_iter.bi_size; >> rq->bio = rq->biotail = bio; > > Just make that: > > else > rq->nr_phys_segments = 0; > > for the third case? Would be great if you could spin a v3 with the mentioned changes, against master since the BIO_NO_PAGE_REF change is now merged there. -- Jens Axboe ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] block: bio: kill BIO_SEG_VALID flag 2019-03-23 19:31 ` Jens Axboe @ 2019-03-25 13:32 ` Johannes Thumshirn 0 siblings, 0 replies; 18+ messages in thread From: Johannes Thumshirn @ 2019-03-25 13:32 UTC (permalink / raw) To: Jens Axboe Cc: Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist On 23/03/2019 20:31, Jens Axboe wrote: [...] > Would be great if you could spin a v3 with the mentioned changes, against > master since the BIO_NO_PAGE_REF change is now merged there. Done. I just need to figure out the blktests failure form the kbuild robot first. It takes some time to trigger > 100 runs of nvme/005 and I still don't understand what's happening and why. Byte, Johannes -- Johannes Thumshirn SUSE Labs Filesystems jthumshirn@suse.de +49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850 ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] block: bio: kill BIO_SEG_VALID flag 2019-03-22 13:13 ` [PATCH v2 1/3] block: bio: kill BIO_SEG_VALID flag Johannes Thumshirn ` (2 preceding siblings ...) 2019-03-22 22:00 ` Jens Axboe @ 2019-03-22 22:40 ` Ming Lei 3 siblings, 0 replies; 18+ messages in thread From: Ming Lei @ 2019-03-22 22:40 UTC (permalink / raw) To: Johannes Thumshirn Cc: Jens Axboe, Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist, Jens Axboe On Fri, Mar 22, 2019 at 9:14 PM Johannes Thumshirn <jthumshirn@suse.de> wrote: > > From: Jens Axboe <axboe@kernel.dk> > > Kill the BIO_SEG_VALID flag. We should just use ->bi_phys_segments to tell > if it's valid or not. > > This patch uses -1 to signify it's not. > > Signed-off-by: Jens Axboe <axboe@kernel.dk> > [ jth initialize pc bios with 1 phys segment and WARN if we're submitting > bios with -1 segments ] > Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> > --- > block/bio.c | 15 +++++++++------ > block/blk-core.c | 1 + > block/blk-merge.c | 13 ++++++------- > drivers/md/raid5.c | 2 +- > include/linux/blk_types.h | 1 - > 5 files changed, 17 insertions(+), 15 deletions(-) > > diff --git a/block/bio.c b/block/bio.c > index 71a78d9fb8b7..a79b2bbcffd0 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -280,6 +280,7 @@ void bio_init(struct bio *bio, struct bio_vec *table, > unsigned short max_vecs) > { > memset(bio, 0, sizeof(*bio)); > + bio->bi_phys_segments = -1; > atomic_set(&bio->__bi_remaining, 1); > atomic_set(&bio->__bi_cnt, 1); > > @@ -305,6 +306,7 @@ void bio_reset(struct bio *bio) > bio_uninit(bio); > > memset(bio, 0, BIO_RESET_BYTES); > + bio->bi_phys_segments = -1; > bio->bi_flags = flags; > atomic_set(&bio->__bi_remaining, 1); > } > @@ -573,7 +575,7 @@ EXPORT_SYMBOL(bio_put); > > int bio_phys_segments(struct request_queue *q, struct bio *bio) > { > - if (unlikely(!bio_flagged(bio, BIO_SEG_VALID))) > + if (unlikely(bio->bi_phys_segments == -1)) > blk_recount_segments(q, bio); > > return bio->bi_phys_segments; > @@ -712,7 +714,10 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page > bvec->bv_len = len; > bvec->bv_offset = offset; > bio->bi_vcnt++; > - bio->bi_phys_segments++; > + if (bio->bi_phys_segments == -1) > + bio->bi_phys_segments = 1; > + else > + bio->bi_phys_segments++; > bio->bi_iter.bi_size += len; > > /* > @@ -731,7 +736,7 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page > > /* If we may be able to merge these biovecs, force a recount */ > if (bio->bi_vcnt > 1 && biovec_phys_mergeable(q, bvec - 1, bvec)) > - bio_clear_flag(bio, BIO_SEG_VALID); > + bio->bi_phys_segments = -1; > > done: > return len; > @@ -1913,10 +1918,8 @@ void bio_trim(struct bio *bio, int offset, int size) > if (offset == 0 && size == bio->bi_iter.bi_size) > return; > > - bio_clear_flag(bio, BIO_SEG_VALID); > - > + bio->bi_phys_segments = -1; > bio_advance(bio, offset << 9); > - > bio->bi_iter.bi_size = size; > > if (bio_integrity(bio)) > diff --git a/block/blk-core.c b/block/blk-core.c > index 4673ebe42255..53372a16dd7c 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -1514,6 +1514,7 @@ void blk_rq_bio_prep(struct request_queue *q, struct request *rq, > else if (bio_op(bio) == REQ_OP_DISCARD) > rq->nr_phys_segments = 1; > > + WARN_ON(rq->nr_phys_segments == -1); > rq->__data_len = bio->bi_iter.bi_size; > rq->bio = rq->biotail = bio; > > diff --git a/block/blk-merge.c b/block/blk-merge.c > index 1c9d4f0f96ea..16c2ae69c46b 100644 > --- a/block/blk-merge.c > +++ b/block/blk-merge.c > @@ -343,7 +343,6 @@ void blk_queue_split(struct request_queue *q, struct bio **bio) > /* physical segments can be figured out during splitting */ > res = split ? split : *bio; > res->bi_phys_segments = nsegs; > - bio_set_flag(res, BIO_SEG_VALID); > > if (split) { > /* there isn't chance to merge the splitted bio */ > @@ -440,8 +439,6 @@ void blk_recount_segments(struct request_queue *q, struct bio *bio) > bio->bi_next = NULL; > bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio); > bio->bi_next = nxt; > - > - bio_set_flag(bio, BIO_SEG_VALID); > } > > static int blk_phys_contig_segment(struct request_queue *q, struct bio *bio, > @@ -620,6 +617,8 @@ static inline int ll_new_hw_segment(struct request_queue *q, > { > int nr_phys_segs = bio_phys_segments(q, bio); > > + if (WARN_ON(nr_phys_segs == -1)) > + nr_phys_segs = 0; > if (req->nr_phys_segments + nr_phys_segs > queue_max_segments(q)) > goto no_merge; > > @@ -651,9 +650,9 @@ int ll_back_merge_fn(struct request_queue *q, struct request *req, > req_set_nomerge(q, req); > return 0; > } > - if (!bio_flagged(req->biotail, BIO_SEG_VALID)) > + if (req->biotail->bi_phys_segments == -1) > blk_recount_segments(q, req->biotail); > - if (!bio_flagged(bio, BIO_SEG_VALID)) > + if (bio->bi_phys_segments == -1) > blk_recount_segments(q, bio); > > return ll_new_hw_segment(q, req, bio); > @@ -673,9 +672,9 @@ int ll_front_merge_fn(struct request_queue *q, struct request *req, > req_set_nomerge(q, req); > return 0; > } > - if (!bio_flagged(bio, BIO_SEG_VALID)) > + if (bio->bi_phys_segments == -1) > blk_recount_segments(q, bio); > - if (!bio_flagged(req->bio, BIO_SEG_VALID)) > + if (req->bio->bi_phys_segments == -1) > blk_recount_segments(q, req->bio); > > return ll_new_hw_segment(q, req, bio); > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > index c033bfcb209e..79eb54dcf0f9 100644 > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c > @@ -5247,7 +5247,7 @@ static int raid5_read_one_chunk(struct mddev *mddev, struct bio *raid_bio) > rcu_read_unlock(); > raid_bio->bi_next = (void*)rdev; > bio_set_dev(align_bi, rdev->bdev); > - bio_clear_flag(align_bi, BIO_SEG_VALID); > + align_bi->bi_phys_segments = -1; > > if (is_badblock(rdev, align_bi->bi_iter.bi_sector, > bio_sectors(align_bi), > diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h > index d66bf5f32610..472059e92071 100644 > --- a/include/linux/blk_types.h > +++ b/include/linux/blk_types.h > @@ -215,7 +215,6 @@ struct bio { > /* > * bio flags > */ > -#define BIO_SEG_VALID 1 /* bi_phys_segments valid */ > #define BIO_CLONED 2 /* doesn't own data */ > #define BIO_BOUNCED 3 /* bio is a bounce bio */ > #define BIO_USER_MAPPED 4 /* contains user pages */ Some drivers set max segments as USHRT_MAX, will this patch break this kind of driver? Thanks, Ming Lei ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2 2/3] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX 2019-03-22 13:13 [PATCH v2 0/3] add flag for tracking bio allocation Johannes Thumshirn 2019-03-22 13:13 ` [PATCH v2 1/3] block: bio: kill BIO_SEG_VALID flag Johannes Thumshirn @ 2019-03-22 13:13 ` Johannes Thumshirn 2019-03-22 14:01 ` Christoph Hellwig 2019-03-22 14:07 ` Hannes Reinecke 2019-03-22 13:13 ` [PATCH v2 3/3] block: bio: introduce BIO_ALLOCED flag and check it in bio_free Johannes Thumshirn 2 siblings, 2 replies; 18+ messages in thread From: Johannes Thumshirn @ 2019-03-22 13:13 UTC (permalink / raw) To: Jens Axboe Cc: Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist, Johannes Thumshirn With the introduction of BIO_ALLOCED we've used up all available bits in bio::bi_flags. Make sure no-one adds a new one and thus overrides the BVEC_POOL_IDX Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> --- block/bio.c | 3 +++ include/linux/blk_types.h | 25 ++++++++++++++----------- 2 files changed, 17 insertions(+), 11 deletions(-) diff --git a/block/bio.c b/block/bio.c index a79b2bbcffd0..87a515e93bee 100644 --- a/block/bio.c +++ b/block/bio.c @@ -2198,6 +2198,9 @@ static int __init init_bio(void) bio_slab_nr = 0; bio_slabs = kcalloc(bio_slab_max, sizeof(struct bio_slab), GFP_KERNEL); + + BUILD_BUG_ON(BIO_FLAG_LAST > BVEC_POOL_OFFSET); + if (!bio_slabs) panic("bio: can't allocate bios\n"); diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 472059e92071..1d28992a20f0 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -215,19 +215,22 @@ struct bio { /* * bio flags */ -#define BIO_CLONED 2 /* doesn't own data */ -#define BIO_BOUNCED 3 /* bio is a bounce bio */ -#define BIO_USER_MAPPED 4 /* contains user pages */ -#define BIO_NULL_MAPPED 5 /* contains invalid user pages */ -#define BIO_QUIET 6 /* Make BIO Quiet */ -#define BIO_CHAIN 7 /* chained bio, ->bi_remaining in effect */ -#define BIO_REFFED 8 /* bio has elevated ->bi_cnt */ -#define BIO_THROTTLED 9 /* This bio has already been subjected to +enum { + BIO_CLONED, /* doesn't own data */ + BIO_BOUNCED, /* bio is a bounce bio */ + BIO_USER_MAPPED, /* contains user pages */ + BIO_NULL_MAPPED, /* contains invalid user pages */ + BIO_QUIET, /* Make BIO Quiet */ + BIO_CHAIN, /* chained bio, ->bi_remaining in effect */ + BIO_REFFED, /* bio has elevated ->bi_cnt */ + BIO_THROTTLED, /* This bio has already been subjected to * throttling rules. Don't do it again. */ -#define BIO_TRACE_COMPLETION 10 /* bio_endio() should trace the final completion + BIO_TRACE_COMPLETION, /* bio_endio() should trace the final completion * of this bio. */ -#define BIO_QUEUE_ENTERED 11 /* can use blk_queue_enter_live() */ -#define BIO_TRACKED 12 /* set if bio goes through the rq_qos path */ + BIO_QUEUE_ENTERED, /* can use blk_queue_enter_live() */ + BIO_TRACKED, /* set if bio goes through the rq_qos path */ + BIO_FLAG_LAST +}; /* See BVEC_POOL_OFFSET below before adding new flags */ -- 2.16.4 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v2 2/3] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX 2019-03-22 13:13 ` [PATCH v2 2/3] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX Johannes Thumshirn @ 2019-03-22 14:01 ` Christoph Hellwig 2019-03-22 14:07 ` Hannes Reinecke 1 sibling, 0 replies; 18+ messages in thread From: Christoph Hellwig @ 2019-03-22 14:01 UTC (permalink / raw) To: Johannes Thumshirn Cc: Jens Axboe, Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist Looks fine, Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 2/3] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX 2019-03-22 13:13 ` [PATCH v2 2/3] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX Johannes Thumshirn 2019-03-22 14:01 ` Christoph Hellwig @ 2019-03-22 14:07 ` Hannes Reinecke 1 sibling, 0 replies; 18+ messages in thread From: Hannes Reinecke @ 2019-03-22 14:07 UTC (permalink / raw) To: Johannes Thumshirn, Jens Axboe Cc: Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist On 3/22/19 2:13 PM, Johannes Thumshirn wrote: > With the introduction of BIO_ALLOCED we've used up all available bits in > bio::bi_flags. > > Make sure no-one adds a new one and thus overrides the BVEC_POOL_IDX > > Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> > --- > block/bio.c | 3 +++ > include/linux/blk_types.h | 25 ++++++++++++++----------- > 2 files changed, 17 insertions(+), 11 deletions(-) > Reviewed-by: Hannes Reinecke <hare@suse.com> Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2 3/3] block: bio: introduce BIO_ALLOCED flag and check it in bio_free 2019-03-22 13:13 [PATCH v2 0/3] add flag for tracking bio allocation Johannes Thumshirn 2019-03-22 13:13 ` [PATCH v2 1/3] block: bio: kill BIO_SEG_VALID flag Johannes Thumshirn 2019-03-22 13:13 ` [PATCH v2 2/3] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX Johannes Thumshirn @ 2019-03-22 13:13 ` Johannes Thumshirn 2019-03-22 14:02 ` Christoph Hellwig 2019-03-22 14:10 ` Hannes Reinecke 2 siblings, 2 replies; 18+ messages in thread From: Johannes Thumshirn @ 2019-03-22 13:13 UTC (permalink / raw) To: Jens Axboe Cc: Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist, Johannes Thumshirn When we're submitting a bio from stack and this ends up being split, we call bio_put(). bio_put() will eventually call bio_free() if the reference count drops to 0. But freeing the bio is wrong, as it was never allocated out of the bio's mempool. Flag each normally allocated bio as 'BIO_ALLOCATED' and skip freeing if the flag isn't set. Fixes: 189ce2b9dcc3 ("block: fast-path for small and simple direct I/O requests") Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> --- block/bio.c | 4 ++++ include/linux/blk_types.h | 1 + 2 files changed, 5 insertions(+) diff --git a/block/bio.c b/block/bio.c index 87a515e93bee..ba6949f111b7 100644 --- a/block/bio.c +++ b/block/bio.c @@ -255,6 +255,9 @@ static void bio_free(struct bio *bio) bio_uninit(bio); + if (!bio_flagged(bio, BIO_ALLOCED)) + return; + if (bs) { bvec_free(&bs->bvec_pool, bio->bi_io_vec, BVEC_POOL_IDX(bio)); @@ -523,6 +526,7 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask, unsigned int nr_iovecs, bvl = bio->bi_inline_vecs; } + bio_set_flag(bio, BIO_ALLOCED); bio->bi_pool = bs; bio->bi_max_vecs = nr_iovecs; bio->bi_io_vec = bvl; diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 1d28992a20f0..19d7402a9af3 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -229,6 +229,7 @@ enum { * of this bio. */ BIO_QUEUE_ENTERED, /* can use blk_queue_enter_live() */ BIO_TRACKED, /* set if bio goes through the rq_qos path */ + BIO_ALLOCED, /* bio allocated by bio_alloc_bioset */ BIO_FLAG_LAST }; -- 2.16.4 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v2 3/3] block: bio: introduce BIO_ALLOCED flag and check it in bio_free 2019-03-22 13:13 ` [PATCH v2 3/3] block: bio: introduce BIO_ALLOCED flag and check it in bio_free Johannes Thumshirn @ 2019-03-22 14:02 ` Christoph Hellwig 2019-03-22 14:05 ` Hannes Reinecke 2019-03-22 14:10 ` Hannes Reinecke 1 sibling, 1 reply; 18+ messages in thread From: Christoph Hellwig @ 2019-03-22 14:02 UTC (permalink / raw) To: Johannes Thumshirn Cc: Jens Axboe, Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist Looks fine, Reviewed-by: Christoph Hellwig <hch@lst.de> But how do you manage to get the tiny on-stack bios split? What kind of setup is this? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 3/3] block: bio: introduce BIO_ALLOCED flag and check it in bio_free 2019-03-22 14:02 ` Christoph Hellwig @ 2019-03-22 14:05 ` Hannes Reinecke 2019-03-22 21:30 ` Keith Busch 0 siblings, 1 reply; 18+ messages in thread From: Hannes Reinecke @ 2019-03-22 14:05 UTC (permalink / raw) To: Christoph Hellwig, Johannes Thumshirn Cc: Jens Axboe, Bart Van Assche, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist On 3/22/19 3:02 PM, Christoph Hellwig wrote: > Looks fine, > > Reviewed-by: Christoph Hellwig <hch@lst.de> > > But how do you manage to get the tiny on-stack bios split? What kind > of setup is this? > It's not tiny if you send a 2M file via direct-io, _and_ have a non-zero MDTS setting... Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 3/3] block: bio: introduce BIO_ALLOCED flag and check it in bio_free 2019-03-22 14:05 ` Hannes Reinecke @ 2019-03-22 21:30 ` Keith Busch 2019-03-22 23:04 ` Ming Lei 0 siblings, 1 reply; 18+ messages in thread From: Keith Busch @ 2019-03-22 21:30 UTC (permalink / raw) To: Hannes Reinecke Cc: Christoph Hellwig, Johannes Thumshirn, Jens Axboe, Bart Van Assche, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist On Fri, Mar 22, 2019 at 03:05:46PM +0100, Hannes Reinecke wrote: > On 3/22/19 3:02 PM, Christoph Hellwig wrote: > > But how do you manage to get the tiny on-stack bios split? What kind > > of setup is this? > > > It's not tiny if you send a 2M file via direct-io, _and_ have a non-zero > MDTS setting... I see a larger request can create a bio chain (though I think 1M is the max that goes through _simple), but how does the stack bio get used in a bio_put()? The chained bios are allocated through a bio_set, and their bio_end_io() calls bio_put() on themselves through bio_chain_endio(), but that's it. The original's endio is never modified from blkdev_bio_end_io_simple(), so who's calling bio_put() on the on-stack bio? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 3/3] block: bio: introduce BIO_ALLOCED flag and check it in bio_free 2019-03-22 21:30 ` Keith Busch @ 2019-03-22 23:04 ` Ming Lei 0 siblings, 0 replies; 18+ messages in thread From: Ming Lei @ 2019-03-22 23:04 UTC (permalink / raw) To: Keith Busch Cc: Hannes Reinecke, Christoph Hellwig, Johannes Thumshirn, Jens Axboe, Bart Van Assche, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist On Sat, Mar 23, 2019 at 5:30 AM Keith Busch <kbusch@kernel.org> wrote: > > On Fri, Mar 22, 2019 at 03:05:46PM +0100, Hannes Reinecke wrote: > > On 3/22/19 3:02 PM, Christoph Hellwig wrote: > > > But how do you manage to get the tiny on-stack bios split? What kind > > > of setup is this? > > > > > It's not tiny if you send a 2M file via direct-io, _and_ have a non-zero > > MDTS setting... > > I see a larger request can create a bio chain (though I think 1M > is the max that goes through _simple), but how does the stack bio multi-page bvec is merged now, 1M isn't the max size any more. > get used in a bio_put()? The chained bios are allocated through a > bio_set, and their bio_end_io() calls bio_put() on themselves through > bio_chain_endio(), but that's it. The original's endio is never modified > from blkdev_bio_end_io_simple(), so who's calling bio_put() on the > on-stack bio? I have the same concern, blkdev_bio_end_io_simple() doesn't release the bio, and this bio shouldn't be released until __blkdev_direct_IO_simple() returns. How can any underlying driver or block layer free such bio coming from upper layer? thanks, Ming Lei ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 3/3] block: bio: introduce BIO_ALLOCED flag and check it in bio_free 2019-03-22 13:13 ` [PATCH v2 3/3] block: bio: introduce BIO_ALLOCED flag and check it in bio_free Johannes Thumshirn 2019-03-22 14:02 ` Christoph Hellwig @ 2019-03-22 14:10 ` Hannes Reinecke 1 sibling, 0 replies; 18+ messages in thread From: Hannes Reinecke @ 2019-03-22 14:10 UTC (permalink / raw) To: Johannes Thumshirn, Jens Axboe Cc: Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist On 3/22/19 2:13 PM, Johannes Thumshirn wrote: > When we're submitting a bio from stack and this ends up being split, we > call bio_put(). bio_put() will eventually call bio_free() if the reference > count drops to 0. But freeing the bio is wrong, as it was never allocated > out of the bio's mempool. > > Flag each normally allocated bio as 'BIO_ALLOCATED' and skip freeing if the > flag isn't set. > > Fixes: 189ce2b9dcc3 ("block: fast-path for small and simple direct I/O requests") > Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> > --- > block/bio.c | 4 ++++ > include/linux/blk_types.h | 1 + > 2 files changed, 5 insertions(+) > Reviewed-by: Hannes Reinecke <hare@suse.com> Although I do question the 'Fixes' tag here; without the first two this patch would be pretty pointless, so I'd rather have all 3 marked with 'Fixes'. Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2019-03-25 13:32 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-03-22 13:13 [PATCH v2 0/3] add flag for tracking bio allocation Johannes Thumshirn 2019-03-22 13:13 ` [PATCH v2 1/3] block: bio: kill BIO_SEG_VALID flag Johannes Thumshirn 2019-03-22 14:01 ` Christoph Hellwig 2019-03-25 8:02 ` Johannes Thumshirn 2019-03-22 14:06 ` Hannes Reinecke 2019-03-22 22:00 ` Jens Axboe 2019-03-23 19:31 ` Jens Axboe 2019-03-25 13:32 ` Johannes Thumshirn 2019-03-22 22:40 ` Ming Lei 2019-03-22 13:13 ` [PATCH v2 2/3] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX Johannes Thumshirn 2019-03-22 14:01 ` Christoph Hellwig 2019-03-22 14:07 ` Hannes Reinecke 2019-03-22 13:13 ` [PATCH v2 3/3] block: bio: introduce BIO_ALLOCED flag and check it in bio_free Johannes Thumshirn 2019-03-22 14:02 ` Christoph Hellwig 2019-03-22 14:05 ` Hannes Reinecke 2019-03-22 21:30 ` Keith Busch 2019-03-22 23:04 ` Ming Lei 2019-03-22 14:10 ` Hannes Reinecke
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).