* [PATCH 0/2] add flag for tracking bio allocation @ 2019-03-21 12:30 Johannes Thumshirn 2019-03-21 12:30 ` [PATCH 1/2] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX Johannes Thumshirn 2019-03-21 12:30 ` [PATCH 2/2] bio: introduce BIO_ALLOCED flag and check it in bio_free Johannes Thumshirn 0 siblings, 2 replies; 11+ messages in thread From: Johannes Thumshirn @ 2019-03-21 12:30 UTC (permalink / raw) To: Jens Axboe Cc: Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist, Johannes Thumshirn Hannes recently sent a patch in [1] to fix an Oops caused by __blkdev_direct_IO_simple() doing bio submissions from the stack and which ended up being freed bio bio_free(). As bio_free() expected a bio which was allocated by bio_alloc_bioset() it crashed. I've sent out a different aproach to tackling this problem and both Jens and Jan leaned to my solution, namely adding another bio flag tracking the allocation. As flag is the last available bit, add a compile time check as a safety net, so we're not accidentially overriding the high 3 bits of bi_flags, which are used for the BVEC_POOL_IDX(). Note, Jens also staged a patch in his io_uring-next branch taking the last flag. For this reason patch 2/2 might not be applied in this form, but 1/2 is still applicable then. [1] https://lore.kernel.org/linux-block/20190320081253.129688-1-hare@suse.de/ Johannes Thumshirn (2): block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX bio: introduce BIO_ALLOCED flag and check it in bio_free block/bio.c | 12 ++++++++++++ include/linux/blk_types.h | 32 ++++++++++++++++++-------------- 2 files changed, 30 insertions(+), 14 deletions(-) -- 2.16.4 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX 2019-03-21 12:30 [PATCH 0/2] add flag for tracking bio allocation Johannes Thumshirn @ 2019-03-21 12:30 ` Johannes Thumshirn 2019-03-21 13:18 ` Hannes Reinecke 2019-03-21 14:15 ` Jens Axboe 2019-03-21 12:30 ` [PATCH 2/2] bio: introduce BIO_ALLOCED flag and check it in bio_free Johannes Thumshirn 1 sibling, 2 replies; 11+ messages in thread From: Johannes Thumshirn @ 2019-03-21 12:30 UTC (permalink / raw) To: Jens Axboe Cc: Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist, Johannes Thumshirn With the introduction of BIO_ALLOCED we've used up all available bits in bio::bi_flags. Make sure no-one adds a new one and thus overrides the BVEC_POOL_IDX Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> --- block/bio.c | 8 ++++++++ include/linux/blk_types.h | 31 +++++++++++++++++-------------- 2 files changed, 25 insertions(+), 14 deletions(-) diff --git a/block/bio.c b/block/bio.c index 4db1008309ed..8c689aed46a0 100644 --- a/block/bio.c +++ b/block/bio.c @@ -2126,12 +2126,20 @@ static void __init biovec_init_slabs(void) } } +static inline void bio_flag_buildtime_check(void) +{ + BUILD_BUG_ON(BIO_FLAG_LAST > BVEC_POOL_OFFSET); +} + static int __init init_bio(void) { bio_slab_max = 2; bio_slab_nr = 0; bio_slabs = kcalloc(bio_slab_max, sizeof(struct bio_slab), GFP_KERNEL); + + bio_flag_buildtime_check(); + if (!bio_slabs) panic("bio: can't allocate bios\n"); diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index d66bf5f32610..0273bad71a96 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -215,20 +215,23 @@ struct bio { /* * bio flags */ -#define BIO_SEG_VALID 1 /* bi_phys_segments valid */ -#define BIO_CLONED 2 /* doesn't own data */ -#define BIO_BOUNCED 3 /* bio is a bounce bio */ -#define BIO_USER_MAPPED 4 /* contains user pages */ -#define BIO_NULL_MAPPED 5 /* contains invalid user pages */ -#define BIO_QUIET 6 /* Make BIO Quiet */ -#define BIO_CHAIN 7 /* chained bio, ->bi_remaining in effect */ -#define BIO_REFFED 8 /* bio has elevated ->bi_cnt */ -#define BIO_THROTTLED 9 /* This bio has already been subjected to - * throttling rules. Don't do it again. */ -#define BIO_TRACE_COMPLETION 10 /* bio_endio() should trace the final completion - * of this bio. */ -#define BIO_QUEUE_ENTERED 11 /* can use blk_queue_enter_live() */ -#define BIO_TRACKED 12 /* set if bio goes through the rq_qos path */ +enum { + BIO_SEG_VALID = 1, /* bi_phys_segments valid */ + BIO_CLONED = 2, /* doesn't own data */ + BIO_BOUNCED = 3, /* bio is a bounce bio */ + BIO_USER_MAPPED = 4, /* contains user pages */ + BIO_NULL_MAPPED = 5, /* contains invalid user pages */ + BIO_QUIET = 6, /* Make BIO Quiet */ + BIO_CHAIN = 7, /* chained bio, ->bi_remaining in effect */ + BIO_REFFED = 8, /* bio has elevated ->bi_cnt */ + BIO_THROTTLED = 9, /* This bio has already been subjected to + * throttling rules. Don't do it again. */ + BIO_TRACE_COMPLETION = 10, /* bio_endio() should trace the final completion + * of this bio. */ + BIO_QUEUE_ENTERED = 11, /* can use blk_queue_enter_live() */ + BIO_TRACKED = 12, /* set if bio goes through the rq_qos path */ + BIO_FLAG_LAST +}; /* See BVEC_POOL_OFFSET below before adding new flags */ -- 2.16.4 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX 2019-03-21 12:30 ` [PATCH 1/2] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX Johannes Thumshirn @ 2019-03-21 13:18 ` Hannes Reinecke 2019-03-21 13:22 ` Johannes Thumshirn 2019-03-21 14:15 ` Jens Axboe 1 sibling, 1 reply; 11+ messages in thread From: Hannes Reinecke @ 2019-03-21 13:18 UTC (permalink / raw) To: Johannes Thumshirn, Jens Axboe Cc: Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist On 3/21/19 1:30 PM, Johannes Thumshirn wrote: > With the introduction of BIO_ALLOCED we've used up all available bits in > bio::bi_flags. > > Make sure no-one adds a new one and thus overrides the BVEC_POOL_IDX > > Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> > --- > block/bio.c | 8 ++++++++ > include/linux/blk_types.h | 31 +++++++++++++++++-------------- > 2 files changed, 25 insertions(+), 14 deletions(-) > > diff --git a/block/bio.c b/block/bio.c > index 4db1008309ed..8c689aed46a0 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -2126,12 +2126,20 @@ static void __init biovec_init_slabs(void) > } > } > > +static inline void bio_flag_buildtime_check(void) > +{ > + BUILD_BUG_ON(BIO_FLAG_LAST > BVEC_POOL_OFFSET); > +} > + > static int __init init_bio(void) > { > bio_slab_max = 2; > bio_slab_nr = 0; > bio_slabs = kcalloc(bio_slab_max, sizeof(struct bio_slab), > GFP_KERNEL); > + > + bio_flag_buildtime_check(); > + > if (!bio_slabs) > panic("bio: can't allocate bios\n"); > > diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h > index d66bf5f32610..0273bad71a96 100644 > --- a/include/linux/blk_types.h > +++ b/include/linux/blk_types.h > @@ -215,20 +215,23 @@ struct bio { > /* > * bio flags > */ > -#define BIO_SEG_VALID 1 /* bi_phys_segments valid */ > -#define BIO_CLONED 2 /* doesn't own data */ > -#define BIO_BOUNCED 3 /* bio is a bounce bio */ > -#define BIO_USER_MAPPED 4 /* contains user pages */ > -#define BIO_NULL_MAPPED 5 /* contains invalid user pages */ > -#define BIO_QUIET 6 /* Make BIO Quiet */ > -#define BIO_CHAIN 7 /* chained bio, ->bi_remaining in effect */ > -#define BIO_REFFED 8 /* bio has elevated ->bi_cnt */ > -#define BIO_THROTTLED 9 /* This bio has already been subjected to > - * throttling rules. Don't do it again. */ > -#define BIO_TRACE_COMPLETION 10 /* bio_endio() should trace the final completion > - * of this bio. */ > -#define BIO_QUEUE_ENTERED 11 /* can use blk_queue_enter_live() */ > -#define BIO_TRACKED 12 /* set if bio goes through the rq_qos path */ > +enum { > + BIO_SEG_VALID = 1, /* bi_phys_segments valid */ > + BIO_CLONED = 2, /* doesn't own data */ > + BIO_BOUNCED = 3, /* bio is a bounce bio */ > + BIO_USER_MAPPED = 4, /* contains user pages */ > + BIO_NULL_MAPPED = 5, /* contains invalid user pages */ > + BIO_QUIET = 6, /* Make BIO Quiet */ > + BIO_CHAIN = 7, /* chained bio, ->bi_remaining in effect */ > + BIO_REFFED = 8, /* bio has elevated ->bi_cnt */ > + BIO_THROTTLED = 9, /* This bio has already been subjected to > + * throttling rules. Don't do it again. */ > + BIO_TRACE_COMPLETION = 10, /* bio_endio() should trace the final completion > + * of this bio. */ > + BIO_QUEUE_ENTERED = 11, /* can use blk_queue_enter_live() */ > + BIO_TRACKED = 12, /* set if bio goes through the rq_qos path */ > + BIO_FLAG_LAST > +}; > > /* See BVEC_POOL_OFFSET below before adding new flags */ > > Where's the point of the enum if all values are initialized? Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX 2019-03-21 13:18 ` Hannes Reinecke @ 2019-03-21 13:22 ` Johannes Thumshirn 0 siblings, 0 replies; 11+ messages in thread From: Johannes Thumshirn @ 2019-03-21 13:22 UTC (permalink / raw) To: Hannes Reinecke, Jens Axboe Cc: Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist On 21/03/2019 14:18, Hannes Reinecke wrote: > Where's the point of the enum if all values are initialized? Enums start at 0 if not otherwise initialized and I wanted to keep the unused 0th bit as unused in this patch, for either the patch from Jens or 2/2 in this series. It's the only reason and it is not wrong to do and not the only place where we have it: jthumshirn@glass:linux (master)$ git grep -A 1 "enum {" | grep "= 0" | wc -l 2206 -- Johannes Thumshirn SUSE Labs Filesystems jthumshirn@suse.de +49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX 2019-03-21 12:30 ` [PATCH 1/2] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX Johannes Thumshirn 2019-03-21 13:18 ` Hannes Reinecke @ 2019-03-21 14:15 ` Jens Axboe 2019-03-21 14:21 ` Jens Axboe 1 sibling, 1 reply; 11+ messages in thread From: Jens Axboe @ 2019-03-21 14:15 UTC (permalink / raw) To: Johannes Thumshirn Cc: Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist On 3/21/19 6:30 AM, Johannes Thumshirn wrote: > With the introduction of BIO_ALLOCED we've used up all available bits in > bio::bi_flags. > > Make sure no-one adds a new one and thus overrides the BVEC_POOL_IDX Two minor things: > diff --git a/block/bio.c b/block/bio.c > index 4db1008309ed..8c689aed46a0 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -2126,12 +2126,20 @@ static void __init biovec_init_slabs(void) > } > } > > +static inline void bio_flag_buildtime_check(void) > +{ > + BUILD_BUG_ON(BIO_FLAG_LAST > BVEC_POOL_OFFSET); > +} > + > static int __init init_bio(void) > { > bio_slab_max = 2; > bio_slab_nr = 0; > bio_slabs = kcalloc(bio_slab_max, sizeof(struct bio_slab), > GFP_KERNEL); > + > + bio_flag_buildtime_check(); > + Just add the BUILD_BUG_ON() here, don't put it in a function. > +enum { > + BIO_SEG_VALID = 1, /* bi_phys_segments valid */ > + BIO_CLONED = 2, /* doesn't own data */ > + BIO_BOUNCED = 3, /* bio is a bounce bio */ > + BIO_USER_MAPPED = 4, /* contains user pages */ > + BIO_NULL_MAPPED = 5, /* contains invalid user pages */ > + BIO_QUIET = 6, /* Make BIO Quiet */ > + BIO_CHAIN = 7, /* chained bio, ->bi_remaining in effect */ > + BIO_REFFED = 8, /* bio has elevated ->bi_cnt */ > + BIO_THROTTLED = 9, /* This bio has already been subjected to > + * throttling rules. Don't do it again. */ > + BIO_TRACE_COMPLETION = 10, /* bio_endio() should trace the final completion > + * of this bio. */ > + BIO_QUEUE_ENTERED = 11, /* can use blk_queue_enter_live() */ > + BIO_TRACKED = 12, /* set if bio goes through the rq_qos path */ > + BIO_FLAG_LAST > +}; Let's order this after the BIO_NO_PAGE_REF patch, so we can avoid having to explicitly initialize the values. You also haven't solved the issue of now having an extra bit, 2/2 uses the last bit which the other patch already took... -- Jens Axboe ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX 2019-03-21 14:15 ` Jens Axboe @ 2019-03-21 14:21 ` Jens Axboe 2019-03-21 14:23 ` Johannes Thumshirn 0 siblings, 1 reply; 11+ messages in thread From: Jens Axboe @ 2019-03-21 14:21 UTC (permalink / raw) To: Johannes Thumshirn Cc: Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist On 3/21/19 8:15 AM, Jens Axboe wrote: > You also haven't solved the issue of now having an extra bit, 2/2 uses > the last bit which the other patch already took... Here's one way - kill BIO_SEG_VALID. We should just use ->bi_phys_segments to tell if it's valid or not. This patch uses -1 to signify it's not. Totally untested... diff --git a/block/bio.c b/block/bio.c index 71a78d9fb8b7..4bc165e7ca43 100644 --- a/block/bio.c +++ b/block/bio.c @@ -280,6 +280,7 @@ void bio_init(struct bio *bio, struct bio_vec *table, unsigned short max_vecs) { memset(bio, 0, sizeof(*bio)); + bio->bi_phys_segments = -1; atomic_set(&bio->__bi_remaining, 1); atomic_set(&bio->__bi_cnt, 1); @@ -305,6 +306,7 @@ void bio_reset(struct bio *bio) bio_uninit(bio); memset(bio, 0, BIO_RESET_BYTES); + bio->bi_phys_segments = -1; bio->bi_flags = flags; atomic_set(&bio->__bi_remaining, 1); } @@ -573,7 +575,7 @@ EXPORT_SYMBOL(bio_put); int bio_phys_segments(struct request_queue *q, struct bio *bio) { - if (unlikely(!bio_flagged(bio, BIO_SEG_VALID))) + if (unlikely(bio->bi_phys_segments == -1)) blk_recount_segments(q, bio); return bio->bi_phys_segments; @@ -731,7 +733,7 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page /* If we may be able to merge these biovecs, force a recount */ if (bio->bi_vcnt > 1 && biovec_phys_mergeable(q, bvec - 1, bvec)) - bio_clear_flag(bio, BIO_SEG_VALID); + bio->bi_phys_segments = -1; done: return len; @@ -1913,10 +1915,8 @@ void bio_trim(struct bio *bio, int offset, int size) if (offset == 0 && size == bio->bi_iter.bi_size) return; - bio_clear_flag(bio, BIO_SEG_VALID); - + bio->bi_phys_segments = -1; bio_advance(bio, offset << 9); - bio->bi_iter.bi_size = size; if (bio_integrity(bio)) diff --git a/block/blk-merge.c b/block/blk-merge.c index 1c9d4f0f96ea..57cea2782a76 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -343,7 +343,6 @@ void blk_queue_split(struct request_queue *q, struct bio **bio) /* physical segments can be figured out during splitting */ res = split ? split : *bio; res->bi_phys_segments = nsegs; - bio_set_flag(res, BIO_SEG_VALID); if (split) { /* there isn't chance to merge the splitted bio */ @@ -440,8 +439,6 @@ void blk_recount_segments(struct request_queue *q, struct bio *bio) bio->bi_next = NULL; bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio); bio->bi_next = nxt; - - bio_set_flag(bio, BIO_SEG_VALID); } static int blk_phys_contig_segment(struct request_queue *q, struct bio *bio, @@ -651,9 +648,9 @@ int ll_back_merge_fn(struct request_queue *q, struct request *req, req_set_nomerge(q, req); return 0; } - if (!bio_flagged(req->biotail, BIO_SEG_VALID)) + if (req->biotail->bi_phys_segments == -1) blk_recount_segments(q, req->biotail); - if (!bio_flagged(bio, BIO_SEG_VALID)) + if (bio->bi_phys_segments == -1) blk_recount_segments(q, bio); return ll_new_hw_segment(q, req, bio); @@ -673,9 +670,9 @@ int ll_front_merge_fn(struct request_queue *q, struct request *req, req_set_nomerge(q, req); return 0; } - if (!bio_flagged(bio, BIO_SEG_VALID)) + if (bio->bi_phys_segments == -1) blk_recount_segments(q, bio); - if (!bio_flagged(req->bio, BIO_SEG_VALID)) + if (req->bio->bi_phys_segments == -1) blk_recount_segments(q, req->bio); return ll_new_hw_segment(q, req, bio); diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index c033bfcb209e..79eb54dcf0f9 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -5247,7 +5247,7 @@ static int raid5_read_one_chunk(struct mddev *mddev, struct bio *raid_bio) rcu_read_unlock(); raid_bio->bi_next = (void*)rdev; bio_set_dev(align_bi, rdev->bdev); - bio_clear_flag(align_bi, BIO_SEG_VALID); + align_bi->bi_phys_segments = -1; if (is_badblock(rdev, align_bi->bi_iter.bi_sector, bio_sectors(align_bi), diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index d66bf5f32610..472059e92071 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -215,7 +215,6 @@ struct bio { /* * bio flags */ -#define BIO_SEG_VALID 1 /* bi_phys_segments valid */ #define BIO_CLONED 2 /* doesn't own data */ #define BIO_BOUNCED 3 /* bio is a bounce bio */ #define BIO_USER_MAPPED 4 /* contains user pages */ -- Jens Axboe ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX 2019-03-21 14:21 ` Jens Axboe @ 2019-03-21 14:23 ` Johannes Thumshirn 2019-03-21 14:25 ` Jens Axboe 0 siblings, 1 reply; 11+ messages in thread From: Johannes Thumshirn @ 2019-03-21 14:23 UTC (permalink / raw) To: Jens Axboe Cc: Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist On 21/03/2019 15:21, Jens Axboe wrote: > On 3/21/19 8:15 AM, Jens Axboe wrote: >> You also haven't solved the issue of now having an extra bit, 2/2 uses >> the last bit which the other patch already took... > > Here's one way - kill BIO_SEG_VALID. We should just use > ->bi_phys_segments to tell if it's valid or not. This patch uses -1 to > signify it's not. > > Totally untested... That sounds like an idea, I'll trow some testing at it and report back. -- Johannes Thumshirn SUSE Labs Filesystems jthumshirn@suse.de +49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX 2019-03-21 14:23 ` Johannes Thumshirn @ 2019-03-21 14:25 ` Jens Axboe 2019-03-21 15:54 ` Hannes Reinecke 2019-03-22 10:24 ` Johannes Thumshirn 0 siblings, 2 replies; 11+ messages in thread From: Jens Axboe @ 2019-03-21 14:25 UTC (permalink / raw) To: Johannes Thumshirn Cc: Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist On 3/21/19 8:23 AM, Johannes Thumshirn wrote: > On 21/03/2019 15:21, Jens Axboe wrote: >> On 3/21/19 8:15 AM, Jens Axboe wrote: >>> You also haven't solved the issue of now having an extra bit, 2/2 uses >>> the last bit which the other patch already took... >> >> Here's one way - kill BIO_SEG_VALID. We should just use >> ->bi_phys_segments to tell if it's valid or not. This patch uses -1 to >> signify it's not. >> >> Totally untested... > > > That sounds like an idea, I'll trow some testing at it and report back. That'd be great, thanks. If we can do that as a prep patch for you, then you can just use an enum and not worry about initializing since I'll just shove BIO_PAGE_NO_REF at the end. And this is simpler than having to shift the masks around. -- Jens Axboe ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX 2019-03-21 14:25 ` Jens Axboe @ 2019-03-21 15:54 ` Hannes Reinecke 2019-03-22 10:24 ` Johannes Thumshirn 1 sibling, 0 replies; 11+ messages in thread From: Hannes Reinecke @ 2019-03-21 15:54 UTC (permalink / raw) To: Jens Axboe, Johannes Thumshirn Cc: Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist On 3/21/19 3:25 PM, Jens Axboe wrote: > On 3/21/19 8:23 AM, Johannes Thumshirn wrote: >> On 21/03/2019 15:21, Jens Axboe wrote: >>> On 3/21/19 8:15 AM, Jens Axboe wrote: >>>> You also haven't solved the issue of now having an extra bit, 2/2 uses >>>> the last bit which the other patch already took... >>> >>> Here's one way - kill BIO_SEG_VALID. We should just use >>> ->bi_phys_segments to tell if it's valid or not. This patch uses -1 to >>> signify it's not. >>> >>> Totally untested... >> >> >> That sounds like an idea, I'll trow some testing at it and report back. > > That'd be great, thanks. If we can do that as a prep patch for you, then > you can just use an enum and not worry about initializing since I'll just > shove BIO_PAGE_NO_REF at the end. > > And this is simpler than having to shift the masks around. > And it would also remove the possiblity that SEG_VALID and nr_phys_segments get out of sync. We still have some unresolved issues revolving around this problem... So yeah, I do like it, too. Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX 2019-03-21 14:25 ` Jens Axboe 2019-03-21 15:54 ` Hannes Reinecke @ 2019-03-22 10:24 ` Johannes Thumshirn 1 sibling, 0 replies; 11+ messages in thread From: Johannes Thumshirn @ 2019-03-22 10:24 UTC (permalink / raw) To: Jens Axboe Cc: Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist On 21/03/2019 15:25, Jens Axboe wrote: > On 3/21/19 8:23 AM, Johannes Thumshirn wrote: >> On 21/03/2019 15:21, Jens Axboe wrote: >>> On 3/21/19 8:15 AM, Jens Axboe wrote: >>>> You also haven't solved the issue of now having an extra bit, 2/2 uses >>>> the last bit which the other patch already took... >>> >>> Here's one way - kill BIO_SEG_VALID. We should just use >>> ->bi_phys_segments to tell if it's valid or not. This patch uses -1 to >>> signify it's not. >>> >>> Totally untested... >> >> >> That sounds like an idea, I'll trow some testing at it and report back. > > That'd be great, thanks. If we can do that as a prep patch for you, then > you can just use an enum and not worry about initializing since I'll just > shove BIO_PAGE_NO_REF at the end. > > And this is simpler than having to shift the masks around. > OK it survived xfstests on BTRFS and XFS(on zram) but blktests block/002 runs into this: [ 7.730469] WARNING: CPU: 1 PID: 283 at drivers/scsi/scsi_lib.c:1018 scsi_init_io+0x132/0x160 [scsi_mod] [ 7.732062] Modules linked in: scsi_debug(E+) scsi_mod(E) zram(E) [ 7.733094] CPU: 1 PID: 283 Comm: modprobe Tainted: G E 5.1.0-rc1-default+ #943 [ 7.735046] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fc-prebuilt.qemu-project.org 04/01/2014 [ 7.737504] RIP: 0010:scsi_init_io+0x132/0x160 [scsi_mod] [ 7.738408] Code: 39 c5 7c 47 48 8b 55 00 0f b7 92 78 04 00 00 39 d0 7f 36 4c 89 a3 10 01 00 00 41 89 44 24 08 31 c0 5b 5d 41 5c 41 5d 41 5e c3 <0f> 0b b8 0a 00 00 00 eb 98 0f 0b b8 09 00 00 00 eb 8f 0f 0b 41 be [ 7.741442] RSP: 0018:ffffb35a402eb6d8 EFLAGS: 00010246 [ 7.742296] RAX: 0000000000000002 RBX: ffff8d805f4d0118 RCX: 0000000000000024 [ 7.743451] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8d805f4d0118 [ 7.744637] RBP: ffff8d805f4d0000 R08: 0000000000000000 R09: ffff8d805f4d0150 [ 7.745819] R10: 0000000000000024 R11: ffffb35a402eb7b8 R12: ffff8d805f4d0000 [ 7.746977] R13: ffff8d8052fbd000 R14: ffff8d805f4d0118 R15: ffff8d805fbe4c00 [ 7.748132] FS: 00007ffa8a26e700(0000) GS:ffff8d805cb00000(0000) knlGS:0000000000000000 [ 7.749438] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7.750366] CR2: 00007ffa899c39b2 CR3: 00000000137d6000 CR4: 00000000000006e0 [ 7.751520] Call Trace: [ 7.751940] scsi_queue_rq+0x72a/0x9a0 [scsi_mod] [ 7.753358] blk_mq_dispatch_rq_list+0x80/0x4b0 [ 7.754100] ? dd_dispatch_request+0x21/0x220 [ 7.754815] blk_mq_do_dispatch_sched+0x64/0xd0 [ 7.755575] blk_mq_sched_dispatch_requests+0xe9/0x160 [ 7.756417] __blk_mq_run_hw_queue+0x5f/0xf0 [ 7.757108] __blk_mq_delay_run_hw_queue+0x106/0x150 [ 7.757913] blk_mq_run_hw_queue+0x51/0x100 [ 7.758595] blk_mq_sched_insert_request+0xa2/0x130 [ 7.759383] blk_execute_rq+0x3b/0x80 [ 7.759981] __scsi_execute+0xb5/0x1e0 [scsi_mod] [ 7.760754] scsi_probe_and_add_lun+0x270/0xd60 [scsi_mod] [ 7.761641] ? scsi_alloc_target+0x274/0x340 [scsi_mod] [ 7.762487] __scsi_scan_target+0xcc/0x220 [scsi_mod] [ 7.763307] scsi_scan_channel.part.14+0x52/0x70 [scsi_mod] [ 7.764210] scsi_scan_host_selected+0xe3/0x190 [scsi_mod] [ 7.765098] ? __pm_runtime_resume+0x48/0x60 [ 7.765807] scsi_scan_host+0x16a/0x1b0 [scsi_mod] [ 7.766580] sdebug_driver_probe+0x159/0x2c0 [scsi_debug] [ 7.767442] ? driver_sysfs_add+0x71/0xd0 [ 7.768101] ? driver_allows_async_probing+0x50/0x50 [ 7.768893] really_probe+0x229/0x420 [ 7.769492] ? driver_allows_async_probing+0x50/0x50 [ 7.770287] driver_probe_device+0x115/0x130 [ 7.770976] ? driver_allows_async_probing+0x50/0x50 [ 7.771770] bus_for_each_drv+0x54/0x80 [ 7.772398] __device_attach+0xb2/0x130 [ 7.773016] bus_probe_device+0x8a/0xa0 [ 7.773639] device_add+0x496/0x670 [ 7.774209] sdebug_add_adapter+0xf1/0x1b0 [scsi_debug] [ 7.775049] scsi_debug_init+0x6e3/0x1000 [scsi_debug] [ 7.775888] ? 0xffffffffc011a000 [ 7.776430] do_one_initcall+0x4e/0x1d4 [ 7.777048] do_init_module+0x5a/0x21d [ 7.777652] load_module+0x1926/0x1f00 [ 7.778255] ? m_show+0x1c0/0x1c0 [ 7.778794] __do_sys_finit_module+0x83/0xc0 [ 7.779482] do_syscall_64+0x5b/0x180 [ 7.780079] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 7.780882] RIP: 0033:0x7ffa89959969 [ 7.781459] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ef 64 2b 00 f7 d8 64 89 01 48 [ 7.784402] RSP: 002b:00007ffca3d05938 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 7.785602] RAX: ffffffffffffffda RBX: 00000000010e6400 RCX: 00007ffa89959969 [ 7.786740] RDX: 0000000000000000 RSI: 000000000042ac19 RDI: 0000000000000004 [ 7.787858] RBP: 000000000042ac19 R08: 0000000000000000 R09: 00000000010e6310 [ 7.788985] R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000040000 [ 7.790107] R13: 0000000000000000 R14: 00000000010e6530 R15: 0000000000000000 1013 blk_status_t scsi_init_io(struct scsi_cmnd *cmd) 1014 { 1015 struct request *rq = cmd->request; 1016 blk_status_t ret; 1017 1018 if (WARN_ON_ONCE(!blk_rq_nr_phys_segments(rq))) 1019 return BLK_STS_IOERR; Looking into it now. -- Johannes Thumshirn SUSE Labs Filesystems jthumshirn@suse.de +49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/2] bio: introduce BIO_ALLOCED flag and check it in bio_free 2019-03-21 12:30 [PATCH 0/2] add flag for tracking bio allocation Johannes Thumshirn 2019-03-21 12:30 ` [PATCH 1/2] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX Johannes Thumshirn @ 2019-03-21 12:30 ` Johannes Thumshirn 1 sibling, 0 replies; 11+ messages in thread From: Johannes Thumshirn @ 2019-03-21 12:30 UTC (permalink / raw) To: Jens Axboe Cc: Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux Block Layer Mailinglist, Linux FSDEVEL Mailinglist, Johannes Thumshirn When we're submitting a bio from stack and this ends up being split, we call bio_put(). bio_put() will eventually call bio_free() if the reference count drops to 0. But freeing the bio is wrong, as it was never allocated out of the bio's mempool. Flag each normally allocated bio as 'BIO_ALLOCATED' and skip freeing if the flag isn't set. Fixes: 189ce2b9dcc3 ("block: fast-path for small and simple direct I/O requests") Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> --- block/bio.c | 4 ++++ include/linux/blk_types.h | 1 + 2 files changed, 5 insertions(+) diff --git a/block/bio.c b/block/bio.c index 8c689aed46a0..3282479d511d 100644 --- a/block/bio.c +++ b/block/bio.c @@ -255,6 +255,9 @@ static void bio_free(struct bio *bio) bio_uninit(bio); + if (!bio_flagged(bio, BIO_ALLOCED)) + return; + if (bs) { bvec_free(&bs->bvec_pool, bio->bi_io_vec, BVEC_POOL_IDX(bio)); @@ -521,6 +524,7 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask, unsigned int nr_iovecs, bvl = bio->bi_inline_vecs; } + bio_set_flag(bio, BIO_ALLOCED); bio->bi_pool = bs; bio->bi_max_vecs = nr_iovecs; bio->bi_io_vec = bvl; diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 0273bad71a96..d0c9d6fd6e71 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -216,6 +216,7 @@ struct bio { * bio flags */ enum { + BIO_ALLOCED = 0, /* bio allocated by bio_alloc_bioset */ BIO_SEG_VALID = 1, /* bi_phys_segments valid */ BIO_CLONED = 2, /* doesn't own data */ BIO_BOUNCED = 3, /* bio is a bounce bio */ -- 2.16.4 ^ permalink raw reply related [flat|nested] 11+ messages in thread
end of thread, other threads:[~2019-03-22 10:24 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-03-21 12:30 [PATCH 0/2] add flag for tracking bio allocation Johannes Thumshirn 2019-03-21 12:30 ` [PATCH 1/2] block: bio: ensure newly added bio flags don't override BVEC_POOL_IDX Johannes Thumshirn 2019-03-21 13:18 ` Hannes Reinecke 2019-03-21 13:22 ` Johannes Thumshirn 2019-03-21 14:15 ` Jens Axboe 2019-03-21 14:21 ` Jens Axboe 2019-03-21 14:23 ` Johannes Thumshirn 2019-03-21 14:25 ` Jens Axboe 2019-03-21 15:54 ` Hannes Reinecke 2019-03-22 10:24 ` Johannes Thumshirn 2019-03-21 12:30 ` [PATCH 2/2] bio: introduce BIO_ALLOCED flag and check it in bio_free Johannes Thumshirn
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).