* [PATCH] dm: max_segments=1 if merge_bvec_fn is not supported @ 2010-03-06 21:10 Lars Ellenberg 2010-03-08 5:33 ` Neil Brown 0 siblings, 1 reply; 9+ messages in thread From: Lars Ellenberg @ 2010-03-06 21:10 UTC (permalink / raw) To: Alasdair G Kergon Cc: device-mapper development, Mikulas Patocka, Neil Brown, jens.axboe, linux-kernel If the lower device exposes a merge_bvec_fn, dm_set_device_limits() restricts max_sectors to PAGE_SIZE "just to be safe". This is not sufficient, however. If someone uses bio_add_page() to add 8 disjunct 512 byte partial pages to a bio, it would succeed, but could still cross a border of whatever restrictions are below us (e.g. raid10 stripe boundary). An attempted bio_split() would not succeed, because bi_vcnt is 8. One example that triggered this frequently is the xen io layer. raid10_make_request bug: can't convert block across chunks or bigger than 64k 209265151 1 Signed-off-by: Lars <lars.ellenberg@linbit.com> --- Neil: you may want to double check linear.c and raid*.c for similar logic, even though it is unlikely that someone puts md raid6 on top of something exposing a merge_bvec_fn. This is not the first time this has been patched, btw. See https://bugzilla.redhat.com/show_bug.cgi?id=440093 and the patch by Mikulas: https://bugzilla.redhat.com/attachment.cgi?id=342638&action=diff --- drivers/md/dm-table.c | 12 ++++++++++-- 1 files changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index 4b22feb..c686ff4 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -515,14 +515,22 @@ int dm_set_device_limits(struct dm_target *ti, struct dm_dev *dev, /* * Check if merge fn is supported. - * If not we'll force DM to use PAGE_SIZE or + * If not we'll force DM to use single bio_vec of PAGE_SIZE or * smaller I/O, just to be safe. */ - if (q->merge_bvec_fn && !ti->type->merge) + if (q->merge_bvec_fn && !ti->type->merge) { limits->max_sectors = min_not_zero(limits->max_sectors, (unsigned int) (PAGE_SIZE >> 9)); + /* Restricting max_sectors is not enough. + * If someone uses bio_add_page to add 8 disjunct 512 byte + * partial pages to a bio, it would succeed, + * but could still cross a border of whatever restrictions + * are below us (e.g. raid0 stripe boundary). An attempted + * bio_split() would not succeed, because bi_vcnt is 8. */ + limits->max_segments = 1; + } return 0; } EXPORT_SYMBOL_GPL(dm_set_device_limits); -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] dm: max_segments=1 if merge_bvec_fn is not supported 2010-03-06 21:10 [PATCH] dm: max_segments=1 if merge_bvec_fn is not supported Lars Ellenberg @ 2010-03-08 5:33 ` Neil Brown 2010-03-08 8:35 ` Mikulas Patocka 0 siblings, 1 reply; 9+ messages in thread From: Neil Brown @ 2010-03-08 5:33 UTC (permalink / raw) To: Lars Ellenberg Cc: Alasdair G Kergon, device-mapper development, Mikulas Patocka, jens.axboe, linux-kernel On Sat, 6 Mar 2010 22:10:13 +0100 Lars Ellenberg <lars.ellenberg@linbit.com> wrote: > If the lower device exposes a merge_bvec_fn, > dm_set_device_limits() restricts max_sectors > to PAGE_SIZE "just to be safe". > > This is not sufficient, however. > > If someone uses bio_add_page() to add 8 disjunct 512 byte partial > pages to a bio, it would succeed, but could still cross a border > of whatever restrictions are below us (e.g. raid10 stripe boundary). > An attempted bio_split() would not succeed, because bi_vcnt is 8. > > One example that triggered this frequently is the xen io layer. > > raid10_make_request bug: can't convert block across chunks or bigger than 64k 209265151 1 > > Signed-off-by: Lars <lars.ellenberg@linbit.com> > > --- > > Neil: you may want to double check linear.c and raid*.c for similar logic, > even though it is unlikely that someone puts md raid6 on top of something > exposing a merge_bvec_fn. > Unlikely, but by no means impossible. I have queued up an appropriate fix for md. Thanks! NeilBrown > This is not the first time this has been patched, btw. > See https://bugzilla.redhat.com/show_bug.cgi?id=440093 > and the patch by Mikulas: > https://bugzilla.redhat.com/attachment.cgi?id=342638&action=diff > > --- > drivers/md/dm-table.c | 12 ++++++++++-- > 1 files changed, 10 insertions(+), 2 deletions(-) > > diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c > index 4b22feb..c686ff4 100644 > --- a/drivers/md/dm-table.c > +++ b/drivers/md/dm-table.c > @@ -515,14 +515,22 @@ int dm_set_device_limits(struct dm_target *ti, struct dm_dev *dev, > > /* > * Check if merge fn is supported. > - * If not we'll force DM to use PAGE_SIZE or > + * If not we'll force DM to use single bio_vec of PAGE_SIZE or > * smaller I/O, just to be safe. > */ > > - if (q->merge_bvec_fn && !ti->type->merge) > + if (q->merge_bvec_fn && !ti->type->merge) { > limits->max_sectors = > min_not_zero(limits->max_sectors, > (unsigned int) (PAGE_SIZE >> 9)); > + /* Restricting max_sectors is not enough. > + * If someone uses bio_add_page to add 8 disjunct 512 byte > + * partial pages to a bio, it would succeed, > + * but could still cross a border of whatever restrictions > + * are below us (e.g. raid0 stripe boundary). An attempted > + * bio_split() would not succeed, because bi_vcnt is 8. */ > + limits->max_segments = 1; > + } > return 0; > } > EXPORT_SYMBOL_GPL(dm_set_device_limits); ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] dm: max_segments=1 if merge_bvec_fn is not supported 2010-03-08 5:33 ` Neil Brown @ 2010-03-08 8:35 ` Mikulas Patocka 2010-03-08 13:14 ` Lars Ellenberg 0 siblings, 1 reply; 9+ messages in thread From: Mikulas Patocka @ 2010-03-08 8:35 UTC (permalink / raw) To: Neil Brown Cc: Lars Ellenberg, Alasdair G Kergon, device-mapper development, jens.axboe, linux-kernel On Mon, 8 Mar 2010, Neil Brown wrote: > On Sat, 6 Mar 2010 22:10:13 +0100 > Lars Ellenberg <lars.ellenberg@linbit.com> wrote: > > > If the lower device exposes a merge_bvec_fn, > > dm_set_device_limits() restricts max_sectors > > to PAGE_SIZE "just to be safe". > > > > This is not sufficient, however. > > > > If someone uses bio_add_page() to add 8 disjunct 512 byte partial > > pages to a bio, it would succeed, but could still cross a border > > of whatever restrictions are below us (e.g. raid10 stripe boundary). > > An attempted bio_split() would not succeed, because bi_vcnt is 8. > > > > One example that triggered this frequently is the xen io layer. > > > > raid10_make_request bug: can't convert block across chunks or bigger than 64k 209265151 1 > > > > Signed-off-by: Lars <lars.ellenberg@linbit.com> > > > > --- > > > > Neil: you may want to double check linear.c and raid*.c for similar logic, > > even though it is unlikely that someone puts md raid6 on top of something > > exposing a merge_bvec_fn. > > > > Unlikely, but by no means impossible. I have queued up an appropriate fix > for md. > > Thanks! > > NeilBrown Hi That patch with limits->max_segments = 1; is wrong. It fixes this bug sometimes and sometimes not. The problem is, if someone attempts to create a bio with two vector entries, the first maps the last sector contained in some page and the second maps the first sector of the next physical page: it has one segment, it has size <= PAGE_SIZE, but it still may cross raid stripe and the raid driver will reject it. > > This is not the first time this has been patched, btw. > > See https://bugzilla.redhat.com/show_bug.cgi?id=440093 > > and the patch by Mikulas: > > https://bugzilla.redhat.com/attachment.cgi?id=342638&action=diff Look at this patch, it is the proper way how to fix it: create a merge_bvec_fn that reject more than one biovec entry. Mikulas > > --- > > drivers/md/dm-table.c | 12 ++++++++++-- > > 1 files changed, 10 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c > > index 4b22feb..c686ff4 100644 > > --- a/drivers/md/dm-table.c > > +++ b/drivers/md/dm-table.c > > @@ -515,14 +515,22 @@ int dm_set_device_limits(struct dm_target *ti, struct dm_dev *dev, > > > > /* > > * Check if merge fn is supported. > > - * If not we'll force DM to use PAGE_SIZE or > > + * If not we'll force DM to use single bio_vec of PAGE_SIZE or > > * smaller I/O, just to be safe. > > */ > > > > - if (q->merge_bvec_fn && !ti->type->merge) > > + if (q->merge_bvec_fn && !ti->type->merge) { > > limits->max_sectors = > > min_not_zero(limits->max_sectors, > > (unsigned int) (PAGE_SIZE >> 9)); > > + /* Restricting max_sectors is not enough. > > + * If someone uses bio_add_page to add 8 disjunct 512 byte > > + * partial pages to a bio, it would succeed, > > + * but could still cross a border of whatever restrictions > > + * are below us (e.g. raid0 stripe boundary). An attempted > > + * bio_split() would not succeed, because bi_vcnt is 8. */ > > + limits->max_segments = 1; > > + } > > return 0; > > } > > EXPORT_SYMBOL_GPL(dm_set_device_limits); > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] dm: max_segments=1 if merge_bvec_fn is not supported 2010-03-08 8:35 ` Mikulas Patocka @ 2010-03-08 13:14 ` Lars Ellenberg 2010-03-18 18:48 ` Andrew Morton 2010-12-04 6:43 ` [PATCH] dm: check max_sectors in dm_merge_bvec (was: Re: dm: max_segments=1 if merge_bvec_fn is not supported) Mike Snitzer 0 siblings, 2 replies; 9+ messages in thread From: Lars Ellenberg @ 2010-03-08 13:14 UTC (permalink / raw) To: Mikulas Patocka Cc: Neil Brown, Alasdair G Kergon, device-mapper development, jens.axboe, linux-kernel On Mon, Mar 08, 2010 at 03:35:37AM -0500, Mikulas Patocka wrote: > Hi > > That patch with limits->max_segments = 1; is wrong. It fixes this bug > sometimes and sometimes not. > > The problem is, if someone attempts to create a bio with two vector > entries, the first maps the last sector contained in some page and the > second maps the first sector of the next physical page: it has one > segment, it has size <= PAGE_SIZE, but it still may cross raid stripe and > the raid driver will reject it. Now that you put it that way ;) You are right. My asumption that "single segment" was equalvalent in practice with "single bvec" does not hold true in that case. Then, what about adding seg_boundary_mask restrictions as well? max_sectors = PAGE_SIZE >> 9; max_segments = 1; seg_boundary_mask = PAGE_SIZE -1; or some such. > > > This is not the first time this has been patched, btw. > > > See https://bugzilla.redhat.com/show_bug.cgi?id=440093 > > > and the patch by Mikulas: > > > https://bugzilla.redhat.com/attachment.cgi?id=342638&action=diff > > Look at this patch, it is the proper way how to fix it: create a > merge_bvec_fn that reject more than one biovec entry. If adding seg_boundary_mask is still not sufficient, lets merge that patch instead? Why has it been dropped, respectively never been merged? It became obsolete for dm-linear by 7bc3447b, but in general the bug is still there, or am I missing something? Lars ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] dm: max_segments=1 if merge_bvec_fn is not supported 2010-03-08 13:14 ` Lars Ellenberg @ 2010-03-18 18:48 ` Andrew Morton 2010-03-18 21:48 ` Neil Brown 2010-12-04 6:43 ` [PATCH] dm: check max_sectors in dm_merge_bvec (was: Re: dm: max_segments=1 if merge_bvec_fn is not supported) Mike Snitzer 1 sibling, 1 reply; 9+ messages in thread From: Andrew Morton @ 2010-03-18 18:48 UTC (permalink / raw) To: Lars Ellenberg Cc: Mikulas Patocka, Neil Brown, Alasdair G Kergon, device-mapper development, jens.axboe, linux-kernel On Mon, 8 Mar 2010 14:14:49 +0100 Lars Ellenberg <lars.ellenberg@linbit.com> wrote: > On Mon, Mar 08, 2010 at 03:35:37AM -0500, Mikulas Patocka wrote: > > Hi > > > > That patch with limits->max_segments = 1; is wrong. It fixes this bug > > sometimes and sometimes not. > > > > The problem is, if someone attempts to create a bio with two vector > > entries, the first maps the last sector contained in some page and the > > second maps the first sector of the next physical page: it has one > > segment, it has size <= PAGE_SIZE, but it still may cross raid stripe and > > the raid driver will reject it. > > Now that you put it that way ;) > You are right. > > My asumption that "single segment" was > equalvalent in practice with "single bvec" > does not hold true in that case. > > Then, what about adding seg_boundary_mask restrictions as well? > max_sectors = PAGE_SIZE >> 9; > max_segments = 1; > seg_boundary_mask = PAGE_SIZE -1; > or some such. > > > > > This is not the first time this has been patched, btw. > > > > See https://bugzilla.redhat.com/show_bug.cgi?id=440093 > > > > and the patch by Mikulas: > > > > https://bugzilla.redhat.com/attachment.cgi?id=342638&action=diff > > > > Look at this patch, it is the proper way how to fix it: create a > > merge_bvec_fn that reject more than one biovec entry. > > If adding seg_boundary_mask is still not sufficient, > lets merge that patch instead? > Why has it been dropped, respectively never been merged? > It became obsolete for dm-linear by 7bc3447b, > but in general the bug is still there, or am I missing something? > This all seemed to die. Does Neil's mysterypatch fix all these issues? Neil, was that patch tagged for -stable backporting? Thanks. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] dm: max_segments=1 if merge_bvec_fn is not supported 2010-03-18 18:48 ` Andrew Morton @ 2010-03-18 21:48 ` Neil Brown 0 siblings, 0 replies; 9+ messages in thread From: Neil Brown @ 2010-03-18 21:48 UTC (permalink / raw) To: Andrew Morton Cc: Lars Ellenberg, Mikulas Patocka, Alasdair G Kergon, device-mapper development, jens.axboe, linux-kernel On Thu, 18 Mar 2010 11:48:58 -0700 Andrew Morton <akpm@linux-foundation.org> wrote: > On Mon, 8 Mar 2010 14:14:49 +0100 > Lars Ellenberg <lars.ellenberg@linbit.com> wrote: > > > On Mon, Mar 08, 2010 at 03:35:37AM -0500, Mikulas Patocka wrote: > > > Hi > > > > > > That patch with limits->max_segments = 1; is wrong. It fixes this bug > > > sometimes and sometimes not. > > > > > > The problem is, if someone attempts to create a bio with two vector > > > entries, the first maps the last sector contained in some page and the > > > second maps the first sector of the next physical page: it has one > > > segment, it has size <= PAGE_SIZE, but it still may cross raid stripe and > > > the raid driver will reject it. > > > > Now that you put it that way ;) > > You are right. > > > > My asumption that "single segment" was > > equalvalent in practice with "single bvec" > > does not hold true in that case. > > > > Then, what about adding seg_boundary_mask restrictions as well? > > max_sectors = PAGE_SIZE >> 9; > > max_segments = 1; > > seg_boundary_mask = PAGE_SIZE -1; > > or some such. > > > > > > > This is not the first time this has been patched, btw. > > > > > See https://bugzilla.redhat.com/show_bug.cgi?id=440093 > > > > > and the patch by Mikulas: > > > > > https://bugzilla.redhat.com/attachment.cgi?id=342638&action=diff > > > > > > Look at this patch, it is the proper way how to fix it: create a > > > merge_bvec_fn that reject more than one biovec entry. > > > > If adding seg_boundary_mask is still not sufficient, > > lets merge that patch instead? > > Why has it been dropped, respectively never been merged? > > It became obsolete for dm-linear by 7bc3447b, > > but in general the bug is still there, or am I missing something? > > > > This all seemed to die. Does Neil's mysterypatch fix all these issues? > > Neil, was that patch tagged for -stable backporting? The patch at the top of my 'for-linus' branch (which Linus doesn't seem to have pulled yet) fixes this for md and is tagged for -stable backporting. I just sets max_segments and seg_boundary_mask. There is no point setting max_sectors as well. I found that setting merge_bvec_fn, while a perfectly correct approach, was more cumbersome. My patch doesn't fix this for dm. I assume the dm developers will do that. NeilBrown ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] dm: check max_sectors in dm_merge_bvec (was: Re: dm: max_segments=1 if merge_bvec_fn is not supported) 2010-03-08 13:14 ` Lars Ellenberg 2010-03-18 18:48 ` Andrew Morton @ 2010-12-04 6:43 ` Mike Snitzer 2010-12-04 16:03 ` Lars Ellenberg 1 sibling, 1 reply; 9+ messages in thread From: Mike Snitzer @ 2010-12-04 6:43 UTC (permalink / raw) To: device-mapper development Cc: Mikulas Patocka, linux-kernel, Alasdair G Kergon, jaxboe, lars.ellenberg I'm late to this old thread but I stumbled across it while auditing the various dm-devel patchwork patches, e.g.: https://patchwork.kernel.org/patch/83666/ https://patchwork.kernel.org/patch/83932/ On Mon, Mar 08 2010 at 8:14am -0500, Lars Ellenberg <lars.ellenberg@linbit.com> wrote: > On Mon, Mar 08, 2010 at 03:35:37AM -0500, Mikulas Patocka wrote: > > Hi > > > > That patch with limits->max_segments = 1; is wrong. It fixes this bug > > sometimes and sometimes not. > > > > The problem is, if someone attempts to create a bio with two vector > > entries, the first maps the last sector contained in some page and the > > second maps the first sector of the next physical page: it has one > > segment, it has size <= PAGE_SIZE, but it still may cross raid stripe and > > the raid driver will reject it. > > Now that you put it that way ;) > You are right. > > My asumption that "single segment" was > equalvalent in practice with "single bvec" > does not hold true in that case. > > Then, what about adding seg_boundary_mask restrictions as well? > max_sectors = PAGE_SIZE >> 9; > max_segments = 1; > seg_boundary_mask = PAGE_SIZE -1; > or some such. > > > > > This is not the first time this has been patched, btw. > > > > See https://bugzilla.redhat.com/show_bug.cgi?id=440093 > > > > and the patch by Mikulas: > > > > https://bugzilla.redhat.com/attachment.cgi?id=342638&action=diff > > > > Look at this patch, it is the proper way how to fix it: create a > > merge_bvec_fn that reject more than one biovec entry. > > If adding seg_boundary_mask is still not sufficient, > lets merge that patch instead? > Why has it been dropped, respectively never been merged? > It became obsolete for dm-linear by 7bc3447b, > but in general the bug is still there, or am I missing something? No it _should_ be fixed in general given DM's dm_merge_bvec() _but_ I did uncover what I think is a subtle oversight in its implementation. Given dm_set_device_limits() sets q->limits->max_sectors, shouldn't dm_merge_bvec() be using queue_max_sectors rather than queue_max_hw_sectors? blk_queue_max_hw_sectors() establishes that max_hw_sectors is the hard limit and max_sectors the soft. But AFAICT no relation is maintained between the two over time (even though max_sectors <= max_hw_sectors _should_ be enforced; in practice there is no blk_queue_max_sectors setter that uniformly enforces as much). Anyway, I think we need the following patch: -- From: Mike Snitzer <snitzer@redhat.com> Subject: dm: check max_sectors in dm_merge_bvec dm_set_device_limits() will set q->limits->max_sectors to <= PAGE_SIZE if an underlying device has a merge_bvec_fn. Therefore, dm_merge_bvec() must use queue_max_sectors() rather than queue_max_hw_sectors() to check the appropriate limit. Signed-off-by: Mike Snitzer <snitzer@redhat.com> --- drivers/md/dm.c | 5 ++--- 1 files changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 7cb1352..e83dcc8 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1358,12 +1358,11 @@ static int dm_merge_bvec(struct request_queue *q, /* * If the target doesn't support merge method and some of the devices * provided their merge_bvec method (we know this by looking at - * queue_max_hw_sectors), then we can't allow bios with multiple vector + * queue_max_sectors), then we can't allow bios with multiple vector * entries. So always set max_size to 0, and the code below allows * just one page. */ - else if (queue_max_hw_sectors(q) <= PAGE_SIZE >> 9) - + else if (queue_max_sectors(q) <= PAGE_SIZE >> 9) max_size = 0; out_table: ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] dm: check max_sectors in dm_merge_bvec (was: Re: dm: max_segments=1 if merge_bvec_fn is not supported) 2010-12-04 6:43 ` [PATCH] dm: check max_sectors in dm_merge_bvec (was: Re: dm: max_segments=1 if merge_bvec_fn is not supported) Mike Snitzer @ 2010-12-04 16:03 ` Lars Ellenberg 2010-12-04 19:21 ` Mike Snitzer 0 siblings, 1 reply; 9+ messages in thread From: Lars Ellenberg @ 2010-12-04 16:03 UTC (permalink / raw) To: Mike Snitzer Cc: device-mapper development, Mikulas Patocka, linux-kernel, Alasdair G Kergon, jaxboe On Sat, Dec 04, 2010 at 01:43:08AM -0500, Mike Snitzer wrote: > I'm late to this old thread but I stumbled across it while auditing the > various dm-devel patchwork patches, e.g.: > https://patchwork.kernel.org/patch/83666/ > https://patchwork.kernel.org/patch/83932/ > > On Mon, Mar 08 2010 at 8:14am -0500, > Lars Ellenberg <lars.ellenberg@linbit.com> wrote: > > > On Mon, Mar 08, 2010 at 03:35:37AM -0500, Mikulas Patocka wrote: > > > Hi > > > > > > That patch with limits->max_segments = 1; is wrong. It fixes this bug > > > sometimes and sometimes not. > > > > > > The problem is, if someone attempts to create a bio with two vector > > > entries, the first maps the last sector contained in some page and the > > > second maps the first sector of the next physical page: it has one > > > segment, it has size <= PAGE_SIZE, but it still may cross raid stripe and > > > the raid driver will reject it. > > > > Now that you put it that way ;) > > You are right. > > > > My asumption that "single segment" was > > equalvalent in practice with "single bvec" > > does not hold true in that case. > > > > Then, what about adding seg_boundary_mask restrictions as well? > > max_sectors = PAGE_SIZE >> 9; > > max_segments = 1; > > seg_boundary_mask = PAGE_SIZE -1; > > or some such. > > > > > > > This is not the first time this has been patched, btw. > > > > > See https://bugzilla.redhat.com/show_bug.cgi?id=440093 > > > > > and the patch by Mikulas: > > > > > https://bugzilla.redhat.com/attachment.cgi?id=342638&action=diff > > > > > > Look at this patch, it is the proper way how to fix it: create a > > > merge_bvec_fn that reject more than one biovec entry. > > > > If adding seg_boundary_mask is still not sufficient, > > lets merge that patch instead? > > Why has it been dropped, respectively never been merged? > > It became obsolete for dm-linear by 7bc3447b, > > but in general the bug is still there, or am I missing something? > > No it _should_ be fixed in general given DM's dm_merge_bvec() _but_ I > did uncover what I think is a subtle oversight in its implementation. > > Given dm_set_device_limits() sets q->limits->max_sectors, > shouldn't dm_merge_bvec() be using queue_max_sectors rather than > queue_max_hw_sectors? > > blk_queue_max_hw_sectors() establishes that max_hw_sectors is the hard > limit and max_sectors the soft. But AFAICT no relation is maintained > between the two over time (even though max_sectors <= max_hw_sectors > _should_ be enforced; in practice there is no blk_queue_max_sectors > setter that uniformly enforces as much). Just for the record, in case someone finds this in the archives, and wants to backport or base his own work on this: A long time ago, there was no .max_hw_sectors. Then max_hw_sectors got introduced, but without accessor function. Before 2.6.31, there was no blk_queue_max_hw_sectors(), only blk_queue_max_sectors(), which set both. 2.6.31 introduced some blk_queue_max_hw_sectors(), which _only_ set max_hw_sectors, and enforced a lower limit of BLK_DEF_MAX_SECTORS, so using that only, you have not been able to actually set lower limits than 512 kB. With 2.6.31 to 2.6.33, inclusive, you still need to use blk_queue_max_sectors() to set your limits. 2.6.34 finally dropped the newly introduced function again, but renamed the other, so starting with 2.6.34 you need to use blk_queue_max_hw_sectors(), which now basically has the function body blk_queue_max_sectors() had up until 2.6.33. > dm_set_device_limits() will set q->limits->max_sectors to <= PAGE_SIZE > if an underlying device has a merge_bvec_fn. Therefore, dm_merge_bvec() > must use queue_max_sectors() rather than queue_max_hw_sectors() to check > the appropriate limit. IMO, you should not do this. max_sectors is a user tunable, capped by max_hw_sectors. max_hw_sectors is the driver limit. Please set max_hw_sectors in dm_set_device_limits instead. BTW, e.g. o_direct will adhere to max_hw_limits, but happily ignore max_sectors, I think. > Signed-off-by: Mike Snitzer <snitzer@redhat.com> > --- > drivers/md/dm.c | 5 ++--- > 1 files changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c > index 7cb1352..e83dcc8 100644 > --- a/drivers/md/dm.c > +++ b/drivers/md/dm.c > @@ -1358,12 +1358,11 @@ static int dm_merge_bvec(struct request_queue *q, > /* > * If the target doesn't support merge method and some of the devices > * provided their merge_bvec method (we know this by looking at > - * queue_max_hw_sectors), then we can't allow bios with multiple vector > + * queue_max_sectors), then we can't allow bios with multiple vector > * entries. So always set max_size to 0, and the code below allows > * just one page. > */ > - else if (queue_max_hw_sectors(q) <= PAGE_SIZE >> 9) > - > + else if (queue_max_sectors(q) <= PAGE_SIZE >> 9) > max_size = 0; > > out_table: Lars ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: dm: check max_sectors in dm_merge_bvec (was: Re: dm: max_segments=1 if merge_bvec_fn is not supported) 2010-12-04 16:03 ` Lars Ellenberg @ 2010-12-04 19:21 ` Mike Snitzer 0 siblings, 0 replies; 9+ messages in thread From: Mike Snitzer @ 2010-12-04 19:21 UTC (permalink / raw) To: Lars Ellenberg Cc: device-mapper development, Mikulas Patocka, linux-kernel, Alasdair G Kergon, jaxboe On Sat, Dec 04 2010 at 11:03am -0500, Lars Ellenberg <lars.ellenberg@linbit.com> wrote: > On Sat, Dec 04, 2010 at 01:43:08AM -0500, Mike Snitzer wrote: > > > Given dm_set_device_limits() sets q->limits->max_sectors, > > shouldn't dm_merge_bvec() be using queue_max_sectors rather than > > queue_max_hw_sectors? > > > > blk_queue_max_hw_sectors() establishes that max_hw_sectors is the hard > > limit and max_sectors the soft. But AFAICT no relation is maintained > > between the two over time (even though max_sectors <= max_hw_sectors > > _should_ be enforced; in practice there is no blk_queue_max_sectors > > setter that uniformly enforces as much). > > Just for the record, in case someone finds this in the archives, > and wants to backport or base his own work on this: > > A long time ago, there was no .max_hw_sectors. Then max_hw_sectors got > introduced, but without accessor function. > > Before 2.6.31, there was no blk_queue_max_hw_sectors(), > only blk_queue_max_sectors(), which set both. > > 2.6.31 introduced some blk_queue_max_hw_sectors(), which _only_ set > max_hw_sectors, and enforced a lower limit of BLK_DEF_MAX_SECTORS, so > using that only, you have not been able to actually set lower limits > than 512 kB. With 2.6.31 to 2.6.33, inclusive, you still need to use > blk_queue_max_sectors() to set your limits. > > 2.6.34 finally dropped the newly introduced function again, > but renamed the other, so starting with 2.6.34 you need to use > blk_queue_max_hw_sectors(), which now basically has the function body > blk_queue_max_sectors() had up until 2.6.33. > > > dm_set_device_limits() will set q->limits->max_sectors to <= PAGE_SIZE > > if an underlying device has a merge_bvec_fn. Therefore, dm_merge_bvec() > > must use queue_max_sectors() rather than queue_max_hw_sectors() to check > > the appropriate limit. > > IMO, you should not do this. > max_sectors is a user tunable, capped by max_hw_sectors. > max_hw_sectors is the driver limit. > > Please set max_hw_sectors in dm_set_device_limits instead. Right, good point.. will do (unless I happen upon a reason not to or someone else shouts). Thanks, Mike ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2010-12-04 19:21 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-03-06 21:10 [PATCH] dm: max_segments=1 if merge_bvec_fn is not supported Lars Ellenberg 2010-03-08 5:33 ` Neil Brown 2010-03-08 8:35 ` Mikulas Patocka 2010-03-08 13:14 ` Lars Ellenberg 2010-03-18 18:48 ` Andrew Morton 2010-03-18 21:48 ` Neil Brown 2010-12-04 6:43 ` [PATCH] dm: check max_sectors in dm_merge_bvec (was: Re: dm: max_segments=1 if merge_bvec_fn is not supported) Mike Snitzer 2010-12-04 16:03 ` Lars Ellenberg 2010-12-04 19:21 ` Mike Snitzer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).