All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] dm: Don't install merge function if not needed
@ 2011-06-21 20:08 Mikulas Patocka
  2011-06-21 20:24 ` Alasdair G Kergon
  2011-06-21 21:05 ` [PATCH] dm: Don't install merge function if not needed Alasdair G Kergon
  0 siblings, 2 replies; 7+ messages in thread
From: Mikulas Patocka @ 2011-06-21 20:08 UTC (permalink / raw)
  To: Alasdair G. Kergon; +Cc: dm-devel

Hi

This is the patch that allows larger bios to snapshots and improve 
snapshot performance.

The logic and reason is explained in the patch header.

Note that providing a merge function on a snapshot target doesn't work, 
the target merge function doesn't know where the bio will go, thus it 
cannot call underlying merge function accurately. Guessing is not 
possible, because the merge function must be obeyed.

This patch may also improve CPU consumption a little bit by not providing 
a merge function on linear devices, where it is not needed.

Mikulas

---

dm: Don't install merge function if not needed

This patch changes dm to not install merge function when not needed.
Merge functions is installed when the table needs it. It is never
uninstalled --- uninstalling it is not thread-safe.


The reason for this change is this:


The specification for allowed bio size is this:
* a bio containing just one page is always allowed
* if the bio contains more pages, it must conform to queue limits and
  the merge function. The bio must not be larger than the size allowed
  by the queue's merge function.

The limit set by the "merge" function must be obeyed. If we don't obey
this limit, "md" driver doesn't process the bio and returns an error.


The snapshot target can provide its own merge function, but when this
merge function is called, it is unclear to which location the bio will go.
We would know where the bio will go in case of already reallocated chunk,
but in case of read or write to not-yet-reallocated chunk, it is impossible
to say where this chunk will be eventually reallocated.

"Guessing" where the bio will go is not allowed, because the guess will
eventually go wrong. Incorrect guess could allow too large bio to
be created. When such large bio is passed to "md" driver, the "md" driver
rejects it with an error.


Consequently --- if the snapshot "cow" device has a merge function,
we must not allow bios larger than a page to go to that snapshot.

Therefore, we could allow bios larger than a page and improve snapshot
performance by not setting a merge function for a "cow" device.

The "cow" device is device mapper device, it is usually composed of
one or more linear targets, these targets do not need merge function
if the underlying disk doesn't have a merge function.

This patch introduces this logic:
* the device mapper provides a merge function for its device
  if one of the underlying devices have a merge function OR
  if one of the targets have nonzero "split_io".

Consequently, if the "cow" device is a linear target and if the underlying disk
doesn't have a merge function, the "cow" device doesn't have a merge function
either. Thus, the snapshot target can allow bios larger than a page.

This patch (together with the previous patch to not copy on full chunk write)
improves performance when writing to ext2 filesystem created on a sparse device
with 8k chunk from 22MB/s (before the patch) to 40MB/s (after the patch).

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/md/dm-table.c |   36 ++++++++++++++++++++++++++++++++++++
 drivers/md/dm.c       |    6 ++----
 drivers/md/dm.h       |    3 +++
 3 files changed, 41 insertions(+), 4 deletions(-)

Index: linux-2.6.39-fast/drivers/md/dm-table.c
===================================================================
--- linux-2.6.39-fast.orig/drivers/md/dm-table.c	2011-06-21 21:18:48.000000000 +0200
+++ linux-2.6.39-fast/drivers/md/dm-table.c	2011-06-21 21:32:55.000000000 +0200
@@ -1152,6 +1152,39 @@ combine_limits:
 	return validate_hardware_logical_block_alignment(table, limits);
 }
 
+static int device_needs_merge(struct dm_target *ti, struct dm_dev *dev,
+			      sector_t start, sector_t len, void *data)
+{
+	struct block_device *bdev = dev->bdev;
+	struct request_queue *q = bdev_get_queue(bdev);
+
+	if (q->merge_bvec_fn)
+		return 1;
+
+	return 0;
+}
+
+static int dm_table_needs_merge(struct dm_table *t)
+{
+	unsigned i = 0;
+	while (i < dm_table_get_num_targets(t)) {
+		struct dm_target *ti;
+
+		ti = dm_table_get_target(t, i++);
+
+		if (ti->split_io)
+			return 1;
+
+		if (!ti->type->iterate_devices)
+			continue;
+
+		if (ti->type->iterate_devices(ti, device_needs_merge,
+					      NULL))
+			return 1;
+	}
+	return 0;
+}
+
 /*
  * Set the integrity profile for this device if all devices used have
  * matching profiles.  We're quite deep in the resume path but still
@@ -1185,6 +1218,9 @@ void dm_table_set_restrictions(struct dm
 	 */
 	q->limits = *limits;
 
+	if (dm_table_needs_merge(t))
+		blk_queue_merge_bvec(q, dm_merge_bvec);
+
 	if (!dm_table_supports_discards(t))
 		queue_flag_clear_unlocked(QUEUE_FLAG_DISCARD, q);
 	else
Index: linux-2.6.39-fast/drivers/md/dm.c
===================================================================
--- linux-2.6.39-fast.orig/drivers/md/dm.c	2011-06-21 21:17:05.000000000 +0200
+++ linux-2.6.39-fast/drivers/md/dm.c	2011-06-21 21:33:31.000000000 +0200
@@ -1320,9 +1320,8 @@ static void __split_and_process_bio(stru
  * CRUD END
  *---------------------------------------------------------------*/
 
-static int dm_merge_bvec(struct request_queue *q,
-			 struct bvec_merge_data *bvm,
-			 struct bio_vec *biovec)
+int dm_merge_bvec(struct request_queue *q, struct bvec_merge_data *bvm,
+		  struct bio_vec *biovec)
 {
 	struct mapped_device *md = q->queuedata;
 	struct dm_table *map = dm_get_live_table(md);
@@ -1799,7 +1798,6 @@ static void dm_init_md_queue(struct mapp
 	md->queue->backing_dev_info.congested_data = md;
 	blk_queue_make_request(md->queue, dm_request);
 	blk_queue_bounce_limit(md->queue, BLK_BOUNCE_ANY);
-	blk_queue_merge_bvec(md->queue, dm_merge_bvec);
 	blk_queue_flush(md->queue, REQ_FLUSH | REQ_FUA);
 }
 
Index: linux-2.6.39-fast/drivers/md/dm.h
===================================================================
--- linux-2.6.39-fast.orig/drivers/md/dm.h	2011-06-21 21:19:26.000000000 +0200
+++ linux-2.6.39-fast/drivers/md/dm.h	2011-06-21 21:22:03.000000000 +0200
@@ -41,6 +41,9 @@ struct dm_dev_internal {
 struct dm_table;
 struct dm_md_mempools;
 
+int dm_merge_bvec(struct request_queue *q, struct bvec_merge_data *bvm,
+		  struct bio_vec *biovec);
+
 /*-----------------------------------------------------------------
  * Internal table functions.
  *---------------------------------------------------------------*/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] dm: Don't install merge function if not needed
  2011-06-21 20:08 [PATCH] dm: Don't install merge function if not needed Mikulas Patocka
@ 2011-06-21 20:24 ` Alasdair G Kergon
  2011-06-21 22:23   ` Alasdair G Kergon
  2011-06-29  8:45   ` Fedora 15 dm-linear device lookup failed Stamper, Brian P. (ARC-D)[Stinger Ghaffarian Technologies Inc. (SGT Inc.)]
  2011-06-21 21:05 ` [PATCH] dm: Don't install merge function if not needed Alasdair G Kergon
  1 sibling, 2 replies; 7+ messages in thread
From: Alasdair G Kergon @ 2011-06-21 20:24 UTC (permalink / raw)
  To: device-mapper development

On Tue, Jun 21, 2011 at 04:08:52PM -0400, Mikulas Patocka wrote:
> possible, because the merge function must be obeyed.

The merge function is only a hint, albeit a strong hint - not a hard
requirement.  (Otherwise we're back to the ideas of block-layer
reservations/cancellations and limitations on suspend which we didn't
want to implement.)

If you are still seeing a problem with md in this respect, please could
you post the details for us to discuss?

Alasdair

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] dm: Don't install merge function if not needed
  2011-06-21 20:08 [PATCH] dm: Don't install merge function if not needed Mikulas Patocka
  2011-06-21 20:24 ` Alasdair G Kergon
@ 2011-06-21 21:05 ` Alasdair G Kergon
  1 sibling, 0 replies; 7+ messages in thread
From: Alasdair G Kergon @ 2011-06-21 21:05 UTC (permalink / raw)
  To: device-mapper development

On Tue, Jun 21, 2011 at 04:08:52PM -0400, Mikulas Patocka wrote:
> This patch may also improve CPU consumption a little bit by not providing 
> a merge function on linear devices, where it is not needed.

That would be a trade-off - dependent on the actual table construction.
The risk from not calling the merge function is that you'll necessitate
splitting (and consequent possible reordering).  Historically we found
it highly desirable to avoid splitting vs. an unnoticeable overhead from
calling the merge function.  Probably needs some numbers with different
sorts of tables and workloads to see where the trade-off might now lie.

Essentially my position is that the linear merge function:
  Improves *some* configurations and processing of specific bios noticeably;
  Has neglible overhead in the cases it doesn't improve.

So to counter that I'd like to see evidence of counter-examples,
e.g. It has noticeable overhead in some configurations that don't
benefit from it.

Then we can debate whether we can distinguish between these classes
automatically or make the change globally.

(BTW md linear should be included in these tests, as the same
considerations ought to apply - if there are solid grounds to make
a change, those grounds may apply to both md and dm.)

Alasdair

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] dm: Don't install merge function if not needed
  2011-06-21 20:24 ` Alasdair G Kergon
@ 2011-06-21 22:23   ` Alasdair G Kergon
  2011-06-29  8:45   ` Fedora 15 dm-linear device lookup failed Stamper, Brian P. (ARC-D)[Stinger Ghaffarian Technologies Inc. (SGT Inc.)]
  1 sibling, 0 replies; 7+ messages in thread
From: Alasdair G Kergon @ 2011-06-21 22:23 UTC (permalink / raw)
  To: device-mapper development; +Cc: Mikulas Patocka

On Tue, Jun 21, 2011 at 09:24:24PM +0100, Alasdair G Kergon wrote:
> If you are still seeing a problem with md in this respect, please could
> you post the details for us to discuss?
 
We've discussed this on IRC and indeed we think stacking of dm over md is still
broken with respect to reconfigurations while bios are being submitted.
Possible long-term solutions may involve changing the way the dm map
functions work (dealing with the overlap with the merge fns) and moving the
responsibility for splitting back into the core block layer, rather than
relying on dm and md to each deal with it.

Alasdair

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Fedora 15 dm-linear device lookup failed
  2011-06-21 20:24 ` Alasdair G Kergon
  2011-06-21 22:23   ` Alasdair G Kergon
@ 2011-06-29  8:45   ` Stamper, Brian P. (ARC-D)[Stinger Ghaffarian Technologies Inc. (SGT Inc.)]
  2011-06-29 19:47     ` Stamper, Brian P. (ARC-D)[Stinger Ghaffarian Technologies Inc. (SGT Inc.)]
  1 sibling, 1 reply; 7+ messages in thread
From: Stamper, Brian P. (ARC-D)[Stinger Ghaffarian Technologies Inc. (SGT Inc.)] @ 2011-06-29  8:45 UTC (permalink / raw)
  To: device-mapper development


[-- Attachment #1.1: Type: text/plain, Size: 1210 bytes --]

I just upgraded a Fedora system from 13 to 15, and now I can't get it to boot.  I'm getting dumped to a dracut shell and getting the error dm-linear device lookup failed.  In the dracut output with debug on, I see dracut scan my drives, and I see it find my 3 LVs, but it then tries to do something with them, I get the error, and it then fails out with root device not found and drops me into dracut.

Here's the odd thing, booting off the rescue dvd my volumes are fine.  The LVM can be mounted no problem.

This makes me think that there's something missing from the kernel or initrd.  I can't really figure out what though.  I've gone through the output of lsmod when running from the rescue dvd.  I've even build a few ramdisks including all the modules that I thought might be related (dm, lvm, mdraid, dmraid, multipath).  I tried building them with mkinitrd -with=module, and I tried again with dracut -add=modules.  No change in results.

I'm not sure where to go from here.  I feel like the system can be rescued, but I don't know how to work backward through dracut to find out why it's unable to find my root volume.  Any help would be appreciated.  Cross-posting to lvm.

Thanks,
-Brian

[-- Attachment #1.2: Type: text/html, Size: 1574 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Fedora 15 dm-linear device lookup failed
  2011-06-29  8:45   ` Fedora 15 dm-linear device lookup failed Stamper, Brian P. (ARC-D)[Stinger Ghaffarian Technologies Inc. (SGT Inc.)]
@ 2011-06-29 19:47     ` Stamper, Brian P. (ARC-D)[Stinger Ghaffarian Technologies Inc. (SGT Inc.)]
  2011-06-29 21:35       ` Gianluca Cecchi
  0 siblings, 1 reply; 7+ messages in thread
From: Stamper, Brian P. (ARC-D)[Stinger Ghaffarian Technologies Inc. (SGT Inc.)] @ 2011-06-29 19:47 UTC (permalink / raw)
  To: device-mapper development


[-- Attachment #1.1: Type: text/plain, Size: 3376 bytes --]


I was able to boot this system by taking an initramfs from another Fedora 15 test system and putting it in /boot and referencing it in grub.conf.  However, I need to find out why an initramfs I build on the system is unable to boot.

When I'm in dracut, I was able to do some checking.  Using the lvm commands, I was able to activate and deactivate the vg, display the lvs, but not use them.  I would get "device not found", and in /dev/mapper, the lvm devices were not there.  There were other lvm devices there, but not from my primary system lvm.  The only thing that stands out about it is that it is ext4, not ext3.

Comparing the decompiled initrds, there's no modules that stand out.  Sas.out is a listing of the WORKING initrd, tfs.out is the NOT WORKING initrd.

[root@testfs ramdisks]# diff sas.out tfs.out
718d717
< ./lib/firmware/ql2500_fw.bin
1248,1249d1246
< ./etc/ld.so.conf.d/openmotif-64.conf
< ./etc/ld.so.conf.d/kernel-2.6.38.7-30.fc15.x86_64.conf
1252c1249
< ./etc/ld.so.conf.d/kernel-2.6.34.8-68.fc13.x86_64.conf
---
> ./etc/ld.so.conf.d/qt-x86_64.conf
1262d1258
< ./etc/modprobe.d/openfwwf.conf
1278a1275
> ./etc/udev/rules.d/40-multipath.rules
1317a1315
> ./lib64/libcryptsetup.so.1.2.0
1322a1321
> ./lib64/libcap.so
1407d1405
< ./lib64/libcryptsetup.so.1.1.0
1419a1418
> ./lib64/libnss_files-2.12.2.so
1480a1480
> ./usr/bin/checkisomd5
1491d1490
< ./usr/lib64/libnss_files.so
1517a1517
> ./usr/share/plymouth/themes/default.plymouth

Nothing there looks relevant.  Libcryptsetup needs updating on the test system I pulled the initrd from.  The q-logic firmware is irrelevant.  There are a couple of older kernels on the test system.  Libcap.so is a dependency of legato backup software.  Libnss_files should have nothing to do with accessing an LVM.

So I'm stuck.  My system is up, but only if I build an initrd on another system and copy it over, meaning next time I update the kernel it will not boot.  Nothing's jumping out at me as obviously missing modules from the broken initrd.  Where else can I look?

-Brian




On 6/29/11 1:45 AM, "Brian Stamper" <brian.p.stamper@nasa.gov> wrote:

I just upgraded a Fedora system from 13 to 15, and now I can't get it to boot.  I'm getting dumped to a dracut shell and getting the error dm-linear device lookup failed.  In the dracut output with debug on, I see dracut scan my drives, and I see it find my 3 LVs, but it then tries to do something with them, I get the error, and it then fails out with root device not found and drops me into dracut.

Here's the odd thing, booting off the rescue dvd my volumes are fine.  The LVM can be mounted no problem.

This makes me think that there's something missing from the kernel or initrd.  I can't really figure out what though.  I've gone through the output of lsmod when running from the rescue dvd.  I've even build a few ramdisks including all the modules that I thought might be related (dm, lvm, mdraid, dmraid, multipath).  I tried building them with mkinitrd -with=module, and I tried again with dracut -add=modules.  No change in results.

I'm not sure where to go from here.  I feel like the system can be rescued, but I don't know how to work backward through dracut to find out why it's unable to find my root volume.  Any help would be appreciated.  Cross-posting to lvm.

Thanks,
-Brian

[-- Attachment #1.2: Type: text/html, Size: 4269 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Fedora 15 dm-linear device lookup failed
  2011-06-29 19:47     ` Stamper, Brian P. (ARC-D)[Stinger Ghaffarian Technologies Inc. (SGT Inc.)]
@ 2011-06-29 21:35       ` Gianluca Cecchi
  0 siblings, 0 replies; 7+ messages in thread
From: Gianluca Cecchi @ 2011-06-29 21:35 UTC (permalink / raw)
  To: device-mapper development

On Wed, Jun 29, 2011 at 9:47 PM, Stamper, Brian P. (ARC-D)[Stinger
Ghaffarian Technologies Inc. (SGT Inc.)]  wrote:
>
> I was able to boot this system by taking an initramfs from another Fedora 15
> test system and putting it in /boot and referencing it in grub.conf.
>  However, I need to find out why an initramfs I build on the system is
> unable to boot.
>

what are a diff between the "init" files under the root of your two
initrd files?
what are the difference between the dev directories contents of your
two initrd files

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-06-29 21:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-21 20:08 [PATCH] dm: Don't install merge function if not needed Mikulas Patocka
2011-06-21 20:24 ` Alasdair G Kergon
2011-06-21 22:23   ` Alasdair G Kergon
2011-06-29  8:45   ` Fedora 15 dm-linear device lookup failed Stamper, Brian P. (ARC-D)[Stinger Ghaffarian Technologies Inc. (SGT Inc.)]
2011-06-29 19:47     ` Stamper, Brian P. (ARC-D)[Stinger Ghaffarian Technologies Inc. (SGT Inc.)]
2011-06-29 21:35       ` Gianluca Cecchi
2011-06-21 21:05 ` [PATCH] dm: Don't install merge function if not needed Alasdair G Kergon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.