All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Fix SMP races in dm-crypt
@ 2007-03-21  7:06 Olaf Kirch
  2007-03-21 10:03 ` Alasdair G Kergon
  2007-03-22 20:28 ` Christophe Saout
  0 siblings, 2 replies; 11+ messages in thread
From: Olaf Kirch @ 2007-03-21  7:06 UTC (permalink / raw)
  To: dm-devel, Alasdair G Kergon, Andrew Morton

(I'm resending this, because it seems mailman ate this posting
the first time around - apologies if you're seeing this twice).

I investigated kernel Bug 5948 (Oops when accessing SATA with device mapper on
AMD 64 X2) to help cut down on the number of open bugs.

The problem reported in that bug seems to be caused by memory pressure
when writing - crypt_alloc_buffer can return less than the full number of
pages required. In that case, the bio is split, but we're keeping a pointer
to the first chunk of the request around in first_clone, and copy it. If the
first clone completes I/O while we're cloning it, we oops on garbage in
the bvec.

I was able to reproduce this on 2.6.20 - and the following patches
seem to fix it for me.

dm-crypt-clone_init-early
	We need to call clone_init early on, before we can conceivably
	ever call bio_put on the clone

dm-crypt-clone-race
	Do no access bios after generic_make_request

dm-crypt-drop-first_clone
	Drop the first_clone business, and always allocate a fresh bio.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] Fix SMP races in dm-crypt
  2007-03-21  7:06 [PATCH 0/3] Fix SMP races in dm-crypt Olaf Kirch
@ 2007-03-21 10:03 ` Alasdair G Kergon
  2007-03-21 10:47   ` [PATCH 4/3] Allocate smaller clones Olaf Kirch
  2007-03-21 13:32   ` Re: [PATCH 0/3] Fix SMP races in dm-crypt Jim Meyering
  2007-03-22 20:28 ` Christophe Saout
  1 sibling, 2 replies; 11+ messages in thread
From: Alasdair G Kergon @ 2007-03-21 10:03 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: dm-devel, Andrew Morton

On Wed, Mar 21, 2007 at 08:06:29AM +0100, Olaf Kirch wrote:
> (I'm resending this, because it seems mailman ate this posting
> the first time around - apologies if you're seeing this twice).
 
The first emails got trapped for moderation.

The 3rd patch was hit by line-wrap and needed fixing up.

> I investigated kernel Bug 5948 (Oops when accessing SATA with device mapper on
> AMD 64 X2) to help cut down on the number of open bugs.
 
Thanks for your detective work on this!  The patches look good and I've
added them to
  http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/
(with prefix dm-crypt-fix) and will review them in full in the next
couple of days.

Alasdair
-- 
agk@redhat.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 4/3] Allocate smaller clones
  2007-03-21 10:03 ` Alasdair G Kergon
@ 2007-03-21 10:47   ` Olaf Kirch
  2007-03-21 13:32   ` Re: [PATCH 0/3] Fix SMP races in dm-crypt Jim Meyering
  1 sibling, 0 replies; 11+ messages in thread
From: Olaf Kirch @ 2007-03-21 10:47 UTC (permalink / raw)
  To: dm-devel, Andrew Morton, Alasdair G Kergon

Hi Alasdair,

FWIW, here's a fourth patch that tidies up crypt_allocate_buffer
a little. It's not strictly required to fix the bug, but the code
looks simpler this way, I think.

Olaf
------------------------
Allocate smaller clones

With the previous dm-crypt fixes, there is no need for the clone
bios to have the same bvec size as the original - we just
need to make them big enough for the remaining number of
pages. The only requirement is that we clear the "out" index
in convert_context, so that crypt_convert starts storing data
at the right position within the clone bio.

Signed-off-by: olaf.kirch@oracle.com
---
 drivers/md/dm-crypt.c |   28 ++++++++--------------------
 1 file changed, 8 insertions(+), 20 deletions(-)

Index: linux-2.6.20/drivers/md/dm-crypt.c
===================================================================
--- linux-2.6.20.orig/drivers/md/dm-crypt.c
+++ linux-2.6.20/drivers/md/dm-crypt.c
@@ -380,7 +380,7 @@ static int crypt_convert(struct crypt_co
  * May return a smaller bio when running out of pages
  */
 static struct bio *
-crypt_alloc_buffer(struct crypt_io *io, unsigned int size, unsigned int *bio_vec_idx)
+crypt_alloc_buffer(struct crypt_io *io, unsigned int size)
 {
 	struct crypt_config *cc = io->target->private;
 	struct bio *clone;
@@ -394,16 +394,7 @@ crypt_alloc_buffer(struct crypt_io *io, 
 
 	clone_init(io, clone);
 
-	/* if the last bio was not complete, continue where that one ended */
-	clone->bi_idx = *bio_vec_idx;
-	clone->bi_vcnt = *bio_vec_idx;
-	clone->bi_size = 0;
-	clone->bi_flags &= ~(1 << BIO_SEG_VALID);
-
-	/* clone->bi_idx pages have already been allocated */
-	size -= clone->bi_idx * PAGE_SIZE;
-
-	for (i = clone->bi_idx; i < nr_iovecs; i++) {
+	for (i = 0; i < nr_iovecs; i++) {
 		struct bio_vec *bv = bio_iovec_idx(clone, i);
 
 		bv->bv_page = mempool_alloc(cc->page_pool, gfp_mask);
@@ -415,7 +406,7 @@ crypt_alloc_buffer(struct crypt_io *io, 
 		 * return a partially allocated bio, the caller will then try
 		 * to allocate additional bios while submitting this partial bio
 		 */
-		if ((i - clone->bi_idx) == (MIN_BIO_PAGES - 1))
+		if (i == (MIN_BIO_PAGES - 1))
 			gfp_mask = (gfp_mask | __GFP_NOWARN) & ~__GFP_WAIT;
 
 		bv->bv_offset = 0;
@@ -434,12 +425,6 @@ crypt_alloc_buffer(struct crypt_io *io, 
 		return NULL;
 	}
 
-	/*
-	 * Remember the last bio_vec allocated to be able
-	 * to correctly continue after the splitting.
-	 */
-	*bio_vec_idx = clone->bi_vcnt;
-
 	return clone;
 }
 
@@ -597,7 +582,6 @@ static void process_write(struct crypt_i
 	struct convert_context ctx;
 	unsigned remaining = base_bio->bi_size;
 	sector_t sector = base_bio->bi_sector - io->target->begin;
-	unsigned bvec_idx = 0;
 
 	atomic_inc(&io->pending);
 
@@ -608,13 +592,14 @@ static void process_write(struct crypt_i
 	 * so repeat the whole process until all the data can be handled.
 	 */
 	while (remaining) {
-		clone = crypt_alloc_buffer(io, base_bio->bi_size, &bvec_idx);
+		clone = crypt_alloc_buffer(io, remaining);
 		if (unlikely(!clone)) {
 			dec_pending(io, -ENOMEM);
 			return;
 		}
 
 		ctx.bio_out = clone;
+		ctx.idx_out = 0;
 
 		if (unlikely(crypt_convert(cc, &ctx) < 0)) {
 			crypt_free_buffer_pages(cc, clone, clone->bi_size);
@@ -623,6 +608,9 @@ static void process_write(struct crypt_i
 			return;
 		}
 
+		/* crypt_convert should have filled the clone bio */
+		BUG_ON(ctx.idx_out < clone->bi_vcnt);
+
 		clone->bi_sector = cc->start + sector;
 		remaining -= clone->bi_size;
 		sector += bio_sectors(clone);

-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Re: [PATCH 0/3] Fix SMP races in dm-crypt
  2007-03-21 10:03 ` Alasdair G Kergon
  2007-03-21 10:47   ` [PATCH 4/3] Allocate smaller clones Olaf Kirch
@ 2007-03-21 13:32   ` Jim Meyering
  2007-03-21 15:44     ` Alasdair G Kergon
  1 sibling, 1 reply; 11+ messages in thread
From: Jim Meyering @ 2007-03-21 13:32 UTC (permalink / raw)
  To: Alasdair Kergon; +Cc: dm-devel, Andrew Morton

Alasdair G Kergon <agk@redhat.com> wrote:
> On Wed, Mar 21, 2007 at 08:06:29AM +0100, Olaf Kirch wrote:
>> (I'm resending this, because it seems mailman ate this posting

On a similar note, I sent two patches on Mar 12.
While they did reach the archive, perhaps you haven't seen them?

    https://www.redhat.com/archives/dm-devel/2007-March/msg00010.html
    https://www.redhat.com/archives/dm-devel/2007-March/msg00011.html

2007-03-12  Jim Meyering  <jim@meyering.net>

        * lib/fs/libdevmapper.c (do_suspend): Detect write failure reliably.
        (do_load): Likewise.
        (find_mount_point): Avoid buffer overrun for unusually long inputs.

2007-03-11  Jim Meyering  <meyering@redhat.com>

        * dmsetup/dmsetup.c (_loop_table): Detect close failure.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Re: [PATCH 0/3] Fix SMP races in dm-crypt
  2007-03-21 13:32   ` Re: [PATCH 0/3] Fix SMP races in dm-crypt Jim Meyering
@ 2007-03-21 15:44     ` Alasdair G Kergon
  0 siblings, 0 replies; 11+ messages in thread
From: Alasdair G Kergon @ 2007-03-21 15:44 UTC (permalink / raw)
  To: Jim Meyering; +Cc: dm-devel, Andrew Morton

On Wed, Mar 21, 2007 at 02:32:42PM +0100, Jim Meyering wrote:
> 2007-03-12  Jim Meyering  <jim@meyering.net>
> 
>         * lib/fs/libdevmapper.c (do_suspend): Detect write failure reliably.
>         (do_load): Likewise.
>         (find_mount_point): Avoid buffer overrun for unusually long inputs.
 
That userspace code is neither used nor maintained any more.  It's
probably time to drop it from CVS so people don't waste time looking at
it:-)

> 2007-03-11  Jim Meyering  <meyering@redhat.com>
> 
>         * dmsetup/dmsetup.c (_loop_table): Detect close failure.

I'll apply that, thanks, but I'm in no hurry because that section of
code is unfinished and only partially reviewed.

Alasdair
-- 
agk@redhat.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] Fix SMP races in dm-crypt
  2007-03-21  7:06 [PATCH 0/3] Fix SMP races in dm-crypt Olaf Kirch
  2007-03-21 10:03 ` Alasdair G Kergon
@ 2007-03-22 20:28 ` Christophe Saout
  2007-03-22 20:41   ` Andrew Morton
  1 sibling, 1 reply; 11+ messages in thread
From: Christophe Saout @ 2007-03-22 20:28 UTC (permalink / raw)
  To: device-mapper development; +Cc: Andrew Morton, Alasdair G Kergon

Hi Olaf,

> (I'm resending this, because it seems mailman ate this posting
> the first time around - apologies if you're seeing this twice).
> 
> I investigated kernel Bug 5948 (Oops when accessing SATA with device mapper on
> AMD 64 X2) to help cut down on the number of open bugs.
> 
> The problem reported in that bug seems to be caused by memory pressure
> when writing - crypt_alloc_buffer can return less than the full number of
> pages required. In that case, the bio is split, but we're keeping a pointer
> to the first chunk of the request around in first_clone, and copy it. If the
> first clone completes I/O while we're cloning it, we oops on garbage in
> the bvec.
> 
> I was able to reproduce this on 2.6.20 - and the following patches
> seem to fix it for me.
> 
> dm-crypt-clone_init-early
> 	We need to call clone_init early on, before we can conceivably
> 	ever call bio_put on the clone
> 
> dm-crypt-clone-race
> 	Do no access bios after generic_make_request
> 
> dm-crypt-drop-first_clone
> 	Drop the first_clone business, and always allocate a fresh bio.

I fully agree. I didn't notice when the sharing of the bvec went away,
but it's no longer needed and the code is simpler. As for the other two
- ouch, you're right.

Thanks for investigating fixing these things.

All four patches are

Acked-By: Christophe Saout <christophe@saout.de>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] Fix SMP races in dm-crypt
  2007-03-22 20:28 ` Christophe Saout
@ 2007-03-22 20:41   ` Andrew Morton
  2007-03-22 20:51     ` Christophe Saout
  2007-03-22 21:53     ` Alasdair G Kergon
  0 siblings, 2 replies; 11+ messages in thread
From: Andrew Morton @ 2007-03-22 20:41 UTC (permalink / raw)
  To: Christophe Saout; +Cc: device-mapper development, Alasdair G Kergon

On Thu, 22 Mar 2007 21:28:16 +0100
Christophe Saout <christophe@saout.de> wrote:

> Hi Olaf,

err, which Olaf are you addressing?  He is not cc'ed.

> > (I'm resending this, because it seems mailman ate this posting
> > the first time around - apologies if you're seeing this twice).
> > 
> > I investigated kernel Bug 5948 (Oops when accessing SATA with device mapper on
> > AMD 64 X2) to help cut down on the number of open bugs.
> > 
> > The problem reported in that bug seems to be caused by memory pressure
> > when writing - crypt_alloc_buffer can return less than the full number of
> > pages required. In that case, the bio is split, but we're keeping a pointer
> > to the first chunk of the request around in first_clone, and copy it. If the
> > first clone completes I/O while we're cloning it, we oops on garbage in
> > the bvec.
> > 
> > I was able to reproduce this on 2.6.20 - and the following patches
> > seem to fix it for me.
> > 
> > dm-crypt-clone_init-early
> > 	We need to call clone_init early on, before we can conceivably
> > 	ever call bio_put on the clone
> > 
> > dm-crypt-clone-race
> > 	Do no access bios after generic_make_request
> > 
> > dm-crypt-drop-first_clone
> > 	Drop the first_clone business, and always allocate a fresh bio.
> 
> I fully agree. I didn't notice when the sharing of the bvec went away,
> but it's no longer needed and the code is simpler. As for the other two
> - ouch, you're right.
> 
> Thanks for investigating fixing these things.
> 
> All four patches are
> 
> Acked-By: Christophe Saout <christophe@saout.de>

I'm hunting around, but cannot find the patches to which you refer.  Maybe
they're on dm-devel only?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] Fix SMP races in dm-crypt
  2007-03-22 20:41   ` Andrew Morton
@ 2007-03-22 20:51     ` Christophe Saout
  2007-03-22 21:53     ` Alasdair G Kergon
  1 sibling, 0 replies; 11+ messages in thread
From: Christophe Saout @ 2007-03-22 20:51 UTC (permalink / raw)
  To: Andrew Morton, Olaf Kirch; +Cc: device-mapper development, Alasdair G Kergon

Am Donnerstag, den 22.03.2007, 13:41 -0700 schrieb Andrew Morton:

> > Hi Olaf,
> 
> err, which Olaf are you addressing?  He is not cc'ed.

Stupid client... didn't notice. Thanks .

> > > (I'm resending this, because it seems mailman ate this posting
> > > the first time around - apologies if you're seeing this twice).
> > > 
> > > I investigated kernel Bug 5948 (Oops when accessing SATA with device mapper on
> > > AMD 64 X2) to help cut down on the number of open bugs.
> > > 
> > > The problem reported in that bug seems to be caused by memory pressure
> > > when writing - crypt_alloc_buffer can return less than the full number of
> > > pages required. In that case, the bio is split, but we're keeping a pointer
> > > to the first chunk of the request around in first_clone, and copy it. If the
> > > first clone completes I/O while we're cloning it, we oops on garbage in
> > > the bvec.
> > > 
> > > I was able to reproduce this on 2.6.20 - and the following patches
> > > seem to fix it for me.
> > > 
> > > dm-crypt-clone_init-early
> > > 	We need to call clone_init early on, before we can conceivably
> > > 	ever call bio_put on the clone
> > > 
> > > dm-crypt-clone-race
> > > 	Do no access bios after generic_make_request
> > > 
> > > dm-crypt-drop-first_clone
> > > 	Drop the first_clone business, and always allocate a fresh bio.
> > 
> > I fully agree. I didn't notice when the sharing of the bvec went away,
> > but it's no longer needed and the code is simpler. As for the other two
> > - ouch, you're right.
> > 
> > Thanks for investigating fixing these things.
> > 
> > All four patches are
> > 
> > Acked-By: Christophe Saout <christophe@saout.de>
> 
> I'm hunting around, but cannot find the patches to which you refer.  Maybe
> they're on dm-devel only?

Yes, but Olaf also Cc'ed you. Didn't you get them?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] Fix SMP races in dm-crypt
  2007-03-22 20:41   ` Andrew Morton
  2007-03-22 20:51     ` Christophe Saout
@ 2007-03-22 21:53     ` Alasdair G Kergon
  2007-03-22 22:44       ` Andrew Morton
  1 sibling, 1 reply; 11+ messages in thread
From: Alasdair G Kergon @ 2007-03-22 21:53 UTC (permalink / raw)
  To: Andrew Morton; +Cc: device-mapper development, Christophe Saout

On Thu, Mar 22, 2007 at 01:41:22PM -0700, Andrew Morton wrote:
> On Thu, 22 Mar 2007 21:28:16 +0100
> Christophe Saout <christophe@saout.de> wrote:
> I'm hunting around, but cannot find the patches to which you refer.  Maybe
> they're on dm-devel only?

These four:
  http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-crypt-fix-call-to-clone_init.patch
  http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-crypt-fix-avoid-cloned-bio-ref-after-free.patch
  http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-crypt-fix-remove-first_clone.patch
  http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-crypt-use-smaller-bvecs-in-clones.patch

Grab them from there, or I'll send you them properly at the start of next
week as part of the next dm patch set.

Alasdair
-- 
agk@redhat.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] Fix SMP races in dm-crypt
  2007-03-22 21:53     ` Alasdair G Kergon
@ 2007-03-22 22:44       ` Andrew Morton
  0 siblings, 0 replies; 11+ messages in thread
From: Andrew Morton @ 2007-03-22 22:44 UTC (permalink / raw)
  To: Alasdair G Kergon; +Cc: device-mapper development, Christophe Saout, Olaf Kirch

On Thu, 22 Mar 2007 21:53:53 +0000
Alasdair G Kergon <agk@redhat.com> wrote:

> On Thu, Mar 22, 2007 at 01:41:22PM -0700, Andrew Morton wrote:
> > On Thu, 22 Mar 2007 21:28:16 +0100
> > Christophe Saout <christophe@saout.de> wrote:
> > I'm hunting around, but cannot find the patches to which you refer.  Maybe
> > they're on dm-devel only?
> 
> These four:
>   http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-crypt-fix-call-to-clone_init.patch
>   http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-crypt-fix-avoid-cloned-bio-ref-after-free.patch
>   http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-crypt-fix-remove-first_clone.patch
>   http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-crypt-use-smaller-bvecs-in-clones.patch
> 
> Grab them from there, or I'll send you them properly at the start of next
> week as part of the next dm patch set.
> 

yup, thanks - Christophe forwarded them.

(And I merged them with the wrong From: attribution (fixes that))

btw, should I include
http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/* in
the -mm lineup?  It's easy to do from this end.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 0/3] Fix SMP races in dm-crypt
@ 2007-03-20 19:54 Olaf Kirch
  0 siblings, 0 replies; 11+ messages in thread
From: Olaf Kirch @ 2007-03-20 19:54 UTC (permalink / raw)
  To: dm-devel

I investigated kernel Bug 5948 (Oops when accessing SATA with device mapper on
AMD 64 X2) to help cut down on the number of open bugs.

The problem reported in that bug seems to be caused by memory pressure
when writing - crypt_alloc_buffer can return less than the full number of
pages required. In that case, the bio is split, but we're keeping a pointer
to the first chunk of the request around in first_clone, and copy it. If the
first clone completes I/O while we're cloning it, we oops on garbage in
the bvec.

I was able to reproduce this on 2.6.20 - and the following patches
seem to fix it for me.

dm-crypt-clone_init-early
	We need to call clone_init early on, before we can conceivably
	ever call bio_put on the clone

dm-crypt-clone-race
	Do no access bios after generic_make_request

dm-crypt-drop-first_clone
	Drop the first_clone business, and always allocate a fresh bio.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2007-03-22 22:44 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-21  7:06 [PATCH 0/3] Fix SMP races in dm-crypt Olaf Kirch
2007-03-21 10:03 ` Alasdair G Kergon
2007-03-21 10:47   ` [PATCH 4/3] Allocate smaller clones Olaf Kirch
2007-03-21 13:32   ` Re: [PATCH 0/3] Fix SMP races in dm-crypt Jim Meyering
2007-03-21 15:44     ` Alasdair G Kergon
2007-03-22 20:28 ` Christophe Saout
2007-03-22 20:41   ` Andrew Morton
2007-03-22 20:51     ` Christophe Saout
2007-03-22 21:53     ` Alasdair G Kergon
2007-03-22 22:44       ` Andrew Morton
  -- strict thread matches above, loose matches on Subject: below --
2007-03-20 19:54 Olaf Kirch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.