linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] xen/blkback: several fixes of resource management
@ 2017-05-18 15:28 Juergen Gross
  2017-05-18 15:28 ` [PATCH v2 1/3] xen/blkback: fix disconnect while I/Os in flight Juergen Gross
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Juergen Gross @ 2017-05-18 15:28 UTC (permalink / raw)
  To: linux-kernel, xen-devel; +Cc: konrad.wilk, roger.pau, netwiz, Juergen Gross

Destroying a Xen guest domain while it was doing I/Os via xen-blkback
leaked several resources, including references of the guest's memory
pages.

This patch series addresses those leaks by correcting usage of
reference counts and the sequence when to free which resource.

The series applies on top of commit 2d4456c73a487abe ("block:
xen-blkback: add null check to avoid null pointer dereference") in
Jens Axboe's tree kernel/git/axboe/linux-block.git

V2: changed flag to type bool in patch 1 (Dietmar Hahn)

Juergen Gross (3):
  xen/blkback: fix disconnect while I/Os in flight
  xen/blkback: don't free be structure too early
  xen/blkback: don't use xen_blkif_get() in xen-blkback kthread

 drivers/block/xen-blkback/blkback.c |  3 ---
 drivers/block/xen-blkback/common.h  |  1 +
 drivers/block/xen-blkback/xenbus.c  | 15 ++++++++-------
 3 files changed, 9 insertions(+), 10 deletions(-)

-- 
2.12.0

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/3] xen/blkback: fix disconnect while I/Os in flight
  2017-05-18 15:28 [PATCH v2 0/3] xen/blkback: several fixes of resource management Juergen Gross
@ 2017-05-18 15:28 ` Juergen Gross
  2017-05-18 15:28 ` [PATCH v2 2/3] xen/blkback: don't free be structure too early Juergen Gross
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Juergen Gross @ 2017-05-18 15:28 UTC (permalink / raw)
  To: linux-kernel, xen-devel
  Cc: konrad.wilk, roger.pau, netwiz, Juergen Gross, stable

Today disconnecting xen-blkback is broken in case there are still
I/Os in flight: xen_blkif_disconnect() will bail out early without
releasing all resources in the hope it will be called again when
the last request has terminated. This, however, won't happen as
xen_blkif_free() won't be called on termination of the last running
request: xen_blkif_put() won't decrement the blkif refcnt to 0 as
xen_blkif_disconnect() didn't finish before thus some xen_blkif_put()
calls in xen_blkif_disconnect() didn't happen.

To solve this deadlock xen_blkif_disconnect() and
xen_blkif_alloc_rings() shouldn't use xen_blkif_put() and
xen_blkif_get() but use some other way to do their accounting of
resources.

This at once fixes another error in xen_blkif_disconnect(): when it
returned early with -EBUSY for another ring than 0 it would call
xen_blkif_put() again for already handled rings on a subsequent call.
This will lead to inconsistencies in the refcnt handling.

Cc: stable@vger.kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Steven Haigh <netwiz@crc.id.au>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
---
 drivers/block/xen-blkback/common.h | 1 +
 drivers/block/xen-blkback/xenbus.c | 7 +++++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h
index dea61f6ab8cb..638597b17a38 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -281,6 +281,7 @@ struct xen_blkif_ring {
 
 	wait_queue_head_t	wq;
 	atomic_t		inflight;
+	bool			active;
 	/* One thread per blkif ring. */
 	struct task_struct	*xenblkd;
 	unsigned int		waiting_reqs;
diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index 1f3dfaa54d87..998915174bb8 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -159,7 +159,7 @@ static int xen_blkif_alloc_rings(struct xen_blkif *blkif)
 		init_waitqueue_head(&ring->shutdown_wq);
 		ring->blkif = blkif;
 		ring->st_print = jiffies;
-		xen_blkif_get(blkif);
+		ring->active = true;
 	}
 
 	return 0;
@@ -249,6 +249,9 @@ static int xen_blkif_disconnect(struct xen_blkif *blkif)
 		struct xen_blkif_ring *ring = &blkif->rings[r];
 		unsigned int i = 0;
 
+		if (!ring->active)
+			continue;
+
 		if (ring->xenblkd) {
 			kthread_stop(ring->xenblkd);
 			wake_up(&ring->shutdown_wq);
@@ -296,7 +299,7 @@ static int xen_blkif_disconnect(struct xen_blkif *blkif)
 		BUG_ON(ring->free_pages_num != 0);
 		BUG_ON(ring->persistent_gnt_c != 0);
 		WARN_ON(i != (XEN_BLKIF_REQS_PER_PAGE * blkif->nr_ring_pages));
-		xen_blkif_put(blkif);
+		ring->active = false;
 	}
 	blkif->nr_ring_pages = 0;
 	/*
-- 
2.12.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 2/3] xen/blkback: don't free be structure too early
  2017-05-18 15:28 [PATCH v2 0/3] xen/blkback: several fixes of resource management Juergen Gross
  2017-05-18 15:28 ` [PATCH v2 1/3] xen/blkback: fix disconnect while I/Os in flight Juergen Gross
@ 2017-05-18 15:28 ` Juergen Gross
  2017-05-18 15:28 ` [PATCH v2 3/3] xen/blkback: don't use xen_blkif_get() in xen-blkback kthread Juergen Gross
  2017-06-07 12:36 ` [PATCH v2 0/3] xen/blkback: several fixes of resource management Steven Haigh
  3 siblings, 0 replies; 7+ messages in thread
From: Juergen Gross @ 2017-05-18 15:28 UTC (permalink / raw)
  To: linux-kernel, xen-devel; +Cc: konrad.wilk, roger.pau, netwiz, Juergen Gross

The be structure must nor be freed when freeing the blkif structure
isn't done. Otherwise a use-after-free of be when unmapping the ring
used for communicating with the frontend will occur in case of a
late call of xenblk_disconnect() (e.g. due to an I/O still active
when trying to disconnect).

Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Steven Haigh <netwiz@crc.id.au>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
---
 drivers/block/xen-blkback/xenbus.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index 998915174bb8..4cdf0490983e 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -315,9 +315,10 @@ static int xen_blkif_disconnect(struct xen_blkif *blkif)
 
 static void xen_blkif_free(struct xen_blkif *blkif)
 {
-
-	xen_blkif_disconnect(blkif);
+	WARN_ON(xen_blkif_disconnect(blkif));
 	xen_vbd_free(&blkif->vbd);
+	kfree(blkif->be->mode);
+	kfree(blkif->be);
 
 	/* Make sure everything is drained before shutting down */
 	kmem_cache_free(xen_blkif_cachep, blkif);
@@ -514,8 +515,6 @@ static int xen_blkbk_remove(struct xenbus_device *dev)
 		xen_blkif_put(be->blkif);
 	}
 
-	kfree(be->mode);
-	kfree(be);
 	return 0;
 }
 
-- 
2.12.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 3/3] xen/blkback: don't use xen_blkif_get() in xen-blkback kthread
  2017-05-18 15:28 [PATCH v2 0/3] xen/blkback: several fixes of resource management Juergen Gross
  2017-05-18 15:28 ` [PATCH v2 1/3] xen/blkback: fix disconnect while I/Os in flight Juergen Gross
  2017-05-18 15:28 ` [PATCH v2 2/3] xen/blkback: don't free be structure too early Juergen Gross
@ 2017-05-18 15:28 ` Juergen Gross
  2017-06-07 12:36 ` [PATCH v2 0/3] xen/blkback: several fixes of resource management Steven Haigh
  3 siblings, 0 replies; 7+ messages in thread
From: Juergen Gross @ 2017-05-18 15:28 UTC (permalink / raw)
  To: linux-kernel, xen-devel; +Cc: konrad.wilk, roger.pau, netwiz, Juergen Gross

There is no need to use xen_blkif_get()/xen_blkif_put() in the kthread
of xen-blkback. Thread stopping is synchronous and using the blkif
reference counting in the kthread will avoid to ever let the reference
count drop to zero at the end of an I/O running concurrent to
disconnecting and multiple rings.

Setting ring->xenblkd to NULL after stopping the kthread isn't needed
as the kthread does this already.

Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Steven Haigh <netwiz@crc.id.au>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
---
 drivers/block/xen-blkback/blkback.c | 3 ---
 drivers/block/xen-blkback/xenbus.c  | 1 -
 2 files changed, 4 deletions(-)

diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
index 726c32e35db9..6b14c509f3c7 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -609,8 +609,6 @@ int xen_blkif_schedule(void *arg)
 	unsigned long timeout;
 	int ret;
 
-	xen_blkif_get(blkif);
-
 	set_freezable();
 	while (!kthread_should_stop()) {
 		if (try_to_freeze())
@@ -665,7 +663,6 @@ int xen_blkif_schedule(void *arg)
 		print_stats(ring);
 
 	ring->xenblkd = NULL;
-	xen_blkif_put(blkif);
 
 	return 0;
 }
diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index 4cdf0490983e..792da683e70d 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -255,7 +255,6 @@ static int xen_blkif_disconnect(struct xen_blkif *blkif)
 		if (ring->xenblkd) {
 			kthread_stop(ring->xenblkd);
 			wake_up(&ring->shutdown_wq);
-			ring->xenblkd = NULL;
 		}
 
 		/* The above kthread_stop() guarantees that at this point we
-- 
2.12.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 0/3] xen/blkback: several fixes of resource management
  2017-05-18 15:28 [PATCH v2 0/3] xen/blkback: several fixes of resource management Juergen Gross
                   ` (2 preceding siblings ...)
  2017-05-18 15:28 ` [PATCH v2 3/3] xen/blkback: don't use xen_blkif_get() in xen-blkback kthread Juergen Gross
@ 2017-06-07 12:36 ` Steven Haigh
  2017-06-07 13:52   ` [Xen-devel] " Konrad Rzeszutek Wilk
  3 siblings, 1 reply; 7+ messages in thread
From: Steven Haigh @ 2017-06-07 12:36 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, linux-kernel, xen-devel, roger.pau

[-- Attachment #1: Type: text/plain, Size: 1348 bytes --]

On Friday, 19 May 2017 1:28:46 AM AEST Juergen Gross wrote:
> Destroying a Xen guest domain while it was doing I/Os via xen-blkback
> leaked several resources, including references of the guest's memory
> pages.
> 
> This patch series addresses those leaks by correcting usage of
> reference counts and the sequence when to free which resource.
> 
> The series applies on top of commit 2d4456c73a487abe ("block:
> xen-blkback: add null check to avoid null pointer dereference") in
> Jens Axboe's tree kernel/git/axboe/linux-block.git
> 
> V2: changed flag to type bool in patch 1 (Dietmar Hahn)
> 
> Juergen Gross (3):
>   xen/blkback: fix disconnect while I/Os in flight
>   xen/blkback: don't free be structure too early
>   xen/blkback: don't use xen_blkif_get() in xen-blkback kthread
> 
>  drivers/block/xen-blkback/blkback.c |  3 ---
>  drivers/block/xen-blkback/common.h  |  1 +
>  drivers/block/xen-blkback/xenbus.c  | 15 ++++++++-------
>  3 files changed, 9 insertions(+), 10 deletions(-)

Just wanted to give this a bit of a prod.

Is there any plans in having this hit the kernel.org kernels?

My testing was purely on kernel 4.9 branch - but it doesn't look like this has 
shown up there yet?

-- 
Steven Haigh

📧 netwiz@crc.id.au	💻 http://www.crc.id.au
📞 +61 (3) 9001 6090	📱 0412 935 897

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/3] xen/blkback: several fixes of resource management
  2017-06-07 12:36 ` [PATCH v2 0/3] xen/blkback: several fixes of resource management Steven Haigh
@ 2017-06-07 13:52   ` Konrad Rzeszutek Wilk
  2017-06-07 14:08     ` Steven Haigh
  0 siblings, 1 reply; 7+ messages in thread
From: Konrad Rzeszutek Wilk @ 2017-06-07 13:52 UTC (permalink / raw)
  To: Steven Haigh; +Cc: xen-devel, Juergen Gross, xen-devel, linux-kernel, roger.pau

On Wed, Jun 07, 2017 at 10:36:58PM +1000, Steven Haigh wrote:
> On Friday, 19 May 2017 1:28:46 AM AEST Juergen Gross wrote:
> > Destroying a Xen guest domain while it was doing I/Os via xen-blkback
> > leaked several resources, including references of the guest's memory
> > pages.
> > 
> > This patch series addresses those leaks by correcting usage of
> > reference counts and the sequence when to free which resource.
> > 
> > The series applies on top of commit 2d4456c73a487abe ("block:
> > xen-blkback: add null check to avoid null pointer dereference") in
> > Jens Axboe's tree kernel/git/axboe/linux-block.git
> > 
> > V2: changed flag to type bool in patch 1 (Dietmar Hahn)
> > 
> > Juergen Gross (3):
> >   xen/blkback: fix disconnect while I/Os in flight
> >   xen/blkback: don't free be structure too early
> >   xen/blkback: don't use xen_blkif_get() in xen-blkback kthread
> > 
> >  drivers/block/xen-blkback/blkback.c |  3 ---
> >  drivers/block/xen-blkback/common.h  |  1 +
> >  drivers/block/xen-blkback/xenbus.c  | 15 ++++++++-------
> >  3 files changed, 9 insertions(+), 10 deletions(-)
> 
> Just wanted to give this a bit of a prod.

<ugh> Ouch!

> 
> Is there any plans in having this hit the kernel.org kernels?

Yes.
> 
> My testing was purely on kernel 4.9 branch - but it doesn't look like this has 
> shown up there yet?

Correct. I am thinking to send these to Jens around June 20th or so.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/3] xen/blkback: several fixes of resource management
  2017-06-07 13:52   ` [Xen-devel] " Konrad Rzeszutek Wilk
@ 2017-06-07 14:08     ` Steven Haigh
  0 siblings, 0 replies; 7+ messages in thread
From: Steven Haigh @ 2017-06-07 14:08 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Juergen Gross; +Cc: xen-devel, linux-kernel, roger.pau

[-- Attachment #1: Type: text/plain, Size: 1954 bytes --]

On Wednesday, 7 June 2017 11:52:34 PM AEST Konrad Rzeszutek Wilk wrote:
> On Wed, Jun 07, 2017 at 10:36:58PM +1000, Steven Haigh wrote:
> > On Friday, 19 May 2017 1:28:46 AM AEST Juergen Gross wrote:
> > > Destroying a Xen guest domain while it was doing I/Os via xen-blkback
> > > leaked several resources, including references of the guest's memory
> > > pages.
> > > 
> > > This patch series addresses those leaks by correcting usage of
> > > reference counts and the sequence when to free which resource.
> > > 
> > > The series applies on top of commit 2d4456c73a487abe ("block:
> > > xen-blkback: add null check to avoid null pointer dereference") in
> > > Jens Axboe's tree kernel/git/axboe/linux-block.git
> > > 
> > > V2: changed flag to type bool in patch 1 (Dietmar Hahn)
> > > 
> > > Juergen Gross (3):
> > >   xen/blkback: fix disconnect while I/Os in flight
> > >   xen/blkback: don't free be structure too early
> > >   xen/blkback: don't use xen_blkif_get() in xen-blkback kthread
> > >  
> > >  drivers/block/xen-blkback/blkback.c |  3 ---
> > >  drivers/block/xen-blkback/common.h  |  1 +
> > >  drivers/block/xen-blkback/xenbus.c  | 15 ++++++++-------
> > >  3 files changed, 9 insertions(+), 10 deletions(-)
> > 
> > Just wanted to give this a bit of a prod.
> 
> <ugh> Ouch!
> 
> > Is there any plans in having this hit the kernel.org kernels?
> 
> Yes.
> 
> > My testing was purely on kernel 4.9 branch - but it doesn't look like this
> > has shown up there yet?
> 
> Correct. I am thinking to send these to Jens around June 20th or so.

Ok, all understood. Thanks for the clarifications.

At the moment, I'm just including them in my kernel builds - then expecting 
them to fail at one point in the future when the patch fails due to 
upstreaming. I'll just keep doing this.

-- 
Steven Haigh

📧 netwiz@crc.id.au	💻 http://www.crc.id.au
📞 +61 (3) 9001 6090	📱 0412 935 897

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-06-07 14:08 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-18 15:28 [PATCH v2 0/3] xen/blkback: several fixes of resource management Juergen Gross
2017-05-18 15:28 ` [PATCH v2 1/3] xen/blkback: fix disconnect while I/Os in flight Juergen Gross
2017-05-18 15:28 ` [PATCH v2 2/3] xen/blkback: don't free be structure too early Juergen Gross
2017-05-18 15:28 ` [PATCH v2 3/3] xen/blkback: don't use xen_blkif_get() in xen-blkback kthread Juergen Gross
2017-06-07 12:36 ` [PATCH v2 0/3] xen/blkback: several fixes of resource management Steven Haigh
2017-06-07 13:52   ` [Xen-devel] " Konrad Rzeszutek Wilk
2017-06-07 14:08     ` Steven Haigh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).