linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1] vhost: avoid large order allocations
@ 2014-05-13  8:35 Michael Mueller
  2014-05-13  8:35 ` Michael Mueller
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Mueller @ 2014-05-13  8:35 UTC (permalink / raw)
  To: mst, kvm, virtualization, netdev, linux-kernel
  Cc: borntraeger, cornelia.huck, mimu, ddongch

A test case which generates memory pressure while performing guest administration
fails with vhost triggering "page allocation failure" and guest not starting up.

After some analysis we discovered the allocation order of vhost to be rensponsible
for this behaviour. Thus we suggest patch 1/1 which dynamically allocates the
required memory. Please see its description for details.

Thanks,
Michael 

Dong Dong Chen (1):
  vhost: avoid large order allocations

 drivers/vhost/net.c   | 4 ++--
 drivers/vhost/scsi.c  | 4 ++--
 drivers/vhost/test.c  | 2 +-
 drivers/vhost/vhost.c | 6 +++++-
 drivers/vhost/vhost.h | 2 +-
 5 files changed, 11 insertions(+), 7 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v1] vhost: avoid large order allocations
  2014-05-13  8:35 [PATCH v1] vhost: avoid large order allocations Michael Mueller
@ 2014-05-13  8:35 ` Michael Mueller
  2014-05-13  8:40   ` Michael S. Tsirkin
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Mueller @ 2014-05-13  8:35 UTC (permalink / raw)
  To: mst, kvm, virtualization, netdev, linux-kernel
  Cc: borntraeger, cornelia.huck, mimu, ddongch

From: Dong Dong Chen <ddongch@cn.ibm.com>

Under memory pressure we observe load issues with module vhost_net, as the
vhost setup code will try to do an order 4 allocation for the device.
The likeliness of this issue to occur can be reduced when the statically
allocated variable "iov" in "struct vhost_virtqueue" is dynamically allocated
with exactly the required size, reducing this to an order 2 allocation.

Signed-off-by: Dong Dong Chen <ddongch@cn.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael Mueller <mimu@linux.vnet.ibm.com>
---
 drivers/vhost/net.c   | 4 ++--
 drivers/vhost/scsi.c  | 4 ++--
 drivers/vhost/test.c  | 2 +-
 drivers/vhost/vhost.c | 6 +++++-
 drivers/vhost/vhost.h | 2 +-
 5 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index be414d2..e3a9a68 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -374,7 +374,7 @@ static void handle_tx(struct vhost_net *net)
 			break;
 
 		head = vhost_get_vq_desc(&net->dev, vq, vq->iov,
-					 ARRAY_SIZE(vq->iov),
+					 UIO_MAXIOV,
 					 &out, &in,
 					 NULL, NULL);
 		/* On error, stop handling until the next kick. */
@@ -506,7 +506,7 @@ static int get_rx_bufs(struct vhost_virtqueue *vq,
 			goto err;
 		}
 		r = vhost_get_vq_desc(vq->dev, vq, vq->iov + seg,
-				      ARRAY_SIZE(vq->iov) - seg, &out,
+				      UIO_MAXIOV - seg, &out,
 				      &in, log, log_num);
 		if (unlikely(r < 0))
 			goto err;
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index cf50ce9..a70f1d9 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -607,7 +607,7 @@ tcm_vhost_do_evt_work(struct vhost_scsi *vs, struct tcm_vhost_evt *evt)
 again:
 	vhost_disable_notify(&vs->dev, vq);
 	head = vhost_get_vq_desc(&vs->dev, vq, vq->iov,
-			ARRAY_SIZE(vq->iov), &out, &in,
+			UIO_MAXIOV, &out, &in,
 			NULL, NULL);
 	if (head < 0) {
 		vs->vs_events_missed = true;
@@ -946,7 +946,7 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
 
 	for (;;) {
 		head = vhost_get_vq_desc(&vs->dev, vq, vq->iov,
-					ARRAY_SIZE(vq->iov), &out, &in,
+					UIO_MAXIOV, &out, &in,
 					NULL, NULL);
 		pr_debug("vhost_get_vq_desc: head: %d, out: %u in: %u\n",
 					head, out, in);
diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index c2a54fb..2e01920 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -54,7 +54,7 @@ static void handle_vq(struct vhost_test *n)
 
 	for (;;) {
 		head = vhost_get_vq_desc(&n->dev, vq, vq->iov,
-					 ARRAY_SIZE(vq->iov),
+					 UIO_MAXIOV,
 					 &out, &in,
 					 NULL, NULL);
 		/* On error, stop handling until the next kick. */
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 78987e4..9017a55 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -251,6 +251,8 @@ static int vhost_worker(void *data)
 
 static void vhost_vq_free_iovecs(struct vhost_virtqueue *vq)
 {
+	kfree(vq->iov);
+	vq->iov = NULL;
 	kfree(vq->indirect);
 	vq->indirect = NULL;
 	kfree(vq->log);
@@ -267,11 +269,12 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev *dev)
 
 	for (i = 0; i < dev->nvqs; ++i) {
 		vq = dev->vqs[i];
+		vq->iov = kmalloc(sizeof(*vq->iov) * UIO_MAXIOV, GFP_KERNEL);
 		vq->indirect = kmalloc(sizeof *vq->indirect * UIO_MAXIOV,
 				       GFP_KERNEL);
 		vq->log = kmalloc(sizeof *vq->log * UIO_MAXIOV, GFP_KERNEL);
 		vq->heads = kmalloc(sizeof *vq->heads * UIO_MAXIOV, GFP_KERNEL);
-		if (!vq->indirect || !vq->log || !vq->heads)
+		if (!vq->iov || !vq->indirect || !vq->log || !vq->heads)
 			goto err_nomem;
 	}
 	return 0;
@@ -310,6 +313,7 @@ void vhost_dev_init(struct vhost_dev *dev,
 	for (i = 0; i < dev->nvqs; ++i) {
 		vq = dev->vqs[i];
 		vq->log = NULL;
+		vq->iov = NULL;
 		vq->indirect = NULL;
 		vq->heads = NULL;
 		vq->dev = dev;
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 35eeb2a..541f757 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -100,7 +100,7 @@ struct vhost_virtqueue {
 	bool log_used;
 	u64 log_addr;
 
-	struct iovec iov[UIO_MAXIOV];
+	struct iovec *iov;
 	struct iovec *indirect;
 	struct vring_used_elem *heads;
 	/* Protected by virtqueue mutex. */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] vhost: avoid large order allocations
  2014-05-13  8:35 ` Michael Mueller
@ 2014-05-13  8:40   ` Michael S. Tsirkin
  2014-05-13  8:57     ` Michael Mueller
  2014-05-13 14:29     ` Romain Francoise
  0 siblings, 2 replies; 9+ messages in thread
From: Michael S. Tsirkin @ 2014-05-13  8:40 UTC (permalink / raw)
  To: Michael Mueller
  Cc: kvm, virtualization, netdev, linux-kernel, borntraeger,
	cornelia.huck, ddongch

On Tue, May 13, 2014 at 10:35:33AM +0200, Michael Mueller wrote:
> From: Dong Dong Chen <ddongch@cn.ibm.com>
> 
> Under memory pressure we observe load issues with module vhost_net, as the
> vhost setup code will try to do an order 4 allocation for the device.
> The likeliness of this issue to occur can be reduced when the statically
> allocated variable "iov" in "struct vhost_virtqueue" is dynamically allocated
> with exactly the required size, reducing this to an order 2 allocation.
> 
> Signed-off-by: Dong Dong Chen <ddongch@cn.ibm.com>
> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
> Reviewed-by: Michael Mueller <mimu@linux.vnet.ibm.com>

Please dont' do this, extra indirection hurts performance.
Instead, please change vhost_net_open and scsi to allocate the whole
structure with vmalloc if kmalloc fails, along the lines of
74d332c13b2148ae934ea94dac1745ae92efe8e5

> ---
>  drivers/vhost/net.c   | 4 ++--
>  drivers/vhost/scsi.c  | 4 ++--
>  drivers/vhost/test.c  | 2 +-
>  drivers/vhost/vhost.c | 6 +++++-
>  drivers/vhost/vhost.h | 2 +-
>  5 files changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index be414d2..e3a9a68 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -374,7 +374,7 @@ static void handle_tx(struct vhost_net *net)
>  			break;
>  
>  		head = vhost_get_vq_desc(&net->dev, vq, vq->iov,
> -					 ARRAY_SIZE(vq->iov),
> +					 UIO_MAXIOV,
>  					 &out, &in,
>  					 NULL, NULL);
>  		/* On error, stop handling until the next kick. */
> @@ -506,7 +506,7 @@ static int get_rx_bufs(struct vhost_virtqueue *vq,
>  			goto err;
>  		}
>  		r = vhost_get_vq_desc(vq->dev, vq, vq->iov + seg,
> -				      ARRAY_SIZE(vq->iov) - seg, &out,
> +				      UIO_MAXIOV - seg, &out,
>  				      &in, log, log_num);
>  		if (unlikely(r < 0))
>  			goto err;
> diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
> index cf50ce9..a70f1d9 100644
> --- a/drivers/vhost/scsi.c
> +++ b/drivers/vhost/scsi.c
> @@ -607,7 +607,7 @@ tcm_vhost_do_evt_work(struct vhost_scsi *vs, struct tcm_vhost_evt *evt)
>  again:
>  	vhost_disable_notify(&vs->dev, vq);
>  	head = vhost_get_vq_desc(&vs->dev, vq, vq->iov,
> -			ARRAY_SIZE(vq->iov), &out, &in,
> +			UIO_MAXIOV, &out, &in,
>  			NULL, NULL);
>  	if (head < 0) {
>  		vs->vs_events_missed = true;
> @@ -946,7 +946,7 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
>  
>  	for (;;) {
>  		head = vhost_get_vq_desc(&vs->dev, vq, vq->iov,
> -					ARRAY_SIZE(vq->iov), &out, &in,
> +					UIO_MAXIOV, &out, &in,
>  					NULL, NULL);
>  		pr_debug("vhost_get_vq_desc: head: %d, out: %u in: %u\n",
>  					head, out, in);
> diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
> index c2a54fb..2e01920 100644
> --- a/drivers/vhost/test.c
> +++ b/drivers/vhost/test.c
> @@ -54,7 +54,7 @@ static void handle_vq(struct vhost_test *n)
>  
>  	for (;;) {
>  		head = vhost_get_vq_desc(&n->dev, vq, vq->iov,
> -					 ARRAY_SIZE(vq->iov),
> +					 UIO_MAXIOV,
>  					 &out, &in,
>  					 NULL, NULL);
>  		/* On error, stop handling until the next kick. */
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 78987e4..9017a55 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -251,6 +251,8 @@ static int vhost_worker(void *data)
>  
>  static void vhost_vq_free_iovecs(struct vhost_virtqueue *vq)
>  {
> +	kfree(vq->iov);
> +	vq->iov = NULL;
>  	kfree(vq->indirect);
>  	vq->indirect = NULL;
>  	kfree(vq->log);
> @@ -267,11 +269,12 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev *dev)
>  
>  	for (i = 0; i < dev->nvqs; ++i) {
>  		vq = dev->vqs[i];
> +		vq->iov = kmalloc(sizeof(*vq->iov) * UIO_MAXIOV, GFP_KERNEL);
>  		vq->indirect = kmalloc(sizeof *vq->indirect * UIO_MAXIOV,
>  				       GFP_KERNEL);
>  		vq->log = kmalloc(sizeof *vq->log * UIO_MAXIOV, GFP_KERNEL);
>  		vq->heads = kmalloc(sizeof *vq->heads * UIO_MAXIOV, GFP_KERNEL);
> -		if (!vq->indirect || !vq->log || !vq->heads)
> +		if (!vq->iov || !vq->indirect || !vq->log || !vq->heads)
>  			goto err_nomem;
>  	}
>  	return 0;
> @@ -310,6 +313,7 @@ void vhost_dev_init(struct vhost_dev *dev,
>  	for (i = 0; i < dev->nvqs; ++i) {
>  		vq = dev->vqs[i];
>  		vq->log = NULL;
> +		vq->iov = NULL;
>  		vq->indirect = NULL;
>  		vq->heads = NULL;
>  		vq->dev = dev;
> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> index 35eeb2a..541f757 100644
> --- a/drivers/vhost/vhost.h
> +++ b/drivers/vhost/vhost.h
> @@ -100,7 +100,7 @@ struct vhost_virtqueue {
>  	bool log_used;
>  	u64 log_addr;
>  
> -	struct iovec iov[UIO_MAXIOV];
> +	struct iovec *iov;
>  	struct iovec *indirect;
>  	struct vring_used_elem *heads;
>  	/* Protected by virtqueue mutex. */
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] vhost: avoid large order allocations
  2014-05-13  8:40   ` Michael S. Tsirkin
@ 2014-05-13  8:57     ` Michael Mueller
  2014-05-13 14:29     ` Romain Francoise
  1 sibling, 0 replies; 9+ messages in thread
From: Michael Mueller @ 2014-05-13  8:57 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, virtualization, netdev, linux-kernel, borntraeger,
	cornelia.huck, ddongch, mimu

On Tue, 13 May 2014 11:40:49 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> Please dont' do this, extra indirection hurts performance.
> Instead, please change vhost_net_open and scsi to allocate the whole
> structure with vmalloc if kmalloc fails, along the lines of
> 74d332c13b2148ae934ea94dac1745ae92efe8e5

Thanks for pointing us to: net: extend net_device allocation to vmalloc()

We'll try to adapt.

Michael  


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] vhost: avoid large order allocations
  2014-05-13  8:40   ` Michael S. Tsirkin
  2014-05-13  8:57     ` Michael Mueller
@ 2014-05-13 14:29     ` Romain Francoise
  2014-05-13 15:07       ` Michael Mueller
  2014-05-13 15:15       ` Michael S. Tsirkin
  1 sibling, 2 replies; 9+ messages in thread
From: Romain Francoise @ 2014-05-13 14:29 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Michael Mueller, kvm, virtualization, netdev, linux-kernel,
	borntraeger, cornelia.huck, ddongch

"Michael S. Tsirkin" <mst@redhat.com> writes:

> Please dont' do this, extra indirection hurts performance.
> Instead, please change vhost_net_open and scsi to allocate the whole
> structure with vmalloc if kmalloc fails, along the lines of
> 74d332c13b2148ae934ea94dac1745ae92efe8e5

Back in January 2013, you didn't seem to think it was a good idea:

https://lkml.org/lkml/2013/1/23/492

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] vhost: avoid large order allocations
  2014-05-13 14:29     ` Romain Francoise
@ 2014-05-13 15:07       ` Michael Mueller
  2014-05-13 15:15       ` Michael S. Tsirkin
  1 sibling, 0 replies; 9+ messages in thread
From: Michael Mueller @ 2014-05-13 15:07 UTC (permalink / raw)
  To: Romain Francoise
  Cc: Michael S. Tsirkin, kvm, virtualization, netdev, linux-kernel,
	borntraeger, cornelia.huck, ddongch

On Tue, 13 May 2014 16:29:58 +0200
Romain Francoise <romain@orebokech.com> wrote:

> "Michael S. Tsirkin" <mst@redhat.com> writes:
> 
> > Please dont' do this, extra indirection hurts performance.
> > Instead, please change vhost_net_open and scsi to allocate the whole
> > structure with vmalloc if kmalloc fails, along the lines of
> > 74d332c13b2148ae934ea94dac1745ae92efe8e5
> 
> Back in January 2013, you didn't seem to think it was a good idea:
> 
> https://lkml.org/lkml/2013/1/23/492
> 

Hi Romain,

in that case I'd suggest that you submit your patch, ours will look pretty much the same!

Cheers
Michael


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] vhost: avoid large order allocations
  2014-05-13 14:29     ` Romain Francoise
  2014-05-13 15:07       ` Michael Mueller
@ 2014-05-13 15:15       ` Michael S. Tsirkin
  2014-05-14  8:11         ` Michael Mueller
  2014-06-13 11:56         ` Michael Mueller
  1 sibling, 2 replies; 9+ messages in thread
From: Michael S. Tsirkin @ 2014-05-13 15:15 UTC (permalink / raw)
  To: Romain Francoise
  Cc: Michael Mueller, kvm, virtualization, netdev, linux-kernel,
	borntraeger, cornelia.huck, ddongch

On Tue, May 13, 2014 at 04:29:58PM +0200, Romain Francoise wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> 
> > Please dont' do this, extra indirection hurts performance.
> > Instead, please change vhost_net_open and scsi to allocate the whole
> > structure with vmalloc if kmalloc fails, along the lines of
> > 74d332c13b2148ae934ea94dac1745ae92efe8e5
> 
> Back in January 2013, you didn't seem to think it was a good idea:
> 
> https://lkml.org/lkml/2013/1/23/492

Hmm true, and Dave thought the structure's too large.
I'll have to do some benchmarks to see what the effect
of Michael's patch is, performance-wise.
If it's too expensive I can pick up your patch, no need to
repost.

-- 
MST

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] vhost: avoid large order allocations
  2014-05-13 15:15       ` Michael S. Tsirkin
@ 2014-05-14  8:11         ` Michael Mueller
  2014-06-13 11:56         ` Michael Mueller
  1 sibling, 0 replies; 9+ messages in thread
From: Michael Mueller @ 2014-05-14  8:11 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Romain Francoise, kvm, virtualization, netdev, linux-kernel,
	borntraeger, cornelia.huck, ddongch

On Tue, 13 May 2014 18:15:27 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, May 13, 2014 at 04:29:58PM +0200, Romain Francoise wrote:
> > "Michael S. Tsirkin" <mst@redhat.com> writes:
> > 
> > > Please dont' do this, extra indirection hurts performance.
> > > Instead, please change vhost_net_open and scsi to allocate the whole
> > > structure with vmalloc if kmalloc fails, along the lines of
> > > 74d332c13b2148ae934ea94dac1745ae92efe8e5
> > 
> > Back in January 2013, you didn't seem to think it was a good idea:
> > 
> > https://lkml.org/lkml/2013/1/23/492
> 
> Hmm true, and Dave thought the structure's too large.
> I'll have to do some benchmarks to see what the effect
> of Michael's patch is, performance-wise.
> If it's too expensive I can pick up your patch, no need to
> repost.
> 

Thanks, let us know then.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] vhost: avoid large order allocations
  2014-05-13 15:15       ` Michael S. Tsirkin
  2014-05-14  8:11         ` Michael Mueller
@ 2014-06-13 11:56         ` Michael Mueller
  1 sibling, 0 replies; 9+ messages in thread
From: Michael Mueller @ 2014-06-13 11:56 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Romain Francoise, kvm, virtualization, netdev, linux-kernel,
	borntraeger, cornelia.huck, ddongch

On Tue, 13 May 2014 18:15:27 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, May 13, 2014 at 04:29:58PM +0200, Romain Francoise wrote:
> > "Michael S. Tsirkin" <mst@redhat.com> writes:
> > 
> > > Please dont' do this, extra indirection hurts performance.
> > > Instead, please change vhost_net_open and scsi to allocate the whole
> > > structure with vmalloc if kmalloc fails, along the lines of
> > > 74d332c13b2148ae934ea94dac1745ae92efe8e5
> > 
> > Back in January 2013, you didn't seem to think it was a good idea:
> > 
> > https://lkml.org/lkml/2013/1/23/492
> 
> Hmm true, and Dave thought the structure's too large.
> I'll have to do some benchmarks to see what the effect
> of Michael's patch is, performance-wise.
> If it's too expensive I can pick up your patch, no need to
> repost.
> 

Hi Michael,

do you have any update in this case for us?

Michael


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-06-13 11:57 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-13  8:35 [PATCH v1] vhost: avoid large order allocations Michael Mueller
2014-05-13  8:35 ` Michael Mueller
2014-05-13  8:40   ` Michael S. Tsirkin
2014-05-13  8:57     ` Michael Mueller
2014-05-13 14:29     ` Romain Francoise
2014-05-13 15:07       ` Michael Mueller
2014-05-13 15:15       ` Michael S. Tsirkin
2014-05-14  8:11         ` Michael Mueller
2014-06-13 11:56         ` Michael Mueller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).