All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] fix stuck in catting hwrng attributes
@ 2014-09-10  9:07 Amos Kong
  2014-09-10  9:07 ` [PATCH 1/2] virtio-rng cleanup: move some code out of mutex protection Amos Kong
  2014-09-10  9:07 ` [PATCH 2/2] virtio-rng: fix stuck in catting hwrng attributes Amos Kong
  0 siblings, 2 replies; 14+ messages in thread
From: Amos Kong @ 2014-09-10  9:07 UTC (permalink / raw)
  To: virtualization; +Cc: amit.shah, kvm

If we read hwrng by long-running dd process, it takes too much cpu time.
When we check hwrng attributes from sysfs by cat, it gets stuck.
The problem can only be reproduced with non-smp guest with slow backend.

This patchset changed hwrng core to always delay 10 jiffies, cat process
have chance to execute protected code, the problem is resolved.

Thanks.

Amos Kong (2):
  virtio-rng cleanup: move some code out of mutex protection
  virtio-rng: fix stuck in catting hwrng attributes

 drivers/char/hw_random/core.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] virtio-rng cleanup: move some code out of mutex protection
  2014-09-10  9:07 [PATCH 0/2] fix stuck in catting hwrng attributes Amos Kong
@ 2014-09-10  9:07 ` Amos Kong
  2014-09-11  6:07   ` Amit Shah
  2014-09-10  9:07 ` [PATCH 2/2] virtio-rng: fix stuck in catting hwrng attributes Amos Kong
  1 sibling, 1 reply; 14+ messages in thread
From: Amos Kong @ 2014-09-10  9:07 UTC (permalink / raw)
  To: virtualization; +Cc: amit.shah, kvm

It doesn't save too much cpu time as expected, just a cleanup.

Signed-off-by: Amos Kong <akong@redhat.com>
---
 drivers/char/hw_random/core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index aa30a25..c591d7e 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -270,8 +270,8 @@ static ssize_t hwrng_attr_current_show(struct device *dev,
 		return -ERESTARTSYS;
 	if (current_rng)
 		name = current_rng->name;
-	ret = snprintf(buf, PAGE_SIZE, "%s\n", name);
 	mutex_unlock(&rng_mutex);
+	ret = snprintf(buf, PAGE_SIZE, "%s\n", name);
 
 	return ret;
 }
@@ -284,19 +284,19 @@ static ssize_t hwrng_attr_available_show(struct device *dev,
 	ssize_t ret = 0;
 	struct hwrng *rng;
 
+	buf[0] = '\0';
 	err = mutex_lock_interruptible(&rng_mutex);
 	if (err)
 		return -ERESTARTSYS;
-	buf[0] = '\0';
 	list_for_each_entry(rng, &rng_list, list) {
 		strncat(buf, rng->name, PAGE_SIZE - ret - 1);
 		ret += strlen(rng->name);
 		strncat(buf, " ", PAGE_SIZE - ret - 1);
 		ret++;
 	}
+	mutex_unlock(&rng_mutex);
 	strncat(buf, "\n", PAGE_SIZE - ret - 1);
 	ret++;
-	mutex_unlock(&rng_mutex);
 
 	return ret;
 }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] virtio-rng: fix stuck in catting hwrng attributes
  2014-09-10  9:07 [PATCH 0/2] fix stuck in catting hwrng attributes Amos Kong
  2014-09-10  9:07 ` [PATCH 1/2] virtio-rng cleanup: move some code out of mutex protection Amos Kong
@ 2014-09-10  9:07 ` Amos Kong
  2014-09-11  6:08   ` Amit Shah
  2014-09-11 11:38   ` Rusty Russell
  1 sibling, 2 replies; 14+ messages in thread
From: Amos Kong @ 2014-09-10  9:07 UTC (permalink / raw)
  To: virtualization; +Cc: amit.shah, kvm

When I check hwrng attributes in sysfs, cat process always gets
stuck if guest has only 1 vcpu and uses a slow rng backend.

Currently we check if there is any tasks waiting to be run on
current cpu in rng_dev_read() by need_resched(). But need_resched()
doesn't work because rng_dev_read() is executing in user context.

This patch removed need_resched() and increase delay to 10 jiffies,
then other tasks can have chance to execute protected code.
Delaying 1 jiffy also works, but 10 jiffies is safer.

Signed-off-by: Amos Kong <akong@redhat.com>
---
 drivers/char/hw_random/core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index c591d7e..b5d1b6f 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -195,8 +195,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
 
 		mutex_unlock(&rng_mutex);
 
-		if (need_resched())
-			schedule_timeout_interruptible(1);
+		schedule_timeout_interruptible(10);
 
 		if (signal_pending(current)) {
 			err = -ERESTARTSYS;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] virtio-rng cleanup: move some code out of mutex protection
  2014-09-10  9:07 ` [PATCH 1/2] virtio-rng cleanup: move some code out of mutex protection Amos Kong
@ 2014-09-11  6:07   ` Amit Shah
  0 siblings, 0 replies; 14+ messages in thread
From: Amit Shah @ 2014-09-11  6:07 UTC (permalink / raw)
  To: Amos Kong; +Cc: herbert.xu, kvm, virtualization

On (Wed) 10 Sep 2014 [17:07:06], Amos Kong wrote:
> It doesn't save too much cpu time as expected, just a cleanup.

Frankly I won't bother with this.  It doesn't completely remove all
copying from the mutex, so it's not worthwhile.

> Signed-off-by: Amos Kong <akong@redhat.com>
> ---
>  drivers/char/hw_random/core.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> index aa30a25..c591d7e 100644
> --- a/drivers/char/hw_random/core.c
> +++ b/drivers/char/hw_random/core.c
> @@ -270,8 +270,8 @@ static ssize_t hwrng_attr_current_show(struct device *dev,
>  		return -ERESTARTSYS;
>  	if (current_rng)
>  		name = current_rng->name;
> -	ret = snprintf(buf, PAGE_SIZE, "%s\n", name);
>  	mutex_unlock(&rng_mutex);
> +	ret = snprintf(buf, PAGE_SIZE, "%s\n", name);
>  
>  	return ret;
>  }
> @@ -284,19 +284,19 @@ static ssize_t hwrng_attr_available_show(struct device *dev,
>  	ssize_t ret = 0;
>  	struct hwrng *rng;
>  
> +	buf[0] = '\0';
>  	err = mutex_lock_interruptible(&rng_mutex);
>  	if (err)
>  		return -ERESTARTSYS;
> -	buf[0] = '\0';
>  	list_for_each_entry(rng, &rng_list, list) {
>  		strncat(buf, rng->name, PAGE_SIZE - ret - 1);
>  		ret += strlen(rng->name);
>  		strncat(buf, " ", PAGE_SIZE - ret - 1);
>  		ret++;
>  	}
> +	mutex_unlock(&rng_mutex);
>  	strncat(buf, "\n", PAGE_SIZE - ret - 1);
>  	ret++;
> -	mutex_unlock(&rng_mutex);
>  
>  	return ret;
>  }
> -- 
> 1.9.3
> 


		Amit

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] virtio-rng: fix stuck in catting hwrng attributes
  2014-09-10  9:07 ` [PATCH 2/2] virtio-rng: fix stuck in catting hwrng attributes Amos Kong
@ 2014-09-11  6:08   ` Amit Shah
  2014-09-14  1:16     ` Amos Kong
  2014-09-11 11:38   ` Rusty Russell
  1 sibling, 1 reply; 14+ messages in thread
From: Amit Shah @ 2014-09-11  6:08 UTC (permalink / raw)
  To: Amos Kong; +Cc: herbert.xu, kvm, virtualization

On (Wed) 10 Sep 2014 [17:07:07], Amos Kong wrote:
> When I check hwrng attributes in sysfs, cat process always gets
> stuck if guest has only 1 vcpu and uses a slow rng backend.
> 
> Currently we check if there is any tasks waiting to be run on
> current cpu in rng_dev_read() by need_resched(). But need_resched()
> doesn't work because rng_dev_read() is executing in user context.
> 
> This patch removed need_resched() and increase delay to 10 jiffies,
> then other tasks can have chance to execute protected code.
> Delaying 1 jiffy also works, but 10 jiffies is safer.

I'd prefer two patches for this one: one to remove the need_resched()
check, and the other to increase the timeout.

Anyway,

Reviewed-by: Amit Shah <amit.shah@redhat.com>

> 
> Signed-off-by: Amos Kong <akong@redhat.com>
> ---
>  drivers/char/hw_random/core.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> index c591d7e..b5d1b6f 100644
> --- a/drivers/char/hw_random/core.c
> +++ b/drivers/char/hw_random/core.c
> @@ -195,8 +195,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
>  
>  		mutex_unlock(&rng_mutex);
>  
> -		if (need_resched())
> -			schedule_timeout_interruptible(1);
> +		schedule_timeout_interruptible(10);
>  
>  		if (signal_pending(current)) {
>  			err = -ERESTARTSYS;
> -- 
> 1.9.3
> 

		Amit

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] virtio-rng: fix stuck in catting hwrng attributes
  2014-09-10  9:07 ` [PATCH 2/2] virtio-rng: fix stuck in catting hwrng attributes Amos Kong
  2014-09-11  6:08   ` Amit Shah
@ 2014-09-11 11:38   ` Rusty Russell
  2014-09-13 17:12     ` Amos Kong
  1 sibling, 1 reply; 14+ messages in thread
From: Rusty Russell @ 2014-09-11 11:38 UTC (permalink / raw)
  To: Amos Kong, virtualization; +Cc: amit.shah, kvm

Amos Kong <akong@redhat.com> writes:
> When I check hwrng attributes in sysfs, cat process always gets
> stuck if guest has only 1 vcpu and uses a slow rng backend.
>
> Currently we check if there is any tasks waiting to be run on
> current cpu in rng_dev_read() by need_resched(). But need_resched()
> doesn't work because rng_dev_read() is executing in user context.

I don't understand this explanation?  I'd expect the sysfs process to be
woken by the mutex_unlock().

If we're really high priority (vs. the sysfs process) then I can see why
we'd need schedule_timeout_interruptible() instead of just schedule(),
and in that case, need_resched() would be false too.

You could argue that's intended behaviour, but I can't see how it
happens in the normal case anyway.

What am I missing?

Thanks,
Rusty.

> This patch removed need_resched() and increase delay to 10 jiffies,
> then other tasks can have chance to execute protected code.
> Delaying 1 jiffy also works, but 10 jiffies is safer.
>
> Signed-off-by: Amos Kong <akong@redhat.com>
> ---
>  drivers/char/hw_random/core.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> index c591d7e..b5d1b6f 100644
> --- a/drivers/char/hw_random/core.c
> +++ b/drivers/char/hw_random/core.c
> @@ -195,8 +195,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
>  
>  		mutex_unlock(&rng_mutex);
>  
> -		if (need_resched())
> -			schedule_timeout_interruptible(1);
> +		schedule_timeout_interruptible(10);
>  
>  		if (signal_pending(current)) {
>  			err = -ERESTARTSYS;
> -- 
> 1.9.3

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] virtio-rng: fix stuck in catting hwrng attributes
  2014-09-11 11:38   ` Rusty Russell
@ 2014-09-13 17:12     ` Amos Kong
  2014-09-14  1:12       ` Amos Kong
  0 siblings, 1 reply; 14+ messages in thread
From: Amos Kong @ 2014-09-13 17:12 UTC (permalink / raw)
  To: Rusty Russell; +Cc: amit.shah, kvm, virtualization

On Thu, Sep 11, 2014 at 09:08:03PM +0930, Rusty Russell wrote:
> Amos Kong <akong@redhat.com> writes:
> > When I check hwrng attributes in sysfs, cat process always gets
> > stuck if guest has only 1 vcpu and uses a slow rng backend.
> >
> > Currently we check if there is any tasks waiting to be run on
> > current cpu in rng_dev_read() by need_resched(). But need_resched()
> > doesn't work because rng_dev_read() is executing in user context.
> 
> I don't understand this explanation?  I'd expect the sysfs process to be
> woken by the mutex_unlock().

But actually sysfs process's not woken always, this is they the
process gets stuck.
 
> If we're really high priority (vs. the sysfs process) then I can see why
> we'd need schedule_timeout_interruptible() instead of just schedule(),
> and in that case, need_resched() would be false too.
> 
> You could argue that's intended behaviour, but I can't see how it
> happens in the normal case anyway.
> 
> What am I missing?
> 
> Thanks,
> Rusty.
> 
> > This patch removed need_resched() and increase delay to 10 jiffies,
> > then other tasks can have chance to execute protected code.
> > Delaying 1 jiffy also works, but 10 jiffies is safer.
> >
> > Signed-off-by: Amos Kong <akong@redhat.com>
> > ---
> >  drivers/char/hw_random/core.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> > index c591d7e..b5d1b6f 100644
> > --- a/drivers/char/hw_random/core.c
> > +++ b/drivers/char/hw_random/core.c
> > @@ -195,8 +195,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
> >  
> >  		mutex_unlock(&rng_mutex);
> >  
> > -		if (need_resched())
> > -			schedule_timeout_interruptible(1);
> > +		schedule_timeout_interruptible(10);
> >  
> >  		if (signal_pending(current)) {
> >  			err = -ERESTARTSYS;
> > -- 
> > 1.9.3

-- 
			Amos.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] virtio-rng: fix stuck in catting hwrng attributes
  2014-09-13 17:12     ` Amos Kong
@ 2014-09-14  1:12       ` Amos Kong
  2014-09-14  2:25         ` Amos Kong
  2014-09-16 15:35         ` Rusty Russell
  0 siblings, 2 replies; 14+ messages in thread
From: Amos Kong @ 2014-09-14  1:12 UTC (permalink / raw)
  To: Rusty Russell; +Cc: amit.shah, kvm, virtualization

On Sun, Sep 14, 2014 at 01:12:58AM +0800, Amos Kong wrote:
> On Thu, Sep 11, 2014 at 09:08:03PM +0930, Rusty Russell wrote:
> > Amos Kong <akong@redhat.com> writes:
> > > When I check hwrng attributes in sysfs, cat process always gets
> > > stuck if guest has only 1 vcpu and uses a slow rng backend.
> > >
> > > Currently we check if there is any tasks waiting to be run on
> > > current cpu in rng_dev_read() by need_resched(). But need_resched()
> > > doesn't work because rng_dev_read() is executing in user context.
> > 
> > I don't understand this explanation?  I'd expect the sysfs process to be
> > woken by the mutex_unlock().
> 
> But actually sysfs process's not woken always, this is they the
> process gets stuck.

%s/they/why/

Hi Rusty,


Reference:
http://www.linuxgrill.com/anonymous/fire/netfilter/kernel-hacking-HOWTO-2.html

read() syscall of /dev/hwrng will enter into kernel, the read operation is
rng_dev_read(), it's userspace context (not interrupt context).

Userspace context doesn't allow other user contexts run on that CPU,
unless the kernel code sleeps for some reason.


In this case, the need_resched() doesn't work.

My solution is removing need_resched() and use an appropriate delay by 
schedule_timeout_interruptible(10).

Thanks, Amos
  
> > If we're really high priority (vs. the sysfs process) then I can see why
> > we'd need schedule_timeout_interruptible() instead of just schedule(),
> > and in that case, need_resched() would be false too.
> > 
> > You could argue that's intended behaviour, but I can't see how it
> > happens in the normal case anyway.
> > 
> > What am I missing?

> > Thanks,
> > Rusty.
> > 
> > > This patch removed need_resched() and increase delay to 10 jiffies,
> > > then other tasks can have chance to execute protected code.
> > > Delaying 1 jiffy also works, but 10 jiffies is safer.
> > >
> > > Signed-off-by: Amos Kong <akong@redhat.com>
> > > ---
> > >  drivers/char/hw_random/core.c | 3 +--
> > >  1 file changed, 1 insertion(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> > > index c591d7e..b5d1b6f 100644
> > > --- a/drivers/char/hw_random/core.c
> > > +++ b/drivers/char/hw_random/core.c
> > > @@ -195,8 +195,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
> > >  
> > >  		mutex_unlock(&rng_mutex);
> > >  
> > > -		if (need_resched())
> > > -			schedule_timeout_interruptible(1);
> > > +		schedule_timeout_interruptible(10);
> > >  
> > >  		if (signal_pending(current)) {
> > >  			err = -ERESTARTSYS;
> > > -- 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] virtio-rng: fix stuck in catting hwrng attributes
  2014-09-11  6:08   ` Amit Shah
@ 2014-09-14  1:16     ` Amos Kong
  0 siblings, 0 replies; 14+ messages in thread
From: Amos Kong @ 2014-09-14  1:16 UTC (permalink / raw)
  To: Amit Shah; +Cc: herbert.xu, kvm, virtualization

On Thu, Sep 11, 2014 at 11:38:38AM +0530, Amit Shah wrote:
> On (Wed) 10 Sep 2014 [17:07:07], Amos Kong wrote:
> > When I check hwrng attributes in sysfs, cat process always gets
> > stuck if guest has only 1 vcpu and uses a slow rng backend.
> > 
> > Currently we check if there is any tasks waiting to be run on
> > current cpu in rng_dev_read() by need_resched(). But need_resched()
> > doesn't work because rng_dev_read() is executing in user context.
> > 
> > This patch removed need_resched() and increase delay to 10 jiffies,
> > then other tasks can have chance to execute protected code.
> > Delaying 1 jiffy also works, but 10 jiffies is safer.

Hi Amit,
 
> I'd prefer two patches for this one: one to remove the need_resched()
> check, and the other to increase the timeout.

If Rusty agrees with this fix, I will respin to update the commitlog
with clear description and split the patches to 3.

Thanks for the review.
 
> Anyway,
> 
> Reviewed-by: Amit Shah <amit.shah@redhat.com>

--
                Amos.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] virtio-rng: fix stuck in catting hwrng attributes
  2014-09-14  1:12       ` Amos Kong
@ 2014-09-14  2:25         ` Amos Kong
  2014-09-15 16:48           ` Radim Krčmář
  2014-09-16 15:35         ` Rusty Russell
  1 sibling, 1 reply; 14+ messages in thread
From: Amos Kong @ 2014-09-14  2:25 UTC (permalink / raw)
  To: Rusty Russell; +Cc: amit.shah, kvm, virtualization

On Sun, Sep 14, 2014 at 09:12:08AM +0800, Amos Kong wrote:

...
> > > > diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> > > > index c591d7e..b5d1b6f 100644
> > > > --- a/drivers/char/hw_random/core.c
> > > > +++ b/drivers/char/hw_random/core.c
> > > > @@ -195,8 +195,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
> > > >  
> > > >  		mutex_unlock(&rng_mutex);
> > > >  
> > > > -		if (need_resched())
> > > > -			schedule_timeout_interruptible(1);
> > > > +		schedule_timeout_interruptible(10);

Problem only occurred in non-smp guest, we can improve it to:

                        if(!is_smp())
                                schedule_timeout_interruptible(10);

is_smp() is only available for arm arch, we need a general one.

> > > >  
> > > >  		if (signal_pending(current)) {
> > > >  			err = -ERESTARTSYS;
> > > > -- 

-- 
			Amos.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] virtio-rng: fix stuck in catting hwrng attributes
  2014-09-14  2:25         ` Amos Kong
@ 2014-09-15 16:48           ` Radim Krčmář
  2014-09-16  0:50               ` Amos Kong
  0 siblings, 1 reply; 14+ messages in thread
From: Radim Krčmář @ 2014-09-15 16:48 UTC (permalink / raw)
  To: Amos Kong; +Cc: amit.shah, kvm, virtualization

2014-09-14 10:25+0800, Amos Kong:
> On Sun, Sep 14, 2014 at 09:12:08AM +0800, Amos Kong wrote:
> 
> ...
> > > > > diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> > > > > index c591d7e..b5d1b6f 100644
> > > > > --- a/drivers/char/hw_random/core.c
> > > > > +++ b/drivers/char/hw_random/core.c
> > > > > @@ -195,8 +195,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
> > > > >  
> > > > >  		mutex_unlock(&rng_mutex);
> > > > >  
> > > > > -		if (need_resched())
> > > > > -			schedule_timeout_interruptible(1);
> > > > > +		schedule_timeout_interruptible(10);

If cond_resched() does not work, it is a bug elsewehere.

> Problem only occurred in non-smp guest, we can improve it to:
> 
>                         if(!is_smp())
>                                 schedule_timeout_interruptible(10);
> 
> is_smp() is only available for arm arch, we need a general one.

(It is num_online_cpus() > 1.)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] virtio-rng: fix stuck in catting hwrng attributes
  2014-09-15 16:48           ` Radim Krčmář
@ 2014-09-16  0:50               ` Amos Kong
  0 siblings, 0 replies; 14+ messages in thread
From: Amos Kong @ 2014-09-16  0:50 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: Rusty Russell, virtualization, kvm, amit.shah, jasowang,
	linux-kernel, m, herbert, mpm

CC linux-kernel

Original thread: http://comments.gmane.org/gmane.linux.kernel.virtualization/22775

On Mon, Sep 15, 2014 at 06:48:46PM +0200, Radim Krčmář wrote:
> 2014-09-14 10:25+0800, Amos Kong:
> > On Sun, Sep 14, 2014 at 09:12:08AM +0800, Amos Kong wrote:
> > 
> > ...
> > > > > > diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> > > > > > index c591d7e..b5d1b6f 100644
> > > > > > --- a/drivers/char/hw_random/core.c
> > > > > > +++ b/drivers/char/hw_random/core.c
> > > > > > @@ -195,8 +195,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
> > > > > >  
> > > > > >  		mutex_unlock(&rng_mutex);
> > > > > >  
> > > > > > -		if (need_resched())
> > > > > > -			schedule_timeout_interruptible(1);
> > > > > > +		schedule_timeout_interruptible(10);
> 
> If cond_resched() does not work, it is a bug elsewehere.

Thanks for your reply, Jason also told me the TIF_NEED_RESCHED should
be set in this case, then need_resched() returns true.

I will investigate the issue and reply you later.
 
> > Problem only occurred in non-smp guest, we can improve it to:
> > 
> >                         if(!is_smp())
> >                                 schedule_timeout_interruptible(10);
> > 
> > is_smp() is only available for arm arch, we need a general one.
> 
> (It is num_online_cpus() > 1.)

-- 
			Amos.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] virtio-rng: fix stuck in catting hwrng attributes
@ 2014-09-16  0:50               ` Amos Kong
  0 siblings, 0 replies; 14+ messages in thread
From: Amos Kong @ 2014-09-16  0:50 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: herbert, kvm, linux-kernel, virtualization, m, mpm, amit.shah

CC linux-kernel

Original thread: http://comments.gmane.org/gmane.linux.kernel.virtualization/22775

On Mon, Sep 15, 2014 at 06:48:46PM +0200, Radim Krčmář wrote:
> 2014-09-14 10:25+0800, Amos Kong:
> > On Sun, Sep 14, 2014 at 09:12:08AM +0800, Amos Kong wrote:
> > 
> > ...
> > > > > > diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> > > > > > index c591d7e..b5d1b6f 100644
> > > > > > --- a/drivers/char/hw_random/core.c
> > > > > > +++ b/drivers/char/hw_random/core.c
> > > > > > @@ -195,8 +195,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
> > > > > >  
> > > > > >  		mutex_unlock(&rng_mutex);
> > > > > >  
> > > > > > -		if (need_resched())
> > > > > > -			schedule_timeout_interruptible(1);
> > > > > > +		schedule_timeout_interruptible(10);
> 
> If cond_resched() does not work, it is a bug elsewehere.

Thanks for your reply, Jason also told me the TIF_NEED_RESCHED should
be set in this case, then need_resched() returns true.

I will investigate the issue and reply you later.
 
> > Problem only occurred in non-smp guest, we can improve it to:
> > 
> >                         if(!is_smp())
> >                                 schedule_timeout_interruptible(10);
> > 
> > is_smp() is only available for arm arch, we need a general one.
> 
> (It is num_online_cpus() > 1.)

-- 
			Amos.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] virtio-rng: fix stuck in catting hwrng attributes
  2014-09-14  1:12       ` Amos Kong
  2014-09-14  2:25         ` Amos Kong
@ 2014-09-16 15:35         ` Rusty Russell
  1 sibling, 0 replies; 14+ messages in thread
From: Rusty Russell @ 2014-09-16 15:35 UTC (permalink / raw)
  To: Amos Kong; +Cc: amit.shah, kvm, virtualization

Amos Kong <akong@redhat.com> writes:
> On Sun, Sep 14, 2014 at 01:12:58AM +0800, Amos Kong wrote:
>> On Thu, Sep 11, 2014 at 09:08:03PM +0930, Rusty Russell wrote:
>> > Amos Kong <akong@redhat.com> writes:
>> > > When I check hwrng attributes in sysfs, cat process always gets
>> > > stuck if guest has only 1 vcpu and uses a slow rng backend.
>> > >
>> > > Currently we check if there is any tasks waiting to be run on
>> > > current cpu in rng_dev_read() by need_resched(). But need_resched()
>> > > doesn't work because rng_dev_read() is executing in user context.
>> > 
>> > I don't understand this explanation?  I'd expect the sysfs process to be
>> > woken by the mutex_unlock().
>> 
>> But actually sysfs process's not woken always, this is they the
>> process gets stuck.
>
> %s/they/why/
>
> Hi Rusty,
>
>
> Reference:
> http://www.linuxgrill.com/anonymous/fire/netfilter/kernel-hacking-HOWTO-2.html

Sure, that was true when I wrote it, and is still true when preempt is
off.

> read() syscall of /dev/hwrng will enter into kernel, the read operation is
> rng_dev_read(), it's userspace context (not interrupt context).
>
> Userspace context doesn't allow other user contexts run on that CPU,
> unless the kernel code sleeps for some reason.

This is true assuming preempt is off, yes.

> In this case, the need_resched() doesn't work.

This is exactly what need_resched() is for: it should return true if
there's another process of sufficient priority waiting to be run.  It
implies that schedule() would run it.

git blame doesn't offer any enlightenment here, as to why we use
schedule_timeout_interruptible() at all.

I would expect mutex_unlock() to wake the other reader.  The code
certainly seems to, so it should now be runnable and need_resched()
should return true.

I suspect something else is happening which makes this "work".

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-09-16 15:35 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-10  9:07 [PATCH 0/2] fix stuck in catting hwrng attributes Amos Kong
2014-09-10  9:07 ` [PATCH 1/2] virtio-rng cleanup: move some code out of mutex protection Amos Kong
2014-09-11  6:07   ` Amit Shah
2014-09-10  9:07 ` [PATCH 2/2] virtio-rng: fix stuck in catting hwrng attributes Amos Kong
2014-09-11  6:08   ` Amit Shah
2014-09-14  1:16     ` Amos Kong
2014-09-11 11:38   ` Rusty Russell
2014-09-13 17:12     ` Amos Kong
2014-09-14  1:12       ` Amos Kong
2014-09-14  2:25         ` Amos Kong
2014-09-15 16:48           ` Radim Krčmář
2014-09-16  0:50             ` Amos Kong
2014-09-16  0:50               ` Amos Kong
2014-09-16 15:35         ` Rusty Russell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.