linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping
@ 2019-07-09 21:06 Tejun Heo
  2019-07-09 21:46 ` Corey Minyard
  0 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2019-07-09 21:06 UTC (permalink / raw)
  To: Corey Minyard; +Cc: openipmi-developer, linux-kernel, kernel-team

ipmi_thread() uses back-to-back schedule() to poll for command
completion which, on some machines, can push up CPU consumption and
heavily tax the scheduler locks leading to noticeable overall
performance degradation.

This patch replaces schedule() with usleep_range(100, 200).  This
allows the sensor readings to finish resonably fast and the cpu
consumption of the kthread is kept under several percents of a core.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 drivers/char/ipmi/ipmi_si_intf.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
index f124a2d2bb9f..2143e3c10623 100644
--- a/drivers/char/ipmi/ipmi_si_intf.c
+++ b/drivers/char/ipmi/ipmi_si_intf.c
@@ -1010,7 +1010,7 @@ static int ipmi_thread(void *data)
 		if (smi_result == SI_SM_CALL_WITHOUT_DELAY)
 			; /* do nothing */
 		else if (smi_result == SI_SM_CALL_WITH_DELAY && busy_wait)
-			schedule();
+			usleep_range(100, 200);
 		else if (smi_result == SI_SM_IDLE) {
 			if (atomic_read(&smi_info->need_watch)) {
 				schedule_timeout_interruptible(100);

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping
  2019-07-09 21:06 [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping Tejun Heo
@ 2019-07-09 21:46 ` Corey Minyard
  2019-07-09 22:09   ` Tejun Heo
  2019-07-09 22:11   ` Tejun Heo
  0 siblings, 2 replies; 13+ messages in thread
From: Corey Minyard @ 2019-07-09 21:46 UTC (permalink / raw)
  To: Tejun Heo; +Cc: openipmi-developer, linux-kernel, kernel-team

On Tue, Jul 09, 2019 at 02:06:43PM -0700, Tejun Heo wrote:
> ipmi_thread() uses back-to-back schedule() to poll for command
> completion which, on some machines, can push up CPU consumption and
> heavily tax the scheduler locks leading to noticeable overall
> performance degradation.
> 
> This patch replaces schedule() with usleep_range(100, 200).  This
> allows the sensor readings to finish resonably fast and the cpu
> consumption of the kthread is kept under several percents of a core.

The IPMI thread was not really designed for sensor reading, it was
designed so that firmware updates would happen in a reasonable time
on systems without an interrupt on the IPMI interface.  This change
will degrade performance for that function.  IIRC correctly the
people who did the patch tried this and it slowed things down too
much.

I'm also a little confused because the CPU in question shouldn't
be doing anything else if the schedule() immediately returns here,
so it's not wasting CPU that could be used on another process.  Or
is it lock contention that is causing an issue on other CPUs?

IMHO, this whole thing is stupid; if you design hardware with
stupid interfaces (byte at a time, no interrupts) you should
expect to get bad performance.  But I can't control what the
hardware vendors do.  This whole thing is a carefully tuned
compromise.

So I can't really take this as-is.

-corey

> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> ---
>  drivers/char/ipmi/ipmi_si_intf.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
> index f124a2d2bb9f..2143e3c10623 100644
> --- a/drivers/char/ipmi/ipmi_si_intf.c
> +++ b/drivers/char/ipmi/ipmi_si_intf.c
> @@ -1010,7 +1010,7 @@ static int ipmi_thread(void *data)
>  		if (smi_result == SI_SM_CALL_WITHOUT_DELAY)
>  			; /* do nothing */
>  		else if (smi_result == SI_SM_CALL_WITH_DELAY && busy_wait)
> -			schedule();
> +			usleep_range(100, 200);
>  		else if (smi_result == SI_SM_IDLE) {
>  			if (atomic_read(&smi_info->need_watch)) {
>  				schedule_timeout_interruptible(100);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping
  2019-07-09 21:46 ` Corey Minyard
@ 2019-07-09 22:09   ` Tejun Heo
  2019-07-09 23:01     ` Corey Minyard
  2019-07-09 22:11   ` Tejun Heo
  1 sibling, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2019-07-09 22:09 UTC (permalink / raw)
  To: Corey Minyard; +Cc: openipmi-developer, linux-kernel, kernel-team

Hello, Corey.

On Tue, Jul 09, 2019 at 04:46:02PM -0500, Corey Minyard wrote:
> I'm also a little confused because the CPU in question shouldn't
> be doing anything else if the schedule() immediately returns here,
> so it's not wasting CPU that could be used on another process.  Or
> is it lock contention that is causing an issue on other CPUs?

Yeah, pretty pronounced too and it also keeps the CPU busy which makes
the load balancer deprioritize that CPU.  Busy looping is never free.

> IMHO, this whole thing is stupid; if you design hardware with
> stupid interfaces (byte at a time, no interrupts) you should
> expect to get bad performance.  But I can't control what the
> hardware vendors do.  This whole thing is a carefully tuned
> compromise.

I'm really not sure "carefully tuned" is applicable on indefinite busy
looping.

> So I can't really take this as-is.

We can go for shorter timeouts for sure but I don't think this sort of
busy looping is acceptable.  Is your position that this must be a busy
loop?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping
  2019-07-09 21:46 ` Corey Minyard
  2019-07-09 22:09   ` Tejun Heo
@ 2019-07-09 22:11   ` Tejun Heo
  2019-07-09 23:07     ` [Openipmi-developer] " Corey Minyard
  1 sibling, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2019-07-09 22:11 UTC (permalink / raw)
  To: Corey Minyard; +Cc: openipmi-developer, linux-kernel, kernel-team

On Tue, Jul 09, 2019 at 04:46:02PM -0500, Corey Minyard wrote:
> On Tue, Jul 09, 2019 at 02:06:43PM -0700, Tejun Heo wrote:
> > ipmi_thread() uses back-to-back schedule() to poll for command
> > completion which, on some machines, can push up CPU consumption and
> > heavily tax the scheduler locks leading to noticeable overall
> > performance degradation.
> > 
> > This patch replaces schedule() with usleep_range(100, 200).  This
> > allows the sensor readings to finish resonably fast and the cpu
> > consumption of the kthread is kept under several percents of a core.
> 
> The IPMI thread was not really designed for sensor reading, it was
> designed so that firmware updates would happen in a reasonable time
> on systems without an interrupt on the IPMI interface.  This change
> will degrade performance for that function.  IIRC correctly the
> people who did the patch tried this and it slowed things down too
> much.

Also, can you point me to the exact patch?  I'm kinda curious what
kind of timning they used.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping
  2019-07-09 22:09   ` Tejun Heo
@ 2019-07-09 23:01     ` Corey Minyard
  2019-07-10 14:22       ` Tejun Heo
  0 siblings, 1 reply; 13+ messages in thread
From: Corey Minyard @ 2019-07-09 23:01 UTC (permalink / raw)
  To: Tejun Heo; +Cc: openipmi-developer, linux-kernel, kernel-team

On Tue, Jul 09, 2019 at 03:09:08PM -0700, Tejun Heo wrote:
> Hello, Corey.
> 
> On Tue, Jul 09, 2019 at 04:46:02PM -0500, Corey Minyard wrote:
> > I'm also a little confused because the CPU in question shouldn't
> > be doing anything else if the schedule() immediately returns here,
> > so it's not wasting CPU that could be used on another process.  Or
> > is it lock contention that is causing an issue on other CPUs?
> 
> Yeah, pretty pronounced too and it also keeps the CPU busy which makes
> the load balancer deprioritize that CPU.  Busy looping is never free.
> 
> > IMHO, this whole thing is stupid; if you design hardware with
> > stupid interfaces (byte at a time, no interrupts) you should
> > expect to get bad performance.  But I can't control what the
> > hardware vendors do.  This whole thing is a carefully tuned
> > compromise.
> 
> I'm really not sure "carefully tuned" is applicable on indefinite busy
> looping.

Well, yeah, but other things were tried and this was the only thing
we could find that worked.  That was before the kind of SMP stuff
we have now, though.

> 
> > So I can't really take this as-is.
> 
> We can go for shorter timeouts for sure but I don't think this sort of
> busy looping is acceptable.  Is your position that this must be a busy
> loop?

Well, no.  I want something that provides as high a throughput as
possible and doesn't cause scheduling issues.  But that may not be
possible.  Screwing up the scheduler is a lot worse than slow IPMI
firmware updates.

How short can the timeouts be and avoid issues?

Thanks,

-corey

> 
> Thanks.
> 
> -- 
> tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Openipmi-developer] [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping
  2019-07-09 22:11   ` Tejun Heo
@ 2019-07-09 23:07     ` Corey Minyard
  2019-07-10 14:12       ` Tejun Heo
  0 siblings, 1 reply; 13+ messages in thread
From: Corey Minyard @ 2019-07-09 23:07 UTC (permalink / raw)
  To: Tejun Heo; +Cc: openipmi-developer, kernel-team, linux-kernel

On Tue, Jul 09, 2019 at 03:11:47PM -0700, Tejun Heo wrote:
> On Tue, Jul 09, 2019 at 04:46:02PM -0500, Corey Minyard wrote:
> > On Tue, Jul 09, 2019 at 02:06:43PM -0700, Tejun Heo wrote:
> > > ipmi_thread() uses back-to-back schedule() to poll for command
> > > completion which, on some machines, can push up CPU consumption and
> > > heavily tax the scheduler locks leading to noticeable overall
> > > performance degradation.
> > > 
> > > This patch replaces schedule() with usleep_range(100, 200).  This
> > > allows the sensor readings to finish resonably fast and the cpu
> > > consumption of the kthread is kept under several percents of a core.
> > 
> > The IPMI thread was not really designed for sensor reading, it was
> > designed so that firmware updates would happen in a reasonable time
> > on systems without an interrupt on the IPMI interface.  This change
> > will degrade performance for that function.  IIRC correctly the
> > people who did the patch tried this and it slowed things down too
> > much.
> 
> Also, can you point me to the exact patch?  I'm kinda curious what
> kind of timning they used.

I believe the change was 33979734cd35ae "IPMI: use schedule in kthread"
The original change that added the kthread was a9a2c44ff0a1350
"ipmi: add timer thread".

I mis-remembered this, we switched from doing a udelay() to
schedule(), but that udelay was 1us, so that's probably not helpful
information.

-corey

> 
> Thanks.
> 
> -- 
> tejun
> 
> 
> _______________________________________________
> Openipmi-developer mailing list
> Openipmi-developer@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openipmi-developer

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Openipmi-developer] [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping
  2019-07-09 23:07     ` [Openipmi-developer] " Corey Minyard
@ 2019-07-10 14:12       ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2019-07-10 14:12 UTC (permalink / raw)
  To: Corey Minyard; +Cc: openipmi-developer, kernel-team, linux-kernel

Hello, Corey.

On Tue, Jul 09, 2019 at 06:07:03PM -0500, Corey Minyard wrote:
> I believe the change was 33979734cd35ae "IPMI: use schedule in kthread"
> The original change that added the kthread was a9a2c44ff0a1350
> "ipmi: add timer thread".
> 
> I mis-remembered this, we switched from doing a udelay() to
> schedule(), but that udelay was 1us, so that's probably not helpful
> information.

I see, so it went from non-yielding busy looping to an yiedling one.
And, yeah, udelay(1) isn't much of a data point.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping
  2019-07-09 23:01     ` Corey Minyard
@ 2019-07-10 14:22       ` Tejun Heo
  2019-07-10 20:11         ` Corey Minyard
  2019-08-01 17:40         ` Corey Minyard
  0 siblings, 2 replies; 13+ messages in thread
From: Tejun Heo @ 2019-07-10 14:22 UTC (permalink / raw)
  To: Corey Minyard; +Cc: openipmi-developer, linux-kernel, kernel-team

Hello,

On Tue, Jul 09, 2019 at 06:01:44PM -0500, Corey Minyard wrote:
> > I'm really not sure "carefully tuned" is applicable on indefinite busy
> > looping.
> 
> Well, yeah, but other things were tried and this was the only thing
> we could find that worked.  That was before the kind of SMP stuff
> we have now, though.

I see.

> > We can go for shorter timeouts for sure but I don't think this sort of
> > busy looping is acceptable.  Is your position that this must be a busy
> > loop?
> 
> Well, no.  I want something that provides as high a throughput as
> possible and doesn't cause scheduling issues.  But that may not be
> possible.  Screwing up the scheduler is a lot worse than slow IPMI
> firmware updates.
> 
> How short can the timeouts be and avoid issues?

We first tried msleep(1) and that was too slow even for sensor reading
making it take longer than 50s.  With the 100us-200us sleep, it got
down to ~5s which was good enough for our use case and the cpu /
scheduler impact was still mostly negligible.  I can't tell for sure
without testing but going significantly below 100us is likely to
become visible pretty quickly.

We can also take a hybrid approach where we busy poll w/ 1us udelay
upto, say, fifty times and then switch to sleeping poll.

Are there some tests which can be used to verify the cases which may
get impacted by these changes?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping
  2019-07-10 14:22       ` Tejun Heo
@ 2019-07-10 20:11         ` Corey Minyard
  2019-08-01 17:40         ` Corey Minyard
  1 sibling, 0 replies; 13+ messages in thread
From: Corey Minyard @ 2019-07-10 20:11 UTC (permalink / raw)
  To: Tejun Heo; +Cc: openipmi-developer, linux-kernel, kernel-team

On Wed, Jul 10, 2019 at 07:22:21AM -0700, Tejun Heo wrote:
> Hello,
> 
> > > We can go for shorter timeouts for sure but I don't think this sort of
> > > busy looping is acceptable.  Is your position that this must be a busy
> > > loop?
> > 
> > Well, no.  I want something that provides as high a throughput as
> > possible and doesn't cause scheduling issues.  But that may not be
> > possible.  Screwing up the scheduler is a lot worse than slow IPMI
> > firmware updates.
> > 
> > How short can the timeouts be and avoid issues?
> 
> We first tried msleep(1) and that was too slow even for sensor reading
> making it take longer than 50s.  With the 100us-200us sleep, it got
> down to ~5s which was good enough for our use case and the cpu /
> scheduler impact was still mostly negligible.  I can't tell for sure
> without testing but going significantly below 100us is likely to
> become visible pretty quickly.

What was the time to read the sensors before you did the change?
It depends a lot on the system, so I can't really guess.

> 
> We can also take a hybrid approach where we busy poll w/ 1us udelay
> upto, say, fifty times and then switch to sleeping poll.

I'm pretty sure we didn't try that in the original work, but I'm
not sure that would work.  Most of the initial spinning would be
pointless.

I would guess that you would decrease the delay and the performance
would improve linearly until you hit a certain point, and then
decreasing the delay wouldn't make a big difference.  That's the
point you want to use, I think.

What might actually be best would be for the driver to measure the
time it takes the BMC to respond and try to set the timeout based
on that information.  BMCs vary a lot, a constant probably won't
work.

And I was just saying that I wasn't expecting any big changes in
the IPMI driver any more...

> 
> Are there some tests which can be used to verify the cases which may
> get impacted by these changes?

Unfortunately not.  The original people at Dell that did the work
don't work there any more, I don't think.

I mostly use qemu now for testing, but this is not something you can
really simulate on qemu very well.  Can you do an IPMI firmware
update on your system?  That would be the easiest way to measure.

Thanks,

-corey

> 
> Thanks.
> 
> -- 
> tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping
  2019-07-10 14:22       ` Tejun Heo
  2019-07-10 20:11         ` Corey Minyard
@ 2019-08-01 17:40         ` Corey Minyard
  2019-08-05 18:18           ` Tejun Heo
  1 sibling, 1 reply; 13+ messages in thread
From: Corey Minyard @ 2019-08-01 17:40 UTC (permalink / raw)
  To: Tejun Heo; +Cc: openipmi-developer, linux-kernel, kernel-team

On Wed, Jul 10, 2019 at 07:22:21AM -0700, Tejun Heo wrote:
> Hello,
> 
> On Tue, Jul 09, 2019 at 06:01:44PM -0500, Corey Minyard wrote:
> > > I'm really not sure "carefully tuned" is applicable on indefinite busy
> > > looping.
> > 
> > Well, yeah, but other things were tried and this was the only thing
> > we could find that worked.  That was before the kind of SMP stuff
> > we have now, though.
> 
> I see.
> 
> > > We can go for shorter timeouts for sure but I don't think this sort of
> > > busy looping is acceptable.  Is your position that this must be a busy
> > > loop?
> > 
> > Well, no.  I want something that provides as high a throughput as
> > possible and doesn't cause scheduling issues.  But that may not be
> > possible.  Screwing up the scheduler is a lot worse than slow IPMI
> > firmware updates.
> > 
> > How short can the timeouts be and avoid issues?
> 
> We first tried msleep(1) and that was too slow even for sensor reading
> making it take longer than 50s.  With the 100us-200us sleep, it got
> down to ~5s which was good enough for our use case and the cpu /
> scheduler impact was still mostly negligible.  I can't tell for sure
> without testing but going significantly below 100us is likely to
> become visible pretty quickly.

I spent some time looking at this.  Without the patch, I
measured around 3.5ms to send/receive a get device ID message
and uses 100% of the CPU on a core.

I measured your patch, it slowed it down to around 10.5ms
per message, which is not good.  Though it did just use a
few percent of the core.

I wrote some code to auto-adjust the timer.  It settled on
a delay around 35us, which gave 4.7ms per message, which is
probably acceptable, and used around 40% of the CPU.  If
I use any timeout (even a 0-10us range) it won't go below
4ms per message.

The process is running at nice 19 priority, so it won't
have a significant effect on other processes from a priority
point of view.  It may still be hitting the scheduler locks
pretty hard, though.  But I played with things quite a bit
and the behavior or the management controller is too
erratic to set a really good timeout.  Maybe other ones
are better, don't know.

One other option we have is that the driver has something
called "maintenance mode".  If it detect that you have
reset the management controller or are issuing firmware
commands, it will modify timeout behavior.  It can also
be activated manually.  I could also make it switch to
just calling schedule instead of delaying when in that
mode.

The right thing it do is complain bitterly to vendors who
build hardware that has to be polled.  But besides that,
I'm thinking the maintenance mode is the thing to do.
It will also change behavior if you reset the management
controller, but only for 30 seconds or so.  Does that
work?

-corey

> 
> We can also take a hybrid approach where we busy poll w/ 1us udelay
> upto, say, fifty times and then switch to sleeping poll.
> 
> Are there some tests which can be used to verify the cases which may
> get impacted by these changes?
> 
> Thanks.
> 
> -- 
> tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping
  2019-08-01 17:40         ` Corey Minyard
@ 2019-08-05 18:18           ` Tejun Heo
  2019-08-05 21:18             ` Corey Minyard
  0 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2019-08-05 18:18 UTC (permalink / raw)
  To: Corey Minyard; +Cc: openipmi-developer, linux-kernel, kernel-team

Hello, Corey.

On Thu, Aug 01, 2019 at 12:40:02PM -0500, Corey Minyard wrote:
> I spent some time looking at this.  Without the patch, I
> measured around 3.5ms to send/receive a get device ID message
> and uses 100% of the CPU on a core.
> 
> I measured your patch, it slowed it down to around 10.5ms
> per message, which is not good.  Though it did just use a
> few percent of the core.
> 
> I wrote some code to auto-adjust the timer.  It settled on
> a delay around 35us, which gave 4.7ms per message, which is
> probably acceptable, and used around 40% of the CPU.  If
> I use any timeout (even a 0-10us range) it won't go below
> 4ms per message.

Out of curiosity, what makes 4.7ms okay and 10.5ms not?  At least for
the use case we have in the fleet (sensor reading mostly), the less
disruptive the better as long as things don't timeout and fail.

> The process is running at nice 19 priority, so it won't
> have a significant effect on other processes from a priority
> point of view.  It may still be hitting the scheduler locks
> pretty hard, though.  But I played with things quite a bit

And power.  Imagine multi six digit number of machines burning a full
core just because of this busy loop to read temperature sensors some
msecs faster.

> and the behavior or the management controller is too
> erratic to set a really good timeout.  Maybe other ones
> are better, don't know.
> 
> One other option we have is that the driver has something
> called "maintenance mode".  If it detect that you have
> reset the management controller or are issuing firmware
> commands, it will modify timeout behavior.  It can also
> be activated manually.  I could also make it switch to
> just calling schedule instead of delaying when in that
> mode.

Yeah, whatever which makes the common-case behavior avoid busy looping
would work.

> The right thing it do is complain bitterly to vendors who
> build hardware that has to be polled.  But besides that,

For sure, but there already are a lot of machines with this thing and
it'll take multiple years for them to retire so...

> I'm thinking the maintenance mode is the thing to do.
> It will also change behavior if you reset the management
> controller, but only for 30 seconds or so.  Does that
> work?

Yeah, sounds good to me.

Thnaks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping
  2019-08-05 18:18           ` Tejun Heo
@ 2019-08-05 21:18             ` Corey Minyard
  2019-08-07 18:27               ` Tejun Heo
  0 siblings, 1 reply; 13+ messages in thread
From: Corey Minyard @ 2019-08-05 21:18 UTC (permalink / raw)
  To: Tejun Heo; +Cc: openipmi-developer, linux-kernel, kernel-team

On Mon, Aug 05, 2019 at 11:18:50AM -0700, Tejun Heo wrote:
> Hello, Corey.
> 
> On Thu, Aug 01, 2019 at 12:40:02PM -0500, Corey Minyard wrote:
> > I spent some time looking at this.  Without the patch, I
> > measured around 3.5ms to send/receive a get device ID message
> > and uses 100% of the CPU on a core.
> > 
> > I measured your patch, it slowed it down to around 10.5ms
> > per message, which is not good.  Though it did just use a
> > few percent of the core.
> > 
> > I wrote some code to auto-adjust the timer.  It settled on
> > a delay around 35us, which gave 4.7ms per message, which is
> > probably acceptable, and used around 40% of the CPU.  If
> > I use any timeout (even a 0-10us range) it won't go below
> > 4ms per message.
> 
> Out of curiosity, what makes 4.7ms okay and 10.5ms not?  At least for
> the use case we have in the fleet (sensor reading mostly), the less
> disruptive the better as long as things don't timeout and fail.

Well, when you are loading firmware and it takes 10 minutes at
max speed, taking 20-30 minutes is a lot worse.  It's not reading
sensors, which would be fine, it's tranferring large chunks of
data.

> 
> > The process is running at nice 19 priority, so it won't
> > have a significant effect on other processes from a priority
> > point of view.  It may still be hitting the scheduler locks
> > pretty hard, though.  But I played with things quite a bit
> 
> And power.  Imagine multi six digit number of machines burning a full
> core just because of this busy loop to read temperature sensors some
> msecs faster.
> 
> > and the behavior or the management controller is too
> > erratic to set a really good timeout.  Maybe other ones
> > are better, don't know.
> > 
> > One other option we have is that the driver has something
> > called "maintenance mode".  If it detect that you have
> > reset the management controller or are issuing firmware
> > commands, it will modify timeout behavior.  It can also
> > be activated manually.  I could also make it switch to
> > just calling schedule instead of delaying when in that
> > mode.
> 
> Yeah, whatever which makes the common-case behavior avoid busy looping
> would work.

Ok, it's queued in linux-next now (and has been for a few days).
I'll get it into the next kernel release (and I just noticed
a spelling error and fixed it).

-corey

> 
> > The right thing it do is complain bitterly to vendors who
> > build hardware that has to be polled.  But besides that,
> 
> For sure, but there already are a lot of machines with this thing and
> it'll take multiple years for them to retire so...
> 
> > I'm thinking the maintenance mode is the thing to do.
> > It will also change behavior if you reset the management
> > controller, but only for 30 seconds or so.  Does that
> > work?
> 
> Yeah, sounds good to me.
> 
> Thnaks.
> 
> -- 
> tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping
  2019-08-05 21:18             ` Corey Minyard
@ 2019-08-07 18:27               ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2019-08-07 18:27 UTC (permalink / raw)
  To: Corey Minyard; +Cc: openipmi-developer, linux-kernel, kernel-team

On Mon, Aug 05, 2019 at 04:18:21PM -0500, Corey Minyard wrote:
> > Yeah, whatever which makes the common-case behavior avoid busy looping
> > would work.
> 
> Ok, it's queued in linux-next now (and has been for a few days).
> I'll get it into the next kernel release (and I just noticed
> a spelling error and fixed it).

Looks great to me.  Thanks a lot for working on the issue.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-08-07 18:27 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-09 21:06 [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping Tejun Heo
2019-07-09 21:46 ` Corey Minyard
2019-07-09 22:09   ` Tejun Heo
2019-07-09 23:01     ` Corey Minyard
2019-07-10 14:22       ` Tejun Heo
2019-07-10 20:11         ` Corey Minyard
2019-08-01 17:40         ` Corey Minyard
2019-08-05 18:18           ` Tejun Heo
2019-08-05 21:18             ` Corey Minyard
2019-08-07 18:27               ` Tejun Heo
2019-07-09 22:11   ` Tejun Heo
2019-07-09 23:07     ` [Openipmi-developer] " Corey Minyard
2019-07-10 14:12       ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).