linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [v5.12-rc2 regression] io_uring: high CPU use after suspend-to-ram
@ 2021-03-10  1:55 Kevin Locke
  2021-03-10  2:18 ` Jens Axboe
  2021-03-10  2:48 ` Jens Axboe
  0 siblings, 2 replies; 5+ messages in thread
From: Kevin Locke @ 2021-03-10  1:55 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, linux-kernel

With kernel 5.12-rc2 (and torvalds/master 144c79ef3353), if mpd is
playing or paused when my system is suspended-to-ram, when the system is
resumed mpd will consume ~200% CPU until killed.  It continues to
produce audio and respond to pause/play commands, which do not affect
CPU usage.  This occurs with either pulse (to PulseAudio or
PipeWire-as-PulseAudio) or alsa audio_output.

The issue appears to have been introduced by a combination of two
commits: 3bfe6106693b caused freeze on suspend-to-ram when mpd is paused
or playing.  e4b4a13f4941 fixed suspend-to-ram, but introduced the high
CPU on resume.

I attempted to further diagnose using `perf record -p $(pidof mpd)`.
Running for about a minute after resume shows ~280 MMAP2 events and
almost nothing else.  I'm not sure what to make of that or how to
further investigate.

Let me know if there's anything else I can do to help diagnose/test.

Thanks,
Kevin

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [v5.12-rc2 regression] io_uring: high CPU use after suspend-to-ram
  2021-03-10  1:55 [v5.12-rc2 regression] io_uring: high CPU use after suspend-to-ram Kevin Locke
@ 2021-03-10  2:18 ` Jens Axboe
  2021-03-10  2:48 ` Jens Axboe
  1 sibling, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2021-03-10  2:18 UTC (permalink / raw)
  To: Kevin Locke, io-uring, linux-kernel

On 3/9/21 6:55 PM, Kevin Locke wrote:
> With kernel 5.12-rc2 (and torvalds/master 144c79ef3353), if mpd is
> playing or paused when my system is suspended-to-ram, when the system is
> resumed mpd will consume ~200% CPU until killed.  It continues to
> produce audio and respond to pause/play commands, which do not affect
> CPU usage.  This occurs with either pulse (to PulseAudio or
> PipeWire-as-PulseAudio) or alsa audio_output.
> 
> The issue appears to have been introduced by a combination of two
> commits: 3bfe6106693b caused freeze on suspend-to-ram when mpd is paused
> or playing.  e4b4a13f4941 fixed suspend-to-ram, but introduced the high
> CPU on resume.
> 
> I attempted to further diagnose using `perf record -p $(pidof mpd)`.
> Running for about a minute after resume shows ~280 MMAP2 events and
> almost nothing else.  I'm not sure what to make of that or how to
> further investigate.
> 
> Let me know if there's anything else I can do to help diagnose/test.

Thanks for the report, let me take a look and try and reproduce (and
fix) it. I'll let you know if I fail in reproducing and need your
help in testing a fix!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [v5.12-rc2 regression] io_uring: high CPU use after suspend-to-ram
  2021-03-10  1:55 [v5.12-rc2 regression] io_uring: high CPU use after suspend-to-ram Kevin Locke
  2021-03-10  2:18 ` Jens Axboe
@ 2021-03-10  2:48 ` Jens Axboe
  2021-03-10  3:23   ` Kevin Locke
  1 sibling, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2021-03-10  2:48 UTC (permalink / raw)
  To: Kevin Locke, io-uring, linux-kernel, rafael.j.wysocki

On 3/9/21 6:55 PM, Kevin Locke wrote:
> With kernel 5.12-rc2 (and torvalds/master 144c79ef3353), if mpd is
> playing or paused when my system is suspended-to-ram, when the system is
> resumed mpd will consume ~200% CPU until killed.  It continues to
> produce audio and respond to pause/play commands, which do not affect
> CPU usage.  This occurs with either pulse (to PulseAudio or
> PipeWire-as-PulseAudio) or alsa audio_output.
> 
> The issue appears to have been introduced by a combination of two
> commits: 3bfe6106693b caused freeze on suspend-to-ram when mpd is paused
> or playing.  e4b4a13f4941 fixed suspend-to-ram, but introduced the high
> CPU on resume.
> 
> I attempted to further diagnose using `perf record -p $(pidof mpd)`.
> Running for about a minute after resume shows ~280 MMAP2 events and
> almost nothing else.  I'm not sure what to make of that or how to
> further investigate.
> 
> Let me know if there's anything else I can do to help diagnose/test.

The below makes it work as expected for me - but I don't quite
understand why we're continually running after the freeze. Adding Rafael
to help understand this.

Rafael, what appears to happen here from a quick look is that the io
threads are frozen fine and the system suspends. But when we resume,
signal_pending() is perpetually true, and that is why we then see the
io_wq_manager() thread just looping like crazy. Is there anything
special I need to do? Note that these are not kthreads, PF_KTHREAD is
not true. I'm guessing it may have something to do with that, but
haven't dug deeper yet.


diff --git a/fs/io-wq.c b/fs/io-wq.c
index 3d7060ba547a..0ae9ecadf295 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -591,7 +591,7 @@ static bool create_io_worker(struct io_wq *wq, struct io_wqe *wqe, int index)
 	tsk->pf_io_worker = worker;
 	worker->task = tsk;
 	set_cpus_allowed_ptr(tsk, cpumask_of_node(wqe->node));
-	tsk->flags |= PF_NOFREEZE | PF_NO_SETAFFINITY;
+	tsk->flags |= PF_NO_SETAFFINITY;
 
 	raw_spin_lock_irq(&wqe->lock);
 	hlist_nulls_add_head_rcu(&worker->nulls_node, &wqe->free_list);
@@ -709,7 +709,6 @@ static int io_wq_manager(void *data)
 		set_current_state(TASK_INTERRUPTIBLE);
 		io_wq_check_workers(wq);
 		schedule_timeout(HZ);
-		try_to_freeze();
 		if (fatal_signal_pending(current))
 			set_bit(IO_WQ_BIT_EXIT, &wq->state);
 	} while (!test_bit(IO_WQ_BIT_EXIT, &wq->state));
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 280133f3abc4..8f4128eb4aa2 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6735,7 +6735,6 @@ static int io_sq_thread(void *data)
 
 			up_read(&sqd->rw_lock);
 			schedule();
-			try_to_freeze();
 			down_read(&sqd->rw_lock);
 			list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
 				io_ring_clear_wakeup_flag(ctx);
diff --git a/kernel/fork.c b/kernel/fork.c
index d3171e8e88e5..72e444cd0ffe 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2436,6 +2436,7 @@ struct task_struct *create_io_thread(int (*fn)(void *), void *arg, int node)
 	if (!IS_ERR(tsk)) {
 		sigfillset(&tsk->blocked);
 		sigdelsetmask(&tsk->blocked, sigmask(SIGKILL));
+		tsk->flags |= PF_NOFREEZE;
 	}
 	return tsk;
 }

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [v5.12-rc2 regression] io_uring: high CPU use after suspend-to-ram
  2021-03-10  2:48 ` Jens Axboe
@ 2021-03-10  3:23   ` Kevin Locke
  2021-03-10 14:45     ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Kevin Locke @ 2021-03-10  3:23 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, linux-kernel, rafael.j.wysocki

On Tue, 2021-03-09 at 19:48 -0700, Jens Axboe wrote:
> On 3/9/21 6:55 PM, Kevin Locke wrote:
>> With kernel 5.12-rc2 (and torvalds/master 144c79ef3353), if mpd is
>> playing or paused when my system is suspended-to-ram, when the system is
>> resumed mpd will consume ~200% CPU until killed.  It continues to
>> produce audio and respond to pause/play commands, which do not affect
>> CPU usage.  This occurs with either pulse (to PulseAudio or
>> PipeWire-as-PulseAudio) or alsa audio_output.
> 
> The below makes it work as expected for me - but I don't quite
> understand why we're continually running after the freeze. Adding Rafael
> to help understand this.

I can confirm that your patch resolves the high CPU usage after suspend
on my system as well.  Many thanks!

Tested-by: Kevin Locke <kevin@kevinlocke.name>

Happy to test any future revisions as well.

Thanks again,
Kevin

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [v5.12-rc2 regression] io_uring: high CPU use after suspend-to-ram
  2021-03-10  3:23   ` Kevin Locke
@ 2021-03-10 14:45     ` Jens Axboe
  0 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2021-03-10 14:45 UTC (permalink / raw)
  To: Kevin Locke, io-uring, linux-kernel, rafael.j.wysocki

On 3/9/21 8:23 PM, Kevin Locke wrote:
> On Tue, 2021-03-09 at 19:48 -0700, Jens Axboe wrote:
>> On 3/9/21 6:55 PM, Kevin Locke wrote:
>>> With kernel 5.12-rc2 (and torvalds/master 144c79ef3353), if mpd is
>>> playing or paused when my system is suspended-to-ram, when the system is
>>> resumed mpd will consume ~200% CPU until killed.  It continues to
>>> produce audio and respond to pause/play commands, which do not affect
>>> CPU usage.  This occurs with either pulse (to PulseAudio or
>>> PipeWire-as-PulseAudio) or alsa audio_output.
>>
>> The below makes it work as expected for me - but I don't quite
>> understand why we're continually running after the freeze. Adding Rafael
>> to help understand this.
> 
> I can confirm that your patch resolves the high CPU usage after suspend
> on my system as well.  Many thanks!
> 
> Tested-by: Kevin Locke <kevin@kevinlocke.name>
> 
> Happy to test any future revisions as well.

Thanks, I'll just hold on to this version for now. It's how it would've
worked before the thread rework anyway. I'd still like to understand why
the thaw leaves them spinning, though :-). But once that is understood,
we can potentially just enable freezing again as a separate patch.
Fixing this one is more important for the time being.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-03-10 14:45 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-10  1:55 [v5.12-rc2 regression] io_uring: high CPU use after suspend-to-ram Kevin Locke
2021-03-10  2:18 ` Jens Axboe
2021-03-10  2:48 ` Jens Axboe
2021-03-10  3:23   ` Kevin Locke
2021-03-10 14:45     ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).