All of lore.kernel.org
 help / color / mirror / Atom feed
From: Juergen Gross <jgross@suse.com>
To: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>,
	Jan Beulich <jbeulich@suse.com>
Subject: Re: xen-balloon thread using 100% of CPU, regression in 5.4.150
Date: Tue, 5 Oct 2021 10:05:39 +0200	[thread overview]
Message-ID: <f707c956-6cdc-9b32-5f22-227e0f5a9f10@suse.com> (raw)
In-Reply-To: <YVrF65BAVsXTgRsd@mail-itl>


[-- Attachment #1.1.1: Type: text/plain, Size: 1505 bytes --]

On 04.10.21 11:14, Marek Marczykowski-Górecki wrote:
> On Mon, Oct 04, 2021 at 07:31:40AM +0200, Juergen Gross wrote:
>> On 03.10.21 06:47, Marek Marczykowski-Górecki wrote:
>>> Hi,
>>>
>>> After updating a PVH domU to 5.4.150, I see xen-balloon thread using
>>> 100% CPU (one thread).
>>> This is a domain started with memory=maxmem=716800KiB (via libvirt). Then,
>>> inside, I see:
>>>
>>> # cat /sys/devices/system/xen_memory/xen_memory0/target_kb
>>> 716924
>>> # cat /sys/devices/system/xen_memory/xen_memory0/info/current_kb
>>> 716400
>>>
>>> Doing `cat info/current_kb > target_kb` "fixes" the issue. But still,
>>> something is wrong - on earlier kernel (5.4.143 to be precise), it
>>> wasn't spinning, with exactly the same values reported in sysfs. It
>>> shouldn't run in circles if it can't get that much memory it wants. I
>>> strongly suspect "xen/balloon: use a kernel thread instead a workqueue"
>>> or related commit being responsible, but I haven't verified it.
>>
>> I think you are right. I need to handle the BP_ECANCELED case similar to
>> BP_EAGAIN in the kernel thread (wait until target size changes again).
>>
>> One further question: do you see any kernel message in the guest related
>> to the looping balloon thread?
> 
> Nothing, only the usual "xen:balloon: Initialising balloon driver", and
> nothing related to balloon after that.

Could you try the attached patch, please? I've tested it briefly with
PV and PVH guests.


Juergen


[-- Attachment #1.1.2: 0001-xen-balloon-fix-cancelled-balloon-action.patch --]
[-- Type: text/x-patch, Size: 2117 bytes --]

From c0901b425d5939b7f3ce6c3f4bb7a0161b819745 Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@suse.com>
Date: Mon, 4 Oct 2021 17:05:48 +0200
Subject: [PATCH] xen/balloon: fix cancelled balloon action
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

In case a ballooning action is cancelled the new kernel thread handling
the ballooning might end up in a busy loop.

Fix that by handling the cancelled action gracefully.

Cc: stable@vger.kernel.org
Fixes: 8480ed9c2bbd56 ("xen/balloon: use a kernel thread instead a workqueue")
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
---
 drivers/xen/balloon.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 43ebfe36ac27..3a50f097ed3e 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -491,12 +491,12 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
 }
 
 /*
- * Stop waiting if either state is not BP_EAGAIN and ballooning action is
- * needed, or if the credit has changed while state is BP_EAGAIN.
+ * Stop waiting if either state is BP_DONE and ballooning action is
+ * needed, or if the credit has changed while state is not BP_DONE.
  */
 static bool balloon_thread_cond(enum bp_state state, long credit)
 {
-	if (state != BP_EAGAIN)
+	if (state == BP_DONE)
 		credit = 0;
 
 	return current_credit() != credit || kthread_should_stop();
@@ -516,10 +516,19 @@ static int balloon_thread(void *unused)
 
 	set_freezable();
 	for (;;) {
-		if (state == BP_EAGAIN)
-			timeout = balloon_stats.schedule_delay * HZ;
-		else
+		switch (state) {
+		case BP_DONE:
+		case BP_ECANCELED:
 			timeout = 3600 * HZ;
+			break;
+		case BP_EAGAIN:
+			timeout = balloon_stats.schedule_delay * HZ;
+			break;
+		case BP_WAIT:
+			timeout = HZ;
+			break;
+		}
+
 		credit = current_credit();
 
 		wait_event_freezable_timeout(balloon_thread_wq,
-- 
2.26.2


[-- Attachment #1.1.3: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

  reply	other threads:[~2021-10-05  8:05 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-03  4:47 xen-balloon thread using 100% of CPU, regression in 5.4.150 Marek Marczykowski-Górecki
2021-10-04  5:31 ` Juergen Gross
2021-10-04  9:14   ` Marek Marczykowski-Górecki
2021-10-05  8:05     ` Juergen Gross [this message]
2021-10-05 13:31       ` Marek Marczykowski-Górecki
2021-10-05 13:33       ` Jason Andryuk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f707c956-6cdc-9b32-5f22-227e0f5a9f10@suse.com \
    --to=jgross@suse.com \
    --cc=jbeulich@suse.com \
    --cc=marmarek@invisiblethingslab.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.