All of lore.kernel.org
 help / color / mirror / Atom feed
* [PULL for 7.1 0/3] memory leak and testing tweaks
@ 2022-08-16 12:26 Alex Bennée
  2022-08-16 12:26 ` [PULL 1/3] linux-user: un-parent OBJECT(cpu) when closing thread Alex Bennée
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Alex Bennée @ 2022-08-16 12:26 UTC (permalink / raw)
  To: peter.maydell, richard.henderson; +Cc: qemu-devel, Alex Bennée

The following changes since commit d102b8162a1e5fe8288d4d5c01801ce6536ac2d1:

  Merge tag 'pull-la-20220814' of https://gitlab.com/rth7680/qemu into staging (2022-08-14 08:48:11 -0500)

are available in the Git repository at:

  https://github.com/stsquad/qemu.git tags/pull-for-7.1-fixes-160822-1

for you to fetch changes up to 65711f9a87874a9ec61108c6009f8baec07c8b0d:

  tests/avocado: apply a band aid to aspeed-evb login (2022-08-16 09:57:12 +0100)

----------------------------------------------------------------
A few small fixes:

  - properly un-parent OBJECT(cpu) when closing -user thread
  - add missing timeout to aspeed tests
  - reduce raciness of login: prompt handling for aspeed tests

----------------------------------------------------------------
Alex Bennée (3):
      linux-user: un-parent OBJECT(cpu) when closing thread
      tests/avocado: add timeout to the aspeed tests
      tests/avocado: apply a band aid to aspeed-evb login

 linux-user/syscall.c            | 13 +++++++------
 tests/avocado/machine_aspeed.py |  4 ++++
 2 files changed, 11 insertions(+), 6 deletions(-)


-- 
2.30.2



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PULL 1/3] linux-user: un-parent OBJECT(cpu) when closing thread
  2022-08-16 12:26 [PULL for 7.1 0/3] memory leak and testing tweaks Alex Bennée
@ 2022-08-16 12:26 ` Alex Bennée
  2022-08-19  1:02   ` Richard Henderson
  2022-08-16 12:26 ` [PULL 2/3] tests/avocado: add timeout to the aspeed tests Alex Bennée
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Alex Bennée @ 2022-08-16 12:26 UTC (permalink / raw)
  To: peter.maydell, richard.henderson
  Cc: qemu-devel, Alex Bennée, Laurent Vivier

While forcing the CPU to unrealize by hand does trigger the clean-up
code we never fully free resources because refcount never reaches
zero. This is because QOM automatically added objects without an
explicit parent to /unattached/, incrementing the refcount.

Instead of manually triggering unrealization just unparent the object
and let the device machinery deal with that for us.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/866
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220811151413.3350684-2-alex.bennee@linaro.org>

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index f409121202..bfdd60136b 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -8594,7 +8594,13 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1,
         if (CPU_NEXT(first_cpu)) {
             TaskState *ts = cpu->opaque;
 
-            object_property_set_bool(OBJECT(cpu), "realized", false, NULL);
+            if (ts->child_tidptr) {
+                put_user_u32(0, ts->child_tidptr);
+                do_sys_futex(g2h(cpu, ts->child_tidptr),
+                             FUTEX_WAKE, INT_MAX, NULL, NULL, 0);
+            }
+
+            object_unparent(OBJECT(cpu));
             object_unref(OBJECT(cpu));
             /*
              * At this point the CPU should be unrealized and removed
@@ -8604,11 +8610,6 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1,
 
             pthread_mutex_unlock(&clone_lock);
 
-            if (ts->child_tidptr) {
-                put_user_u32(0, ts->child_tidptr);
-                do_sys_futex(g2h(cpu, ts->child_tidptr),
-                             FUTEX_WAKE, INT_MAX, NULL, NULL, 0);
-            }
             thread_cpu = NULL;
             g_free(ts);
             rcu_unregister_thread();
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PULL 2/3] tests/avocado: add timeout to the aspeed tests
  2022-08-16 12:26 [PULL for 7.1 0/3] memory leak and testing tweaks Alex Bennée
  2022-08-16 12:26 ` [PULL 1/3] linux-user: un-parent OBJECT(cpu) when closing thread Alex Bennée
@ 2022-08-16 12:26 ` Alex Bennée
  2022-08-16 12:32   ` Peter Maydell
  2022-08-16 12:26 ` [PULL 3/3] tests/avocado: apply a band aid to aspeed-evb login Alex Bennée
  2022-08-16 15:58 ` [PULL for 7.1 0/3] memory leak and testing tweaks Richard Henderson
  3 siblings, 1 reply; 11+ messages in thread
From: Alex Bennée @ 2022-08-16 12:26 UTC (permalink / raw)
  To: peter.maydell, richard.henderson
  Cc: qemu-devel, Alex Bennée, Philippe Mathieu-Daudé,
	Cleber Rosa, Wainer dos Santos Moschetta, Beraldo Leal

On some systems the test can hang. At least defining a timeout stops
it from hanging forever.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220811151413.3350684-7-alex.bennee@linaro.org>

diff --git a/tests/avocado/machine_aspeed.py b/tests/avocado/machine_aspeed.py
index b4e35a3d07..c54da0fd8f 100644
--- a/tests/avocado/machine_aspeed.py
+++ b/tests/avocado/machine_aspeed.py
@@ -40,6 +40,8 @@ def test_ast1030_zephyros(self):
 
 class AST2x00Machine(QemuSystemTest):
 
+    timeout = 90
+
     def wait_for_console_pattern(self, success_message, vm=None):
         wait_for_console_pattern(self, success_message,
                                  failure_message='Kernel panic - not syncing',
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PULL 3/3] tests/avocado: apply a band aid to aspeed-evb login
  2022-08-16 12:26 [PULL for 7.1 0/3] memory leak and testing tweaks Alex Bennée
  2022-08-16 12:26 ` [PULL 1/3] linux-user: un-parent OBJECT(cpu) when closing thread Alex Bennée
  2022-08-16 12:26 ` [PULL 2/3] tests/avocado: add timeout to the aspeed tests Alex Bennée
@ 2022-08-16 12:26 ` Alex Bennée
  2022-08-16 15:58 ` [PULL for 7.1 0/3] memory leak and testing tweaks Richard Henderson
  3 siblings, 0 replies; 11+ messages in thread
From: Alex Bennée @ 2022-08-16 12:26 UTC (permalink / raw)
  To: peter.maydell, richard.henderson
  Cc: qemu-devel, Alex Bennée, Cédric Le Goater, John Snow,
	Cleber Rosa, Philippe Mathieu-Daudé,
	Wainer dos Santos Moschetta, Beraldo Leal

This is really a limitation of the underlying console code which
doesn't allow us to detect the login: and following "#" prompts
because it reads input line wise. By adding a small delay we ensure
that the login prompt has appeared so we don't accidentally spaff the
shell commands to a confused getty in the guest.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Acked-by: John Snow <jsnow@redhat.com>
Message-Id: <20220811151413.3350684-8-alex.bennee@linaro.org>

diff --git a/tests/avocado/machine_aspeed.py b/tests/avocado/machine_aspeed.py
index c54da0fd8f..65d38f4efa 100644
--- a/tests/avocado/machine_aspeed.py
+++ b/tests/avocado/machine_aspeed.py
@@ -101,7 +101,9 @@ def do_test_arm_aspeed_buidroot_start(self, image, cpu_id):
         self.wait_for_console_pattern('Starting kernel ...')
         self.wait_for_console_pattern('Booting Linux on physical CPU ' + cpu_id)
         self.wait_for_console_pattern('lease of 10.0.2.15')
+        # the line before login:
         self.wait_for_console_pattern('Aspeed EVB')
+        time.sleep(0.1)
         exec_command(self, 'root')
         time.sleep(0.1)
 
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PULL 2/3] tests/avocado: add timeout to the aspeed tests
  2022-08-16 12:26 ` [PULL 2/3] tests/avocado: add timeout to the aspeed tests Alex Bennée
@ 2022-08-16 12:32   ` Peter Maydell
  2022-08-16 13:31     ` Alex Bennée
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Maydell @ 2022-08-16 12:32 UTC (permalink / raw)
  To: Alex Bennée
  Cc: richard.henderson, qemu-devel, Philippe Mathieu-Daudé,
	Cleber Rosa, Wainer dos Santos Moschetta, Beraldo Leal

On Tue, 16 Aug 2022 at 13:26, Alex Bennée <alex.bennee@linaro.org> wrote:
>
> On some systems the test can hang. At least defining a timeout stops
> it from hanging forever.

Aha. Yeah, I've seen this test hang forever sometimes.

Is there some place (in the superclass??) that we can put a
default timeout that applies to *all* avocado tests, so we
don't have the risk of forgetting it in a particular test?

-- PMM


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PULL 2/3] tests/avocado: add timeout to the aspeed tests
  2022-08-16 12:32   ` Peter Maydell
@ 2022-08-16 13:31     ` Alex Bennée
  2022-08-16 13:40       ` Peter Maydell
  0 siblings, 1 reply; 11+ messages in thread
From: Alex Bennée @ 2022-08-16 13:31 UTC (permalink / raw)
  To: Peter Maydell
  Cc: richard.henderson, qemu-devel, Philippe Mathieu-Daudé,
	Cleber Rosa, Wainer dos Santos Moschetta, Beraldo Leal


Peter Maydell <peter.maydell@linaro.org> writes:

> On Tue, 16 Aug 2022 at 13:26, Alex Bennée <alex.bennee@linaro.org> wrote:
>>
>> On some systems the test can hang. At least defining a timeout stops
>> it from hanging forever.
>
> Aha. Yeah, I've seen this test hang forever sometimes.
>
> Is there some place (in the superclass??) that we can put a
> default timeout that applies to *all* avocado tests, so we
> don't have the risk of forgetting it in a particular test?

It's a bit muddy. Most tests are sub-classed on LinuxTest which does
define a default timeout:

  class LinuxTest(LinuxSSHMixIn, QemuSystemTest):
      """Facilitates having a cloud-image Linux based available.

      For tests that indent to interact with guests, this is a better choice
      to start with than the more vanilla `QemuSystemTest` class.
      """

      timeout = 900
      distro = None
      username = 'root'
      password = 'password'
      smp = '2'
      memory = '1024'

However the aspeed tests are directly derived from QemuSystemTest.
Perhaps we should just move the timeout down to that or maybe
QemuBaseTest?

>
> -- PMM


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PULL 2/3] tests/avocado: add timeout to the aspeed tests
  2022-08-16 13:31     ` Alex Bennée
@ 2022-08-16 13:40       ` Peter Maydell
  0 siblings, 0 replies; 11+ messages in thread
From: Peter Maydell @ 2022-08-16 13:40 UTC (permalink / raw)
  To: Alex Bennée
  Cc: richard.henderson, qemu-devel, Philippe Mathieu-Daudé,
	Cleber Rosa, Wainer dos Santos Moschetta, Beraldo Leal

On Tue, 16 Aug 2022 at 14:34, Alex Bennée <alex.bennee@linaro.org> wrote:
> Peter Maydell <peter.maydell@linaro.org> writes:
> > Is there some place (in the superclass??) that we can put a
> > default timeout that applies to *all* avocado tests, so we
> > don't have the risk of forgetting it in a particular test?
>
> It's a bit muddy. Most tests are sub-classed on LinuxTest which does
> define a default timeout:
>
>   class LinuxTest(LinuxSSHMixIn, QemuSystemTest):
>       """Facilitates having a cloud-image Linux based available.
>
>       For tests that indent to interact with guests, this is a better choice
>       to start with than the more vanilla `QemuSystemTest` class.
>       """
>
>       timeout = 900
>       distro = None
>       username = 'root'
>       password = 'password'
>       smp = '2'
>       memory = '1024'
>
> However the aspeed tests are directly derived from QemuSystemTest.
> Perhaps we should just move the timeout down to that or maybe
> QemuBaseTest?

Ideally, we should do it at whatever level ensures it is applied
to every single test that 'check-avocado' runs, regardless of how
the test was written. "QemuBaseTest" still sounds a bit higher
than the absolute basic "this is a test" level, but maybe that's
the lowest level we have access to?

thanks
-- PMM


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PULL for 7.1 0/3] memory leak and testing tweaks
  2022-08-16 12:26 [PULL for 7.1 0/3] memory leak and testing tweaks Alex Bennée
                   ` (2 preceding siblings ...)
  2022-08-16 12:26 ` [PULL 3/3] tests/avocado: apply a band aid to aspeed-evb login Alex Bennée
@ 2022-08-16 15:58 ` Richard Henderson
  3 siblings, 0 replies; 11+ messages in thread
From: Richard Henderson @ 2022-08-16 15:58 UTC (permalink / raw)
  To: Alex Bennée, peter.maydell; +Cc: qemu-devel

On 8/16/22 07:26, Alex Bennée wrote:
> The following changes since commit d102b8162a1e5fe8288d4d5c01801ce6536ac2d1:
> 
>    Merge tag 'pull-la-20220814' of https://gitlab.com/rth7680/qemu into staging (2022-08-14 08:48:11 -0500)
> 
> are available in the Git repository at:
> 
>    https://github.com/stsquad/qemu.git tags/pull-for-7.1-fixes-160822-1
> 
> for you to fetch changes up to 65711f9a87874a9ec61108c6009f8baec07c8b0d:
> 
>    tests/avocado: apply a band aid to aspeed-evb login (2022-08-16 09:57:12 +0100)
> 
> ----------------------------------------------------------------
> A few small fixes:
> 
>    - properly un-parent OBJECT(cpu) when closing -user thread
>    - add missing timeout to aspeed tests
>    - reduce raciness of login: prompt handling for aspeed tests

Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/7.1 as appropriate.


r~



> 
> ----------------------------------------------------------------
> Alex Bennée (3):
>        linux-user: un-parent OBJECT(cpu) when closing thread
>        tests/avocado: add timeout to the aspeed tests
>        tests/avocado: apply a band aid to aspeed-evb login
> 
>   linux-user/syscall.c            | 13 +++++++------
>   tests/avocado/machine_aspeed.py |  4 ++++
>   2 files changed, 11 insertions(+), 6 deletions(-)
> 
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PULL 1/3] linux-user: un-parent OBJECT(cpu) when closing thread
  2022-08-16 12:26 ` [PULL 1/3] linux-user: un-parent OBJECT(cpu) when closing thread Alex Bennée
@ 2022-08-19  1:02   ` Richard Henderson
  2022-08-19  8:37     ` Alex Bennée
  0 siblings, 1 reply; 11+ messages in thread
From: Richard Henderson @ 2022-08-19  1:02 UTC (permalink / raw)
  To: Alex Bennée, peter.maydell; +Cc: qemu-devel, Laurent Vivier

On 8/16/22 05:26, Alex Bennée wrote:
> While forcing the CPU to unrealize by hand does trigger the clean-up
> code we never fully free resources because refcount never reaches
> zero. This is because QOM automatically added objects without an
> explicit parent to /unattached/, incrementing the refcount.
> 
> Instead of manually triggering unrealization just unparent the object
> and let the device machinery deal with that for us.
> 
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/866
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> Reviewed-by: Laurent Vivier <laurent@vivier.eu>
> Message-Id: <20220811151413.3350684-2-alex.bennee@linaro.org>
> 
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index f409121202..bfdd60136b 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -8594,7 +8594,13 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1,
>           if (CPU_NEXT(first_cpu)) {
>               TaskState *ts = cpu->opaque;
>   
> -            object_property_set_bool(OBJECT(cpu), "realized", false, NULL);
> +            if (ts->child_tidptr) {
> +                put_user_u32(0, ts->child_tidptr);
> +                do_sys_futex(g2h(cpu, ts->child_tidptr),
> +                             FUTEX_WAKE, INT_MAX, NULL, NULL, 0);
> +            }
> +
> +            object_unparent(OBJECT(cpu));

This has caused a regression in arm/aarch64.

We hard-code ARMCPRegInfo pointers into TranslationBlocks, for calling into 
helper_{get,set}cp_reg{,64}.  So we have a race condition between whichever cpu thread 
translates the code first (encoding the pointer), and that cpu thread exiting, so that the 
next execution of the TB references a freed data structure.

We shouldn't have N copies of these pointers in the first place.  This seems like 
something that ought to be placed on the ARMCPUClass, so that it could be shared by each cpu.

But that's going to be a complex fix, so I'm reverting this for rc4.


r~


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PULL 1/3] linux-user: un-parent OBJECT(cpu) when closing thread
  2022-08-19  1:02   ` Richard Henderson
@ 2022-08-19  8:37     ` Alex Bennée
  2022-08-19 14:36       ` Richard Henderson
  0 siblings, 1 reply; 11+ messages in thread
From: Alex Bennée @ 2022-08-19  8:37 UTC (permalink / raw)
  To: Richard Henderson; +Cc: peter.maydell, qemu-devel, Laurent Vivier


Richard Henderson <richard.henderson@linaro.org> writes:

> On 8/16/22 05:26, Alex Bennée wrote:
>> While forcing the CPU to unrealize by hand does trigger the clean-up
>> code we never fully free resources because refcount never reaches
>> zero. This is because QOM automatically added objects without an
>> explicit parent to /unattached/, incrementing the refcount.
>> Instead of manually triggering unrealization just unparent the
>> object
>> and let the device machinery deal with that for us.
>> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/866
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> Reviewed-by: Laurent Vivier <laurent@vivier.eu>
>> Message-Id: <20220811151413.3350684-2-alex.bennee@linaro.org>
>> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
>> index f409121202..bfdd60136b 100644
>> --- a/linux-user/syscall.c
>> +++ b/linux-user/syscall.c
>> @@ -8594,7 +8594,13 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1,
>>           if (CPU_NEXT(first_cpu)) {
>>               TaskState *ts = cpu->opaque;
>>   -            object_property_set_bool(OBJECT(cpu), "realized",
>> false, NULL);
>> +            if (ts->child_tidptr) {
>> +                put_user_u32(0, ts->child_tidptr);
>> +                do_sys_futex(g2h(cpu, ts->child_tidptr),
>> +                             FUTEX_WAKE, INT_MAX, NULL, NULL, 0);
>> +            }
>> +
>> +            object_unparent(OBJECT(cpu));
>
> This has caused a regression in arm/aarch64.
>
> We hard-code ARMCPRegInfo pointers into TranslationBlocks, for calling
> into helper_{get,set}cp_reg{,64}.  So we have a race condition between
> whichever cpu thread translates the code first (encoding the pointer),
> and that cpu thread exiting, so that the next execution of the TB
> references a freed data structure.

What is the test case that breaks this? I guess a multi-threaded
sysregs.c would trigger it?

> We shouldn't have N copies of these pointers in the first place.  This
> seems like something that ought to be placed on the ARMCPUClass, so
> that it could be shared by each cpu.
>
> But that's going to be a complex fix, so I'm reverting this for rc4.
>
>
> r~


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PULL 1/3] linux-user: un-parent OBJECT(cpu) when closing thread
  2022-08-19  8:37     ` Alex Bennée
@ 2022-08-19 14:36       ` Richard Henderson
  0 siblings, 0 replies; 11+ messages in thread
From: Richard Henderson @ 2022-08-19 14:36 UTC (permalink / raw)
  To: Alex Bennée; +Cc: peter.maydell, qemu-devel, Laurent Vivier

On 8/19/22 01:37, Alex Bennée wrote:
>> This has caused a regression in arm/aarch64.
>>
>> We hard-code ARMCPRegInfo pointers into TranslationBlocks, for calling
>> into helper_{get,set}cp_reg{,64}.  So we have a race condition between
>> whichever cpu thread translates the code first (encoding the pointer),
>> and that cpu thread exiting, so that the next execution of the TB
>> references a freed data structure.
> 
> What is the test case that breaks this? I guess a multi-threaded
> sysregs.c would trigger it?

E.g. tests/tcg/aarch64-linux-user/signals.


r~


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-08-19 14:37 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-16 12:26 [PULL for 7.1 0/3] memory leak and testing tweaks Alex Bennée
2022-08-16 12:26 ` [PULL 1/3] linux-user: un-parent OBJECT(cpu) when closing thread Alex Bennée
2022-08-19  1:02   ` Richard Henderson
2022-08-19  8:37     ` Alex Bennée
2022-08-19 14:36       ` Richard Henderson
2022-08-16 12:26 ` [PULL 2/3] tests/avocado: add timeout to the aspeed tests Alex Bennée
2022-08-16 12:32   ` Peter Maydell
2022-08-16 13:31     ` Alex Bennée
2022-08-16 13:40       ` Peter Maydell
2022-08-16 12:26 ` [PULL 3/3] tests/avocado: apply a band aid to aspeed-evb login Alex Bennée
2022-08-16 15:58 ` [PULL for 7.1 0/3] memory leak and testing tweaks Richard Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.