All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] timers: move back migrate_timers_from_cpu() invocation
@ 2019-04-11 10:45 ` Jan Beulich
  0 siblings, 0 replies; 10+ messages in thread
From: Jan Beulich @ 2019-04-11 10:45 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall

Commit 597fbb8be6 ("xen/timers: Fix memory leak with cpu unplug/plug")
went a little too far: Migrating timers away from a CPU being offlined
needs to heppen independent of whether it get parked or fully offlined.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/common/timer.c
+++ b/xen/common/timer.c
@@ -619,8 +619,6 @@ static void free_percpu_timers(unsigned
 {
     struct timers *ts = &per_cpu(timers, cpu);
 
-    migrate_timers_from_cpu(cpu);
-
     ASSERT(heap_metadata(ts->heap)->size == 0);
     if ( heap_metadata(ts->heap)->limit )
     {
@@ -648,6 +646,8 @@ static int cpu_callback(
     case CPU_UP_CANCELED:
     case CPU_DEAD:
     case CPU_RESUME_FAILED:
+        migrate_timers_from_cpu(cpu);
+
         if ( !park_offline_cpus && system_state != SYS_STATE_suspend )
             free_percpu_timers(cpu);
         break;



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Xen-devel] [PATCH] timers: move back migrate_timers_from_cpu() invocation
@ 2019-04-11 10:45 ` Jan Beulich
  0 siblings, 0 replies; 10+ messages in thread
From: Jan Beulich @ 2019-04-11 10:45 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall

Commit 597fbb8be6 ("xen/timers: Fix memory leak with cpu unplug/plug")
went a little too far: Migrating timers away from a CPU being offlined
needs to heppen independent of whether it get parked or fully offlined.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/common/timer.c
+++ b/xen/common/timer.c
@@ -619,8 +619,6 @@ static void free_percpu_timers(unsigned
 {
     struct timers *ts = &per_cpu(timers, cpu);
 
-    migrate_timers_from_cpu(cpu);
-
     ASSERT(heap_metadata(ts->heap)->size == 0);
     if ( heap_metadata(ts->heap)->limit )
     {
@@ -648,6 +646,8 @@ static int cpu_callback(
     case CPU_UP_CANCELED:
     case CPU_DEAD:
     case CPU_RESUME_FAILED:
+        migrate_timers_from_cpu(cpu);
+
         if ( !park_offline_cpus && system_state != SYS_STATE_suspend )
             free_percpu_timers(cpu);
         break;



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] timers: move back migrate_timers_from_cpu() invocation
@ 2019-04-11 10:47   ` Andrew Cooper
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Cooper @ 2019-04-11 10:47 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall

On 11/04/2019 11:45, Jan Beulich wrote:
> Commit 597fbb8be6 ("xen/timers: Fix memory leak with cpu unplug/plug")
> went a little too far: Migrating timers away from a CPU being offlined
> needs to heppen independent of whether it get parked or fully offlined.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/xen/common/timer.c
> +++ b/xen/common/timer.c
> @@ -619,8 +619,6 @@ static void free_percpu_timers(unsigned
>  {
>      struct timers *ts = &per_cpu(timers, cpu);
>  
> -    migrate_timers_from_cpu(cpu);
> -
>      ASSERT(heap_metadata(ts->heap)->size == 0);
>      if ( heap_metadata(ts->heap)->limit )
>      {
> @@ -648,6 +646,8 @@ static int cpu_callback(
>      case CPU_UP_CANCELED:
>      case CPU_DEAD:
>      case CPU_RESUME_FAILED:
> +        migrate_timers_from_cpu(cpu);
> +
>          if ( !park_offline_cpus && system_state != SYS_STATE_suspend )
>              free_percpu_timers(cpu);
>          break;

I'm pretty sure you also need a call in the REMOVE_CASE.  The cpu comes
online for long enough to potentially gain a timer.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH] timers: move back migrate_timers_from_cpu() invocation
@ 2019-04-11 10:47   ` Andrew Cooper
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Cooper @ 2019-04-11 10:47 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall

On 11/04/2019 11:45, Jan Beulich wrote:
> Commit 597fbb8be6 ("xen/timers: Fix memory leak with cpu unplug/plug")
> went a little too far: Migrating timers away from a CPU being offlined
> needs to heppen independent of whether it get parked or fully offlined.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/xen/common/timer.c
> +++ b/xen/common/timer.c
> @@ -619,8 +619,6 @@ static void free_percpu_timers(unsigned
>  {
>      struct timers *ts = &per_cpu(timers, cpu);
>  
> -    migrate_timers_from_cpu(cpu);
> -
>      ASSERT(heap_metadata(ts->heap)->size == 0);
>      if ( heap_metadata(ts->heap)->limit )
>      {
> @@ -648,6 +646,8 @@ static int cpu_callback(
>      case CPU_UP_CANCELED:
>      case CPU_DEAD:
>      case CPU_RESUME_FAILED:
> +        migrate_timers_from_cpu(cpu);
> +
>          if ( !park_offline_cpus && system_state != SYS_STATE_suspend )
>              free_percpu_timers(cpu);
>          break;

I'm pretty sure you also need a call in the REMOVE_CASE.  The cpu comes
online for long enough to potentially gain a timer.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] timers: move back migrate_timers_from_cpu() invocation
@ 2019-04-11 10:55     ` Jan Beulich
  0 siblings, 0 replies; 10+ messages in thread
From: Jan Beulich @ 2019-04-11 10:55 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall, xen-devel

>>> On 11.04.19 at 12:47, <andrew.cooper3@citrix.com> wrote:
> On 11/04/2019 11:45, Jan Beulich wrote:
>> @@ -648,6 +646,8 @@ static int cpu_callback(
>>      case CPU_UP_CANCELED:
>>      case CPU_DEAD:
>>      case CPU_RESUME_FAILED:
>> +        migrate_timers_from_cpu(cpu);
>> +
>>          if ( !park_offline_cpus && system_state != SYS_STATE_suspend )
>>              free_percpu_timers(cpu);
>>          break;
> 
> I'm pretty sure you also need a call in the REMOVE_CASE.  The cpu comes
> online for long enough to potentially gain a timer.

What do you mean by "comes online for long enough"? There's
no call site for the notifier with CPU_REMOVE right now, so what
the eventual behavior is going to be is simply unknown. I for one
don't expect the CPU to be brought back up again in that case.
I'm wondering if you're mixing this up with CPU_RESUME_FAILED.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH] timers: move back migrate_timers_from_cpu() invocation
@ 2019-04-11 10:55     ` Jan Beulich
  0 siblings, 0 replies; 10+ messages in thread
From: Jan Beulich @ 2019-04-11 10:55 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall, xen-devel

>>> On 11.04.19 at 12:47, <andrew.cooper3@citrix.com> wrote:
> On 11/04/2019 11:45, Jan Beulich wrote:
>> @@ -648,6 +646,8 @@ static int cpu_callback(
>>      case CPU_UP_CANCELED:
>>      case CPU_DEAD:
>>      case CPU_RESUME_FAILED:
>> +        migrate_timers_from_cpu(cpu);
>> +
>>          if ( !park_offline_cpus && system_state != SYS_STATE_suspend )
>>              free_percpu_timers(cpu);
>>          break;
> 
> I'm pretty sure you also need a call in the REMOVE_CASE.  The cpu comes
> online for long enough to potentially gain a timer.

What do you mean by "comes online for long enough"? There's
no call site for the notifier with CPU_REMOVE right now, so what
the eventual behavior is going to be is simply unknown. I for one
don't expect the CPU to be brought back up again in that case.
I'm wondering if you're mixing this up with CPU_RESUME_FAILED.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] timers: move back migrate_timers_from_cpu() invocation
@ 2019-04-11 20:03       ` Andrew Cooper
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Cooper @ 2019-04-11 20:03 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall, xen-devel

On 11/04/2019 11:55, Jan Beulich wrote:
>>>> On 11.04.19 at 12:47, <andrew.cooper3@citrix.com> wrote:
>> On 11/04/2019 11:45, Jan Beulich wrote:
>>> @@ -648,6 +646,8 @@ static int cpu_callback(
>>>      case CPU_UP_CANCELED:
>>>      case CPU_DEAD:
>>>      case CPU_RESUME_FAILED:
>>> +        migrate_timers_from_cpu(cpu);
>>> +
>>>          if ( !park_offline_cpus && system_state != SYS_STATE_suspend )
>>>              free_percpu_timers(cpu);
>>>          break;
>> I'm pretty sure you also need a call in the REMOVE_CASE.  The cpu comes
>> online for long enough to potentially gain a timer.
> What do you mean by "comes online for long enough"? There's
> no call site for the notifier with CPU_REMOVE right now, so what
> the eventual behavior is going to be is simply unknown. I for one
> don't expect the CPU to be brought back up again in that case.
> I'm wondering if you're mixing this up with CPU_RESUME_FAILED.

Now I'm even more confused, but yes clearly - it wasn't the CPU_REMOVE case.

That said, by making this adjustment, you are now liable to hit an
assertion in the REMOVE case, which on a release build will result in
complete memory corruption, as it frees an in-use datastructure.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH] timers: move back migrate_timers_from_cpu() invocation
@ 2019-04-11 20:03       ` Andrew Cooper
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Cooper @ 2019-04-11 20:03 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall, xen-devel

On 11/04/2019 11:55, Jan Beulich wrote:
>>>> On 11.04.19 at 12:47, <andrew.cooper3@citrix.com> wrote:
>> On 11/04/2019 11:45, Jan Beulich wrote:
>>> @@ -648,6 +646,8 @@ static int cpu_callback(
>>>      case CPU_UP_CANCELED:
>>>      case CPU_DEAD:
>>>      case CPU_RESUME_FAILED:
>>> +        migrate_timers_from_cpu(cpu);
>>> +
>>>          if ( !park_offline_cpus && system_state != SYS_STATE_suspend )
>>>              free_percpu_timers(cpu);
>>>          break;
>> I'm pretty sure you also need a call in the REMOVE_CASE.  The cpu comes
>> online for long enough to potentially gain a timer.
> What do you mean by "comes online for long enough"? There's
> no call site for the notifier with CPU_REMOVE right now, so what
> the eventual behavior is going to be is simply unknown. I for one
> don't expect the CPU to be brought back up again in that case.
> I'm wondering if you're mixing this up with CPU_RESUME_FAILED.

Now I'm even more confused, but yes clearly - it wasn't the CPU_REMOVE case.

That said, by making this adjustment, you are now liable to hit an
assertion in the REMOVE case, which on a release build will result in
complete memory corruption, as it frees an in-use datastructure.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] timers: move back migrate_timers_from_cpu() invocation
@ 2019-04-12 11:27         ` Jan Beulich
  0 siblings, 0 replies; 10+ messages in thread
From: Jan Beulich @ 2019-04-12 11:27 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall, xen-devel

>>> On 11.04.19 at 22:03, <andrew.cooper3@citrix.com> wrote:
> On 11/04/2019 11:55, Jan Beulich wrote:
>>>>> On 11.04.19 at 12:47, <andrew.cooper3@citrix.com> wrote:
>>> On 11/04/2019 11:45, Jan Beulich wrote:
>>>> @@ -648,6 +646,8 @@ static int cpu_callback(
>>>>      case CPU_UP_CANCELED:
>>>>      case CPU_DEAD:
>>>>      case CPU_RESUME_FAILED:
>>>> +        migrate_timers_from_cpu(cpu);
>>>> +
>>>>          if ( !park_offline_cpus && system_state != SYS_STATE_suspend )
>>>>              free_percpu_timers(cpu);
>>>>          break;
>>> I'm pretty sure you also need a call in the REMOVE_CASE.  The cpu comes
>>> online for long enough to potentially gain a timer.
>> What do you mean by "comes online for long enough"? There's
>> no call site for the notifier with CPU_REMOVE right now, so what
>> the eventual behavior is going to be is simply unknown. I for one
>> don't expect the CPU to be brought back up again in that case.
>> I'm wondering if you're mixing this up with CPU_RESUME_FAILED.
> 
> Now I'm even more confused, but yes clearly - it wasn't the CPU_REMOVE case.
> 
> That said, by making this adjustment, you are now liable to hit an
> assertion in the REMOVE case, which on a release build will result in
> complete memory corruption, as it frees an in-use datastructure.

Now I'm afraid I'm confused as well: Which assertion would trigger,
and which in-use data structure would get freed? For an offline
(and parked) CPU, the heap pointer would consistently point to
dummy_heap at all times.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH] timers: move back migrate_timers_from_cpu() invocation
@ 2019-04-12 11:27         ` Jan Beulich
  0 siblings, 0 replies; 10+ messages in thread
From: Jan Beulich @ 2019-04-12 11:27 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall, xen-devel

>>> On 11.04.19 at 22:03, <andrew.cooper3@citrix.com> wrote:
> On 11/04/2019 11:55, Jan Beulich wrote:
>>>>> On 11.04.19 at 12:47, <andrew.cooper3@citrix.com> wrote:
>>> On 11/04/2019 11:45, Jan Beulich wrote:
>>>> @@ -648,6 +646,8 @@ static int cpu_callback(
>>>>      case CPU_UP_CANCELED:
>>>>      case CPU_DEAD:
>>>>      case CPU_RESUME_FAILED:
>>>> +        migrate_timers_from_cpu(cpu);
>>>> +
>>>>          if ( !park_offline_cpus && system_state != SYS_STATE_suspend )
>>>>              free_percpu_timers(cpu);
>>>>          break;
>>> I'm pretty sure you also need a call in the REMOVE_CASE.  The cpu comes
>>> online for long enough to potentially gain a timer.
>> What do you mean by "comes online for long enough"? There's
>> no call site for the notifier with CPU_REMOVE right now, so what
>> the eventual behavior is going to be is simply unknown. I for one
>> don't expect the CPU to be brought back up again in that case.
>> I'm wondering if you're mixing this up with CPU_RESUME_FAILED.
> 
> Now I'm even more confused, but yes clearly - it wasn't the CPU_REMOVE case.
> 
> That said, by making this adjustment, you are now liable to hit an
> assertion in the REMOVE case, which on a release build will result in
> complete memory corruption, as it frees an in-use datastructure.

Now I'm afraid I'm confused as well: Which assertion would trigger,
and which in-use data structure would get freed? For an offline
(and parked) CPU, the heap pointer would consistently point to
dummy_heap at all times.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-04-12 11:27 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-11 10:45 [PATCH] timers: move back migrate_timers_from_cpu() invocation Jan Beulich
2019-04-11 10:45 ` [Xen-devel] " Jan Beulich
2019-04-11 10:47 ` Andrew Cooper
2019-04-11 10:47   ` [Xen-devel] " Andrew Cooper
2019-04-11 10:55   ` Jan Beulich
2019-04-11 10:55     ` [Xen-devel] " Jan Beulich
2019-04-11 20:03     ` Andrew Cooper
2019-04-11 20:03       ` [Xen-devel] " Andrew Cooper
2019-04-12 11:27       ` Jan Beulich
2019-04-12 11:27         ` [Xen-devel] " Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.