All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xen-devel] [PATCH v2] x86/cpu: Sync any remaining RCU callbacks after CPU up/down
@ 2020-02-19 17:25 Igor Druzhinin
  2020-02-21 15:46 ` Jürgen Groß
  2020-02-21 16:29 ` Jan Beulich
  0 siblings, 2 replies; 10+ messages in thread
From: Igor Druzhinin @ 2020-02-19 17:25 UTC (permalink / raw)
  To: xen-devel; +Cc: jgross, Igor Druzhinin, wl, andrew.cooper3, jbeulich, roger.pau

During CPU down operation RCU callbacks are scheduled to finish
off some actions later as soon as CPU is fully dead (the same applies
to CPU up operation in case error path is taken). If in the same grace
period another CPU up operation is performed on the same CPU, RCU callback
will be called later on a CPU in a potentially wrong (already up again
instead of still being down) state leading to eventual state inconsistency
and/or crash.

In order to avoid it - flush RCU callbacks explicitly upon finishing off
the current operation.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
---
This got discovered trying to resume PV shim with multiple vCPUs on AMD
machine (where park_offline_cpus == 0). RCU callback responsible for
freeing percpu area on CPU offline got finally called after CPU went
online again as the guest performed regular vCPU offline/online operations
on resume.

Note: this patch requires RCU series from Juergen to be applied -
https://lists.xenproject.org/archives/html/xen-devel/2020-02/msg01221.html

v2: changed rcu_barrier() position, updated description
---
 xen/arch/x86/sysctl.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
index 4a76f0f..dd5a24f 100644
--- a/xen/arch/x86/sysctl.c
+++ b/xen/arch/x86/sysctl.c
@@ -78,8 +78,11 @@ static void l3_cache_get(void *arg)
 long cpu_up_helper(void *data)
 {
     unsigned int cpu = (unsigned long)data;
-    int ret = cpu_up(cpu);
+    int ret;
 
+    /* Flush potentially scheduled RCU work from preceding CPU offline */
+    rcu_barrier();
+    ret = cpu_up(cpu);
     if ( ret == -EBUSY )
     {
         /* On EBUSY, flush RCU work and have one more go. */
@@ -104,7 +107,11 @@ long cpu_up_helper(void *data)
 long cpu_down_helper(void *data)
 {
     int cpu = (unsigned long)data;
-    int ret = cpu_down(cpu);
+    int ret;
+
+    /* Flush potentially scheduled RCU work from preceding CPU online */
+    rcu_barrier();
+    ret = cpu_down(cpu);
     if ( ret == -EBUSY )
     {
         /* On EBUSY, flush RCU work and have one more go. */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v2] x86/cpu: Sync any remaining RCU callbacks after CPU up/down
  2020-02-19 17:25 [Xen-devel] [PATCH v2] x86/cpu: Sync any remaining RCU callbacks after CPU up/down Igor Druzhinin
@ 2020-02-21 15:46 ` Jürgen Groß
  2020-02-21 16:29 ` Jan Beulich
  1 sibling, 0 replies; 10+ messages in thread
From: Jürgen Groß @ 2020-02-21 15:46 UTC (permalink / raw)
  To: Igor Druzhinin, xen-devel; +Cc: andrew.cooper3, wl, jbeulich, roger.pau

On 19.02.20 18:25, Igor Druzhinin wrote:
> During CPU down operation RCU callbacks are scheduled to finish
> off some actions later as soon as CPU is fully dead (the same applies
> to CPU up operation in case error path is taken). If in the same grace
> period another CPU up operation is performed on the same CPU, RCU callback
> will be called later on a CPU in a potentially wrong (already up again
> instead of still being down) state leading to eventual state inconsistency
> and/or crash.
> 
> In order to avoid it - flush RCU callbacks explicitly upon finishing off
> the current operation.
> 
> Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>

Reviewed-by: Juergen Gross <jgross@suse.com>


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v2] x86/cpu: Sync any remaining RCU callbacks after CPU up/down
  2020-02-19 17:25 [Xen-devel] [PATCH v2] x86/cpu: Sync any remaining RCU callbacks after CPU up/down Igor Druzhinin
  2020-02-21 15:46 ` Jürgen Groß
@ 2020-02-21 16:29 ` Jan Beulich
  2020-02-21 19:26   ` Igor Druzhinin
  1 sibling, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2020-02-21 16:29 UTC (permalink / raw)
  To: Igor Druzhinin; +Cc: jgross, xen-devel, roger.pau, wl, andrew.cooper3

On 19.02.2020 18:25, Igor Druzhinin wrote:
> --- a/xen/arch/x86/sysctl.c
> +++ b/xen/arch/x86/sysctl.c
> @@ -78,8 +78,11 @@ static void l3_cache_get(void *arg)
>  long cpu_up_helper(void *data)
>  {
>      unsigned int cpu = (unsigned long)data;
> -    int ret = cpu_up(cpu);
> +    int ret;
>  
> +    /* Flush potentially scheduled RCU work from preceding CPU offline */
> +    rcu_barrier();
> +    ret = cpu_up(cpu);
>      if ( ret == -EBUSY )
>      {
>          /* On EBUSY, flush RCU work and have one more go. */
> @@ -104,7 +107,11 @@ long cpu_up_helper(void *data)
>  long cpu_down_helper(void *data)
>  {
>      int cpu = (unsigned long)data;
> -    int ret = cpu_down(cpu);
> +    int ret;
> +
> +    /* Flush potentially scheduled RCU work from preceding CPU online */
> +    rcu_barrier();
> +    ret = cpu_down(cpu);
>      if ( ret == -EBUSY )
>      {
>          /* On EBUSY, flush RCU work and have one more go. */
> 

There are more calls to cpu_up() / cpu_down(), most notably in
core_parking.c. Wouldn't it be better to make the barrier part
of the two functions? This would the also cover non-x86 in
case an arch wants to support offlining/onlining of CPUs.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v2] x86/cpu: Sync any remaining RCU callbacks after CPU up/down
  2020-02-21 16:29 ` Jan Beulich
@ 2020-02-21 19:26   ` Igor Druzhinin
  2020-02-24 15:33     ` Roger Pau Monné
  2020-02-25 12:22     ` Jan Beulich
  0 siblings, 2 replies; 10+ messages in thread
From: Igor Druzhinin @ 2020-02-21 19:26 UTC (permalink / raw)
  To: Jan Beulich; +Cc: jgross, xen-devel, roger.pau, wl, andrew.cooper3

On 21/02/2020 16:29, Jan Beulich wrote:
> On 19.02.2020 18:25, Igor Druzhinin wrote:
>> --- a/xen/arch/x86/sysctl.c
>> +++ b/xen/arch/x86/sysctl.c
>> @@ -78,8 +78,11 @@ static void l3_cache_get(void *arg)
>>  long cpu_up_helper(void *data)
>>  {
>>      unsigned int cpu = (unsigned long)data;
>> -    int ret = cpu_up(cpu);
>> +    int ret;
>>  
>> +    /* Flush potentially scheduled RCU work from preceding CPU offline */
>> +    rcu_barrier();
>> +    ret = cpu_up(cpu);
>>      if ( ret == -EBUSY )
>>      {
>>          /* On EBUSY, flush RCU work and have one more go. */
>> @@ -104,7 +107,11 @@ long cpu_up_helper(void *data)
>>  long cpu_down_helper(void *data)
>>  {
>>      int cpu = (unsigned long)data;
>> -    int ret = cpu_down(cpu);
>> +    int ret;
>> +
>> +    /* Flush potentially scheduled RCU work from preceding CPU online */
>> +    rcu_barrier();
>> +    ret = cpu_down(cpu);
>>      if ( ret == -EBUSY )
>>      {
>>          /* On EBUSY, flush RCU work and have one more go. */
>>
> 
> There are more calls to cpu_up() / cpu_down(), most notably in
> core_parking.c. Wouldn't it be better to make the barrier part
> of the two functions? This would the also cover non-x86 in
> case an arch wants to support offlining/onlining of CPUs.

Those functions are called from early init code and rcu_barrier() is
an expensive operation. I think it's better if caller is responsible
for syncing the state. This is the reason I moved rcu_barrier() in front
of cpu_up/down.

Igor

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v2] x86/cpu: Sync any remaining RCU callbacks after CPU up/down
  2020-02-21 19:26   ` Igor Druzhinin
@ 2020-02-24 15:33     ` Roger Pau Monné
  2020-02-25 12:22     ` Jan Beulich
  1 sibling, 0 replies; 10+ messages in thread
From: Roger Pau Monné @ 2020-02-24 15:33 UTC (permalink / raw)
  To: Igor Druzhinin; +Cc: jgross, xen-devel, wl, Jan Beulich, andrew.cooper3

On Fri, Feb 21, 2020 at 07:26:49PM +0000, Igor Druzhinin wrote:
> On 21/02/2020 16:29, Jan Beulich wrote:
> > On 19.02.2020 18:25, Igor Druzhinin wrote:
> >> --- a/xen/arch/x86/sysctl.c
> >> +++ b/xen/arch/x86/sysctl.c
> >> @@ -78,8 +78,11 @@ static void l3_cache_get(void *arg)
> >>  long cpu_up_helper(void *data)
> >>  {
> >>      unsigned int cpu = (unsigned long)data;
> >> -    int ret = cpu_up(cpu);
> >> +    int ret;
> >>  
> >> +    /* Flush potentially scheduled RCU work from preceding CPU offline */
> >> +    rcu_barrier();
> >> +    ret = cpu_up(cpu);
> >>      if ( ret == -EBUSY )
> >>      {
> >>          /* On EBUSY, flush RCU work and have one more go. */
> >> @@ -104,7 +107,11 @@ long cpu_up_helper(void *data)
> >>  long cpu_down_helper(void *data)
> >>  {
> >>      int cpu = (unsigned long)data;
> >> -    int ret = cpu_down(cpu);
> >> +    int ret;
> >> +
> >> +    /* Flush potentially scheduled RCU work from preceding CPU online */
> >> +    rcu_barrier();
> >> +    ret = cpu_down(cpu);
> >>      if ( ret == -EBUSY )
> >>      {
> >>          /* On EBUSY, flush RCU work and have one more go. */
> >>
> > 
> > There are more calls to cpu_up() / cpu_down(), most notably in
> > core_parking.c. Wouldn't it be better to make the barrier part
> > of the two functions? This would the also cover non-x86 in
> > case an arch wants to support offlining/onlining of CPUs.
> 
> Those functions are called from early init code and rcu_barrier() is
> an expensive operation. I think it's better if caller is responsible
> for syncing the state. This is the reason I moved rcu_barrier() in front
> of cpu_up/down.

Could you gate the call to rcu_barrier() on system_state >
SYS_STATE_smp_boot?

I think that would avoid calling them during early boot and smp
startup but would still allow for the call to rcu_barrier() to be
hidden inside cpu_{up/down}.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v2] x86/cpu: Sync any remaining RCU callbacks after CPU up/down
  2020-02-21 19:26   ` Igor Druzhinin
  2020-02-24 15:33     ` Roger Pau Monné
@ 2020-02-25 12:22     ` Jan Beulich
  2020-02-25 12:37       ` Igor Druzhinin
  1 sibling, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2020-02-25 12:22 UTC (permalink / raw)
  To: Igor Druzhinin; +Cc: jgross, xen-devel, roger.pau, wl, andrew.cooper3

On 21.02.2020 20:26, Igor Druzhinin wrote:
> On 21/02/2020 16:29, Jan Beulich wrote:
>> On 19.02.2020 18:25, Igor Druzhinin wrote:
>>> --- a/xen/arch/x86/sysctl.c
>>> +++ b/xen/arch/x86/sysctl.c
>>> @@ -78,8 +78,11 @@ static void l3_cache_get(void *arg)
>>>  long cpu_up_helper(void *data)
>>>  {
>>>      unsigned int cpu = (unsigned long)data;
>>> -    int ret = cpu_up(cpu);
>>> +    int ret;
>>>  
>>> +    /* Flush potentially scheduled RCU work from preceding CPU offline */
>>> +    rcu_barrier();
>>> +    ret = cpu_up(cpu);
>>>      if ( ret == -EBUSY )
>>>      {
>>>          /* On EBUSY, flush RCU work and have one more go. */
>>> @@ -104,7 +107,11 @@ long cpu_up_helper(void *data)
>>>  long cpu_down_helper(void *data)
>>>  {
>>>      int cpu = (unsigned long)data;
>>> -    int ret = cpu_down(cpu);
>>> +    int ret;
>>> +
>>> +    /* Flush potentially scheduled RCU work from preceding CPU online */
>>> +    rcu_barrier();
>>> +    ret = cpu_down(cpu);
>>>      if ( ret == -EBUSY )
>>>      {
>>>          /* On EBUSY, flush RCU work and have one more go. */
>>>
>>
>> There are more calls to cpu_up() / cpu_down(), most notably in
>> core_parking.c. Wouldn't it be better to make the barrier part
>> of the two functions? This would the also cover non-x86 in
>> case an arch wants to support offlining/onlining of CPUs.
> 
> Those functions are called from early init code and rcu_barrier() is
> an expensive operation. I think it's better if caller is responsible
> for syncing the state. This is the reason I moved rcu_barrier() in front
> of cpu_up/down.

Well, there are two aspects here: One is to avoid the overhead where
it's not needed. The other is, as observed on this patch, that by
the chosen approach callers which in fact need amending may be
forgotten. To find middle grounds, perhaps the solution is to have
variants of cpu_{up,down}() which invoke the barrier, and which
would be used by all runtime invocations?

The other question of course is why early init code is special in
this regard. If it indeed was, perhaps the barrier invocation could
also be tied to certain SYS_STATE_* values?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v2] x86/cpu: Sync any remaining RCU callbacks after CPU up/down
  2020-02-25 12:22     ` Jan Beulich
@ 2020-02-25 12:37       ` Igor Druzhinin
  2020-02-25 12:40         ` Jan Beulich
  0 siblings, 1 reply; 10+ messages in thread
From: Igor Druzhinin @ 2020-02-25 12:37 UTC (permalink / raw)
  To: Jan Beulich; +Cc: jgross, xen-devel, roger.pau, wl, andrew.cooper3

On 25/02/2020 12:22, Jan Beulich wrote:
> On 21.02.2020 20:26, Igor Druzhinin wrote:
>> On 21/02/2020 16:29, Jan Beulich wrote:
>>> On 19.02.2020 18:25, Igor Druzhinin wrote:
>>>> --- a/xen/arch/x86/sysctl.c
>>>> +++ b/xen/arch/x86/sysctl.c
>>>> @@ -78,8 +78,11 @@ static void l3_cache_get(void *arg)
>>>>  long cpu_up_helper(void *data)
>>>>  {
>>>>      unsigned int cpu = (unsigned long)data;
>>>> -    int ret = cpu_up(cpu);
>>>> +    int ret;
>>>>  
>>>> +    /* Flush potentially scheduled RCU work from preceding CPU offline */
>>>> +    rcu_barrier();
>>>> +    ret = cpu_up(cpu);
>>>>      if ( ret == -EBUSY )
>>>>      {
>>>>          /* On EBUSY, flush RCU work and have one more go. */
>>>> @@ -104,7 +107,11 @@ long cpu_up_helper(void *data)
>>>>  long cpu_down_helper(void *data)
>>>>  {
>>>>      int cpu = (unsigned long)data;
>>>> -    int ret = cpu_down(cpu);
>>>> +    int ret;
>>>> +
>>>> +    /* Flush potentially scheduled RCU work from preceding CPU online */
>>>> +    rcu_barrier();
>>>> +    ret = cpu_down(cpu);
>>>>      if ( ret == -EBUSY )
>>>>      {
>>>>          /* On EBUSY, flush RCU work and have one more go. */
>>>>
>>>
>>> There are more calls to cpu_up() / cpu_down(), most notably in
>>> core_parking.c. Wouldn't it be better to make the barrier part
>>> of the two functions? This would the also cover non-x86 in
>>> case an arch wants to support offlining/onlining of CPUs.
>>
>> Those functions are called from early init code and rcu_barrier() is
>> an expensive operation. I think it's better if caller is responsible
>> for syncing the state. This is the reason I moved rcu_barrier() in front
>> of cpu_up/down.
> 
> Well, there are two aspects here: One is to avoid the overhead where
> it's not needed. The other is, as observed on this patch, that by
> the chosen approach callers which in fact need amending may be
> forgotten. To find middle grounds, perhaps the solution is to have
> variants of cpu_{up,down}() which invoke the barrier, and which
> would be used by all runtime invocations?
> 
> The other question of course is why early init code is special in
> this regard. If it indeed was, perhaps the barrier invocation could
> also be tied to certain SYS_STATE_* values?

It's not special - in fact it starts after RCU is initialized. The issue
is, as you said, unnecessary overhead. I'd prefer to avoid any conditional
magic and instead call "nosync" version directly from setup code.

Igor

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v2] x86/cpu: Sync any remaining RCU callbacks after CPU up/down
  2020-02-25 12:37       ` Igor Druzhinin
@ 2020-02-25 12:40         ` Jan Beulich
  2020-02-25 12:46           ` Igor Druzhinin
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2020-02-25 12:40 UTC (permalink / raw)
  To: Igor Druzhinin; +Cc: jgross, xen-devel, roger.pau, wl, andrew.cooper3

On 25.02.2020 13:37, Igor Druzhinin wrote:
> On 25/02/2020 12:22, Jan Beulich wrote:
>> On 21.02.2020 20:26, Igor Druzhinin wrote:
>>> On 21/02/2020 16:29, Jan Beulich wrote:
>>>> On 19.02.2020 18:25, Igor Druzhinin wrote:
>>>>> --- a/xen/arch/x86/sysctl.c
>>>>> +++ b/xen/arch/x86/sysctl.c
>>>>> @@ -78,8 +78,11 @@ static void l3_cache_get(void *arg)
>>>>>  long cpu_up_helper(void *data)
>>>>>  {
>>>>>      unsigned int cpu = (unsigned long)data;
>>>>> -    int ret = cpu_up(cpu);
>>>>> +    int ret;
>>>>>  
>>>>> +    /* Flush potentially scheduled RCU work from preceding CPU offline */
>>>>> +    rcu_barrier();
>>>>> +    ret = cpu_up(cpu);
>>>>>      if ( ret == -EBUSY )
>>>>>      {
>>>>>          /* On EBUSY, flush RCU work and have one more go. */
>>>>> @@ -104,7 +107,11 @@ long cpu_up_helper(void *data)
>>>>>  long cpu_down_helper(void *data)
>>>>>  {
>>>>>      int cpu = (unsigned long)data;
>>>>> -    int ret = cpu_down(cpu);
>>>>> +    int ret;
>>>>> +
>>>>> +    /* Flush potentially scheduled RCU work from preceding CPU online */
>>>>> +    rcu_barrier();
>>>>> +    ret = cpu_down(cpu);
>>>>>      if ( ret == -EBUSY )
>>>>>      {
>>>>>          /* On EBUSY, flush RCU work and have one more go. */
>>>>>
>>>>
>>>> There are more calls to cpu_up() / cpu_down(), most notably in
>>>> core_parking.c. Wouldn't it be better to make the barrier part
>>>> of the two functions? This would the also cover non-x86 in
>>>> case an arch wants to support offlining/onlining of CPUs.
>>>
>>> Those functions are called from early init code and rcu_barrier() is
>>> an expensive operation. I think it's better if caller is responsible
>>> for syncing the state. This is the reason I moved rcu_barrier() in front
>>> of cpu_up/down.
>>
>> Well, there are two aspects here: One is to avoid the overhead where
>> it's not needed. The other is, as observed on this patch, that by
>> the chosen approach callers which in fact need amending may be
>> forgotten. To find middle grounds, perhaps the solution is to have
>> variants of cpu_{up,down}() which invoke the barrier, and which
>> would be used by all runtime invocations?
>>
>> The other question of course is why early init code is special in
>> this regard. If it indeed was, perhaps the barrier invocation could
>> also be tied to certain SYS_STATE_* values?
> 
> It's not special - in fact it starts after RCU is initialized. The issue
> is, as you said, unnecessary overhead.

Well, if it's unnecessary overhead, then it is special in some way.
After all at runtime the overhead isn't unnecessary. Is it perhaps
just that currently we don't _happen_ to have anything that would
make use of an RCU barrier necessary in this case? In which case it
would be a problem waiting to bite us down the road?

Jan

> I'd prefer to avoid any conditional
> magic and instead call "nosync" version directly from setup code.
> 
> Igor
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v2] x86/cpu: Sync any remaining RCU callbacks after CPU up/down
  2020-02-25 12:40         ` Jan Beulich
@ 2020-02-25 12:46           ` Igor Druzhinin
  2020-02-27 15:25             ` Igor Druzhinin
  0 siblings, 1 reply; 10+ messages in thread
From: Igor Druzhinin @ 2020-02-25 12:46 UTC (permalink / raw)
  To: Jan Beulich; +Cc: jgross, xen-devel, roger.pau, wl, andrew.cooper3

On 25/02/2020 12:40, Jan Beulich wrote:
> On 25.02.2020 13:37, Igor Druzhinin wrote:
>> On 25/02/2020 12:22, Jan Beulich wrote:
>>> On 21.02.2020 20:26, Igor Druzhinin wrote:
>>>> On 21/02/2020 16:29, Jan Beulich wrote:
>>>>> On 19.02.2020 18:25, Igor Druzhinin wrote:
>>>>>> --- a/xen/arch/x86/sysctl.c
>>>>>> +++ b/xen/arch/x86/sysctl.c
>>>>>> @@ -78,8 +78,11 @@ static void l3_cache_get(void *arg)
>>>>>>  long cpu_up_helper(void *data)
>>>>>>  {
>>>>>>      unsigned int cpu = (unsigned long)data;
>>>>>> -    int ret = cpu_up(cpu);
>>>>>> +    int ret;
>>>>>>  
>>>>>> +    /* Flush potentially scheduled RCU work from preceding CPU offline */
>>>>>> +    rcu_barrier();
>>>>>> +    ret = cpu_up(cpu);
>>>>>>      if ( ret == -EBUSY )
>>>>>>      {
>>>>>>          /* On EBUSY, flush RCU work and have one more go. */
>>>>>> @@ -104,7 +107,11 @@ long cpu_up_helper(void *data)
>>>>>>  long cpu_down_helper(void *data)
>>>>>>  {
>>>>>>      int cpu = (unsigned long)data;
>>>>>> -    int ret = cpu_down(cpu);
>>>>>> +    int ret;
>>>>>> +
>>>>>> +    /* Flush potentially scheduled RCU work from preceding CPU online */
>>>>>> +    rcu_barrier();
>>>>>> +    ret = cpu_down(cpu);
>>>>>>      if ( ret == -EBUSY )
>>>>>>      {
>>>>>>          /* On EBUSY, flush RCU work and have one more go. */
>>>>>>
>>>>>
>>>>> There are more calls to cpu_up() / cpu_down(), most notably in
>>>>> core_parking.c. Wouldn't it be better to make the barrier part
>>>>> of the two functions? This would the also cover non-x86 in
>>>>> case an arch wants to support offlining/onlining of CPUs.
>>>>
>>>> Those functions are called from early init code and rcu_barrier() is
>>>> an expensive operation. I think it's better if caller is responsible
>>>> for syncing the state. This is the reason I moved rcu_barrier() in front
>>>> of cpu_up/down.
>>>
>>> Well, there are two aspects here: One is to avoid the overhead where
>>> it's not needed. The other is, as observed on this patch, that by
>>> the chosen approach callers which in fact need amending may be
>>> forgotten. To find middle grounds, perhaps the solution is to have
>>> variants of cpu_{up,down}() which invoke the barrier, and which
>>> would be used by all runtime invocations?
>>>
>>> The other question of course is why early init code is special in
>>> this regard. If it indeed was, perhaps the barrier invocation could
>>> also be tied to certain SYS_STATE_* values?
>>
>> It's not special - in fact it starts after RCU is initialized. The issue
>> is, as you said, unnecessary overhead.
> 
> Well, if it's unnecessary overhead, then it is special in some way.
> After all at runtime the overhead isn't unnecessary. Is it perhaps
> just that currently we don't _happen_ to have anything that would
> make use of an RCU barrier necessary in this case? In which case it
> would be a problem waiting to bite us down the road?

I agree to a certain extent that it might be prudent to avoid special
casing even if we currently know that the case is safe. Let me see
if overhead is tolerable at CPU bring up on our largest system available
(448 CPUs).

Igor

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v2] x86/cpu: Sync any remaining RCU callbacks after CPU up/down
  2020-02-25 12:46           ` Igor Druzhinin
@ 2020-02-27 15:25             ` Igor Druzhinin
  0 siblings, 0 replies; 10+ messages in thread
From: Igor Druzhinin @ 2020-02-27 15:25 UTC (permalink / raw)
  To: Jan Beulich; +Cc: jgross, xen-devel, roger.pau, wl, andrew.cooper3

On 25/02/2020 12:46, Igor Druzhinin wrote:
> On 25/02/2020 12:40, Jan Beulich wrote:
>> On 25.02.2020 13:37, Igor Druzhinin wrote:
>>> On 25/02/2020 12:22, Jan Beulich wrote:
>>>> On 21.02.2020 20:26, Igor Druzhinin wrote:
>>>>> On 21/02/2020 16:29, Jan Beulich wrote:
>>>>>> On 19.02.2020 18:25, Igor Druzhinin wrote:
>>>>>>> --- a/xen/arch/x86/sysctl.c
>>>>>>> +++ b/xen/arch/x86/sysctl.c
>>>>>>> @@ -78,8 +78,11 @@ static void l3_cache_get(void *arg)
>>>>>>>  long cpu_up_helper(void *data)
>>>>>>>  {
>>>>>>>      unsigned int cpu = (unsigned long)data;
>>>>>>> -    int ret = cpu_up(cpu);
>>>>>>> +    int ret;
>>>>>>>  
>>>>>>> +    /* Flush potentially scheduled RCU work from preceding CPU offline */
>>>>>>> +    rcu_barrier();
>>>>>>> +    ret = cpu_up(cpu);
>>>>>>>      if ( ret == -EBUSY )
>>>>>>>      {
>>>>>>>          /* On EBUSY, flush RCU work and have one more go. */
>>>>>>> @@ -104,7 +107,11 @@ long cpu_up_helper(void *data)
>>>>>>>  long cpu_down_helper(void *data)
>>>>>>>  {
>>>>>>>      int cpu = (unsigned long)data;
>>>>>>> -    int ret = cpu_down(cpu);
>>>>>>> +    int ret;
>>>>>>> +
>>>>>>> +    /* Flush potentially scheduled RCU work from preceding CPU online */
>>>>>>> +    rcu_barrier();
>>>>>>> +    ret = cpu_down(cpu);
>>>>>>>      if ( ret == -EBUSY )
>>>>>>>      {
>>>>>>>          /* On EBUSY, flush RCU work and have one more go. */
>>>>>>>
>>>>>>
>>>>>> There are more calls to cpu_up() / cpu_down(), most notably in
>>>>>> core_parking.c. Wouldn't it be better to make the barrier part
>>>>>> of the two functions? This would the also cover non-x86 in
>>>>>> case an arch wants to support offlining/onlining of CPUs.
>>>>>
>>>>> Those functions are called from early init code and rcu_barrier() is
>>>>> an expensive operation. I think it's better if caller is responsible
>>>>> for syncing the state. This is the reason I moved rcu_barrier() in front
>>>>> of cpu_up/down.
>>>>
>>>> Well, there are two aspects here: One is to avoid the overhead where
>>>> it's not needed. The other is, as observed on this patch, that by
>>>> the chosen approach callers which in fact need amending may be
>>>> forgotten. To find middle grounds, perhaps the solution is to have
>>>> variants of cpu_{up,down}() which invoke the barrier, and which
>>>> would be used by all runtime invocations?
>>>>
>>>> The other question of course is why early init code is special in
>>>> this regard. If it indeed was, perhaps the barrier invocation could
>>>> also be tied to certain SYS_STATE_* values?
>>>
>>> It's not special - in fact it starts after RCU is initialized. The issue
>>> is, as you said, unnecessary overhead.
>>
>> Well, if it's unnecessary overhead, then it is special in some way.
>> After all at runtime the overhead isn't unnecessary. Is it perhaps
>> just that currently we don't _happen_ to have anything that would
>> make use of an RCU barrier necessary in this case? In which case it
>> would be a problem waiting to bite us down the road?
> 
> I agree to a certain extent that it might be prudent to avoid special
> casing even if we currently know that the case is safe. Let me see
> if overhead is tolerable at CPU bring up on our largest system available
> (448 CPUs).

I didn't see any significant slowdown in boot on 448 CPUs with latest
version of RCU series from Juergen. Will send v3 shortly.

Igor

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-02-27 15:25 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-19 17:25 [Xen-devel] [PATCH v2] x86/cpu: Sync any remaining RCU callbacks after CPU up/down Igor Druzhinin
2020-02-21 15:46 ` Jürgen Groß
2020-02-21 16:29 ` Jan Beulich
2020-02-21 19:26   ` Igor Druzhinin
2020-02-24 15:33     ` Roger Pau Monné
2020-02-25 12:22     ` Jan Beulich
2020-02-25 12:37       ` Igor Druzhinin
2020-02-25 12:40         ` Jan Beulich
2020-02-25 12:46           ` Igor Druzhinin
2020-02-27 15:25             ` Igor Druzhinin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.