linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] powerpc/pseries: explicitly reschedule during drmem_lmb list traversal
@ 2020-07-28 17:37 Nathan Lynch
  2020-07-28 17:46 ` Laurent Dufour
  2020-09-09 13:27 ` Michael Ellerman
  0 siblings, 2 replies; 12+ messages in thread
From: Nathan Lynch @ 2020-07-28 17:37 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tyreld, cheloha, ldufour

The drmem lmb list can have hundreds of thousands of entries, and
unfortunately lookups take the form of linear searches. As long as
this is the case, traversals have the potential to monopolize the CPU
and provoke lockup reports, workqueue stalls, and the like unless
they explicitly yield.

Rather than placing cond_resched() calls within various
for_each_drmem_lmb() loop blocks in the code, put it in the iteration
expression of the loop macro itself so users can't omit it.

Fixes: 6c6ea53725b3 ("powerpc/mm: Separate ibm, dynamic-memory data from DT format")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
 arch/powerpc/include/asm/drmem.h | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h
index 414d209f45bb..36d0ed04bda8 100644
--- a/arch/powerpc/include/asm/drmem.h
+++ b/arch/powerpc/include/asm/drmem.h
@@ -8,6 +8,8 @@
 #ifndef _ASM_POWERPC_LMB_H
 #define _ASM_POWERPC_LMB_H
 
+#include <linux/sched.h>
+
 struct drmem_lmb {
 	u64     base_addr;
 	u32     drc_index;
@@ -26,8 +28,14 @@ struct drmem_lmb_info {
 
 extern struct drmem_lmb_info *drmem_info;
 
+static inline struct drmem_lmb *drmem_lmb_next(struct drmem_lmb *lmb)
+{
+	cond_resched();
+	return ++lmb;
+}
+
 #define for_each_drmem_lmb_in_range(lmb, start, end)		\
-	for ((lmb) = (start); (lmb) < (end); (lmb)++)
+	for ((lmb) = (start); (lmb) < (end); lmb = drmem_lmb_next(lmb))
 
 #define for_each_drmem_lmb(lmb)					\
 	for_each_drmem_lmb_in_range((lmb),			\
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] powerpc/pseries: explicitly reschedule during drmem_lmb list traversal
  2020-07-28 17:37 [PATCH] powerpc/pseries: explicitly reschedule during drmem_lmb list traversal Nathan Lynch
@ 2020-07-28 17:46 ` Laurent Dufour
  2020-07-28 19:19   ` Nathan Lynch
  2020-09-09 13:27 ` Michael Ellerman
  1 sibling, 1 reply; 12+ messages in thread
From: Laurent Dufour @ 2020-07-28 17:46 UTC (permalink / raw)
  To: Nathan Lynch, linuxppc-dev; +Cc: tyreld, cheloha

Le 28/07/2020 à 19:37, Nathan Lynch a écrit :
> The drmem lmb list can have hundreds of thousands of entries, and
> unfortunately lookups take the form of linear searches. As long as
> this is the case, traversals have the potential to monopolize the CPU
> and provoke lockup reports, workqueue stalls, and the like unless
> they explicitly yield.
> 
> Rather than placing cond_resched() calls within various
> for_each_drmem_lmb() loop blocks in the code, put it in the iteration
> expression of the loop macro itself so users can't omit it.

Hi Nathan,

Is that not too much to call cond_resched() on every LMB?

Could that be less frequent, every 10, or 100, I don't really know ?

Cheers,
Laurent.
> 
> Fixes: 6c6ea53725b3 ("powerpc/mm: Separate ibm, dynamic-memory data from DT format")
> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
> ---
>   arch/powerpc/include/asm/drmem.h | 10 +++++++++-
>   1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h
> index 414d209f45bb..36d0ed04bda8 100644
> --- a/arch/powerpc/include/asm/drmem.h
> +++ b/arch/powerpc/include/asm/drmem.h
> @@ -8,6 +8,8 @@
>   #ifndef _ASM_POWERPC_LMB_H
>   #define _ASM_POWERPC_LMB_H
>   
> +#include <linux/sched.h>
> +
>   struct drmem_lmb {
>   	u64     base_addr;
>   	u32     drc_index;
> @@ -26,8 +28,14 @@ struct drmem_lmb_info {
>   
>   extern struct drmem_lmb_info *drmem_info;
>   
> +static inline struct drmem_lmb *drmem_lmb_next(struct drmem_lmb *lmb)
> +{
> +	cond_resched();
> +	return ++lmb;
> +}
> +
>   #define for_each_drmem_lmb_in_range(lmb, start, end)		\
> -	for ((lmb) = (start); (lmb) < (end); (lmb)++)
> +	for ((lmb) = (start); (lmb) < (end); lmb = drmem_lmb_next(lmb))
>   
>   #define for_each_drmem_lmb(lmb)					\
>   	for_each_drmem_lmb_in_range((lmb),			\
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powerpc/pseries: explicitly reschedule during drmem_lmb list traversal
  2020-07-28 17:46 ` Laurent Dufour
@ 2020-07-28 19:19   ` Nathan Lynch
  2020-07-30  0:57     ` Michael Ellerman
  0 siblings, 1 reply; 12+ messages in thread
From: Nathan Lynch @ 2020-07-28 19:19 UTC (permalink / raw)
  To: Laurent Dufour; +Cc: tyreld, cheloha, linuxppc-dev

Hi Laurent,

Laurent Dufour <ldufour@linux.ibm.com> writes:
> Le 28/07/2020 à 19:37, Nathan Lynch a écrit :
>> The drmem lmb list can have hundreds of thousands of entries, and
>> unfortunately lookups take the form of linear searches. As long as
>> this is the case, traversals have the potential to monopolize the CPU
>> and provoke lockup reports, workqueue stalls, and the like unless
>> they explicitly yield.
>> 
>> Rather than placing cond_resched() calls within various
>> for_each_drmem_lmb() loop blocks in the code, put it in the iteration
>> expression of the loop macro itself so users can't omit it.
>
> Is that not too much to call cond_resched() on every LMB?
>
> Could that be less frequent, every 10, or 100, I don't really know ?

Everything done within for_each_drmem_lmb is relatively heavyweight
already. E.g. calling dlpar_remove_lmb()/dlpar_add_lmb() can take dozens
of milliseconds. I don't think cond_resched() is an expensive check in
this context.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powerpc/pseries: explicitly reschedule during drmem_lmb list traversal
  2020-07-28 19:19   ` Nathan Lynch
@ 2020-07-30  0:57     ` Michael Ellerman
  2020-07-30 15:01       ` Nathan Lynch
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Ellerman @ 2020-07-30  0:57 UTC (permalink / raw)
  To: Nathan Lynch, Laurent Dufour; +Cc: tyreld, cheloha, linuxppc-dev

Nathan Lynch <nathanl@linux.ibm.com> writes:
> Laurent Dufour <ldufour@linux.ibm.com> writes:
>> Le 28/07/2020 à 19:37, Nathan Lynch a écrit :
>>> The drmem lmb list can have hundreds of thousands of entries, and
>>> unfortunately lookups take the form of linear searches. As long as
>>> this is the case, traversals have the potential to monopolize the CPU
>>> and provoke lockup reports, workqueue stalls, and the like unless
>>> they explicitly yield.
>>> 
>>> Rather than placing cond_resched() calls within various
>>> for_each_drmem_lmb() loop blocks in the code, put it in the iteration
>>> expression of the loop macro itself so users can't omit it.
>>
>> Is that not too much to call cond_resched() on every LMB?
>>
>> Could that be less frequent, every 10, or 100, I don't really know ?
>
> Everything done within for_each_drmem_lmb is relatively heavyweight
> already. E.g. calling dlpar_remove_lmb()/dlpar_add_lmb() can take dozens
> of milliseconds. I don't think cond_resched() is an expensive check in
> this context.

Hmm, mostly.

But there are quite a few cases like drmem_update_dt_v1():

	for_each_drmem_lmb(lmb) {
		dr_cell->base_addr = cpu_to_be64(lmb->base_addr);
		dr_cell->drc_index = cpu_to_be32(lmb->drc_index);
		dr_cell->aa_index = cpu_to_be32(lmb->aa_index);
		dr_cell->flags = cpu_to_be32(drmem_lmb_flags(lmb));

		dr_cell++;
	}

Which will compile to a pretty tight loop at the moment.

Or drmem_update_dt_v2() which has two loops over all lmbs.

And although the actual TIF check is cheap the function call to do it is
not free.

So I worry this is going to make some of those long loops take even longer.

At the same time I don't see an easy way to batch the calls to
cond_resched() without more intrusive changes.

cheers

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powerpc/pseries: explicitly reschedule during drmem_lmb list traversal
  2020-07-30  0:57     ` Michael Ellerman
@ 2020-07-30 15:01       ` Nathan Lynch
  2020-07-31 13:16         ` Michael Ellerman
  0 siblings, 1 reply; 12+ messages in thread
From: Nathan Lynch @ 2020-07-30 15:01 UTC (permalink / raw)
  To: Michael Ellerman, Laurent Dufour; +Cc: tyreld, cheloha, linuxppc-dev

Michael Ellerman <mpe@ellerman.id.au> writes:
> Nathan Lynch <nathanl@linux.ibm.com> writes:
>> Laurent Dufour <ldufour@linux.ibm.com> writes:
>>> Le 28/07/2020 à 19:37, Nathan Lynch a écrit :
>>>> The drmem lmb list can have hundreds of thousands of entries, and
>>>> unfortunately lookups take the form of linear searches. As long as
>>>> this is the case, traversals have the potential to monopolize the CPU
>>>> and provoke lockup reports, workqueue stalls, and the like unless
>>>> they explicitly yield.
>>>> 
>>>> Rather than placing cond_resched() calls within various
>>>> for_each_drmem_lmb() loop blocks in the code, put it in the iteration
>>>> expression of the loop macro itself so users can't omit it.
>>>
>>> Is that not too much to call cond_resched() on every LMB?
>>>
>>> Could that be less frequent, every 10, or 100, I don't really know ?
>>
>> Everything done within for_each_drmem_lmb is relatively heavyweight
>> already. E.g. calling dlpar_remove_lmb()/dlpar_add_lmb() can take dozens
>> of milliseconds. I don't think cond_resched() is an expensive check in
>> this context.
>
> Hmm, mostly.
>
> But there are quite a few cases like drmem_update_dt_v1():
>
> 	for_each_drmem_lmb(lmb) {
> 		dr_cell->base_addr = cpu_to_be64(lmb->base_addr);
> 		dr_cell->drc_index = cpu_to_be32(lmb->drc_index);
> 		dr_cell->aa_index = cpu_to_be32(lmb->aa_index);
> 		dr_cell->flags = cpu_to_be32(drmem_lmb_flags(lmb));
>
> 		dr_cell++;
> 	}
>
> Which will compile to a pretty tight loop at the moment.
>
> Or drmem_update_dt_v2() which has two loops over all lmbs.
>
> And although the actual TIF check is cheap the function call to do it is
> not free.
>
> So I worry this is going to make some of those long loops take even
> longer.

That's fair, and I was wrong - some of the loop bodies are relatively
simple, not doing allocations or taking locks, etc.

One way to deal is to keep for_each_drmem_lmb() as-is and add a new
iterator that can reschedule, e.g. for_each_drmem_lmb_slow().

On the other hand... it's probably not too strong to say that the
drmem/hotplug code is in crisis with respect to correctness and
algorithmic complexity, so those are my overriding concerns right
now. Yes, this change will pessimize loops that are reinitializing the
entire drmem_lmb array on every DLPAR operation, but:

1. it doesn't make any user of for_each_drmem_lmb() less correct;
2. why is this code doing that in the first place, other than to
   accommodate a poor data structure choice?

The duration of the system calls where this code runs are measured in
minutes or hours on large configurations because of all the behaviors
that are at best O(n) with the amount of memory assigned to the
partition. For simplicity's sake I'd rather defer lower-level
performance considerations like this until the drmem data structures'
awful lookup properties are fixed -- hopefully in the 5.10 timeframe.

Thoughts?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powerpc/pseries: explicitly reschedule during drmem_lmb list traversal
  2020-07-30 15:01       ` Nathan Lynch
@ 2020-07-31 13:16         ` Michael Ellerman
  2020-07-31 13:52           ` Nathan Lynch
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Ellerman @ 2020-07-31 13:16 UTC (permalink / raw)
  To: Nathan Lynch, Laurent Dufour; +Cc: tyreld, cheloha, linuxppc-dev

Nathan Lynch <nathanl@linux.ibm.com> writes:
> Michael Ellerman <mpe@ellerman.id.au> writes:
>> Nathan Lynch <nathanl@linux.ibm.com> writes:
>>> Laurent Dufour <ldufour@linux.ibm.com> writes:
>>>> Le 28/07/2020 à 19:37, Nathan Lynch a écrit :
>>>>> The drmem lmb list can have hundreds of thousands of entries, and
>>>>> unfortunately lookups take the form of linear searches. As long as
>>>>> this is the case, traversals have the potential to monopolize the CPU
>>>>> and provoke lockup reports, workqueue stalls, and the like unless
>>>>> they explicitly yield.
>>>>> 
>>>>> Rather than placing cond_resched() calls within various
>>>>> for_each_drmem_lmb() loop blocks in the code, put it in the iteration
>>>>> expression of the loop macro itself so users can't omit it.
>>>>
>>>> Is that not too much to call cond_resched() on every LMB?
>>>>
>>>> Could that be less frequent, every 10, or 100, I don't really know ?
>>>
>>> Everything done within for_each_drmem_lmb is relatively heavyweight
>>> already. E.g. calling dlpar_remove_lmb()/dlpar_add_lmb() can take dozens
>>> of milliseconds. I don't think cond_resched() is an expensive check in
>>> this context.
>>
>> Hmm, mostly.
>>
>> But there are quite a few cases like drmem_update_dt_v1():
>>
>> 	for_each_drmem_lmb(lmb) {
>> 		dr_cell->base_addr = cpu_to_be64(lmb->base_addr);
>> 		dr_cell->drc_index = cpu_to_be32(lmb->drc_index);
>> 		dr_cell->aa_index = cpu_to_be32(lmb->aa_index);
>> 		dr_cell->flags = cpu_to_be32(drmem_lmb_flags(lmb));
>>
>> 		dr_cell++;
>> 	}
>>
>> Which will compile to a pretty tight loop at the moment.
>>
>> Or drmem_update_dt_v2() which has two loops over all lmbs.
>>
>> And although the actual TIF check is cheap the function call to do it is
>> not free.
>>
>> So I worry this is going to make some of those long loops take even
>> longer.
>
> That's fair, and I was wrong - some of the loop bodies are relatively
> simple, not doing allocations or taking locks, etc.
>
> One way to deal is to keep for_each_drmem_lmb() as-is and add a new
> iterator that can reschedule, e.g. for_each_drmem_lmb_slow().

If we did that, how many call-sites would need converting?
Is it ~2 or ~20 or ~200?

> On the other hand... it's probably not too strong to say that the
> drmem/hotplug code is in crisis with respect to correctness and
> algorithmic complexity, so those are my overriding concerns right
> now. Yes, this change will pessimize loops that are reinitializing the
> entire drmem_lmb array on every DLPAR operation, but:
>
> 1. it doesn't make any user of for_each_drmem_lmb() less correct;
> 2. why is this code doing that in the first place, other than to
>    accommodate a poor data structure choice?
>
> The duration of the system calls where this code runs are measured in
> minutes or hours on large configurations because of all the behaviors
> that are at best O(n) with the amount of memory assigned to the
> partition. For simplicity's sake I'd rather defer lower-level
> performance considerations like this until the drmem data structures'
> awful lookup properties are fixed -- hopefully in the 5.10 timeframe.

Yeah fair enough.

cheers

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powerpc/pseries: explicitly reschedule during drmem_lmb list traversal
  2020-07-31 13:16         ` Michael Ellerman
@ 2020-07-31 13:52           ` Nathan Lynch
  2020-08-02 12:42             ` Michael Ellerman
  0 siblings, 1 reply; 12+ messages in thread
From: Nathan Lynch @ 2020-07-31 13:52 UTC (permalink / raw)
  To: Michael Ellerman, Laurent Dufour; +Cc: tyreld, cheloha, linuxppc-dev

Michael Ellerman <mpe@ellerman.id.au> writes:

> Nathan Lynch <nathanl@linux.ibm.com> writes:
>> Michael Ellerman <mpe@ellerman.id.au> writes:
>>> Nathan Lynch <nathanl@linux.ibm.com> writes:
>>>> Laurent Dufour <ldufour@linux.ibm.com> writes:
>>>>> Le 28/07/2020 à 19:37, Nathan Lynch a écrit :
>>>>>> The drmem lmb list can have hundreds of thousands of entries, and
>>>>>> unfortunately lookups take the form of linear searches. As long as
>>>>>> this is the case, traversals have the potential to monopolize the CPU
>>>>>> and provoke lockup reports, workqueue stalls, and the like unless
>>>>>> they explicitly yield.
>>>>>> 
>>>>>> Rather than placing cond_resched() calls within various
>>>>>> for_each_drmem_lmb() loop blocks in the code, put it in the iteration
>>>>>> expression of the loop macro itself so users can't omit it.
>>>>>
>>>>> Is that not too much to call cond_resched() on every LMB?
>>>>>
>>>>> Could that be less frequent, every 10, or 100, I don't really know ?
>>>>
>>>> Everything done within for_each_drmem_lmb is relatively heavyweight
>>>> already. E.g. calling dlpar_remove_lmb()/dlpar_add_lmb() can take dozens
>>>> of milliseconds. I don't think cond_resched() is an expensive check in
>>>> this context.
>>>
>>> Hmm, mostly.
>>>
>>> But there are quite a few cases like drmem_update_dt_v1():
>>>
>>> 	for_each_drmem_lmb(lmb) {
>>> 		dr_cell->base_addr = cpu_to_be64(lmb->base_addr);
>>> 		dr_cell->drc_index = cpu_to_be32(lmb->drc_index);
>>> 		dr_cell->aa_index = cpu_to_be32(lmb->aa_index);
>>> 		dr_cell->flags = cpu_to_be32(drmem_lmb_flags(lmb));
>>>
>>> 		dr_cell++;
>>> 	}
>>>
>>> Which will compile to a pretty tight loop at the moment.
>>>
>>> Or drmem_update_dt_v2() which has two loops over all lmbs.
>>>
>>> And although the actual TIF check is cheap the function call to do it is
>>> not free.
>>>
>>> So I worry this is going to make some of those long loops take even
>>> longer.
>>
>> That's fair, and I was wrong - some of the loop bodies are relatively
>> simple, not doing allocations or taking locks, etc.
>>
>> One way to deal is to keep for_each_drmem_lmb() as-is and add a new
>> iterator that can reschedule, e.g. for_each_drmem_lmb_slow().
>
> If we did that, how many call-sites would need converting?
> Is it ~2 or ~20 or ~200?

At a glance I would convert 15-20 out of the 24 users in the tree I'm
looking at. Let me know if I should do a v2 with that approach.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powerpc/pseries: explicitly reschedule during drmem_lmb list traversal
  2020-07-31 13:52           ` Nathan Lynch
@ 2020-08-02 12:42             ` Michael Ellerman
  2020-08-10 20:03               ` Nathan Lynch
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Ellerman @ 2020-08-02 12:42 UTC (permalink / raw)
  To: Nathan Lynch, Laurent Dufour; +Cc: tyreld, cheloha, linuxppc-dev

Nathan Lynch <nathanl@linux.ibm.com> writes:
> Michael Ellerman <mpe@ellerman.id.au> writes:
>> Nathan Lynch <nathanl@linux.ibm.com> writes:
>>> Michael Ellerman <mpe@ellerman.id.au> writes:
>>>> Nathan Lynch <nathanl@linux.ibm.com> writes:
>>>>> Laurent Dufour <ldufour@linux.ibm.com> writes:
>>>>>> Le 28/07/2020 à 19:37, Nathan Lynch a écrit :
>>>>>>> The drmem lmb list can have hundreds of thousands of entries, and
>>>>>>> unfortunately lookups take the form of linear searches. As long as
>>>>>>> this is the case, traversals have the potential to monopolize the CPU
>>>>>>> and provoke lockup reports, workqueue stalls, and the like unless
>>>>>>> they explicitly yield.
>>>>>>> 
>>>>>>> Rather than placing cond_resched() calls within various
>>>>>>> for_each_drmem_lmb() loop blocks in the code, put it in the iteration
>>>>>>> expression of the loop macro itself so users can't omit it.
>>>>>>
>>>>>> Is that not too much to call cond_resched() on every LMB?
>>>>>>
>>>>>> Could that be less frequent, every 10, or 100, I don't really know ?
>>>>>
>>>>> Everything done within for_each_drmem_lmb is relatively heavyweight
>>>>> already. E.g. calling dlpar_remove_lmb()/dlpar_add_lmb() can take dozens
>>>>> of milliseconds. I don't think cond_resched() is an expensive check in
>>>>> this context.
>>>>
>>>> Hmm, mostly.
>>>>
>>>> But there are quite a few cases like drmem_update_dt_v1():
>>>>
>>>> 	for_each_drmem_lmb(lmb) {
>>>> 		dr_cell->base_addr = cpu_to_be64(lmb->base_addr);
>>>> 		dr_cell->drc_index = cpu_to_be32(lmb->drc_index);
>>>> 		dr_cell->aa_index = cpu_to_be32(lmb->aa_index);
>>>> 		dr_cell->flags = cpu_to_be32(drmem_lmb_flags(lmb));
>>>>
>>>> 		dr_cell++;
>>>> 	}
>>>>
>>>> Which will compile to a pretty tight loop at the moment.
>>>>
>>>> Or drmem_update_dt_v2() which has two loops over all lmbs.
>>>>
>>>> And although the actual TIF check is cheap the function call to do it is
>>>> not free.
>>>>
>>>> So I worry this is going to make some of those long loops take even
>>>> longer.
>>>
>>> That's fair, and I was wrong - some of the loop bodies are relatively
>>> simple, not doing allocations or taking locks, etc.
>>>
>>> One way to deal is to keep for_each_drmem_lmb() as-is and add a new
>>> iterator that can reschedule, e.g. for_each_drmem_lmb_slow().
>>
>> If we did that, how many call-sites would need converting?
>> Is it ~2 or ~20 or ~200?
>
> At a glance I would convert 15-20 out of the 24 users in the tree I'm
> looking at. Let me know if I should do a v2 with that approach.

OK, that's a bunch of churn then, if we're planning to rework the code
significantly in the near future.

One thought, which I possibly should not put in writing, is that we
could use the alignment of the pointer as a poor man's substitute for a
counter, eg:

+static inline struct drmem_lmb *drmem_lmb_next(struct drmem_lmb *lmb)
+{
+	if (lmb % PAGE_SIZE == 0)
+		cond_resched();
+
+	return ++lmb;
+}

I think the lmbs are allocated in a block, so I think that will work.
Maybe PAGE_SIZE is not the right size to use, but you get the idea.

Gross I know, but might be OK as short term solution?

cheers

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powerpc/pseries: explicitly reschedule during drmem_lmb list traversal
  2020-08-02 12:42             ` Michael Ellerman
@ 2020-08-10 20:03               ` Nathan Lynch
  2020-08-12  1:32                 ` Nathan Lynch
  0 siblings, 1 reply; 12+ messages in thread
From: Nathan Lynch @ 2020-08-10 20:03 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: tyreld, cheloha, Laurent Dufour, linuxppc-dev

Michael Ellerman <mpe@ellerman.id.au> writes:
> One thought, which I possibly should not put in writing, is that we
> could use the alignment of the pointer as a poor man's substitute for a
> counter, eg:
>
> +static inline struct drmem_lmb *drmem_lmb_next(struct drmem_lmb *lmb)
> +{
> +	if (lmb % PAGE_SIZE == 0)
> +		cond_resched();
> +
> +	return ++lmb;
> +}
>
> I think the lmbs are allocated in a block, so I think that will work.
> Maybe PAGE_SIZE is not the right size to use, but you get the idea.
>
> Gross I know, but might be OK as short term solution?

OK, looking into this.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powerpc/pseries: explicitly reschedule during drmem_lmb list traversal
  2020-08-10 20:03               ` Nathan Lynch
@ 2020-08-12  1:32                 ` Nathan Lynch
  0 siblings, 0 replies; 12+ messages in thread
From: Nathan Lynch @ 2020-08-12  1:32 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: tyreld, cheloha, Laurent Dufour, linuxppc-dev

Nathan Lynch <nathanl@linux.ibm.com> writes:
> Michael Ellerman <mpe@ellerman.id.au> writes:
>> One thought, which I possibly should not put in writing, is that we
>> could use the alignment of the pointer as a poor man's substitute for a
>> counter, eg:
>>
>> +static inline struct drmem_lmb *drmem_lmb_next(struct drmem_lmb *lmb)
>> +{
>> +	if (lmb % PAGE_SIZE == 0)
>> +		cond_resched();
>> +
>> +	return ++lmb;
>> +}
>>
>> I think the lmbs are allocated in a block, so I think that will work.
>> Maybe PAGE_SIZE is not the right size to use, but you get the idea.
>>
>> Gross I know, but might be OK as short term solution?
>
> OK, looking into this.

To follow up:

I wasn't able to measure more than ~1% difference in DLPAR memory
performance with my original version of this, but that was on a
relatively small configuration - hundreds of elements in the array as
opposed to thousands. I took an educated guess at an appropriate
interval and posted v2:

https://lore.kernel.org/linuxppc-dev/20200812012005.1919255-1-nathanl@linux.ibm.com/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powerpc/pseries: explicitly reschedule during drmem_lmb list traversal
  2020-07-28 17:37 [PATCH] powerpc/pseries: explicitly reschedule during drmem_lmb list traversal Nathan Lynch
  2020-07-28 17:46 ` Laurent Dufour
@ 2020-09-09 13:27 ` Michael Ellerman
  2020-09-10  7:37   ` Michael Ellerman
  1 sibling, 1 reply; 12+ messages in thread
From: Michael Ellerman @ 2020-09-09 13:27 UTC (permalink / raw)
  To: linuxppc-dev, Nathan Lynch; +Cc: tyreld, cheloha, ldufour

On Tue, 28 Jul 2020 12:37:41 -0500, Nathan Lynch wrote:
> The drmem lmb list can have hundreds of thousands of entries, and
> unfortunately lookups take the form of linear searches. As long as
> this is the case, traversals have the potential to monopolize the CPU
> and provoke lockup reports, workqueue stalls, and the like unless
> they explicitly yield.
> 
> Rather than placing cond_resched() calls within various
> for_each_drmem_lmb() loop blocks in the code, put it in the iteration
> expression of the loop macro itself so users can't omit it.

Applied to powerpc/next.

[1/1] powerpc/pseries: explicitly reschedule during drmem_lmb list traversal
      https://git.kernel.org/powerpc/c/9d6792ffe140240ae54c881cc4183f9acc24b4df

cheers

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powerpc/pseries: explicitly reschedule during drmem_lmb list traversal
  2020-09-09 13:27 ` Michael Ellerman
@ 2020-09-10  7:37   ` Michael Ellerman
  0 siblings, 0 replies; 12+ messages in thread
From: Michael Ellerman @ 2020-09-10  7:37 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev, Nathan Lynch; +Cc: tyreld, cheloha, ldufour

Michael Ellerman <patch-notifications@ellerman.id.au> writes:
> On Tue, 28 Jul 2020 12:37:41 -0500, Nathan Lynch wrote:
>> The drmem lmb list can have hundreds of thousands of entries, and
>> unfortunately lookups take the form of linear searches. As long as
>> this is the case, traversals have the potential to monopolize the CPU
>> and provoke lockup reports, workqueue stalls, and the like unless
>> they explicitly yield.
>> 
>> Rather than placing cond_resched() calls within various
>> for_each_drmem_lmb() loop blocks in the code, put it in the iteration
>> expression of the loop macro itself so users can't omit it.
>
> Applied to powerpc/next.
>
> [1/1] powerpc/pseries: explicitly reschedule during drmem_lmb list traversal
>       https://git.kernel.org/powerpc/c/9d6792ffe140240ae54c881cc4183f9acc24b4df

Some script gremlins here, I actually applied v3 (only once).

cheers

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-09-10  7:39 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-28 17:37 [PATCH] powerpc/pseries: explicitly reschedule during drmem_lmb list traversal Nathan Lynch
2020-07-28 17:46 ` Laurent Dufour
2020-07-28 19:19   ` Nathan Lynch
2020-07-30  0:57     ` Michael Ellerman
2020-07-30 15:01       ` Nathan Lynch
2020-07-31 13:16         ` Michael Ellerman
2020-07-31 13:52           ` Nathan Lynch
2020-08-02 12:42             ` Michael Ellerman
2020-08-10 20:03               ` Nathan Lynch
2020-08-12  1:32                 ` Nathan Lynch
2020-09-09 13:27 ` Michael Ellerman
2020-09-10  7:37   ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).