All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group
@ 2013-10-21  8:58 ` Wei Yang
  0 siblings, 0 replies; 32+ messages in thread
From: Wei Yang @ 2013-10-21  8:58 UTC (permalink / raw)
  To: tj, cl, linux-mm, linux-kernel; +Cc: weiyang

When a cpu belongs to a new group, there is no cpu has the same group id. This
means it can be assigned a new group id without checking with every others.

This patch does this optimiztion.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 mm/percpu.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/mm/percpu.c b/mm/percpu.c
index 8c8e08f..536ca4f 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -1488,7 +1488,10 @@ static struct pcpu_alloc_info * __init pcpu_build_alloc_info(
 			    (cpu_distance_fn(cpu, tcpu) > LOCAL_DISTANCE ||
 			     cpu_distance_fn(tcpu, cpu) > LOCAL_DISTANCE)) {
 				group++;
-				nr_groups = max(nr_groups, group + 1);
+				if (group == nr_groups) {
+					nr_groups++;
+					break;
+				}
 				goto next_group;
 			}
 		}
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group
@ 2013-10-21  8:58 ` Wei Yang
  0 siblings, 0 replies; 32+ messages in thread
From: Wei Yang @ 2013-10-21  8:58 UTC (permalink / raw)
  To: tj, cl, linux-mm, linux-kernel; +Cc: weiyang

When a cpu belongs to a new group, there is no cpu has the same group id. This
means it can be assigned a new group id without checking with every others.

This patch does this optimiztion.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 mm/percpu.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/mm/percpu.c b/mm/percpu.c
index 8c8e08f..536ca4f 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -1488,7 +1488,10 @@ static struct pcpu_alloc_info * __init pcpu_build_alloc_info(
 			    (cpu_distance_fn(cpu, tcpu) > LOCAL_DISTANCE ||
 			     cpu_distance_fn(tcpu, cpu) > LOCAL_DISTANCE)) {
 				group++;
-				nr_groups = max(nr_groups, group + 1);
+				if (group == nr_groups) {
+					nr_groups++;
+					break;
+				}
 				goto next_group;
 			}
 		}
-- 
1.7.5.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 2/3] percpu: merge two loops when setting up group info
  2013-10-21  8:58 ` Wei Yang
@ 2013-10-21  8:58   ` Wei Yang
  -1 siblings, 0 replies; 32+ messages in thread
From: Wei Yang @ 2013-10-21  8:58 UTC (permalink / raw)
  To: tj, cl, linux-mm, linux-kernel; +Cc: weiyang

There are two loops setting up the group info of pcpu_alloc_info. They share
the same logic, so merge them could be time efficient when there are many
groups.

This patch merge these two loops into one.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 mm/percpu.c |    8 +++-----
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/mm/percpu.c b/mm/percpu.c
index 536ca4f..4f710a4f 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -1542,11 +1542,6 @@ static struct pcpu_alloc_info * __init pcpu_build_alloc_info(
 		return ERR_PTR(-ENOMEM);
 	cpu_map = ai->groups[0].cpu_map;
 
-	for (group = 0; group < nr_groups; group++) {
-		ai->groups[group].cpu_map = cpu_map;
-		cpu_map += roundup(group_cnt[group], upa);
-	}
-
 	ai->static_size = static_size;
 	ai->reserved_size = reserved_size;
 	ai->dyn_size = dyn_size;
@@ -1557,6 +1552,8 @@ static struct pcpu_alloc_info * __init pcpu_build_alloc_info(
 	for (group = 0, unit = 0; group_cnt[group]; group++) {
 		struct pcpu_group_info *gi = &ai->groups[group];
 
+		gi->cpu_map = cpu_map;
+
 		/*
 		 * Initialize base_offset as if all groups are located
 		 * back-to-back.  The caller should update this to
@@ -1568,6 +1565,7 @@ static struct pcpu_alloc_info * __init pcpu_build_alloc_info(
 			if (group_map[cpu] == group)
 				gi->cpu_map[gi->nr_units++] = cpu;
 		gi->nr_units = roundup(gi->nr_units, upa);
+		cpu_map += gi->nr_units;
 		unit += gi->nr_units;
 	}
 	BUG_ON(unit != nr_units);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 2/3] percpu: merge two loops when setting up group info
@ 2013-10-21  8:58   ` Wei Yang
  0 siblings, 0 replies; 32+ messages in thread
From: Wei Yang @ 2013-10-21  8:58 UTC (permalink / raw)
  To: tj, cl, linux-mm, linux-kernel; +Cc: weiyang

There are two loops setting up the group info of pcpu_alloc_info. They share
the same logic, so merge them could be time efficient when there are many
groups.

This patch merge these two loops into one.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 mm/percpu.c |    8 +++-----
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/mm/percpu.c b/mm/percpu.c
index 536ca4f..4f710a4f 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -1542,11 +1542,6 @@ static struct pcpu_alloc_info * __init pcpu_build_alloc_info(
 		return ERR_PTR(-ENOMEM);
 	cpu_map = ai->groups[0].cpu_map;
 
-	for (group = 0; group < nr_groups; group++) {
-		ai->groups[group].cpu_map = cpu_map;
-		cpu_map += roundup(group_cnt[group], upa);
-	}
-
 	ai->static_size = static_size;
 	ai->reserved_size = reserved_size;
 	ai->dyn_size = dyn_size;
@@ -1557,6 +1552,8 @@ static struct pcpu_alloc_info * __init pcpu_build_alloc_info(
 	for (group = 0, unit = 0; group_cnt[group]; group++) {
 		struct pcpu_group_info *gi = &ai->groups[group];
 
+		gi->cpu_map = cpu_map;
+
 		/*
 		 * Initialize base_offset as if all groups are located
 		 * back-to-back.  The caller should update this to
@@ -1568,6 +1565,7 @@ static struct pcpu_alloc_info * __init pcpu_build_alloc_info(
 			if (group_map[cpu] == group)
 				gi->cpu_map[gi->nr_units++] = cpu;
 		gi->nr_units = roundup(gi->nr_units, upa);
+		cpu_map += gi->nr_units;
 		unit += gi->nr_units;
 	}
 	BUG_ON(unit != nr_units);
-- 
1.7.5.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 3/3] percpu: little optimization on calculating pcpu_unit_size
  2013-10-21  8:58 ` Wei Yang
@ 2013-10-21  8:58   ` Wei Yang
  -1 siblings, 0 replies; 32+ messages in thread
From: Wei Yang @ 2013-10-21  8:58 UTC (permalink / raw)
  To: tj, cl, linux-mm, linux-kernel; +Cc: weiyang

pcpu_unit_size exactly equals to ai->unit_size.

This patch assign this value instead of calculating from pcpu_unit_pages. Also
it reorder them to make it looks more friendly to audience.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 mm/percpu.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/percpu.c b/mm/percpu.c
index 4f710a4f..74677e0 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -1300,8 +1300,8 @@ int __init pcpu_setup_first_chunk(const struct pcpu_alloc_info *ai,
 	pcpu_unit_offsets = unit_off;
 
 	/* determine basic parameters */
-	pcpu_unit_pages = ai->unit_size >> PAGE_SHIFT;
-	pcpu_unit_size = pcpu_unit_pages << PAGE_SHIFT;
+	pcpu_unit_size = ai->unit_size;
+	pcpu_unit_pages = pcpu_unit_size >> PAGE_SHIFT;
 	pcpu_atom_size = ai->atom_size;
 	pcpu_chunk_struct_size = sizeof(struct pcpu_chunk) +
 		BITS_TO_LONGS(pcpu_unit_pages) * sizeof(unsigned long);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 3/3] percpu: little optimization on calculating pcpu_unit_size
@ 2013-10-21  8:58   ` Wei Yang
  0 siblings, 0 replies; 32+ messages in thread
From: Wei Yang @ 2013-10-21  8:58 UTC (permalink / raw)
  To: tj, cl, linux-mm, linux-kernel; +Cc: weiyang

pcpu_unit_size exactly equals to ai->unit_size.

This patch assign this value instead of calculating from pcpu_unit_pages. Also
it reorder them to make it looks more friendly to audience.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 mm/percpu.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/percpu.c b/mm/percpu.c
index 4f710a4f..74677e0 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -1300,8 +1300,8 @@ int __init pcpu_setup_first_chunk(const struct pcpu_alloc_info *ai,
 	pcpu_unit_offsets = unit_off;
 
 	/* determine basic parameters */
-	pcpu_unit_pages = ai->unit_size >> PAGE_SHIFT;
-	pcpu_unit_size = pcpu_unit_pages << PAGE_SHIFT;
+	pcpu_unit_size = ai->unit_size;
+	pcpu_unit_pages = pcpu_unit_size >> PAGE_SHIFT;
 	pcpu_atom_size = ai->atom_size;
 	pcpu_chunk_struct_size = sizeof(struct pcpu_chunk) +
 		BITS_TO_LONGS(pcpu_unit_pages) * sizeof(unsigned long);
-- 
1.7.5.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group
  2013-10-21  8:58 ` Wei Yang
@ 2013-10-27 12:30   ` Tejun Heo
  -1 siblings, 0 replies; 32+ messages in thread
From: Tejun Heo @ 2013-10-27 12:30 UTC (permalink / raw)
  To: Wei Yang; +Cc: cl, linux-mm, linux-kernel

On Mon, Oct 21, 2013 at 04:58:11PM +0800, Wei Yang wrote:
> When a cpu belongs to a new group, there is no cpu has the same group id. This
> means it can be assigned a new group id without checking with every others.
> 
> This patch does this optimiztion.

Does this actually matter?  If so, it'd probably make a lot more sense
to start inner loop at @cpu + 1 so that it becomes O(N).

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group
@ 2013-10-27 12:30   ` Tejun Heo
  0 siblings, 0 replies; 32+ messages in thread
From: Tejun Heo @ 2013-10-27 12:30 UTC (permalink / raw)
  To: Wei Yang; +Cc: cl, linux-mm, linux-kernel

On Mon, Oct 21, 2013 at 04:58:11PM +0800, Wei Yang wrote:
> When a cpu belongs to a new group, there is no cpu has the same group id. This
> means it can be assigned a new group id without checking with every others.
> 
> This patch does this optimiztion.

Does this actually matter?  If so, it'd probably make a lot more sense
to start inner loop at @cpu + 1 so that it becomes O(N).

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/3] percpu: merge two loops when setting up group info
  2013-10-21  8:58   ` Wei Yang
@ 2013-10-27 12:35     ` Tejun Heo
  -1 siblings, 0 replies; 32+ messages in thread
From: Tejun Heo @ 2013-10-27 12:35 UTC (permalink / raw)
  To: Wei Yang; +Cc: cl, linux-mm, linux-kernel

On Mon, Oct 21, 2013 at 04:58:12PM +0800, Wei Yang wrote:
> There are two loops setting up the group info of pcpu_alloc_info. They share
> the same logic, so merge them could be time efficient when there are many
> groups.
> 
> This patch merge these two loops into one.

It *looks* correct to me but I'd rather not change this unless you can
show me this actually matters, which I find extremely doubtful given
nr_groups would be in the order of few thousands even on an extremely
large machine.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/3] percpu: merge two loops when setting up group info
@ 2013-10-27 12:35     ` Tejun Heo
  0 siblings, 0 replies; 32+ messages in thread
From: Tejun Heo @ 2013-10-27 12:35 UTC (permalink / raw)
  To: Wei Yang; +Cc: cl, linux-mm, linux-kernel

On Mon, Oct 21, 2013 at 04:58:12PM +0800, Wei Yang wrote:
> There are two loops setting up the group info of pcpu_alloc_info. They share
> the same logic, so merge them could be time efficient when there are many
> groups.
> 
> This patch merge these two loops into one.

It *looks* correct to me but I'd rather not change this unless you can
show me this actually matters, which I find extremely doubtful given
nr_groups would be in the order of few thousands even on an extremely
large machine.

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/3] percpu: little optimization on calculating pcpu_unit_size
  2013-10-21  8:58   ` Wei Yang
@ 2013-10-27 12:36     ` Tejun Heo
  -1 siblings, 0 replies; 32+ messages in thread
From: Tejun Heo @ 2013-10-27 12:36 UTC (permalink / raw)
  To: Wei Yang; +Cc: cl, linux-mm, linux-kernel

On Mon, Oct 21, 2013 at 04:58:13PM +0800, Wei Yang wrote:
> pcpu_unit_size exactly equals to ai->unit_size.
> 
> This patch assign this value instead of calculating from pcpu_unit_pages. Also
> it reorder them to make it looks more friendly to audience.

Ditto.  I'd rather not change unless this is clearly better.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/3] percpu: little optimization on calculating pcpu_unit_size
@ 2013-10-27 12:36     ` Tejun Heo
  0 siblings, 0 replies; 32+ messages in thread
From: Tejun Heo @ 2013-10-27 12:36 UTC (permalink / raw)
  To: Wei Yang; +Cc: cl, linux-mm, linux-kernel

On Mon, Oct 21, 2013 at 04:58:13PM +0800, Wei Yang wrote:
> pcpu_unit_size exactly equals to ai->unit_size.
> 
> This patch assign this value instead of calculating from pcpu_unit_pages. Also
> it reorder them to make it looks more friendly to audience.

Ditto.  I'd rather not change unless this is clearly better.

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/3] percpu: merge two loops when setting up group info
  2013-10-27 12:35     ` Tejun Heo
@ 2013-10-28  2:37       ` Wei Yang
  -1 siblings, 0 replies; 32+ messages in thread
From: Wei Yang @ 2013-10-28  2:37 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Wei Yang, cl, linux-mm, linux-kernel

On Sun, Oct 27, 2013 at 08:35:42AM -0400, Tejun Heo wrote:
>On Mon, Oct 21, 2013 at 04:58:12PM +0800, Wei Yang wrote:
>> There are two loops setting up the group info of pcpu_alloc_info. They share
>> the same logic, so merge them could be time efficient when there are many
>> groups.
>> 
>> This patch merge these two loops into one.
>
>It *looks* correct to me but I'd rather not change this unless you can
>show me this actually matters, which I find extremely doubtful given
>nr_groups would be in the order of few thousands even on an extremely
>large machine.

Tejun, thanks for your review and comments.

I agree with you that the nr_groups won't be very large, which means it will
not bring many benefits.

This is just a small code refine. If you don't like it, just drop it :-)

>
>Thanks.
>
>-- 
>tejun

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/3] percpu: merge two loops when setting up group info
@ 2013-10-28  2:37       ` Wei Yang
  0 siblings, 0 replies; 32+ messages in thread
From: Wei Yang @ 2013-10-28  2:37 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Wei Yang, cl, linux-mm, linux-kernel

On Sun, Oct 27, 2013 at 08:35:42AM -0400, Tejun Heo wrote:
>On Mon, Oct 21, 2013 at 04:58:12PM +0800, Wei Yang wrote:
>> There are two loops setting up the group info of pcpu_alloc_info. They share
>> the same logic, so merge them could be time efficient when there are many
>> groups.
>> 
>> This patch merge these two loops into one.
>
>It *looks* correct to me but I'd rather not change this unless you can
>show me this actually matters, which I find extremely doubtful given
>nr_groups would be in the order of few thousands even on an extremely
>large machine.

Tejun, thanks for your review and comments.

I agree with you that the nr_groups won't be very large, which means it will
not bring many benefits.

This is just a small code refine. If you don't like it, just drop it :-)

>
>Thanks.
>
>-- 
>tejun

-- 
Richard Yang
Help you, Help me

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/3] percpu: little optimization on calculating pcpu_unit_size
  2013-10-27 12:36     ` Tejun Heo
@ 2013-10-28  2:43       ` Wei Yang
  -1 siblings, 0 replies; 32+ messages in thread
From: Wei Yang @ 2013-10-28  2:43 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Wei Yang, cl, linux-mm, linux-kernel

On Sun, Oct 27, 2013 at 08:36:34AM -0400, Tejun Heo wrote:
>On Mon, Oct 21, 2013 at 04:58:13PM +0800, Wei Yang wrote:
>> pcpu_unit_size exactly equals to ai->unit_size.
>> 
>> This patch assign this value instead of calculating from pcpu_unit_pages. Also
>> it reorder them to make it looks more friendly to audience.
>
>Ditto.  I'd rather not change unless this is clearly better.

This one change an assignement to a shift, which in my mind is a little
faster.

Well, this is just executed once during the boot time, not a big deal.

>
>Thanks.
>
>-- 
>tejun

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/3] percpu: little optimization on calculating pcpu_unit_size
@ 2013-10-28  2:43       ` Wei Yang
  0 siblings, 0 replies; 32+ messages in thread
From: Wei Yang @ 2013-10-28  2:43 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Wei Yang, cl, linux-mm, linux-kernel

On Sun, Oct 27, 2013 at 08:36:34AM -0400, Tejun Heo wrote:
>On Mon, Oct 21, 2013 at 04:58:13PM +0800, Wei Yang wrote:
>> pcpu_unit_size exactly equals to ai->unit_size.
>> 
>> This patch assign this value instead of calculating from pcpu_unit_pages. Also
>> it reorder them to make it looks more friendly to audience.
>
>Ditto.  I'd rather not change unless this is clearly better.

This one change an assignement to a shift, which in my mind is a little
faster.

Well, this is just executed once during the boot time, not a big deal.

>
>Thanks.
>
>-- 
>tejun

-- 
Richard Yang
Help you, Help me

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group
  2013-10-27 12:30   ` Tejun Heo
@ 2013-10-28  3:00     ` Wei Yang
  -1 siblings, 0 replies; 32+ messages in thread
From: Wei Yang @ 2013-10-28  3:00 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Wei Yang, cl, linux-mm, linux-kernel

On Sun, Oct 27, 2013 at 08:30:08AM -0400, Tejun Heo wrote:
>On Mon, Oct 21, 2013 at 04:58:11PM +0800, Wei Yang wrote:
>> When a cpu belongs to a new group, there is no cpu has the same group id. This
>> means it can be assigned a new group id without checking with every others.
>> 
>> This patch does this optimiztion.
>
>Does this actually matter?  If so, it'd probably make a lot more sense
>to start inner loop at @cpu + 1 so that it becomes O(N).

One of the worst case in my mind:

CPU:        0    1    2    3    4    ...
Group:      0    1    2    3    4    ...
(sounds it is impossible in the real world)

Every time, when we encounter a new CPU and try to assign it to a group, we
found it belongs to a new group. The original logic will iterate on all old
CPUs again, while the new logic could skip this and assign it to a new group.

Again, this is a tiny change, which doesn't matters a lot.

BTW, I don't get your point for "start inner loop at @cpu+1".

The original logic is:
	loop 1:   0 - nr_cpus
	loop 2:      0 - (cpu - 1)

If you found one better approach to improve the logic, I believe all the users
will appreciate your efforts :-)

Thanks for your review and comments again ~

>
>Thanks.
>
>-- 
>tejun

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group
@ 2013-10-28  3:00     ` Wei Yang
  0 siblings, 0 replies; 32+ messages in thread
From: Wei Yang @ 2013-10-28  3:00 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Wei Yang, cl, linux-mm, linux-kernel

On Sun, Oct 27, 2013 at 08:30:08AM -0400, Tejun Heo wrote:
>On Mon, Oct 21, 2013 at 04:58:11PM +0800, Wei Yang wrote:
>> When a cpu belongs to a new group, there is no cpu has the same group id. This
>> means it can be assigned a new group id without checking with every others.
>> 
>> This patch does this optimiztion.
>
>Does this actually matter?  If so, it'd probably make a lot more sense
>to start inner loop at @cpu + 1 so that it becomes O(N).

One of the worst case in my mind:

CPU:        0    1    2    3    4    ...
Group:      0    1    2    3    4    ...
(sounds it is impossible in the real world)

Every time, when we encounter a new CPU and try to assign it to a group, we
found it belongs to a new group. The original logic will iterate on all old
CPUs again, while the new logic could skip this and assign it to a new group.

Again, this is a tiny change, which doesn't matters a lot.

BTW, I don't get your point for "start inner loop at @cpu+1".

The original logic is:
	loop 1:   0 - nr_cpus
	loop 2:      0 - (cpu - 1)

If you found one better approach to improve the logic, I believe all the users
will appreciate your efforts :-)

Thanks for your review and comments again ~

>
>Thanks.
>
>-- 
>tejun

-- 
Richard Yang
Help you, Help me

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group
  2013-10-28  3:00     ` Wei Yang
@ 2013-10-28 11:31       ` Tejun Heo
  -1 siblings, 0 replies; 32+ messages in thread
From: Tejun Heo @ 2013-10-28 11:31 UTC (permalink / raw)
  To: Wei Yang; +Cc: cl, linux-mm, linux-kernel

Hello,

On Mon, Oct 28, 2013 at 11:00:55AM +0800, Wei Yang wrote:
> >Does this actually matter?  If so, it'd probably make a lot more sense
> >to start inner loop at @cpu + 1 so that it becomes O(N).
> 
> One of the worst case in my mind:
> 
> CPU:        0    1    2    3    4    ...
> Group:      0    1    2    3    4    ...
> (sounds it is impossible in the real world)

I was wondering whether you had an actual case where this actually
matters or it's just something you thought of while reading the code.

> Every time, when we encounter a new CPU and try to assign it to a group, we
> found it belongs to a new group. The original logic will iterate on all old
> CPUs again, while the new logic could skip this and assign it to a new group.
> 
> Again, this is a tiny change, which doesn't matters a lot.

I think it *could* matter because the current implementation is O(N^2)
where N is the number of CPUs.  On machines, say, with 4k CPU, it's
gonna loop 16M times but then again even that takes only a few
millisecs on modern machines.

> BTW, I don't get your point for "start inner loop at @cpu+1".
> 
> The original logic is:
> 	loop 1:   0 - nr_cpus
> 	loop 2:      0 - (cpu - 1)
> 
> If you found one better approach to improve the logic, I believe all the users
> will appreciate your efforts :-)

Ooh, right, I forgot about the break and then I thought somehow that
would make it O(N).  Sorry about that.  I blame jetlag. :)

Yeah, I don't know.  The function is quite hairy which makes me keep
things simpler and reluctant to make changes unless it actually makes
non-trivial difference.  The change looks okay to me but it seems
neither necessary or substantially beneficial and if my experience is
anything to go by, *any* change involves some risk of brekage no
matter how innocent it may look, so given the circumstances, I'd like
to keep things the way they are.

Thanks a lot!

-- 
tejun

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group
@ 2013-10-28 11:31       ` Tejun Heo
  0 siblings, 0 replies; 32+ messages in thread
From: Tejun Heo @ 2013-10-28 11:31 UTC (permalink / raw)
  To: Wei Yang; +Cc: cl, linux-mm, linux-kernel

Hello,

On Mon, Oct 28, 2013 at 11:00:55AM +0800, Wei Yang wrote:
> >Does this actually matter?  If so, it'd probably make a lot more sense
> >to start inner loop at @cpu + 1 so that it becomes O(N).
> 
> One of the worst case in my mind:
> 
> CPU:        0    1    2    3    4    ...
> Group:      0    1    2    3    4    ...
> (sounds it is impossible in the real world)

I was wondering whether you had an actual case where this actually
matters or it's just something you thought of while reading the code.

> Every time, when we encounter a new CPU and try to assign it to a group, we
> found it belongs to a new group. The original logic will iterate on all old
> CPUs again, while the new logic could skip this and assign it to a new group.
> 
> Again, this is a tiny change, which doesn't matters a lot.

I think it *could* matter because the current implementation is O(N^2)
where N is the number of CPUs.  On machines, say, with 4k CPU, it's
gonna loop 16M times but then again even that takes only a few
millisecs on modern machines.

> BTW, I don't get your point for "start inner loop at @cpu+1".
> 
> The original logic is:
> 	loop 1:   0 - nr_cpus
> 	loop 2:      0 - (cpu - 1)
> 
> If you found one better approach to improve the logic, I believe all the users
> will appreciate your efforts :-)

Ooh, right, I forgot about the break and then I thought somehow that
would make it O(N).  Sorry about that.  I blame jetlag. :)

Yeah, I don't know.  The function is quite hairy which makes me keep
things simpler and reluctant to make changes unless it actually makes
non-trivial difference.  The change looks okay to me but it seems
neither necessary or substantially beneficial and if my experience is
anything to go by, *any* change involves some risk of brekage no
matter how innocent it may look, so given the circumstances, I'd like
to keep things the way they are.

Thanks a lot!

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group
  2013-10-28 11:31       ` Tejun Heo
@ 2013-10-28 15:17         ` Wei Yang
  -1 siblings, 0 replies; 32+ messages in thread
From: Wei Yang @ 2013-10-28 15:17 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Wei Yang, cl, linux-mm, linux-kernel

On Mon, Oct 28, 2013 at 07:31:20AM -0400, Tejun Heo wrote:
>Hello,
>
>On Mon, Oct 28, 2013 at 11:00:55AM +0800, Wei Yang wrote:
>> >Does this actually matter?  If so, it'd probably make a lot more sense
>> >to start inner loop at @cpu + 1 so that it becomes O(N).
>> 
>> One of the worst case in my mind:
>> 
>> CPU:        0    1    2    3    4    ...
>> Group:      0    1    2    3    4    ...
>> (sounds it is impossible in the real world)
>
>I was wondering whether you had an actual case where this actually
>matters or it's just something you thought of while reading the code.

Tejun,

Thanks for your comments.

I found this just in code review. :-)

>
>> Every time, when we encounter a new CPU and try to assign it to a group, we
>> found it belongs to a new group. The original logic will iterate on all old
>> CPUs again, while the new logic could skip this and assign it to a new group.
>> 
>> Again, this is a tiny change, which doesn't matters a lot.
>
>I think it *could* matter because the current implementation is O(N^2)
>where N is the number of CPUs.  On machines, say, with 4k CPU, it's
>gonna loop 16M times but then again even that takes only a few
>millisecs on modern machines.

I am not familiar with the real cases of the CPU numbers. Thanks for leting me
know there could be 4K CPUs.

Yep, a few millisecs sounds not a big a mount.

>
>> BTW, I don't get your point for "start inner loop at @cpu+1".
>> 
>> The original logic is:
>> 	loop 1:   0 - nr_cpus
>> 	loop 2:      0 - (cpu - 1)
>> 
>> If you found one better approach to improve the logic, I believe all the users
>> will appreciate your efforts :-)
>
>Ooh, right, I forgot about the break and then I thought somehow that
>would make it O(N).  Sorry about that.  I blame jetlag. :)
>
>Yeah, I don't know.  The function is quite hairy which makes me keep
>things simpler and reluctant to make changes unless it actually makes
>non-trivial difference.  The change looks okay to me but it seems
>neither necessary or substantially beneficial and if my experience is
>anything to go by, *any* change involves some risk of brekage no
>matter how innocent it may look, so given the circumstances, I'd like
>to keep things the way they are.

Yep, I really agree with you. If no big improvement, it is really not
necessary to change the code, which will face some risk.

Here I have another one, which in my mind will improve it in one case. Looking
forward to your comments :-) If I am not correct, please let me know. :-)

>From bd70498b9df47b25ff20054e24bb510c5430c0c3 Mon Sep 17 00:00:00 2001
From: Wei Yang <weiyang@linux.vnet.ibm.com>
Date: Thu, 10 Oct 2013 09:42:14 +0800
Subject: [PATCH] percpu: optimize group assignment when cpu_distance_fn is
 NULL

When cpu_distance_fn is NULL, all CPUs belongs to group 0. The original logic
will continue to go through each CPU and its predecessor. cpu_distance_fn is
always NULL when pcpu_build_alloc_info() is called from pcpu_page_first_chunk().

By applying this patch, the time complexity will drop to O(n) form O(n^2) in
case cpu_distance_fn is NULL.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 mm/percpu.c |   23 ++++++++++++-----------
 1 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/mm/percpu.c b/mm/percpu.c
index f79c807..8e6034f 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -1481,20 +1481,21 @@ static struct pcpu_alloc_info * __init pcpu_build_alloc_info(
 	for_each_possible_cpu(cpu) {
 		group = 0;
 	next_group:
-		for_each_possible_cpu(tcpu) {
-			if (cpu == tcpu)
-				break;
-			if (group_map[tcpu] == group && cpu_distance_fn &&
-			    (cpu_distance_fn(cpu, tcpu) > LOCAL_DISTANCE ||
-			     cpu_distance_fn(tcpu, cpu) > LOCAL_DISTANCE)) {
-				group++;
-				if (group == nr_groups) {
-					nr_groups++;
+		if (cpu_distance_fn)
+			for_each_possible_cpu(tcpu) {
+				if (cpu == tcpu)
 					break;
+				if (group_map[tcpu] == group &&
+				    (cpu_distance_fn(cpu, tcpu) > LOCAL_DISTANCE ||
+				     cpu_distance_fn(tcpu, cpu) > LOCAL_DISTANCE)) {
+					group++;
+					if (group == nr_groups) {
+						nr_groups++;
+						break;
+					}
+					goto next_group;
 				}
-				goto next_group;
 			}
-		}
 		group_map[cpu] = group;
 		group_cnt[group]++;
 	}
-- 
1.7.5.4

BTW, this one is based on my previous patch.

>
>Thanks a lot!
>
>-- 
>tejun

-- 
Richard Yang
Help you, Help me


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group
@ 2013-10-28 15:17         ` Wei Yang
  0 siblings, 0 replies; 32+ messages in thread
From: Wei Yang @ 2013-10-28 15:17 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Wei Yang, cl, linux-mm, linux-kernel

On Mon, Oct 28, 2013 at 07:31:20AM -0400, Tejun Heo wrote:
>Hello,
>
>On Mon, Oct 28, 2013 at 11:00:55AM +0800, Wei Yang wrote:
>> >Does this actually matter?  If so, it'd probably make a lot more sense
>> >to start inner loop at @cpu + 1 so that it becomes O(N).
>> 
>> One of the worst case in my mind:
>> 
>> CPU:        0    1    2    3    4    ...
>> Group:      0    1    2    3    4    ...
>> (sounds it is impossible in the real world)
>
>I was wondering whether you had an actual case where this actually
>matters or it's just something you thought of while reading the code.

Tejun,

Thanks for your comments.

I found this just in code review. :-)

>
>> Every time, when we encounter a new CPU and try to assign it to a group, we
>> found it belongs to a new group. The original logic will iterate on all old
>> CPUs again, while the new logic could skip this and assign it to a new group.
>> 
>> Again, this is a tiny change, which doesn't matters a lot.
>
>I think it *could* matter because the current implementation is O(N^2)
>where N is the number of CPUs.  On machines, say, with 4k CPU, it's
>gonna loop 16M times but then again even that takes only a few
>millisecs on modern machines.

I am not familiar with the real cases of the CPU numbers. Thanks for leting me
know there could be 4K CPUs.

Yep, a few millisecs sounds not a big a mount.

>
>> BTW, I don't get your point for "start inner loop at @cpu+1".
>> 
>> The original logic is:
>> 	loop 1:   0 - nr_cpus
>> 	loop 2:      0 - (cpu - 1)
>> 
>> If you found one better approach to improve the logic, I believe all the users
>> will appreciate your efforts :-)
>
>Ooh, right, I forgot about the break and then I thought somehow that
>would make it O(N).  Sorry about that.  I blame jetlag. :)
>
>Yeah, I don't know.  The function is quite hairy which makes me keep
>things simpler and reluctant to make changes unless it actually makes
>non-trivial difference.  The change looks okay to me but it seems
>neither necessary or substantially beneficial and if my experience is
>anything to go by, *any* change involves some risk of brekage no
>matter how innocent it may look, so given the circumstances, I'd like
>to keep things the way they are.

Yep, I really agree with you. If no big improvement, it is really not
necessary to change the code, which will face some risk.

Here I have another one, which in my mind will improve it in one case. Looking
forward to your comments :-) If I am not correct, please let me know. :-)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group
  2013-10-28 15:17         ` Wei Yang
@ 2013-11-20  3:00           ` Wei Yang
  -1 siblings, 0 replies; 32+ messages in thread
From: Wei Yang @ 2013-11-20  3:00 UTC (permalink / raw)
  To: Wei Yang; +Cc: Tejun Heo, cl, linux-mm, linux-kernel

On Mon, Oct 28, 2013 at 11:17:46PM +0800, Wei Yang wrote:
>On Mon, Oct 28, 2013 at 07:31:20AM -0400, Tejun Heo wrote:
>>Hello,
>>
>>On Mon, Oct 28, 2013 at 11:00:55AM +0800, Wei Yang wrote:
>>> >Does this actually matter?  If so, it'd probably make a lot more sense
>>> >to start inner loop at @cpu + 1 so that it becomes O(N).
>>> 
>>> One of the worst case in my mind:
>>> 
>>> CPU:        0    1    2    3    4    ...
>>> Group:      0    1    2    3    4    ...
>>> (sounds it is impossible in the real world)
>>
>>I was wondering whether you had an actual case where this actually
>>matters or it's just something you thought of while reading the code.
>
>Tejun,
>
>Thanks for your comments.
>
>I found this just in code review. :-)
>
>>
>>> Every time, when we encounter a new CPU and try to assign it to a group, we
>>> found it belongs to a new group. The original logic will iterate on all old
>>> CPUs again, while the new logic could skip this and assign it to a new group.
>>> 
>>> Again, this is a tiny change, which doesn't matters a lot.
>>
>>I think it *could* matter because the current implementation is O(N^2)
>>where N is the number of CPUs.  On machines, say, with 4k CPU, it's
>>gonna loop 16M times but then again even that takes only a few
>>millisecs on modern machines.
>
>I am not familiar with the real cases of the CPU numbers. Thanks for leting me
>know there could be 4K CPUs.
>
>Yep, a few millisecs sounds not a big a mount.
>
>>
>>> BTW, I don't get your point for "start inner loop at @cpu+1".
>>> 
>>> The original logic is:
>>> 	loop 1:   0 - nr_cpus
>>> 	loop 2:      0 - (cpu - 1)
>>> 
>>> If you found one better approach to improve the logic, I believe all the users
>>> will appreciate your efforts :-)
>>
>>Ooh, right, I forgot about the break and then I thought somehow that
>>would make it O(N).  Sorry about that.  I blame jetlag. :)
>>
>>Yeah, I don't know.  The function is quite hairy which makes me keep
>>things simpler and reluctant to make changes unless it actually makes
>>non-trivial difference.  The change looks okay to me but it seems
>>neither necessary or substantially beneficial and if my experience is
>>anything to go by, *any* change involves some risk of brekage no
>>matter how innocent it may look, so given the circumstances, I'd like
>>to keep things the way they are.
>
>Yep, I really agree with you. If no big improvement, it is really not
>necessary to change the code, which will face some risk.
>
>Here I have another one, which in my mind will improve it in one case. Looking
>forward to your comments :-) If I am not correct, please let me know. :-)

Tejun,

What do you think about this one?

>
>From bd70498b9df47b25ff20054e24bb510c5430c0c3 Mon Sep 17 00:00:00 2001
>From: Wei Yang <weiyang@linux.vnet.ibm.com>
>Date: Thu, 10 Oct 2013 09:42:14 +0800
>Subject: [PATCH] percpu: optimize group assignment when cpu_distance_fn is
> NULL
>
>When cpu_distance_fn is NULL, all CPUs belongs to group 0. The original logic
>will continue to go through each CPU and its predecessor. cpu_distance_fn is
>always NULL when pcpu_build_alloc_info() is called from pcpu_page_first_chunk().
>
>By applying this patch, the time complexity will drop to O(n) form O(n^2) in
>case cpu_distance_fn is NULL.
>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>---
> mm/percpu.c |   23 ++++++++++++-----------
> 1 files changed, 12 insertions(+), 11 deletions(-)
>
>diff --git a/mm/percpu.c b/mm/percpu.c
>index f79c807..8e6034f 100644
>--- a/mm/percpu.c
>+++ b/mm/percpu.c
>@@ -1481,20 +1481,21 @@ static struct pcpu_alloc_info * __init pcpu_build_alloc_info(
> 	for_each_possible_cpu(cpu) {
> 		group = 0;
> 	next_group:
>-		for_each_possible_cpu(tcpu) {
>-			if (cpu == tcpu)
>-				break;
>-			if (group_map[tcpu] == group && cpu_distance_fn &&
>-			    (cpu_distance_fn(cpu, tcpu) > LOCAL_DISTANCE ||
>-			     cpu_distance_fn(tcpu, cpu) > LOCAL_DISTANCE)) {
>-				group++;
>-				if (group == nr_groups) {
>-					nr_groups++;
>+		if (cpu_distance_fn)
>+			for_each_possible_cpu(tcpu) {
>+				if (cpu == tcpu)
> 					break;
>+				if (group_map[tcpu] == group &&
>+				    (cpu_distance_fn(cpu, tcpu) > LOCAL_DISTANCE ||
>+				     cpu_distance_fn(tcpu, cpu) > LOCAL_DISTANCE)) {
>+					group++;
>+					if (group == nr_groups) {
>+						nr_groups++;
>+						break;
>+					}
>+					goto next_group;
> 				}
>-				goto next_group;
> 			}
>-		}
> 		group_map[cpu] = group;
> 		group_cnt[group]++;
> 	}
>-- 
>1.7.5.4
>
>BTW, this one is based on my previous patch.
>
>>
>>Thanks a lot!
>>
>>-- 
>>tejun
>
>-- 
>Richard Yang
>Help you, Help me

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group
@ 2013-11-20  3:00           ` Wei Yang
  0 siblings, 0 replies; 32+ messages in thread
From: Wei Yang @ 2013-11-20  3:00 UTC (permalink / raw)
  To: Wei Yang; +Cc: Tejun Heo, cl, linux-mm, linux-kernel

On Mon, Oct 28, 2013 at 11:17:46PM +0800, Wei Yang wrote:
>On Mon, Oct 28, 2013 at 07:31:20AM -0400, Tejun Heo wrote:
>>Hello,
>>
>>On Mon, Oct 28, 2013 at 11:00:55AM +0800, Wei Yang wrote:
>>> >Does this actually matter?  If so, it'd probably make a lot more sense
>>> >to start inner loop at @cpu + 1 so that it becomes O(N).
>>> 
>>> One of the worst case in my mind:
>>> 
>>> CPU:        0    1    2    3    4    ...
>>> Group:      0    1    2    3    4    ...
>>> (sounds it is impossible in the real world)
>>
>>I was wondering whether you had an actual case where this actually
>>matters or it's just something you thought of while reading the code.
>
>Tejun,
>
>Thanks for your comments.
>
>I found this just in code review. :-)
>
>>
>>> Every time, when we encounter a new CPU and try to assign it to a group, we
>>> found it belongs to a new group. The original logic will iterate on all old
>>> CPUs again, while the new logic could skip this and assign it to a new group.
>>> 
>>> Again, this is a tiny change, which doesn't matters a lot.
>>
>>I think it *could* matter because the current implementation is O(N^2)
>>where N is the number of CPUs.  On machines, say, with 4k CPU, it's
>>gonna loop 16M times but then again even that takes only a few
>>millisecs on modern machines.
>
>I am not familiar with the real cases of the CPU numbers. Thanks for leting me
>know there could be 4K CPUs.
>
>Yep, a few millisecs sounds not a big a mount.
>
>>
>>> BTW, I don't get your point for "start inner loop at @cpu+1".
>>> 
>>> The original logic is:
>>> 	loop 1:   0 - nr_cpus
>>> 	loop 2:      0 - (cpu - 1)
>>> 
>>> If you found one better approach to improve the logic, I believe all the users
>>> will appreciate your efforts :-)
>>
>>Ooh, right, I forgot about the break and then I thought somehow that
>>would make it O(N).  Sorry about that.  I blame jetlag. :)
>>
>>Yeah, I don't know.  The function is quite hairy which makes me keep
>>things simpler and reluctant to make changes unless it actually makes
>>non-trivial difference.  The change looks okay to me but it seems
>>neither necessary or substantially beneficial and if my experience is
>>anything to go by, *any* change involves some risk of brekage no
>>matter how innocent it may look, so given the circumstances, I'd like
>>to keep things the way they are.
>
>Yep, I really agree with you. If no big improvement, it is really not
>necessary to change the code, which will face some risk.
>
>Here I have another one, which in my mind will improve it in one case. Looking
>forward to your comments :-) If I am not correct, please let me know. :-)

Tejun,

What do you think about this one?

>
>From bd70498b9df47b25ff20054e24bb510c5430c0c3 Mon Sep 17 00:00:00 2001
>From: Wei Yang <weiyang@linux.vnet.ibm.com>
>Date: Thu, 10 Oct 2013 09:42:14 +0800
>Subject: [PATCH] percpu: optimize group assignment when cpu_distance_fn is
> NULL
>
>When cpu_distance_fn is NULL, all CPUs belongs to group 0. The original logic
>will continue to go through each CPU and its predecessor. cpu_distance_fn is
>always NULL when pcpu_build_alloc_info() is called from pcpu_page_first_chunk().
>
>By applying this patch, the time complexity will drop to O(n) form O(n^2) in
>case cpu_distance_fn is NULL.
>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>---
> mm/percpu.c |   23 ++++++++++++-----------
> 1 files changed, 12 insertions(+), 11 deletions(-)
>
>diff --git a/mm/percpu.c b/mm/percpu.c
>index f79c807..8e6034f 100644
>--- a/mm/percpu.c
>+++ b/mm/percpu.c
>@@ -1481,20 +1481,21 @@ static struct pcpu_alloc_info * __init pcpu_build_alloc_info(
> 	for_each_possible_cpu(cpu) {
> 		group = 0;
> 	next_group:
>-		for_each_possible_cpu(tcpu) {
>-			if (cpu == tcpu)
>-				break;
>-			if (group_map[tcpu] == group && cpu_distance_fn &&
>-			    (cpu_distance_fn(cpu, tcpu) > LOCAL_DISTANCE ||
>-			     cpu_distance_fn(tcpu, cpu) > LOCAL_DISTANCE)) {
>-				group++;
>-				if (group == nr_groups) {
>-					nr_groups++;
>+		if (cpu_distance_fn)
>+			for_each_possible_cpu(tcpu) {
>+				if (cpu == tcpu)
> 					break;
>+				if (group_map[tcpu] == group &&
>+				    (cpu_distance_fn(cpu, tcpu) > LOCAL_DISTANCE ||
>+				     cpu_distance_fn(tcpu, cpu) > LOCAL_DISTANCE)) {
>+					group++;
>+					if (group == nr_groups) {
>+						nr_groups++;
>+						break;
>+					}
>+					goto next_group;
> 				}
>-				goto next_group;
> 			}
>-		}
> 		group_map[cpu] = group;
> 		group_cnt[group]++;
> 	}
>-- 
>1.7.5.4
>
>BTW, this one is based on my previous patch.
>
>>
>>Thanks a lot!
>>
>>-- 
>>tejun
>
>-- 
>Richard Yang
>Help you, Help me

-- 
Richard Yang
Help you, Help me

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group
  2013-11-20  3:00           ` Wei Yang
@ 2013-11-20  5:51             ` Tejun Heo
  -1 siblings, 0 replies; 32+ messages in thread
From: Tejun Heo @ 2013-11-20  5:51 UTC (permalink / raw)
  To: Wei Yang; +Cc: cl, linux-mm, linux-kernel

Hello,

On Wed, Nov 20, 2013 at 11:00:56AM +0800, Wei Yang wrote:
> What do you think about this one?
> 
> >
> >From bd70498b9df47b25ff20054e24bb510c5430c0c3 Mon Sep 17 00:00:00 2001
> >From: Wei Yang <weiyang@linux.vnet.ibm.com>
> >Date: Thu, 10 Oct 2013 09:42:14 +0800
> >Subject: [PATCH] percpu: optimize group assignment when cpu_distance_fn is
> > NULL
> >
> >When cpu_distance_fn is NULL, all CPUs belongs to group 0. The original logic
> >will continue to go through each CPU and its predecessor. cpu_distance_fn is
> >always NULL when pcpu_build_alloc_info() is called from pcpu_page_first_chunk().
> >
> >By applying this patch, the time complexity will drop to O(n) form O(n^2) in
> >case cpu_distance_fn is NULL.

The test was put in the inner loop because the nesting was already too
deep and cpu_distance_fn is unlikely to be NULL on machines where the
number of CPUs is high enough to matter.  If that O(n^2) loop is gonna
be a problem, it's gonna be a problem on large NUMA machines and we'll
have to do something about it for cases where cpu_distance_fn exists
anyway.

The patch is just extremely marginal.  Ah well... why not?  I'll apply
it once -rc1 drops.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group
@ 2013-11-20  5:51             ` Tejun Heo
  0 siblings, 0 replies; 32+ messages in thread
From: Tejun Heo @ 2013-11-20  5:51 UTC (permalink / raw)
  To: Wei Yang; +Cc: cl, linux-mm, linux-kernel

Hello,

On Wed, Nov 20, 2013 at 11:00:56AM +0800, Wei Yang wrote:
> What do you think about this one?
> 
> >
> >From bd70498b9df47b25ff20054e24bb510c5430c0c3 Mon Sep 17 00:00:00 2001
> >From: Wei Yang <weiyang@linux.vnet.ibm.com>
> >Date: Thu, 10 Oct 2013 09:42:14 +0800
> >Subject: [PATCH] percpu: optimize group assignment when cpu_distance_fn is
> > NULL
> >
> >When cpu_distance_fn is NULL, all CPUs belongs to group 0. The original logic
> >will continue to go through each CPU and its predecessor. cpu_distance_fn is
> >always NULL when pcpu_build_alloc_info() is called from pcpu_page_first_chunk().
> >
> >By applying this patch, the time complexity will drop to O(n) form O(n^2) in
> >case cpu_distance_fn is NULL.

The test was put in the inner loop because the nesting was already too
deep and cpu_distance_fn is unlikely to be NULL on machines where the
number of CPUs is high enough to matter.  If that O(n^2) loop is gonna
be a problem, it's gonna be a problem on large NUMA machines and we'll
have to do something about it for cases where cpu_distance_fn exists
anyway.

The patch is just extremely marginal.  Ah well... why not?  I'll apply
it once -rc1 drops.

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group
  2013-11-20  5:51             ` Tejun Heo
@ 2013-11-20  6:58               ` Wei Yang
  -1 siblings, 0 replies; 32+ messages in thread
From: Wei Yang @ 2013-11-20  6:58 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Wei Yang, cl, linux-mm, linux-kernel

On Wed, Nov 20, 2013 at 12:51:21AM -0500, Tejun Heo wrote:
>Hello,
>
>On Wed, Nov 20, 2013 at 11:00:56AM +0800, Wei Yang wrote:
>> What do you think about this one?
>> 
>> >
>> >From bd70498b9df47b25ff20054e24bb510c5430c0c3 Mon Sep 17 00:00:00 2001
>> >From: Wei Yang <weiyang@linux.vnet.ibm.com>
>> >Date: Thu, 10 Oct 2013 09:42:14 +0800
>> >Subject: [PATCH] percpu: optimize group assignment when cpu_distance_fn is
>> > NULL
>> >
>> >When cpu_distance_fn is NULL, all CPUs belongs to group 0. The original logic
>> >will continue to go through each CPU and its predecessor. cpu_distance_fn is
>> >always NULL when pcpu_build_alloc_info() is called from pcpu_page_first_chunk().
>> >
>> >By applying this patch, the time complexity will drop to O(n) form O(n^2) in
>> >case cpu_distance_fn is NULL.
>
>The test was put in the inner loop because the nesting was already too
>deep and cpu_distance_fn is unlikely to be NULL on machines where the
>number of CPUs is high enough to matter.  If that O(n^2) loop is gonna
>be a problem, it's gonna be a problem on large NUMA machines and we'll
>have to do something about it for cases where cpu_distance_fn exists
>anyway.

Tejun,

Yep, hope this will not bring some problem on a large NUMA machie when
cpu_distance_fn is not NULL.

>
>The patch is just extremely marginal.  Ah well... why not?  I'll apply
>it once -rc1 drops.
>
>Thanks.
>
>-- 
>tejun

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group
@ 2013-11-20  6:58               ` Wei Yang
  0 siblings, 0 replies; 32+ messages in thread
From: Wei Yang @ 2013-11-20  6:58 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Wei Yang, cl, linux-mm, linux-kernel

On Wed, Nov 20, 2013 at 12:51:21AM -0500, Tejun Heo wrote:
>Hello,
>
>On Wed, Nov 20, 2013 at 11:00:56AM +0800, Wei Yang wrote:
>> What do you think about this one?
>> 
>> >
>> >From bd70498b9df47b25ff20054e24bb510c5430c0c3 Mon Sep 17 00:00:00 2001
>> >From: Wei Yang <weiyang@linux.vnet.ibm.com>
>> >Date: Thu, 10 Oct 2013 09:42:14 +0800
>> >Subject: [PATCH] percpu: optimize group assignment when cpu_distance_fn is
>> > NULL
>> >
>> >When cpu_distance_fn is NULL, all CPUs belongs to group 0. The original logic
>> >will continue to go through each CPU and its predecessor. cpu_distance_fn is
>> >always NULL when pcpu_build_alloc_info() is called from pcpu_page_first_chunk().
>> >
>> >By applying this patch, the time complexity will drop to O(n) form O(n^2) in
>> >case cpu_distance_fn is NULL.
>
>The test was put in the inner loop because the nesting was already too
>deep and cpu_distance_fn is unlikely to be NULL on machines where the
>number of CPUs is high enough to matter.  If that O(n^2) loop is gonna
>be a problem, it's gonna be a problem on large NUMA machines and we'll
>have to do something about it for cases where cpu_distance_fn exists
>anyway.

Tejun,

Yep, hope this will not bring some problem on a large NUMA machie when
cpu_distance_fn is not NULL.

>
>The patch is just extremely marginal.  Ah well... why not?  I'll apply
>it once -rc1 drops.
>
>Thanks.
>
>-- 
>tejun

-- 
Richard Yang
Help you, Help me

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group
  2013-11-20  5:51             ` Tejun Heo
@ 2013-11-22 23:04               ` Tejun Heo
  -1 siblings, 0 replies; 32+ messages in thread
From: Tejun Heo @ 2013-11-22 23:04 UTC (permalink / raw)
  To: Wei Yang; +Cc: cl, linux-mm, linux-kernel

Hello,

On Wed, Nov 20, 2013 at 12:51:21AM -0500, Tejun Heo wrote:
> The patch is just extremely marginal.  Ah well... why not?  I'll apply
> it once -rc1 drops.

So, I was about to apply this patch but decided against it.  It
doesn't really make anything better and the code looks worse
afterwards.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group
@ 2013-11-22 23:04               ` Tejun Heo
  0 siblings, 0 replies; 32+ messages in thread
From: Tejun Heo @ 2013-11-22 23:04 UTC (permalink / raw)
  To: Wei Yang; +Cc: cl, linux-mm, linux-kernel

Hello,

On Wed, Nov 20, 2013 at 12:51:21AM -0500, Tejun Heo wrote:
> The patch is just extremely marginal.  Ah well... why not?  I'll apply
> it once -rc1 drops.

So, I was about to apply this patch but decided against it.  It
doesn't really make anything better and the code looks worse
afterwards.

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group
  2013-11-22 23:04               ` Tejun Heo
@ 2013-11-24  1:48                 ` Wei Yang
  -1 siblings, 0 replies; 32+ messages in thread
From: Wei Yang @ 2013-11-24  1:48 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Wei Yang, cl, linux-mm, linux-kernel

On Fri, Nov 22, 2013 at 06:04:00PM -0500, Tejun Heo wrote:
>Hello,
>
>On Wed, Nov 20, 2013 at 12:51:21AM -0500, Tejun Heo wrote:
>> The patch is just extremely marginal.  Ah well... why not?  I'll apply
>> it once -rc1 drops.
>
>So, I was about to apply this patch but decided against it.  It
>doesn't really make anything better and the code looks worse
>afterwards.

Ok, that's fine. Maybe we could find a better way :-)

>
>Thanks.
>
>-- 
>tejun

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group
@ 2013-11-24  1:48                 ` Wei Yang
  0 siblings, 0 replies; 32+ messages in thread
From: Wei Yang @ 2013-11-24  1:48 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Wei Yang, cl, linux-mm, linux-kernel

On Fri, Nov 22, 2013 at 06:04:00PM -0500, Tejun Heo wrote:
>Hello,
>
>On Wed, Nov 20, 2013 at 12:51:21AM -0500, Tejun Heo wrote:
>> The patch is just extremely marginal.  Ah well... why not?  I'll apply
>> it once -rc1 drops.
>
>So, I was about to apply this patch but decided against it.  It
>doesn't really make anything better and the code looks worse
>afterwards.

Ok, that's fine. Maybe we could find a better way :-)

>
>Thanks.
>
>-- 
>tejun

-- 
Richard Yang
Help you, Help me

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2013-11-24  1:49 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-21  8:58 [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group Wei Yang
2013-10-21  8:58 ` Wei Yang
2013-10-21  8:58 ` [PATCH 2/3] percpu: merge two loops when setting up group info Wei Yang
2013-10-21  8:58   ` Wei Yang
2013-10-27 12:35   ` Tejun Heo
2013-10-27 12:35     ` Tejun Heo
2013-10-28  2:37     ` Wei Yang
2013-10-28  2:37       ` Wei Yang
2013-10-21  8:58 ` [PATCH 3/3] percpu: little optimization on calculating pcpu_unit_size Wei Yang
2013-10-21  8:58   ` Wei Yang
2013-10-27 12:36   ` Tejun Heo
2013-10-27 12:36     ` Tejun Heo
2013-10-28  2:43     ` Wei Yang
2013-10-28  2:43       ` Wei Yang
2013-10-27 12:30 ` [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new group Tejun Heo
2013-10-27 12:30   ` Tejun Heo
2013-10-28  3:00   ` Wei Yang
2013-10-28  3:00     ` Wei Yang
2013-10-28 11:31     ` Tejun Heo
2013-10-28 11:31       ` Tejun Heo
2013-10-28 15:17       ` Wei Yang
2013-10-28 15:17         ` Wei Yang
2013-11-20  3:00         ` Wei Yang
2013-11-20  3:00           ` Wei Yang
2013-11-20  5:51           ` Tejun Heo
2013-11-20  5:51             ` Tejun Heo
2013-11-20  6:58             ` Wei Yang
2013-11-20  6:58               ` Wei Yang
2013-11-22 23:04             ` Tejun Heo
2013-11-22 23:04               ` Tejun Heo
2013-11-24  1:48               ` Wei Yang
2013-11-24  1:48                 ` Wei Yang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.