linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 3/9] net: xfrm: use __this_cpu_read per-cpu helper
@ 2012-11-13  1:52 Shan Wei
  2012-11-13  7:21 ` Steffen Klassert
  0 siblings, 1 reply; 6+ messages in thread
From: Shan Wei @ 2012-11-13  1:52 UTC (permalink / raw)
  To: steffen.klassert, David Miller, NetDev, Herbert Xu,
	Kernel-Maillist, cl, Shan Wei

From: Shan Wei <davidshan@tencent.com>


Signed-off-by: Shan Wei <davidshan@tencent.com>
---
v4:
  derefrence pointer before reading to avoid compile warning.
---
 net/xfrm/xfrm_ipcomp.c |    8 +++-----
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/net/xfrm/xfrm_ipcomp.c b/net/xfrm/xfrm_ipcomp.c
index e5246fb..2906d52 100644
--- a/net/xfrm/xfrm_ipcomp.c
+++ b/net/xfrm/xfrm_ipcomp.c
@@ -276,18 +276,16 @@ static struct crypto_comp * __percpu *ipcomp_alloc_tfms(const char *alg_name)
 	struct crypto_comp * __percpu *tfms;
 	int cpu;
 
-	/* This can be any valid CPU ID so we don't need locking. */
-	cpu = raw_smp_processor_id();
 
 	list_for_each_entry(pos, &ipcomp_tfms_list, list) {
 		struct crypto_comp *tfm;
 
-		tfms = pos->tfms;
-		tfm = *per_cpu_ptr(tfms, cpu);
+		/* This can be any valid CPU ID so we don't need locking. */
+		tfm = __this_cpu_read(*pos->tfms);
 
 		if (!strcmp(crypto_comp_name(tfm), alg_name)) {
 			pos->users++;
-			return tfms;
+			return pos->tfms;
 		}
 	}
 
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v4 3/9] net: xfrm: use __this_cpu_read per-cpu helper
  2012-11-13  1:52 [PATCH v4 3/9] net: xfrm: use __this_cpu_read per-cpu helper Shan Wei
@ 2012-11-13  7:21 ` Steffen Klassert
  2012-11-13  9:33   ` Shan Wei
  0 siblings, 1 reply; 6+ messages in thread
From: Steffen Klassert @ 2012-11-13  7:21 UTC (permalink / raw)
  To: Shan Wei; +Cc: David Miller, NetDev, Herbert Xu, Kernel-Maillist, cl

On Tue, Nov 13, 2012 at 09:52:09AM +0800, Shan Wei wrote:
> From: Shan Wei <davidshan@tencent.com>
> 

Please add a proper commit message, explaining why you do this change.

> 
> Signed-off-by: Shan Wei <davidshan@tencent.com>
> ---
> v4:
>   derefrence pointer before reading to avoid compile warning.
> ---
>  net/xfrm/xfrm_ipcomp.c |    8 +++-----
>  1 files changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/net/xfrm/xfrm_ipcomp.c b/net/xfrm/xfrm_ipcomp.c
> index e5246fb..2906d52 100644
> --- a/net/xfrm/xfrm_ipcomp.c
> +++ b/net/xfrm/xfrm_ipcomp.c
> @@ -276,18 +276,16 @@ static struct crypto_comp * __percpu *ipcomp_alloc_tfms(const char *alg_name)
>  	struct crypto_comp * __percpu *tfms;
>  	int cpu;
>  
> -	/* This can be any valid CPU ID so we don't need locking. */
> -	cpu = raw_smp_processor_id();
>  
>  	list_for_each_entry(pos, &ipcomp_tfms_list, list) {
>  		struct crypto_comp *tfm;
>  
> -		tfms = pos->tfms;
> -		tfm = *per_cpu_ptr(tfms, cpu);
> +		/* This can be any valid CPU ID so we don't need locking. */
> +		tfm = __this_cpu_read(*pos->tfms);

This should just fetch the tfm pointer, so why exactly __this_cpu_read
is better than __this_cpu_ptr? Please keep in mind that performance is
not the most important thing here. It's much more important that it
works in any case.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v4 3/9] net: xfrm: use __this_cpu_read per-cpu helper
  2012-11-13  7:21 ` Steffen Klassert
@ 2012-11-13  9:33   ` Shan Wei
  2012-11-13 10:48     ` Steffen Klassert
  0 siblings, 1 reply; 6+ messages in thread
From: Shan Wei @ 2012-11-13  9:33 UTC (permalink / raw)
  To: Steffen Klassert, Christoph Lameter
  Cc: David Miller, NetDev, Herbert Xu, Kernel-Maillist

Steffen Klassert said, at 2012/11/13 15:21:
> 
> This should just fetch the tfm pointer, so why exactly __this_cpu_read
> is better than __this_cpu_ptr? Please keep in mind that performance is
> not the most important thing here. It's much more important that it
> works in any case.
 
[0/9] describes why we should submit this series patches.
you are not included in original mail.

this_cpu_ptr relocates and address. this_cpu_read() relocates the address
and performs the fetch. If you want to operate on rda(defined as per_cpu) 
then you can only use this_cpu_ptr. this_cpu_read() saves you more instructions
since it can do the relocation and the fetch in one instruction.

More info, please refer to http://www.spinics.net/lists/kernel/msg1435153.html


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v4 3/9] net: xfrm: use __this_cpu_read per-cpu helper
  2012-11-13  9:33   ` Shan Wei
@ 2012-11-13 10:48     ` Steffen Klassert
  2012-11-13 12:36       ` [PATCH v5 " Shan Wei
  0 siblings, 1 reply; 6+ messages in thread
From: Steffen Klassert @ 2012-11-13 10:48 UTC (permalink / raw)
  To: Shan Wei
  Cc: Christoph Lameter, David Miller, NetDev, Herbert Xu, Kernel-Maillist

On Tue, Nov 13, 2012 at 05:33:19PM +0800, Shan Wei wrote:
> Steffen Klassert said, at 2012/11/13 15:21:
> > 
> > This should just fetch the tfm pointer, so why exactly __this_cpu_read
> > is better than __this_cpu_ptr? Please keep in mind that performance is
> > not the most important thing here. It's much more important that it
> > works in any case.
>  
> [0/9] describes why we should submit this series patches.
> you are not included in original mail.
> 
> this_cpu_ptr relocates and address. this_cpu_read() relocates the address
> and performs the fetch. If you want to operate on rda(defined as per_cpu) 
> then you can only use this_cpu_ptr. this_cpu_read() saves you more instructions
> since it can do the relocation and the fetch in one instruction.
> 

Ok, so please add a commit message to describe your changes.

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 3/9] net: xfrm: use __this_cpu_read per-cpu helper
  2012-11-13 10:48     ` Steffen Klassert
@ 2012-11-13 12:36       ` Shan Wei
  2012-11-14  8:43         ` Steffen Klassert
  0 siblings, 1 reply; 6+ messages in thread
From: Shan Wei @ 2012-11-13 12:36 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Christoph Lameter, David Miller, NetDev, Herbert Xu, Kernel-Maillist

Steffen Klassert said, at 2012/11/13 18:48:
> 
> Ok, so please add a commit message to describe your changes.
> 
> Thanks.
> 

[PATCH v5] net: xfrm: use __this_cpu_read per-cpu helper

this_cpu_ptr/this_cpu_read is faster than per_cpu_ptr(p, smp_processor_id()) 
and can reduce  memory accesses.
The latter helper needs to find the offset for current cpu,
and needs more assembler instructions which objdump shows in following. 

this_cpu_ptr relocates and address. this_cpu_read() relocates the address
and performs the fetch. this_cpu_read() saves you more instructions
since it can do the relocation and the fetch in one instruction.

per_cpu_ptr(p, smp_processor_id()):
  1e:   65 8b 04 25 00 00 00 00         mov    %gs:0x0,%eax
  26:   48 98                           cltq
  28:   31 f6                           xor    %esi,%esi
  2a:   48 c7 c7 00 00 00 00            mov    $0x0,%rdi
  31:   48 8b 04 c5 00 00 00 00         mov    0x0(,%rax,8),%rax
  39:   c7 44 10 04 14 00 00 00         movl   $0x14,0x4(%rax,%rdx,1)

this_cpu_ptr(p)
  1e:   65 48 03 14 25 00 00 00 00      add    %gs:0x0,%rdx
  27:   31 f6                           xor    %esi,%esi
  29:   c7 42 04 14 00 00 00            movl   $0x14,0x4(%rdx)
  30:   48 c7 c7 00 00 00 00            mov    $0x0,%rdi

Signed-off-by: Shan Wei <davidshan@tencent.com>
---
 net/xfrm/xfrm_ipcomp.c |    8 +++-----
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/net/xfrm/xfrm_ipcomp.c b/net/xfrm/xfrm_ipcomp.c
index e5246fb..2906d52 100644
--- a/net/xfrm/xfrm_ipcomp.c
+++ b/net/xfrm/xfrm_ipcomp.c
@@ -276,18 +276,16 @@ static struct crypto_comp * __percpu *ipcomp_alloc_tfms(const char *alg_name)
 	struct crypto_comp * __percpu *tfms;
 	int cpu;
 
-	/* This can be any valid CPU ID so we don't need locking. */
-	cpu = raw_smp_processor_id();
 
 	list_for_each_entry(pos, &ipcomp_tfms_list, list) {
 		struct crypto_comp *tfm;
 
-		tfms = pos->tfms;
-		tfm = *per_cpu_ptr(tfms, cpu);
+		/* This can be any valid CPU ID so we don't need locking. */
+		tfm = __this_cpu_read(*pos->tfms);
 
 		if (!strcmp(crypto_comp_name(tfm), alg_name)) {
 			pos->users++;
-			return tfms;
+			return pos->tfms;
 		}
 	}
 
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 3/9] net: xfrm: use __this_cpu_read per-cpu helper
  2012-11-13 12:36       ` [PATCH v5 " Shan Wei
@ 2012-11-14  8:43         ` Steffen Klassert
  0 siblings, 0 replies; 6+ messages in thread
From: Steffen Klassert @ 2012-11-14  8:43 UTC (permalink / raw)
  To: Shan Wei
  Cc: Christoph Lameter, David Miller, NetDev, Herbert Xu, Kernel-Maillist

On Tue, Nov 13, 2012 at 08:36:00PM +0800, Shan Wei wrote:
> Steffen Klassert said, at 2012/11/13 18:48:
> > 
> > Ok, so please add a commit message to describe your changes.
> > 
> > Thanks.
> > 
> 
> [PATCH v5] net: xfrm: use __this_cpu_read per-cpu helper
> 
> this_cpu_ptr/this_cpu_read is faster than per_cpu_ptr(p, smp_processor_id()) 
> and can reduce  memory accesses.
> The latter helper needs to find the offset for current cpu,
> and needs more assembler instructions which objdump shows in following. 
> 
> this_cpu_ptr relocates and address. this_cpu_read() relocates the address
> and performs the fetch. this_cpu_read() saves you more instructions
> since it can do the relocation and the fetch in one instruction.
> 
> per_cpu_ptr(p, smp_processor_id()):
>   1e:   65 8b 04 25 00 00 00 00         mov    %gs:0x0,%eax
>   26:   48 98                           cltq
>   28:   31 f6                           xor    %esi,%esi
>   2a:   48 c7 c7 00 00 00 00            mov    $0x0,%rdi
>   31:   48 8b 04 c5 00 00 00 00         mov    0x0(,%rax,8),%rax
>   39:   c7 44 10 04 14 00 00 00         movl   $0x14,0x4(%rax,%rdx,1)
> 
> this_cpu_ptr(p)
>   1e:   65 48 03 14 25 00 00 00 00      add    %gs:0x0,%rdx
>   27:   31 f6                           xor    %esi,%esi
>   29:   c7 42 04 14 00 00 00            movl   $0x14,0x4(%rdx)
>   30:   48 c7 c7 00 00 00 00            mov    $0x0,%rdi
> 
> Signed-off-by: Shan Wei <davidshan@tencent.com>

Applied to ipsec-next, thanks!

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-11-14  8:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-13  1:52 [PATCH v4 3/9] net: xfrm: use __this_cpu_read per-cpu helper Shan Wei
2012-11-13  7:21 ` Steffen Klassert
2012-11-13  9:33   ` Shan Wei
2012-11-13 10:48     ` Steffen Klassert
2012-11-13 12:36       ` [PATCH v5 " Shan Wei
2012-11-14  8:43         ` Steffen Klassert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).