* [PATCH v2 6/9] rcu: use __this_cpu_read helper instead of per_cpu_ptr(p, raw_smp_processor_id())
@ 2012-11-02 16:01 Shan Wei
2012-11-02 17:46 ` Christoph Lameter
2012-11-02 18:10 ` Paul E. McKenney
0 siblings, 2 replies; 9+ messages in thread
From: Shan Wei @ 2012-11-02 16:01 UTC (permalink / raw)
To: dipankar, paulmck, Kernel-Maillist, cl, Shan Wei
From: Shan Wei <davidshan@tencent.com>
Signed-off-by: Shan Wei <davidshan@tencent.com>
---
kernel/rcutree.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 74df86b..441b945 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1960,7 +1960,7 @@ static void force_quiescent_state(struct rcu_state *rsp)
struct rcu_node *rnp_old = NULL;
/* Funnel through hierarchy to reduce memory contention. */
- rnp = per_cpu_ptr(rsp->rda, raw_smp_processor_id())->mynode;
+ rnp = __this_cpu_read(rsp->rda->mynode);
for (; rnp != NULL; rnp = rnp->parent) {
ret = (ACCESS_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS) ||
!raw_spin_trylock(&rnp->fqslock);
--
1.7.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 6/9] rcu: use __this_cpu_read helper instead of per_cpu_ptr(p, raw_smp_processor_id())
2012-11-02 16:01 [PATCH v2 6/9] rcu: use __this_cpu_read helper instead of per_cpu_ptr(p, raw_smp_processor_id()) Shan Wei
@ 2012-11-02 17:46 ` Christoph Lameter
2012-11-02 18:10 ` Paul E. McKenney
1 sibling, 0 replies; 9+ messages in thread
From: Christoph Lameter @ 2012-11-02 17:46 UTC (permalink / raw)
To: Shan Wei; +Cc: dipankar, paulmck, Kernel-Maillist
On Sat, 3 Nov 2012, Shan Wei wrote:
>
> /* Funnel through hierarchy to reduce memory contention. */
> - rnp = per_cpu_ptr(rsp->rda, raw_smp_processor_id())->mynode;
> + rnp = __this_cpu_read(rsp->rda->mynode);
> for (; rnp != NULL; rnp = rnp->parent) {
Reviewed-by: Christoph Lameter <cl@linux.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 6/9] rcu: use __this_cpu_read helper instead of per_cpu_ptr(p, raw_smp_processor_id())
2012-11-02 16:01 [PATCH v2 6/9] rcu: use __this_cpu_read helper instead of per_cpu_ptr(p, raw_smp_processor_id()) Shan Wei
2012-11-02 17:46 ` Christoph Lameter
@ 2012-11-02 18:10 ` Paul E. McKenney
2012-11-02 20:19 ` Christoph Lameter
1 sibling, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2012-11-02 18:10 UTC (permalink / raw)
To: Shan Wei; +Cc: dipankar, Kernel-Maillist, cl
On Sat, Nov 03, 2012 at 12:01:47AM +0800, Shan Wei wrote:
> From: Shan Wei <davidshan@tencent.com>
>
> Signed-off-by: Shan Wei <davidshan@tencent.com>
> ---
> kernel/rcutree.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 74df86b..441b945 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -1960,7 +1960,7 @@ static void force_quiescent_state(struct rcu_state *rsp)
> struct rcu_node *rnp_old = NULL;
>
> /* Funnel through hierarchy to reduce memory contention. */
> - rnp = per_cpu_ptr(rsp->rda, raw_smp_processor_id())->mynode;
> + rnp = __this_cpu_read(rsp->rda->mynode);
OK, I'll bite... Why this instead of:
rnp = __this_cpu_read(rsp->rda)->mynode;
Thanx, Paul
> for (; rnp != NULL; rnp = rnp->parent) {
> ret = (ACCESS_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS) ||
> !raw_spin_trylock(&rnp->fqslock);
> --
> 1.7.1
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 6/9] rcu: use __this_cpu_read helper instead of per_cpu_ptr(p, raw_smp_processor_id())
2012-11-02 18:10 ` Paul E. McKenney
@ 2012-11-02 20:19 ` Christoph Lameter
2012-11-03 9:19 ` Paul E. McKenney
0 siblings, 1 reply; 9+ messages in thread
From: Christoph Lameter @ 2012-11-02 20:19 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: Shan Wei, dipankar, Kernel-Maillist
On Fri, 2 Nov 2012, Paul E. McKenney wrote:
> On Sat, Nov 03, 2012 at 12:01:47AM +0800, Shan Wei wrote:
> > From: Shan Wei <davidshan@tencent.com>
> >
> > Signed-off-by: Shan Wei <davidshan@tencent.com>
> > ---
> > kernel/rcutree.c | 2 +-
> > 1 files changed, 1 insertions(+), 1 deletions(-)
> >
> > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > index 74df86b..441b945 100644
> > --- a/kernel/rcutree.c
> > +++ b/kernel/rcutree.c
> > @@ -1960,7 +1960,7 @@ static void force_quiescent_state(struct rcu_state *rsp)
> > struct rcu_node *rnp_old = NULL;
> >
> > /* Funnel through hierarchy to reduce memory contention. */
> > - rnp = per_cpu_ptr(rsp->rda, raw_smp_processor_id())->mynode;
> > + rnp = __this_cpu_read(rsp->rda->mynode);
>
> OK, I'll bite... Why this instead of:
>
> rnp = __this_cpu_read(rsp->rda)->mynode;
Because this_cpu_read fetches a data word from an address. The addres is
relocated using a segment prefix (which contains the offset of the
current per cpu area).
And the address needed here is the address of the field of mynode
within a structure that has a per cpu address.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 6/9] rcu: use __this_cpu_read helper instead of per_cpu_ptr(p, raw_smp_processor_id())
2012-11-02 20:19 ` Christoph Lameter
@ 2012-11-03 9:19 ` Paul E. McKenney
2012-11-04 10:38 ` Shan Wei
2012-11-05 15:23 ` Christoph Lameter
0 siblings, 2 replies; 9+ messages in thread
From: Paul E. McKenney @ 2012-11-03 9:19 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Shan Wei, dipankar, Kernel-Maillist
On Fri, Nov 02, 2012 at 08:19:04PM +0000, Christoph Lameter wrote:
> On Fri, 2 Nov 2012, Paul E. McKenney wrote:
>
> > On Sat, Nov 03, 2012 at 12:01:47AM +0800, Shan Wei wrote:
> > > From: Shan Wei <davidshan@tencent.com>
> > >
> > > Signed-off-by: Shan Wei <davidshan@tencent.com>
> > > ---
> > > kernel/rcutree.c | 2 +-
> > > 1 files changed, 1 insertions(+), 1 deletions(-)
> > >
> > > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > > index 74df86b..441b945 100644
> > > --- a/kernel/rcutree.c
> > > +++ b/kernel/rcutree.c
> > > @@ -1960,7 +1960,7 @@ static void force_quiescent_state(struct rcu_state *rsp)
> > > struct rcu_node *rnp_old = NULL;
> > >
> > > /* Funnel through hierarchy to reduce memory contention. */
> > > - rnp = per_cpu_ptr(rsp->rda, raw_smp_processor_id())->mynode;
> > > + rnp = __this_cpu_read(rsp->rda->mynode);
> >
> > OK, I'll bite... Why this instead of:
> >
> > rnp = __this_cpu_read(rsp->rda)->mynode;
>
> Because this_cpu_read fetches a data word from an address. The addres is
> relocated using a segment prefix (which contains the offset of the
> current per cpu area).
>
> And the address needed here is the address of the field of mynode
> within a structure that has a per cpu address.
OK, I do understand why it happens to work. My question is instead why
it is considered a good idea. After all, it is the ->rda field that is
marked __percpu, not the ->mynode field. So in the interest of
mechanical checking and general readability, it seems to me that it
would be way better to apply __this_cpu_read() to rsp->rda rather than
to rsp->rda->mynode.
Thanx, Paul
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 6/9] rcu: use __this_cpu_read helper instead of per_cpu_ptr(p, raw_smp_processor_id())
2012-11-03 9:19 ` Paul E. McKenney
@ 2012-11-04 10:38 ` Shan Wei
2012-11-05 14:49 ` Christoph Lameter
2012-11-05 15:23 ` Christoph Lameter
1 sibling, 1 reply; 9+ messages in thread
From: Shan Wei @ 2012-11-04 10:38 UTC (permalink / raw)
To: paulmck; +Cc: Christoph Lameter, dipankar, Kernel-Maillist
Paul E. McKenney said, at 2012/11/3 17:19:
> OK, I do understand why it happens to work. My question is instead why
> it is considered a good idea.
Maybe objdump gives the answer.
__this_cpu_read which read member pointer of per-cpu variable
can reduce two instructions on x86-64 arch.
*test code:*
struct eater_state {
u32 state;
struct eater __percpu *eater_info;
};
struct eater {
char name[4];
u32 age;
};
static u32 test_func(struct eater_state *tstas)
{
struct eater *aeater;
//aeater = __this_cpu_ptr(tstas->eater_info); <-----------------1
//return aeater->age;
return __this_cpu_read(tstas->eater_info->age); <-----------------2
}
static int __init demo_init(void)
{
int ret = 0 ;
int age;
struct eater_state as;
struct eater david;
as.state = 1;
as.eater_info = &david;
age = test_func(&as);
return ret;
}
__this_cpu_ptr <-----------------1
0000000000000000 <init_module>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 48 83 ec 10 sub $0x10,%rsp
8: 48 8d 45 f0 lea -0x10(%rbp),%rax
c: 65 48 03 04 25 00 00 00 00 add %gs:0x0,%rax
15: 31 c0 xor %eax,%eax
17: c9 leaveq
18: c3 retq
__this_cpu_read<-----------------2
0000000000000000 <init_module>:
0: 55 push %rbp
1: 31 c0 xor %eax,%eax
3: 48 89 e5 mov %rsp,%rbp
6: 48 83 ec 10 sub $0x10,%rsp
a: c9 leaveq
b: c3 retq
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 6/9] rcu: use __this_cpu_read helper instead of per_cpu_ptr(p, raw_smp_processor_id())
2012-11-04 10:38 ` Shan Wei
@ 2012-11-05 14:49 ` Christoph Lameter
[not found] ` <CAPYxyx+eJUWxDJrbOHVRtCchszmj8+BgSkNhpH3gGBJK87OikA@mail.gmail.com>
0 siblings, 1 reply; 9+ messages in thread
From: Christoph Lameter @ 2012-11-05 14:49 UTC (permalink / raw)
To: Shan Wei; +Cc: paulmck, dipankar, Kernel-Maillist
On Sun, 4 Nov 2012, Shan Wei wrote:
> __this_cpu_read<-----------------2
> 0000000000000000 <init_module>:
> 0: 55 push %rbp
> 1: 31 c0 xor %eax,%eax
> 3: 48 89 e5 mov %rsp,%rbp
> 6: 48 83 ec 10 sub $0x10,%rsp
> a: c9 leaveq
> b: c3 retq
?? There should be an operation using gs: here. This does not look
like code that includes a __this_cpu_read().
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 6/9] rcu: use __this_cpu_read helper instead of per_cpu_ptr(p, raw_smp_processor_id())
2012-11-03 9:19 ` Paul E. McKenney
2012-11-04 10:38 ` Shan Wei
@ 2012-11-05 15:23 ` Christoph Lameter
1 sibling, 0 replies; 9+ messages in thread
From: Christoph Lameter @ 2012-11-05 15:23 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: Shan Wei, dipankar, Kernel-Maillist
On Sat, 3 Nov 2012, Paul E. McKenney wrote:
> OK, I do understand why it happens to work. My question is instead why
> it is considered a good idea. After all, it is the ->rda field that is
> marked __percpu, not the ->mynode field. So in the interest of
> mechanical checking and general readability, it seems to me that it
> would be way better to apply __this_cpu_read() to rsp->rda rather than
> to rsp->rda->mynode.
mynode is part of the structure reached via rda.
Use on rsp->rda does not work since the offset of mynode must be added to
rda before a fetch related to the current cpus per cpu address can be
done.
this_cpu_ptr relocates and address. this_cpu_read() relocates the address
and performs the fetch. If you want to operate on rda then you can only
use this_cpu_ptr. this_cpu_read() saves you more instructions since it can
do the relocation and the fetch in one instruction.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 6/9] rcu: use __this_cpu_read helper instead of per_cpu_ptr(p, raw_smp_processor_id())
[not found] ` <CAPYxyx+eJUWxDJrbOHVRtCchszmj8+BgSkNhpH3gGBJK87OikA@mail.gmail.com>
@ 2012-11-05 15:55 ` Christoph Lameter
0 siblings, 0 replies; 9+ messages in thread
From: Christoph Lameter @ 2012-11-05 15:55 UTC (permalink / raw)
To: 单卫; +Cc: paulmck, dipankar, Kernel-Maillist
[-- Attachment #1: Type: TEXT/PLAIN, Size: 381 bytes --]
On Mon, 5 Nov 2012, 锟斤拷锟斤拷 wrote:
> I guarantee that x86-64 don't use gs register here. run test again锟斤拷
> Maybe there is some optimizations for __this_cpu_read call on x86-64锟斤拷 not
> sure.
There is no optimization that I know of unless the compiler eliminated the
__this_cpu_read completely. gs: is necessary to perform the implied
relocation in this_cpu_read().
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-11-05 15:55 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-02 16:01 [PATCH v2 6/9] rcu: use __this_cpu_read helper instead of per_cpu_ptr(p, raw_smp_processor_id()) Shan Wei
2012-11-02 17:46 ` Christoph Lameter
2012-11-02 18:10 ` Paul E. McKenney
2012-11-02 20:19 ` Christoph Lameter
2012-11-03 9:19 ` Paul E. McKenney
2012-11-04 10:38 ` Shan Wei
2012-11-05 14:49 ` Christoph Lameter
[not found] ` <CAPYxyx+eJUWxDJrbOHVRtCchszmj8+BgSkNhpH3gGBJK87OikA@mail.gmail.com>
2012-11-05 15:55 ` Christoph Lameter
2012-11-05 15:23 ` Christoph Lameter
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.