linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86:pvclock: add missing barriers
@ 2016-06-08 18:11 Roman Kagan
  2016-06-08 19:45 ` Borislav Petkov
  0 siblings, 1 reply; 5+ messages in thread
From: Roman Kagan @ 2016-06-08 18:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Denis V. Lunev, Roman Kagan, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, Andy Lutomirski, Borislav Petkov,
	Paolo Bonzini, stable

Gradual removal of excessive barriers in pvclock reading functions
(commits 502dfeff239e8313bfbe906ca0a1a6827ac8481b,
a3eb97bd80134ba07864ca00747466c02118aca1) ended up removing too much:
although rdtsc is now orderd WRT other loads, there's no protection
against the compiler reordering the loads of ->version with the loads of
other fields.

E.g. on my system gcc-5.3.1 generates code which loads ->system_time and
->flags outside of the ->version test loop.

(Re)introduce the compiler barriers around accesses to the contents of
pvclock.  While at this, make the function a bit more compact by
removing unnecessary local variables.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
---
 arch/x86/include/asm/pvclock.h | 17 +++++------------
 1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h
index fdcc040..65c4de2 100644
--- a/arch/x86/include/asm/pvclock.h
+++ b/arch/x86/include/asm/pvclock.h
@@ -80,18 +80,11 @@ static __always_inline
 unsigned __pvclock_read_cycles(const struct pvclock_vcpu_time_info *src,
 			       cycle_t *cycles, u8 *flags)
 {
-	unsigned version;
-	cycle_t ret, offset;
-	u8 ret_flags;
-
-	version = src->version;
-
-	offset = pvclock_get_nsec_offset(src);
-	ret = src->system_time + offset;
-	ret_flags = src->flags;
-
-	*cycles = ret;
-	*flags = ret_flags;
+	unsigned version = src->version;
+	barrier();
+	*cycles = src->system_time + pvclock_get_nsec_offset(src);
+	*flags = src->flags;
+	barrier();
 	return version;
 }
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86:pvclock: add missing barriers
  2016-06-08 18:11 [PATCH] x86:pvclock: add missing barriers Roman Kagan
@ 2016-06-08 19:45 ` Borislav Petkov
  2016-06-08 21:01   ` Roman Kagan
  2016-06-12 10:02   ` Minfei Huang
  0 siblings, 2 replies; 5+ messages in thread
From: Borislav Petkov @ 2016-06-08 19:45 UTC (permalink / raw)
  To: Roman Kagan, Minfei Huang
  Cc: linux-kernel, Denis V. Lunev, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, Andy Lutomirski, Paolo Bonzini

On Wed, Jun 08, 2016 at 09:11:39PM +0300, Roman Kagan wrote:
> Gradual removal of excessive barriers in pvclock reading functions
> (commits 502dfeff239e8313bfbe906ca0a1a6827ac8481b,
> a3eb97bd80134ba07864ca00747466c02118aca1) ended up removing too much:
> although rdtsc is now orderd WRT other loads, there's no protection
> against the compiler reordering the loads of ->version with the loads of
> other fields.
> 
> E.g. on my system gcc-5.3.1 generates code which loads ->system_time and
> ->flags outside of the ->version test loop.
> 
> (Re)introduce the compiler barriers around accesses to the contents of
> pvclock.  While at this, make the function a bit more compact by
> removing unnecessary local variables.
> 
> Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: stable@vger.kernel.org
> ---
>  arch/x86/include/asm/pvclock.h | 17 +++++------------
>  1 file changed, 5 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h
> index fdcc040..65c4de2 100644
> --- a/arch/x86/include/asm/pvclock.h
> +++ b/arch/x86/include/asm/pvclock.h
> @@ -80,18 +80,11 @@ static __always_inline
>  unsigned __pvclock_read_cycles(const struct pvclock_vcpu_time_info *src,
>  			       cycle_t *cycles, u8 *flags)
>  {
> -	unsigned version;
> -	cycle_t ret, offset;
> -	u8 ret_flags;
> -
> -	version = src->version;
> -
> -	offset = pvclock_get_nsec_offset(src);
> -	ret = src->system_time + offset;
> -	ret_flags = src->flags;
> -
> -	*cycles = ret;
> -	*flags = ret_flags;
> +	unsigned version = src->version;
> +	barrier();
> +	*cycles = src->system_time + pvclock_get_nsec_offset(src);
> +	*flags = src->flags;
> +	barrier();
>  	return version;

I have a similar patchset in my mbox starting here:

https://lkml.kernel.org/r/1464329832-4638-1-git-send-email-mnghuan@gmail.com

Care to take a look?

Thanks.


-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86:pvclock: add missing barriers
  2016-06-08 19:45 ` Borislav Petkov
@ 2016-06-08 21:01   ` Roman Kagan
  2016-06-09 10:29     ` Roman Kagan
  2016-06-12 10:02   ` Minfei Huang
  1 sibling, 1 reply; 5+ messages in thread
From: Roman Kagan @ 2016-06-08 21:01 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Minfei Huang, linux-kernel, Denis V. Lunev, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, x86, Andy Lutomirski, Paolo Bonzini

On Wed, Jun 08, 2016 at 09:45:09PM +0200, Borislav Petkov wrote:
> On Wed, Jun 08, 2016 at 09:11:39PM +0300, Roman Kagan wrote:
> > Gradual removal of excessive barriers in pvclock reading functions
> > (commits 502dfeff239e8313bfbe906ca0a1a6827ac8481b,
> > a3eb97bd80134ba07864ca00747466c02118aca1) ended up removing too much:
> > although rdtsc is now orderd WRT other loads, there's no protection
> > against the compiler reordering the loads of ->version with the loads of
> > other fields.
> > 
> > E.g. on my system gcc-5.3.1 generates code which loads ->system_time and
> > ->flags outside of the ->version test loop.
> > 
> > (Re)introduce the compiler barriers around accesses to the contents of
> > pvclock.  While at this, make the function a bit more compact by
> > removing unnecessary local variables.
> > 
> > Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > Cc: x86@kernel.org
> > Cc: Andy Lutomirski <luto@kernel.org>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: stable@vger.kernel.org
> > ---
> >  arch/x86/include/asm/pvclock.h | 17 +++++------------
> >  1 file changed, 5 insertions(+), 12 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h
> > index fdcc040..65c4de2 100644
> > --- a/arch/x86/include/asm/pvclock.h
> > +++ b/arch/x86/include/asm/pvclock.h
> > @@ -80,18 +80,11 @@ static __always_inline
> >  unsigned __pvclock_read_cycles(const struct pvclock_vcpu_time_info *src,
> >  			       cycle_t *cycles, u8 *flags)
> >  {
> > -	unsigned version;
> > -	cycle_t ret, offset;
> > -	u8 ret_flags;
> > -
> > -	version = src->version;
> > -
> > -	offset = pvclock_get_nsec_offset(src);
> > -	ret = src->system_time + offset;
> > -	ret_flags = src->flags;
> > -
> > -	*cycles = ret;
> > -	*flags = ret_flags;
> > +	unsigned version = src->version;
> > +	barrier();
> > +	*cycles = src->system_time + pvclock_get_nsec_offset(src);
> > +	*flags = src->flags;
> > +	barrier();
> >  	return version;
> 
> I have a similar patchset in my mbox starting here:
> 
> https://lkml.kernel.org/r/1464329832-4638-1-git-send-email-mnghuan@gmail.com
> 
> Care to take a look?

Just did, thanks for the link.

The difference is whether you want the reader to see consistent view of
the pvclock data (as in my patch) or also the most up to date one
(as in Minfei Huang's patch) at the cost of extra lfence instructions
(on my machine this is 30% slowdown).

I'm not sure if the latter is really necessary.  If it is, then the
lfence or mfence in rdtsc_ordered() becomes excessive.  Perhaps we'd
have to revert to rdtsc_barrier() to surround pvclock data access, and
plain rdtsc() instead of rdtsc_ordered().

Roman.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86:pvclock: add missing barriers
  2016-06-08 21:01   ` Roman Kagan
@ 2016-06-09 10:29     ` Roman Kagan
  0 siblings, 0 replies; 5+ messages in thread
From: Roman Kagan @ 2016-06-09 10:29 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Minfei Huang, linux-kernel, Denis V. Lunev, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, x86, Andy Lutomirski, Paolo Bonzini

On Thu, Jun 09, 2016 at 12:01:13AM +0300, Roman Kagan wrote:
> On Wed, Jun 08, 2016 at 09:45:09PM +0200, Borislav Petkov wrote:
> > On Wed, Jun 08, 2016 at 09:11:39PM +0300, Roman Kagan wrote:
> > > --- a/arch/x86/include/asm/pvclock.h
> > > +++ b/arch/x86/include/asm/pvclock.h
> > > @@ -80,18 +80,11 @@ static __always_inline
> > >  unsigned __pvclock_read_cycles(const struct pvclock_vcpu_time_info *src,
> > >  			       cycle_t *cycles, u8 *flags)
> > >  {
> > > -	unsigned version;
> > > -	cycle_t ret, offset;
> > > -	u8 ret_flags;
> > > -
> > > -	version = src->version;
> > > -
> > > -	offset = pvclock_get_nsec_offset(src);
> > > -	ret = src->system_time + offset;
> > > -	ret_flags = src->flags;
> > > -
> > > -	*cycles = ret;
> > > -	*flags = ret_flags;
> > > +	unsigned version = src->version;
> > > +	barrier();
> > > +	*cycles = src->system_time + pvclock_get_nsec_offset(src);
> > > +	*flags = src->flags;
> > > +	barrier();
> > >  	return version;
> > 
> > I have a similar patchset in my mbox starting here:
> > 
> > https://lkml.kernel.org/r/1464329832-4638-1-git-send-email-mnghuan@gmail.com
> > 
> > Care to take a look?
> 
> Just did, thanks for the link.
> 
> The difference is whether you want the reader to see consistent view of
> the pvclock data (as in my patch) or also the most up to date one
> (as in Minfei Huang's patch) at the cost of extra lfence instructions
> (on my machine this is 30% slowdown).

Sorry, I should have looked better.  Minfei's patch inserts smb_rmb()-s
which on x86 are just barrier()-s, so that patch results in the code
equivalent to mine.  So I'll jump onto that thread instead of pursuing
this one.

Roman.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86:pvclock: add missing barriers
  2016-06-08 19:45 ` Borislav Petkov
  2016-06-08 21:01   ` Roman Kagan
@ 2016-06-12 10:02   ` Minfei Huang
  1 sibling, 0 replies; 5+ messages in thread
From: Minfei Huang @ 2016-06-12 10:02 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Roman Kagan, linux-kernel, Denis V. Lunev, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, x86, Andy Lutomirski, Paolo Bonzini

On 06/08/16 at 09:45P, Borislav Petkov wrote:
> On Wed, Jun 08, 2016 at 09:11:39PM +0300, Roman Kagan wrote:
> > Gradual removal of excessive barriers in pvclock reading functions
> > (commits 502dfeff239e8313bfbe906ca0a1a6827ac8481b,
> > a3eb97bd80134ba07864ca00747466c02118aca1) ended up removing too much:
> > although rdtsc is now orderd WRT other loads, there's no protection
> > against the compiler reordering the loads of ->version with the loads of
> > other fields.
> > 
> > E.g. on my system gcc-5.3.1 generates code which loads ->system_time and
> > ->flags outside of the ->version test loop.
> > 
> > (Re)introduce the compiler barriers around accesses to the contents of
> > pvclock.  While at this, make the function a bit more compact by
> > removing unnecessary local variables.
> > 
> > Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > Cc: x86@kernel.org
> > Cc: Andy Lutomirski <luto@kernel.org>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: stable@vger.kernel.org
> > ---
> >  arch/x86/include/asm/pvclock.h | 17 +++++------------
> >  1 file changed, 5 insertions(+), 12 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h
> > index fdcc040..65c4de2 100644
> > --- a/arch/x86/include/asm/pvclock.h
> > +++ b/arch/x86/include/asm/pvclock.h
> > @@ -80,18 +80,11 @@ static __always_inline
> >  unsigned __pvclock_read_cycles(const struct pvclock_vcpu_time_info *src,
> >  			       cycle_t *cycles, u8 *flags)
> >  {
> > -	unsigned version;
> > -	cycle_t ret, offset;
> > -	u8 ret_flags;
> > -
> > -	version = src->version;
> > -
> > -	offset = pvclock_get_nsec_offset(src);
> > -	ret = src->system_time + offset;
> > -	ret_flags = src->flags;
> > -
> > -	*cycles = ret;
> > -	*flags = ret_flags;
> > +	unsigned version = src->version;
> > +	barrier();
> > +	*cycles = src->system_time + pvclock_get_nsec_offset(src);
> > +	*flags = src->flags;
> > +	barrier();
> >  	return version;
> 
> I have a similar patchset in my mbox starting here:
> 
> https://lkml.kernel.org/r/1464329832-4638-1-git-send-email-mnghuan@gmail.com
> 
> Care to take a look?

Hi, Boris.

I had a vocation last several days, and sorry for replying it
later.

As Roman said in this thread, both his and mine are similar way
to fix this inconsistent issue, and I have a wrapper function
smp_rmb in my patch which the root function is barriers in x86
as well.


Thanks
Minfei

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-06-12 10:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-08 18:11 [PATCH] x86:pvclock: add missing barriers Roman Kagan
2016-06-08 19:45 ` Borislav Petkov
2016-06-08 21:01   ` Roman Kagan
2016-06-09 10:29     ` Roman Kagan
2016-06-12 10:02   ` Minfei Huang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).