All of lore.kernel.org
 help / color / mirror / Atom feed
From: Will Deacon <will.deacon@arm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, ard.biesheuvel@linaro.org,
	catalin.marinas@arm.com, rml@tech9.net, tglx@linutronix.de,
	schwidefsky@de.ibm.com
Subject: Re: [PATCH 0/2] arm64: Only call into preempt_schedule() if need_resched()
Date: Wed, 28 Nov 2018 12:04:24 +0000	[thread overview]
Message-ID: <20181128120423.GA24868@arm.com> (raw)
In-Reply-To: <20181128090146.GF2149@hirez.programming.kicks-ass.net>

On Wed, Nov 28, 2018 at 10:01:46AM +0100, Peter Zijlstra wrote:
> On Wed, Nov 28, 2018 at 09:56:40AM +0100, Peter Zijlstra wrote:
> > On Tue, Nov 27, 2018 at 07:45:00PM +0000, Will Deacon wrote:
> > > This pair of patches improves our preempt_enable() implementation slightly
> > > on arm64 by making the resulting call to preempt_schedule() conditional
> > > on need_resched(), which is tracked in bit 32 of the preempt count. The
> > > logic is inverted so that we can detect the preempt count going to zero
> > > and need_resched being set with a single CBZ instruction.
> > 
> > >   40:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
> > >   44:   910003fd        mov     x29, sp
> > >   48:   d5384101        mrs     x1, sp_el0
> > >   4c:   f9400820        ldr     x0, [x1, #16]
> > 
> > We load x0 which is a u64, right?
> > 
> > >   50:   d1000400        sub     x0, x0, #0x1
> > >   54:   b9001020        str     w0, [x1, #16]
> > 
> > But we store w0, which is the low u32, such as to not touch the high
> > word which contains the preempt bit.
> > 
> > >   58:   b4000060        cbz     x0, 64 <will+0x24>
> > >   5c:   a8c17bfd        ldp     x29, x30, [sp], #16
> > >   60:   d65f03c0        ret
> > >   64:   94000000        bl      0 <preempt_schedule>
> > >   68:   a8c17bfd        ldp     x29, x30, [sp], #16
> > >   6c:   d65f03c0        ret
> > 
> > Why not?
> > 
> >    58:   b4000060        cbnz    x0, 60 <will+0x24>
> >    5c:   94000000        bl      0 <preempt_schedule>
> >    60:   a8c17bfd        ldp     x29, x30, [sp], #16
> >    64:   d65f03c0        ret
> > 
> > which seems shorter.
> > 
> > 
> > So it's still early, and I haven't finished (or really even started) my
> > pot 'o tea, but what about:
> > 
> > 
> > 	ldr x0, [x1, #16]	// seees the high bit set -- no preempt needed
> > 	sub x0, x0, #1
> > 
> > 	<interrupt>
> > 	  ...
> > 	  resched_curr()
> > 	    set_tsk_need_resched();
> > 	    set_preempt_need_resched();
> > 	</interrupt> // sees preempt_count != 0, does not preempt
> > 
> > 	str w0, [x1, #16] // stores preempt_count == 0
> > 	cbnz x0, 1f // taken, we still observe the high word from before
> > 
> > 1:	ret
> > 
> > 
> > Which then ends with preempt_count==0, need_resched==0 and no actual
> > preemption afaict.
> > 
> > Can you use mis-matched ll x0 / sc w0 to do this same thing and detector
> > the intermediate write on the high word?
> 
> That is, something along these here lines:
> 
> 1:	ldxr x0, [x1, #16]
> 	sub  x0, x0, #1
> 	stxr w1, w0, [x1, #16]

^^ This guy needs a different encoding but, to be honest, I reckon we're
better off just reloading the need_resched flag in the case where the count
has hit zero. I'll have a play. The assembly I posted is all generated by
GCC, so I can't comment on why it didn't chose your shorter sequence :)

Will

WARNING: multiple messages have this Message-ID (diff)
From: will.deacon@arm.com (Will Deacon)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 0/2] arm64: Only call into preempt_schedule() if need_resched()
Date: Wed, 28 Nov 2018 12:04:24 +0000	[thread overview]
Message-ID: <20181128120423.GA24868@arm.com> (raw)
In-Reply-To: <20181128090146.GF2149@hirez.programming.kicks-ass.net>

On Wed, Nov 28, 2018 at 10:01:46AM +0100, Peter Zijlstra wrote:
> On Wed, Nov 28, 2018 at 09:56:40AM +0100, Peter Zijlstra wrote:
> > On Tue, Nov 27, 2018 at 07:45:00PM +0000, Will Deacon wrote:
> > > This pair of patches improves our preempt_enable() implementation slightly
> > > on arm64 by making the resulting call to preempt_schedule() conditional
> > > on need_resched(), which is tracked in bit 32 of the preempt count. The
> > > logic is inverted so that we can detect the preempt count going to zero
> > > and need_resched being set with a single CBZ instruction.
> > 
> > >   40:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
> > >   44:   910003fd        mov     x29, sp
> > >   48:   d5384101        mrs     x1, sp_el0
> > >   4c:   f9400820        ldr     x0, [x1, #16]
> > 
> > We load x0 which is a u64, right?
> > 
> > >   50:   d1000400        sub     x0, x0, #0x1
> > >   54:   b9001020        str     w0, [x1, #16]
> > 
> > But we store w0, which is the low u32, such as to not touch the high
> > word which contains the preempt bit.
> > 
> > >   58:   b4000060        cbz     x0, 64 <will+0x24>
> > >   5c:   a8c17bfd        ldp     x29, x30, [sp], #16
> > >   60:   d65f03c0        ret
> > >   64:   94000000        bl      0 <preempt_schedule>
> > >   68:   a8c17bfd        ldp     x29, x30, [sp], #16
> > >   6c:   d65f03c0        ret
> > 
> > Why not?
> > 
> >    58:   b4000060        cbnz    x0, 60 <will+0x24>
> >    5c:   94000000        bl      0 <preempt_schedule>
> >    60:   a8c17bfd        ldp     x29, x30, [sp], #16
> >    64:   d65f03c0        ret
> > 
> > which seems shorter.
> > 
> > 
> > So it's still early, and I haven't finished (or really even started) my
> > pot 'o tea, but what about:
> > 
> > 
> > 	ldr x0, [x1, #16]	// seees the high bit set -- no preempt needed
> > 	sub x0, x0, #1
> > 
> > 	<interrupt>
> > 	  ...
> > 	  resched_curr()
> > 	    set_tsk_need_resched();
> > 	    set_preempt_need_resched();
> > 	</interrupt> // sees preempt_count != 0, does not preempt
> > 
> > 	str w0, [x1, #16] // stores preempt_count == 0
> > 	cbnz x0, 1f // taken, we still observe the high word from before
> > 
> > 1:	ret
> > 
> > 
> > Which then ends with preempt_count==0, need_resched==0 and no actual
> > preemption afaict.
> > 
> > Can you use mis-matched ll x0 / sc w0 to do this same thing and detector
> > the intermediate write on the high word?
> 
> That is, something along these here lines:
> 
> 1:	ldxr x0, [x1, #16]
> 	sub  x0, x0, #1
> 	stxr w1, w0, [x1, #16]

^^ This guy needs a different encoding but, to be honest, I reckon we're
better off just reloading the need_resched flag in the case where the count
has hit zero. I'll have a play. The assembly I posted is all generated by
GCC, so I can't comment on why it didn't chose your shorter sequence :)

Will

  reply	other threads:[~2018-11-28 12:04 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-27 19:45 [PATCH 0/2] arm64: Only call into preempt_schedule() if need_resched() Will Deacon
2018-11-27 19:45 ` Will Deacon
2018-11-27 19:45 ` [PATCH 1/2] preempt: Move PREEMPT_NEED_RESCHED definition into arch code Will Deacon
2018-11-27 19:45   ` Will Deacon
2018-11-27 19:45 ` [PATCH 2/2] arm64: preempt: Provide our own implementation of asm/preempt.h Will Deacon
2018-11-27 19:45   ` Will Deacon
2018-11-28 15:35   ` Ard Biesheuvel
2018-11-28 15:35     ` Ard Biesheuvel
2018-11-28 16:40     ` Peter Zijlstra
2018-11-28 16:40       ` Peter Zijlstra
2018-11-28 16:45       ` Ard Biesheuvel
2018-11-28 16:45         ` Ard Biesheuvel
2018-11-28  8:56 ` [PATCH 0/2] arm64: Only call into preempt_schedule() if need_resched() Peter Zijlstra
2018-11-28  8:56   ` Peter Zijlstra
2018-11-28  9:01   ` Peter Zijlstra
2018-11-28  9:01     ` Peter Zijlstra
2018-11-28 12:04     ` Will Deacon [this message]
2018-11-28 12:04       ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181128120423.GA24868@arm.com \
    --to=will.deacon@arm.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rml@tech9.net \
    --cc=schwidefsky@de.ibm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.