From: Peter Zijlstra <peterz@infradead.org>
To: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, ard.biesheuvel@linaro.org,
catalin.marinas@arm.com, rml@tech9.net, tglx@linutronix.de,
schwidefsky@de.ibm.com
Subject: Re: [PATCH 0/2] arm64: Only call into preempt_schedule() if need_resched()
Date: Wed, 28 Nov 2018 10:01:46 +0100 [thread overview]
Message-ID: <20181128090146.GF2149@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20181128085640.GX2131@hirez.programming.kicks-ass.net>
On Wed, Nov 28, 2018 at 09:56:40AM +0100, Peter Zijlstra wrote:
> On Tue, Nov 27, 2018 at 07:45:00PM +0000, Will Deacon wrote:
> > Hi all,
> >
> > This pair of patches improves our preempt_enable() implementation slightly
> > on arm64 by making the resulting call to preempt_schedule() conditional
> > on need_resched(), which is tracked in bit 32 of the preempt count. The
> > logic is inverted so that we can detect the preempt count going to zero
> > and need_resched being set with a single CBZ instruction.
>
> > 40: a9bf7bfd stp x29, x30, [sp, #-16]!
> > 44: 910003fd mov x29, sp
> > 48: d5384101 mrs x1, sp_el0
> > 4c: f9400820 ldr x0, [x1, #16]
>
> We load x0 which is a u64, right?
>
> > 50: d1000400 sub x0, x0, #0x1
> > 54: b9001020 str w0, [x1, #16]
>
> But we store w0, which is the low u32, such as to not touch the high
> word which contains the preempt bit.
>
> > 58: b4000060 cbz x0, 64 <will+0x24>
> > 5c: a8c17bfd ldp x29, x30, [sp], #16
> > 60: d65f03c0 ret
> > 64: 94000000 bl 0 <preempt_schedule>
> > 68: a8c17bfd ldp x29, x30, [sp], #16
> > 6c: d65f03c0 ret
>
> Why not?
>
> 58: b4000060 cbnz x0, 60 <will+0x24>
> 5c: 94000000 bl 0 <preempt_schedule>
> 60: a8c17bfd ldp x29, x30, [sp], #16
> 64: d65f03c0 ret
>
> which seems shorter.
>
>
> So it's still early, and I haven't finished (or really even started) my
> pot 'o tea, but what about:
>
>
> ldr x0, [x1, #16] // seees the high bit set -- no preempt needed
> sub x0, x0, #1
>
> <interrupt>
> ...
> resched_curr()
> set_tsk_need_resched();
> set_preempt_need_resched();
> </interrupt> // sees preempt_count != 0, does not preempt
>
> str w0, [x1, #16] // stores preempt_count == 0
> cbnz x0, 1f // taken, we still observe the high word from before
>
> 1: ret
>
>
> Which then ends with preempt_count==0, need_resched==0 and no actual
> preemption afaict.
>
> Can you use mis-matched ll x0 / sc w0 to do this same thing and detector
> the intermediate write on the high word?
That is, something along these here lines:
1: ldxr x0, [x1, #16]
sub x0, x0, #1
stxr w1, w0, [x1, #16]
cbnz w1, 1b
cbnz x0, 2f
bl preempt_schedule
2: ret
can that work?
WARNING: multiple messages have this Message-ID (diff)
From: peterz@infradead.org (Peter Zijlstra)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 0/2] arm64: Only call into preempt_schedule() if need_resched()
Date: Wed, 28 Nov 2018 10:01:46 +0100 [thread overview]
Message-ID: <20181128090146.GF2149@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20181128085640.GX2131@hirez.programming.kicks-ass.net>
On Wed, Nov 28, 2018 at 09:56:40AM +0100, Peter Zijlstra wrote:
> On Tue, Nov 27, 2018 at 07:45:00PM +0000, Will Deacon wrote:
> > Hi all,
> >
> > This pair of patches improves our preempt_enable() implementation slightly
> > on arm64 by making the resulting call to preempt_schedule() conditional
> > on need_resched(), which is tracked in bit 32 of the preempt count. The
> > logic is inverted so that we can detect the preempt count going to zero
> > and need_resched being set with a single CBZ instruction.
>
> > 40: a9bf7bfd stp x29, x30, [sp, #-16]!
> > 44: 910003fd mov x29, sp
> > 48: d5384101 mrs x1, sp_el0
> > 4c: f9400820 ldr x0, [x1, #16]
>
> We load x0 which is a u64, right?
>
> > 50: d1000400 sub x0, x0, #0x1
> > 54: b9001020 str w0, [x1, #16]
>
> But we store w0, which is the low u32, such as to not touch the high
> word which contains the preempt bit.
>
> > 58: b4000060 cbz x0, 64 <will+0x24>
> > 5c: a8c17bfd ldp x29, x30, [sp], #16
> > 60: d65f03c0 ret
> > 64: 94000000 bl 0 <preempt_schedule>
> > 68: a8c17bfd ldp x29, x30, [sp], #16
> > 6c: d65f03c0 ret
>
> Why not?
>
> 58: b4000060 cbnz x0, 60 <will+0x24>
> 5c: 94000000 bl 0 <preempt_schedule>
> 60: a8c17bfd ldp x29, x30, [sp], #16
> 64: d65f03c0 ret
>
> which seems shorter.
>
>
> So it's still early, and I haven't finished (or really even started) my
> pot 'o tea, but what about:
>
>
> ldr x0, [x1, #16] // seees the high bit set -- no preempt needed
> sub x0, x0, #1
>
> <interrupt>
> ...
> resched_curr()
> set_tsk_need_resched();
> set_preempt_need_resched();
> </interrupt> // sees preempt_count != 0, does not preempt
>
> str w0, [x1, #16] // stores preempt_count == 0
> cbnz x0, 1f // taken, we still observe the high word from before
>
> 1: ret
>
>
> Which then ends with preempt_count==0, need_resched==0 and no actual
> preemption afaict.
>
> Can you use mis-matched ll x0 / sc w0 to do this same thing and detector
> the intermediate write on the high word?
That is, something along these here lines:
1: ldxr x0, [x1, #16]
sub x0, x0, #1
stxr w1, w0, [x1, #16]
cbnz w1, 1b
cbnz x0, 2f
bl preempt_schedule
2: ret
can that work?
next prev parent reply other threads:[~2018-11-28 9:02 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-27 19:45 [PATCH 0/2] arm64: Only call into preempt_schedule() if need_resched() Will Deacon
2018-11-27 19:45 ` Will Deacon
2018-11-27 19:45 ` [PATCH 1/2] preempt: Move PREEMPT_NEED_RESCHED definition into arch code Will Deacon
2018-11-27 19:45 ` Will Deacon
2018-11-27 19:45 ` [PATCH 2/2] arm64: preempt: Provide our own implementation of asm/preempt.h Will Deacon
2018-11-27 19:45 ` Will Deacon
2018-11-28 15:35 ` Ard Biesheuvel
2018-11-28 15:35 ` Ard Biesheuvel
2018-11-28 16:40 ` Peter Zijlstra
2018-11-28 16:40 ` Peter Zijlstra
2018-11-28 16:45 ` Ard Biesheuvel
2018-11-28 16:45 ` Ard Biesheuvel
2018-11-28 8:56 ` [PATCH 0/2] arm64: Only call into preempt_schedule() if need_resched() Peter Zijlstra
2018-11-28 8:56 ` Peter Zijlstra
2018-11-28 9:01 ` Peter Zijlstra [this message]
2018-11-28 9:01 ` Peter Zijlstra
2018-11-28 12:04 ` Will Deacon
2018-11-28 12:04 ` Will Deacon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181128090146.GF2149@hirez.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=ard.biesheuvel@linaro.org \
--cc=catalin.marinas@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rml@tech9.net \
--cc=schwidefsky@de.ibm.com \
--cc=tglx@linutronix.de \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.