All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, ard.biesheuvel@linaro.org,
	catalin.marinas@arm.com, rml@tech9.net, tglx@linutronix.de,
	schwidefsky@de.ibm.com
Subject: Re: [PATCH 0/2] arm64: Only call into preempt_schedule() if need_resched()
Date: Wed, 28 Nov 2018 09:56:40 +0100	[thread overview]
Message-ID: <20181128085640.GX2131@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <1543347902-21170-1-git-send-email-will.deacon@arm.com>

On Tue, Nov 27, 2018 at 07:45:00PM +0000, Will Deacon wrote:
> Hi all,
> 
> This pair of patches improves our preempt_enable() implementation slightly
> on arm64 by making the resulting call to preempt_schedule() conditional
> on need_resched(), which is tracked in bit 32 of the preempt count. The
> logic is inverted so that we can detect the preempt count going to zero
> and need_resched being set with a single CBZ instruction.

>   40:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
>   44:   910003fd        mov     x29, sp
>   48:   d5384101        mrs     x1, sp_el0
>   4c:   f9400820        ldr     x0, [x1, #16]

We load x0 which is a u64, right?

>   50:   d1000400        sub     x0, x0, #0x1
>   54:   b9001020        str     w0, [x1, #16]

But we store w0, which is the low u32, such as to not touch the high
word which contains the preempt bit.

>   58:   b4000060        cbz     x0, 64 <will+0x24>
>   5c:   a8c17bfd        ldp     x29, x30, [sp], #16
>   60:   d65f03c0        ret
>   64:   94000000        bl      0 <preempt_schedule>
>   68:   a8c17bfd        ldp     x29, x30, [sp], #16
>   6c:   d65f03c0        ret

Why not?

   58:   b4000060        cbnz    x0, 60 <will+0x24>
   5c:   94000000        bl      0 <preempt_schedule>
   60:   a8c17bfd        ldp     x29, x30, [sp], #16
   64:   d65f03c0        ret

which seems shorter.


So it's still early, and I haven't finished (or really even started) my
pot 'o tea, but what about:


	ldr x0, [x1, #16]	// seees the high bit set -- no preempt needed
	sub x0, x0, #1

	<interrupt>
	  ...
	  resched_curr()
	    set_tsk_need_resched();
	    set_preempt_need_resched();
	</interrupt> // sees preempt_count != 0, does not preempt

	str w0, [x1, #16] // stores preempt_count == 0
	cbnz x0, 1f // taken, we still observe the high word from before

1:	ret


Which then ends with preempt_count==0, need_resched==0 and no actual
preemption afaict.

Can you use mis-matched ll x0 / sc w0 to do this same thing and detector
the intermediate write on the high word?

WARNING: multiple messages have this Message-ID (diff)
From: peterz@infradead.org (Peter Zijlstra)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 0/2] arm64: Only call into preempt_schedule() if need_resched()
Date: Wed, 28 Nov 2018 09:56:40 +0100	[thread overview]
Message-ID: <20181128085640.GX2131@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <1543347902-21170-1-git-send-email-will.deacon@arm.com>

On Tue, Nov 27, 2018 at 07:45:00PM +0000, Will Deacon wrote:
> Hi all,
> 
> This pair of patches improves our preempt_enable() implementation slightly
> on arm64 by making the resulting call to preempt_schedule() conditional
> on need_resched(), which is tracked in bit 32 of the preempt count. The
> logic is inverted so that we can detect the preempt count going to zero
> and need_resched being set with a single CBZ instruction.

>   40:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
>   44:   910003fd        mov     x29, sp
>   48:   d5384101        mrs     x1, sp_el0
>   4c:   f9400820        ldr     x0, [x1, #16]

We load x0 which is a u64, right?

>   50:   d1000400        sub     x0, x0, #0x1
>   54:   b9001020        str     w0, [x1, #16]

But we store w0, which is the low u32, such as to not touch the high
word which contains the preempt bit.

>   58:   b4000060        cbz     x0, 64 <will+0x24>
>   5c:   a8c17bfd        ldp     x29, x30, [sp], #16
>   60:   d65f03c0        ret
>   64:   94000000        bl      0 <preempt_schedule>
>   68:   a8c17bfd        ldp     x29, x30, [sp], #16
>   6c:   d65f03c0        ret

Why not?

   58:   b4000060        cbnz    x0, 60 <will+0x24>
   5c:   94000000        bl      0 <preempt_schedule>
   60:   a8c17bfd        ldp     x29, x30, [sp], #16
   64:   d65f03c0        ret

which seems shorter.


So it's still early, and I haven't finished (or really even started) my
pot 'o tea, but what about:


	ldr x0, [x1, #16]	// seees the high bit set -- no preempt needed
	sub x0, x0, #1

	<interrupt>
	  ...
	  resched_curr()
	    set_tsk_need_resched();
	    set_preempt_need_resched();
	</interrupt> // sees preempt_count != 0, does not preempt

	str w0, [x1, #16] // stores preempt_count == 0
	cbnz x0, 1f // taken, we still observe the high word from before

1:	ret


Which then ends with preempt_count==0, need_resched==0 and no actual
preemption afaict.

Can you use mis-matched ll x0 / sc w0 to do this same thing and detector
the intermediate write on the high word?

  parent reply	other threads:[~2018-11-28  8:56 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-27 19:45 [PATCH 0/2] arm64: Only call into preempt_schedule() if need_resched() Will Deacon
2018-11-27 19:45 ` Will Deacon
2018-11-27 19:45 ` [PATCH 1/2] preempt: Move PREEMPT_NEED_RESCHED definition into arch code Will Deacon
2018-11-27 19:45   ` Will Deacon
2018-11-27 19:45 ` [PATCH 2/2] arm64: preempt: Provide our own implementation of asm/preempt.h Will Deacon
2018-11-27 19:45   ` Will Deacon
2018-11-28 15:35   ` Ard Biesheuvel
2018-11-28 15:35     ` Ard Biesheuvel
2018-11-28 16:40     ` Peter Zijlstra
2018-11-28 16:40       ` Peter Zijlstra
2018-11-28 16:45       ` Ard Biesheuvel
2018-11-28 16:45         ` Ard Biesheuvel
2018-11-28  8:56 ` Peter Zijlstra [this message]
2018-11-28  8:56   ` [PATCH 0/2] arm64: Only call into preempt_schedule() if need_resched() Peter Zijlstra
2018-11-28  9:01   ` Peter Zijlstra
2018-11-28  9:01     ` Peter Zijlstra
2018-11-28 12:04     ` Will Deacon
2018-11-28 12:04       ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181128085640.GX2131@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=ard.biesheuvel@linaro.org \
    --cc=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rml@tech9.net \
    --cc=schwidefsky@de.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.