All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Nicolas Saenz Julienne <nsaenzju@redhat.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	rcu@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	paulmck@kernel.org, mtosatti <mtosatti@redhat.com>,
	frederic <frederic@kernel.org>
Subject: Re: Question WRT early IRQ/NMI entry code
Date: Tue, 30 Nov 2021 14:47:01 +0100	[thread overview]
Message-ID: <875ys9dacq.ffs@tglx> (raw)
In-Reply-To: <8719ad46cc29a2c5d7baac3c35770e5460ab8d5c.camel@redhat.com>

On Tue, Nov 30 2021 at 12:28, Nicolas Saenz Julienne wrote:
> while going over the IRQ/NMI entry code I've found a small 'inconsistency':
> while in the IRQ entry path, we inform RCU of the context change *before*
> incrementing the preempt counter, the opposite happens for the NMI entry
> path. This applies to both arm64 and x86[1].
>
> Actually, rcu_nmi_enter() — which is also the main RCU context switch function
> for the IRQ entry path — uses the preempt counter to verify it's not in NMI
> context. So it would make sense to assume all callers have the same updated
> view of the preempt count, which isn't true ATM.
>
> I'm sure there an obscure/non-obvious reason for this, right?

There is.

> IRQ path:
>   -> x86_64 asm (entry_64.S)
>   -> irqentry_enter() -> rcu_irq_enter() -> *rcu_nmi_enter()*
>   -> run_irq_on_irqstack_cond() -> irq_exit_rcu() -> *preempt_count_add(HARDIRQ_OFFSET)*
>   -> // Run IRQ...
>
> NMI path:
>   -> x86_64 asm (entry_64.S)
>   -> irqentry_nmi_enter() -> __nmi_enter() -> *__preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET)*
>                           -> *rcu_nmi_enter()*

The reason is symmetry vs. returning from interupt / exception:

 irqentry_enter()
      exit_rcu = false;

      if (user_mode(regs)) {
          irqentry_enter_from_user_mode(regs)
            __enter_from_user_mode(regs)
              user_exit_irqoff();       <- RCU handling for NOHZ full

      } else if (is_idle_task_current()) {
            rcu_irq_enter()
            exit_rcu = true;
      }

 irq_enter_rcu()
     __irq_enter_raw()
     preempt_count_add(HARDIRQ_OFFSET);

 irq_handler()

 irq_exit_rcu()
     preempt_count_sub(HARDIRQ_OFFSET);
     if (!in_interrupt() && local_softirq_pending())
     	 invoke_softirq();

 irqentry_exit(regs, exit_rcu)

     if (user_mode(regs)) {
         irqentry_exit_to_usermode(regs)
           user_enter_irqoff();     <- RCU handling for NOHZ full
     } else if (irqs_enabled(regs)) {
           if (exit_rcu) {          <- Idle task special case
               rcu_irq_exit();
           } else {
              irqentry_exit_cond_resched();
           }

     } else if (exit_rcu) {
         rcu_irq_exit();
     }

On return from interrupt HARDIRQ_OFFSET has to be removed _before_
handling soft interrupts. It's also required that the preempt count has
the original state _before_ reaching irqentry_exit() which
might schedule if the interrupt/exception hit user space or kernel space
with interrupts enabled.

So doing it symmetric makes sense.

For NMIs the above conditionals do not apply at all and we just do

    __nmi_enter()
        preempt_count_add(NMI_COUNT + HARDIRQ_COUNT);
    rcu_nmi_enter();

    handle_nmi();

    rcu_nmi_exit();
    __nmi_exit()
        preempt_count_sub(NMI_COUNT + HARDIRQ_COUNT);

The reason why preempt count is incremented before invoking
rcu_nmi_enter() is simply that RCU has to know about being in NMI
context, i.e. in_nmi() has to return the correct answer.

Thanks,

        tglx

WARNING: multiple messages have this Message-ID (diff)
From: Thomas Gleixner <tglx@linutronix.de>
To: Nicolas Saenz Julienne <nsaenzju@redhat.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	rcu@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	paulmck@kernel.org, mtosatti <mtosatti@redhat.com>,
	frederic <frederic@kernel.org>
Subject: Re: Question WRT early IRQ/NMI entry code
Date: Tue, 30 Nov 2021 14:47:01 +0100	[thread overview]
Message-ID: <875ys9dacq.ffs@tglx> (raw)
In-Reply-To: <8719ad46cc29a2c5d7baac3c35770e5460ab8d5c.camel@redhat.com>

On Tue, Nov 30 2021 at 12:28, Nicolas Saenz Julienne wrote:
> while going over the IRQ/NMI entry code I've found a small 'inconsistency':
> while in the IRQ entry path, we inform RCU of the context change *before*
> incrementing the preempt counter, the opposite happens for the NMI entry
> path. This applies to both arm64 and x86[1].
>
> Actually, rcu_nmi_enter() — which is also the main RCU context switch function
> for the IRQ entry path — uses the preempt counter to verify it's not in NMI
> context. So it would make sense to assume all callers have the same updated
> view of the preempt count, which isn't true ATM.
>
> I'm sure there an obscure/non-obvious reason for this, right?

There is.

> IRQ path:
>   -> x86_64 asm (entry_64.S)
>   -> irqentry_enter() -> rcu_irq_enter() -> *rcu_nmi_enter()*
>   -> run_irq_on_irqstack_cond() -> irq_exit_rcu() -> *preempt_count_add(HARDIRQ_OFFSET)*
>   -> // Run IRQ...
>
> NMI path:
>   -> x86_64 asm (entry_64.S)
>   -> irqentry_nmi_enter() -> __nmi_enter() -> *__preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET)*
>                           -> *rcu_nmi_enter()*

The reason is symmetry vs. returning from interupt / exception:

 irqentry_enter()
      exit_rcu = false;

      if (user_mode(regs)) {
          irqentry_enter_from_user_mode(regs)
            __enter_from_user_mode(regs)
              user_exit_irqoff();       <- RCU handling for NOHZ full

      } else if (is_idle_task_current()) {
            rcu_irq_enter()
            exit_rcu = true;
      }

 irq_enter_rcu()
     __irq_enter_raw()
     preempt_count_add(HARDIRQ_OFFSET);

 irq_handler()

 irq_exit_rcu()
     preempt_count_sub(HARDIRQ_OFFSET);
     if (!in_interrupt() && local_softirq_pending())
     	 invoke_softirq();

 irqentry_exit(regs, exit_rcu)

     if (user_mode(regs)) {
         irqentry_exit_to_usermode(regs)
           user_enter_irqoff();     <- RCU handling for NOHZ full
     } else if (irqs_enabled(regs)) {
           if (exit_rcu) {          <- Idle task special case
               rcu_irq_exit();
           } else {
              irqentry_exit_cond_resched();
           }

     } else if (exit_rcu) {
         rcu_irq_exit();
     }

On return from interrupt HARDIRQ_OFFSET has to be removed _before_
handling soft interrupts. It's also required that the preempt count has
the original state _before_ reaching irqentry_exit() which
might schedule if the interrupt/exception hit user space or kernel space
with interrupts enabled.

So doing it symmetric makes sense.

For NMIs the above conditionals do not apply at all and we just do

    __nmi_enter()
        preempt_count_add(NMI_COUNT + HARDIRQ_COUNT);
    rcu_nmi_enter();

    handle_nmi();

    rcu_nmi_exit();
    __nmi_exit()
        preempt_count_sub(NMI_COUNT + HARDIRQ_COUNT);

The reason why preempt count is incremented before invoking
rcu_nmi_enter() is simply that RCU has to know about being in NMI
context, i.e. in_nmi() has to return the correct answer.

Thanks,

        tglx

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2021-11-30 13:47 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-30 11:28 Question WRT early IRQ/NMI entry code Nicolas Saenz Julienne
2021-11-30 11:28 ` Nicolas Saenz Julienne
2021-11-30 12:05 ` Frederic Weisbecker
2021-11-30 12:05   ` Frederic Weisbecker
2021-11-30 12:50 ` Mark Rutland
2021-11-30 12:50   ` Mark Rutland
2021-11-30 13:47 ` Thomas Gleixner [this message]
2021-11-30 13:47   ` Thomas Gleixner
2021-11-30 14:13   ` Steven Rostedt
2021-11-30 14:13     ` Steven Rostedt
2021-11-30 22:31     ` [PATCH] Documentation: Fill the gaps about entry/noinstr constraints Thomas Gleixner
2021-11-30 22:31       ` Thomas Gleixner
2021-12-01 10:56       ` Mark Rutland
2021-12-01 10:56         ` Mark Rutland
2021-12-01 18:14         ` Thomas Gleixner
2021-12-01 18:14           ` Thomas Gleixner
2021-12-01 18:23           ` Mark Rutland
2021-12-01 18:23             ` Mark Rutland
2021-12-01 20:28             ` Thomas Gleixner
2021-12-01 20:28               ` Thomas Gleixner
2021-12-01 20:35               ` [PATCH v2] " Thomas Gleixner
2021-12-01 20:35                 ` Thomas Gleixner
2021-12-02 10:03                 ` Mark Rutland
2021-12-02 10:03                   ` Mark Rutland
2021-12-03 20:08                 ` Paul E. McKenney
2021-12-03 20:08                   ` Paul E. McKenney
2021-12-13 10:36                   ` Nicolas Saenz Julienne
2021-12-13 10:36                     ` Nicolas Saenz Julienne
2021-12-13 16:41                     ` Paul E. McKenney
2021-12-13 16:41                       ` Paul E. McKenney
2021-12-04  3:48                 ` Randy Dunlap
2021-12-04  3:48                   ` Randy Dunlap
2021-12-06 17:36                   ` Mark Rutland
2021-12-06 17:36                     ` Mark Rutland
2021-12-06 17:53                     ` Paul E. McKenney
2021-12-06 17:53                       ` Paul E. McKenney
2021-12-06 21:24                       ` Randy Dunlap
2021-12-06 21:24                         ` Randy Dunlap
2021-12-06 21:36                         ` Paul E. McKenney
2021-12-06 21:36                           ` Paul E. McKenney
2021-11-30 15:13   ` Question WRT early IRQ/NMI entry code Nicolas Saenz Julienne
2021-11-30 15:13     ` Nicolas Saenz Julienne

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875ys9dacq.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=frederic@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mtosatti@redhat.com \
    --cc=nsaenzju@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.