Linux-kselftest Archive on lore.kernel.org
 help / color / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Ira Weiny <ira.weiny@intel.com>, "Paul E. McKenney" <paulmck@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	x86@kernel.org, Dave Hansen <dave.hansen@linux.intel.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Fenghua Yu <fenghua.yu@intel.com>,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-nvdimm@lists.01.org, linux-mm@kvack.org,
	linux-kselftest@vger.kernel.org
Subject: Re: [PATCH 06/10] x86/entry: Move nmi entry/exit into common code
Date: Tue, 27 Oct 2020 15:18:50 +0100
Message-ID: <87pn5321md.fsf@nanos.tec.linutronix.de> (raw)
In-Reply-To: <20201027070750.GM534324@iweiny-DESK2.sc.intel.com>

On Tue, Oct 27 2020 at 00:07, Ira Weiny wrote:
> On Fri, Oct 23, 2020 at 11:50:11PM +0200, Thomas Gleixner wrote:
>> >  #ifndef irqentry_state
>> >  typedef struct irqentry_state {
>> > -	bool	exit_rcu;
>> > +	union {
>> > +		bool	exit_rcu;
>> > +		bool	lockdep;
>> > +	};
>> >  } irqentry_state_t;
>> >  #endif
>> 
>>   -E_NO_KERNELDOC
>
> Adding: Paul McKenney
>
> I'm happy to write something but I'm very unfamiliar with this code.  So I'm
> getting confused what exactly exit_rcu is flagging.
>
> I can see that exit_rcu is a bad name for the state used in
> irqentry_nmi_[enter|exit]().  Furthermore, I see why 'lockdep' is a better
> name.  But similar lockdep handling is used in irqentry_exit() if exit_rcu is
> true...

No, it's not similar at all. Lockdep state vs. interrupts and regular
exceptions is always consistent.

In the NMI case, that's not guaranteed because of

       local_irq_disable()
         arch_local_irq_disable()
                                        <- NMI race window
         trace_hardirqs_off()

same the other way round

       local_irq_enable()
         trace_hardirqs_on()
                                        <- NMI race window
         arch_local_irq_enable()

IOW, the hardware state and the lockdep state are not consistent.

> /**
>  * struct irqentry_state - Opaque object for exception state storage
>  * @exit_rcu: Used exclusively in the irqentry_*() calls; tracks if the
>  *            exception hit the idle task which requires special handling,
>  *            including calling rcu_irq_exit(), when the exception
>  exits.

calls; signals whether the exit path has to invoke rcu_irq_exit().

>  * @lockdep: Used exclusively in the irqentry_nmi_*() calls; ensures lockdep
>  *           tracking is maintained if hardirqs were already enabled

   ensures that lockdep state is restored correctly on exit from nmi.

>  *
>  * This opaque object is filled in by the irqentry_*_enter() functions and
>  * should be passed back into the corresponding irqentry_*_exit()
>  functions

s/should/must/

>  * when the exception is complete.
>  *
>  * Callers of irqentry_*_[enter|exit]() should consider this structure
>  opaque

s/should/must/

>  * and all members private.  Descriptions of the members are provided to aid in
>  * the maintenance of the irqentry_*() functions.
>  */
>
> Perhaps Paul can enlighten me on how exit_rcu is used beyond just flagging a
> call to rcu_irq_exit()?

I can do that as well :) The only purpose is to invoke rcu_irq_exit()
conditionally.

> Why do we call lockdep_hardirqs_off() only when in the idle task?  That implies
> that regs_irqs_disabled() can only be false if we were in the idle task to
> match up the lockdep on/off calls.

You're reading the code slightly wrong.

> This does not make sense to me because why do we need the extra check
> for exit_rcu?  I'm still trying to understand when regs_irqs_disabled() is false.

It's false when the interrupted context had interrupts enabled.

So we have the following scenarios:

 Usermode   Idletask   irqs enabled  RCU entry  RCU exit
   Y           N           Y		Y          Y

   N           N           Y            N          N 
   N           N           N            N          N
   N           Y           Y            Y          Y
   N           Y           N            Y          Y                       

Now you might wonder about irqs enabled/disabled. This code is not only
used for interrupts (device, ipi, local timer...) where interrupts are
obviously enabled, it's also used for exception entry/exit. You can have
e.g. pagefaults in interrupt disabled regions.

> Also, the comment in irqentry_enter() refers to irq_enter_from_user_mode() which
> does not seem to exist anymore.  So I'm not sure what careful sequence it is
> referring to.

That was renamed to irqentry_enter_from_user_mode() and the comment was
not updated. Sorry for leaving this hard to solve puzzle around.

Thanks,

        tglx

  reply index

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-22 22:26 [PATCH 00/10] PKS: Add Protection Keys Supervisor (PKS) support ira.weiny
2020-10-22 22:26 ` [PATCH 01/10] x86/pkeys: Create pkeys_common.h ira.weiny
2020-10-22 22:26 ` [PATCH 02/10] x86/fpu: Refactor arch_set_user_pkey_access() for PKS support ira.weiny
2020-10-22 22:26 ` [PATCH 03/10] x86/pks: Enable Protection Keys Supervisor (PKS) ira.weiny
2020-10-22 22:26 ` [PATCH 04/10] x86/pks: Preserve the PKRS MSR on context switch ira.weiny
2020-10-22 22:26 ` [PATCH 05/10] x86/pks: Add PKS kernel API ira.weiny
2020-10-22 22:26 ` [PATCH 06/10] x86/entry: Move nmi entry/exit into common code ira.weiny
2020-10-23 21:50   ` Thomas Gleixner
2020-10-27  7:07     ` Ira Weiny
2020-10-27 14:18       ` Thomas Gleixner [this message]
2020-10-22 22:26 ` [PATCH 07/10] x86/entry: Pass irqentry_state_t by reference ira.weiny
2020-10-23 21:56   ` Thomas Gleixner
2020-10-27  7:11     ` Ira Weiny
2020-10-22 22:26 ` [PATCH 08/10] x86/entry: Preserve PKRS MSR across exceptions ira.weiny
2020-10-22 22:27 ` [PATCH 09/10] x86/fault: Report the PKRS state on fault ira.weiny
2020-10-22 22:27 ` [PATCH 10/10] x86/pks: Add PKS test code ira.weiny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pn5321md.fsf@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-kselftest Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-kselftest/0 linux-kselftest/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-kselftest linux-kselftest/ https://lore.kernel.org/linux-kselftest \
		linux-kselftest@vger.kernel.org
	public-inbox-index linux-kselftest

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kselftest


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git