From: "Jürgen Groß" <jgross@suse.com>
To: Andy Lutomirski <luto@kernel.org>, Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
LKML <linux-kernel@vger.kernel.org>, X86 ML <x86@kernel.org>,
"Paul E. McKenney" <paulmck@kernel.org>,
Alexandre Chartre <alexandre.chartre@oracle.com>,
Frederic Weisbecker <frederic@kernel.org>,
Paolo Bonzini <pbonzini@redhat.com>,
Sean Christopherson <sean.j.christopherson@intel.com>,
Masami Hiramatsu <mhiramat@kernel.org>,
Petr Mladek <pmladek@suse.com>,
Steven Rostedt <rostedt@goodmis.org>,
Joel Fernandes <joel@joelfernandes.org>,
Boris Ostrovsky <boris.ostrovsky@oracle.com>,
Brian Gerst <brgerst@gmail.com>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Josh Poimboeuf <jpoimboe@redhat.com>,
Will Deacon <will@kernel.org>,
Tom Lendacky <thomas.lendacky@amd.com>,
Wei Liu <wei.liu@kernel.org>,
Michael Kelley <mikelley@microsoft.com>,
Jason Chen CJ <jason.cj.chen@intel.com>,
Zhao Yakui <yakui.zhao@intel.com>,
"Peter Zijlstra (Intel)" <peterz@infradead.org>
Subject: Re: [patch V6 10/37] x86/entry: Switch XEN/PV hypercall entry to IDTENTRY
Date: Wed, 20 May 2020 10:06:32 +0200 [thread overview]
Message-ID: <3dd0e972-1b80-cd6b-6490-5b745ada68c8@suse.com> (raw)
In-Reply-To: <CALCETrW4BxfTVzv8mXntNXiAPnKxqdMEv7djUknGZcrno2WJHg@mail.gmail.com>
On 19.05.20 21:44, Andy Lutomirski wrote:
> On Tue, May 19, 2020 at 11:58 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>>
>> Andy Lutomirski <luto@kernel.org> writes:
>>> On Fri, May 15, 2020 at 5:10 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>>>> @@ -573,6 +578,16 @@ static __always_inline void __idtentry_exit(struct pt_regs *regs)
>>>> instrumentation_end();
>>>> return;
>>>> }
>>>> + } else if (IS_ENABLED(CONFIG_XEN_PV)) {
>>>> + if (preempt_hcall) {
>>>> + /* See CONFIG_PREEMPTION above */
>>>> + instrumentation_begin();
>>>> + rcu_irq_exit_preempt();
>>>> + xen_maybe_preempt_hcall();
>>>> + trace_hardirqs_on();
>>>> + instrumentation_end();
>>>> + return;
>>>> + }
>>>
>>> Ewwwww! This shouldn't be taken as a NAK -- it's just an expression
>>> of disgust.
>>
>> I'm really not proud of it, but that was the least horrible thing I
>> could come up with.
>>
>>> Shouldn't this be:
>>>
>>> instrumentation_begin();
>>> if (!irq_needs_irq_stack(...))
>>> __blah();
>>> else
>>> run_on_irqstack(__blah, NULL);
>>> instrumentation_end();
>>>
>>> or even:
>>>
>>> instrumentation_begin();
>>> run_on_irqstack_if_needed(__blah, NULL);
>>> instrumentation_end();
>>
>> Yeah. In that case the instrumentation markers are not required as they
>> will be inside the run....() function.
>>
>>> ****** BUT *******
>>>
>>> I think this is all arse-backwards. This is a giant mess designed to
>>> pretend we support preemption and to emulate normal preemption in a
>>> non-preemptible kernel. I propose one to two massive cleanups:
>>>
>>> A: Just delete all of this code. Preemptible hypercalls on
>>> non-preempt kernels will still process interrupts but won't get
>>> preempted. If you want preemption, compile with preemption.
>>
>> I'm happy to do so, but the XEN folks might have opinions on that :)
Indeed. :-)
>>
>>> B: Turn this thing around. Specifically, in the one and only case we
>>> care about, we know pretty much exactly what context we got this entry
>>> in: we're running in a schedulable context doing an explicitly
>>> preemptible hypercall, and we have RIP pointing at a SYSCALL
>>> instruction (presumably, but we shouldn't bet on it) in the hypercall
>>> page. Ideally we would change the Xen PV ABI so the hypercall would
>>> return something like EAGAIN instead of auto-restarting and we could
>>> ditch this mess entirely. But the ABI seems to be set in stone or at
>>> least in molasses, so how about just:
>>>
>>> idt_entry(exit(regs));
>>> if (inhcall && need_resched())
>>> schedule();
>>
>> Which brings you into the situation that you call schedule() from the
>> point where we just moved it out. If we would go there we'd need to
>> ensure that RCU is watching as well. idtentry_exit() might have it
>> turned off ....
>
> I don't think this is possible. Once you untangle all the wrappers,
> the call sites are effectively:
>
> __this_cpu_write(xen_in_preemptible_hcall, true);
> CALL_NOSPEC to the hypercall page
> __this_cpu_write(xen_in_preemptible_hcall, false);
>
> I think IF=1 when this happens, but I won't swear to it. RCU had
> better be watching.
Preemptible hypercalls are never done with interrupts off. To be more
precise: they are only ever done during ioctl() processing.
I can add an ASSERT() to xen_preemptible_hcall_begin() if you want.
>
> As I understand it, the one and only situation Xen wants to handle is
> that an interrupt gets delivered during the hypercall. The hypervisor
> is too clever for its own good and deals with this by rewinding RIP to
> the beginning of whatever instruction did the hypercall and delivers
> the interrupt, and we end up in this handler. So, if this happens,
> the idea is to not only handle the interrupt but to schedule if
> scheduling would be useful.
Correct. More precise: the hypercalls in question can last very long
(up to several seconds) and so they need to be interruptible. As said
before: the interface how this is done is horrible. :-(
>
> So I don't think we need all this RCU magic. This really ought to be
> able to be simplified to:
>
> idtentry_exit();
>
> if (appropriate condition)
> schedule();
>
> Obviously we don't want to schedule if this is a nested entry, but we
> should be able to rule that out by checking that regs->flags &
> X86_EFLAGS_IF and by handling the percpu variable a little more
> intelligently. So maybe the right approach is:
>
> bool in_preemptible_hcall = __this_cpu_read(xen_in_preemptible_hcall);
> __this_cpu_write(xen_in_preemptible_hcall, false);
> idtentry_enter(...);
>
> do the acutal work;
>
> idtentry_exit(...);
>
> if (in_preemptible_hcall) {
> assert regs->flags & X86_EFLAGS_IF;
> assert that RCU is watching;
> assert that we're on the thread stack;
> assert whatever else we feel like asserting;
> if (need_resched())
> schedule();
> }
>
> __this_cpu_write(xen_in_preemptible_hcall, in_preemptible_hcall);
>
> And now we don't have a special idtentry_exit() case just for Xen, and
> all the mess is entirely contained in the Xen PV code. And we need to
> mark all the preemptible hypercalls noinstr. Does this seem
> reasonable?
From my point of view this sounds fine.
>
> That being said, right now, with or without your patch, I think we're
> toast if the preemptible hypercall code gets traced. So maybe the
> right thing is to just drop all the magic preemption stuff from your
> patch and let the Xen maintainers submit something new (maybe like
> what I suggest above) if they want magic preemption back.
>
I'd prefer to not break preemptible hypercall in between.
IMO the patch should be modified along your suggestion. I'd be happy to
test it.
Juergen
next prev parent reply other threads:[~2020-05-20 8:06 UTC|newest]
Thread overview: 159+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-15 23:45 [patch V6 00/37] x86/entry: Rework leftovers and merge plan Thomas Gleixner
2020-05-15 23:45 ` [patch V6 01/37] tracing/hwlat: Use ktime_get_mono_fast_ns() Thomas Gleixner
2020-05-19 21:26 ` Steven Rostedt
2020-05-19 21:45 ` Thomas Gleixner
2020-05-19 22:18 ` Steven Rostedt
2020-05-20 19:51 ` Thomas Gleixner
2020-05-20 20:14 ` Peter Zijlstra
2020-05-20 22:20 ` Thomas Gleixner
2020-05-15 23:45 ` [patch V6 02/37] tracing/hwlat: Split ftrace_nmi_enter/exit() Thomas Gleixner
2020-05-19 22:23 ` Steven Rostedt
2020-05-15 23:45 ` [patch V6 03/37] nmi, tracing: Provide nmi_enter/exit_notrace() Thomas Gleixner
2020-05-17 5:12 ` Andy Lutomirski
2020-05-19 22:24 ` Steven Rostedt
2020-05-15 23:45 ` [patch V6 04/37] x86: Make hardware latency tracing explicit Thomas Gleixner
2020-05-17 5:36 ` Andy Lutomirski
2020-05-17 8:48 ` Thomas Gleixner
2020-05-18 5:50 ` Andy Lutomirski
2020-05-18 8:03 ` Thomas Gleixner
2020-05-18 20:42 ` Andy Lutomirski
2020-05-18 8:01 ` Peter Zijlstra
2020-05-18 8:05 ` Thomas Gleixner
2020-05-18 8:08 ` Peter Zijlstra
2020-05-20 20:09 ` Thomas Gleixner
2020-05-20 20:14 ` Andy Lutomirski
2020-05-20 22:20 ` Thomas Gleixner
2020-05-15 23:45 ` [patch V6 05/37] genirq: Provide irq_enter/exit_rcu() Thomas Gleixner
2020-05-18 23:06 ` Andy Lutomirski
2020-05-15 23:45 ` [patch V6 06/37] genirq: Provde __irq_enter/exit_raw() Thomas Gleixner
2020-05-18 23:07 ` Andy Lutomirski
2020-05-15 23:45 ` [patch V6 07/37] x86/entry: Provide helpers for execute on irqstack Thomas Gleixner
2020-05-18 23:11 ` Andy Lutomirski
2020-05-18 23:46 ` Andy Lutomirski
2020-05-18 23:53 ` Thomas Gleixner
2020-05-18 23:56 ` Andy Lutomirski
2020-05-20 12:35 ` Thomas Gleixner
2020-05-20 15:09 ` Andy Lutomirski
2020-05-20 15:27 ` Thomas Gleixner
2020-05-20 15:36 ` Andy Lutomirski
2020-05-18 23:51 ` Thomas Gleixner
2020-05-15 23:45 ` [patch V6 08/37] x86/entry/64: Move do_softirq_own_stack() to C Thomas Gleixner
2020-05-18 23:48 ` Andy Lutomirski
2020-05-15 23:45 ` [patch V6 09/37] x86/entry: Split idtentry_enter/exit() Thomas Gleixner
2020-05-18 23:49 ` Andy Lutomirski
2020-05-19 8:25 ` Thomas Gleixner
2020-05-15 23:45 ` [patch V6 10/37] x86/entry: Switch XEN/PV hypercall entry to IDTENTRY Thomas Gleixner
2020-05-19 17:06 ` Andy Lutomirski
2020-05-19 18:57 ` Thomas Gleixner
2020-05-19 19:44 ` Andy Lutomirski
2020-05-20 8:06 ` Jürgen Groß [this message]
2020-05-20 11:31 ` Andrew Cooper
2020-05-20 14:13 ` Thomas Gleixner
2020-05-20 15:16 ` Andy Lutomirski
2020-05-20 17:22 ` Andy Lutomirski
2020-05-20 19:16 ` Thomas Gleixner
2020-05-20 23:21 ` Andy Lutomirski
2020-05-21 10:45 ` Thomas Gleixner
2020-05-21 2:23 ` Boris Ostrovsky
2020-05-21 7:08 ` Thomas Gleixner
2020-05-15 23:45 ` [patch V6 11/37] x86/entry/64: Simplify idtentry_body Thomas Gleixner
2020-05-19 17:06 ` Andy Lutomirski
2020-05-15 23:45 ` [patch V6 12/37] x86/entry: Provide idtentry_entry/exit_cond_rcu() Thomas Gleixner
2020-05-19 17:08 ` Andy Lutomirski
2020-05-19 19:00 ` Thomas Gleixner
2020-05-19 20:20 ` Thomas Gleixner
2020-05-19 20:24 ` Andy Lutomirski
2020-05-19 21:20 ` Thomas Gleixner
2020-05-20 0:26 ` Andy Lutomirski
2020-05-20 2:23 ` Paul E. McKenney
2020-05-20 15:36 ` Andy Lutomirski
2020-05-20 16:51 ` Andy Lutomirski
2020-05-20 18:05 ` Paul E. McKenney
2020-05-20 19:49 ` Thomas Gleixner
2020-05-20 22:15 ` Paul E. McKenney
2020-05-20 23:25 ` Paul E. McKenney
2020-05-21 8:31 ` Thomas Gleixner
2020-05-21 13:39 ` Paul E. McKenney
2020-05-21 18:41 ` Thomas Gleixner
2020-05-21 19:04 ` Paul E. McKenney
2020-05-20 18:32 ` Thomas Gleixner
2020-05-20 19:24 ` Thomas Gleixner
2020-05-20 19:42 ` Paul E. McKenney
2020-05-20 17:38 ` Paul E. McKenney
2020-05-20 17:47 ` Andy Lutomirski
2020-05-20 18:11 ` Paul E. McKenney
2020-05-20 14:19 ` Thomas Gleixner
2020-05-27 8:12 ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
2020-05-15 23:46 ` [patch V6 13/37] x86/entry: Switch page fault exception to IDTENTRY_RAW Thomas Gleixner
2020-05-19 20:12 ` Andy Lutomirski
2020-05-15 23:46 ` [patch V6 14/37] x86/entry: Remove the transition leftovers Thomas Gleixner
2020-05-19 20:13 ` Andy Lutomirski
2020-05-15 23:46 ` [patch V6 15/37] x86/entry: Change exit path of xen_failsafe_callback Thomas Gleixner
2020-05-19 20:14 ` Andy Lutomirski
2020-05-15 23:46 ` [patch V6 16/37] x86/entry/64: Remove error_exit Thomas Gleixner
2020-05-19 20:14 ` Andy Lutomirski
2020-05-15 23:46 ` [patch V6 17/37] x86/entry/32: Remove common_exception Thomas Gleixner
2020-05-19 20:14 ` Andy Lutomirski
2020-05-15 23:46 ` [patch V6 18/37] x86/irq: Use generic irq_regs implementation Thomas Gleixner
2020-05-15 23:46 ` [patch V6 19/37] x86/irq: Convey vector as argument and not in ptregs Thomas Gleixner
2020-05-19 20:19 ` Andy Lutomirski
2020-05-21 13:22 ` Thomas Gleixner
2020-05-22 18:48 ` Boris Ostrovsky
2020-05-22 19:26 ` Josh Poimboeuf
2020-05-22 19:54 ` Thomas Gleixner
2020-05-15 23:46 ` [patch V6 20/37] x86/irq/64: Provide handle_irq() Thomas Gleixner
2020-05-19 20:21 ` Andy Lutomirski
2020-05-15 23:46 ` [patch V6 21/37] x86/entry: Add IRQENTRY_IRQ macro Thomas Gleixner
2020-05-19 20:27 ` Andy Lutomirski
2020-05-15 23:46 ` [patch V6 22/37] x86/entry: Use idtentry for interrupts Thomas Gleixner
2020-05-19 20:28 ` Andy Lutomirski
2020-05-15 23:46 ` [patch V6 23/37] x86/entry: Provide IDTENTRY_SYSVEC Thomas Gleixner
2020-05-20 0:29 ` Andy Lutomirski
2020-05-20 15:07 ` Thomas Gleixner
2020-05-15 23:46 ` [patch V6 24/37] x86/entry: Convert APIC interrupts to IDTENTRY_SYSVEC Thomas Gleixner
2020-05-20 0:27 ` Andy Lutomirski
2020-05-15 23:46 ` [patch V6 25/37] x86/entry: Convert SMP system vectors " Thomas Gleixner
2020-05-20 0:28 ` Andy Lutomirski
2020-05-15 23:46 ` [patch V6 26/37] x86/entry: Convert various system vectors Thomas Gleixner
2020-05-20 0:30 ` Andy Lutomirski
2020-05-15 23:46 ` [patch V6 27/37] x86/entry: Convert KVM vectors to IDTENTRY_SYSVEC Thomas Gleixner
2020-05-20 0:30 ` Andy Lutomirski
2020-05-15 23:46 ` [patch V6 28/37] x86/entry: Convert various hypervisor " Thomas Gleixner
2020-05-20 0:31 ` Andy Lutomirski
2020-05-15 23:46 ` [patch V6 29/37] x86/entry: Convert XEN hypercall vector " Thomas Gleixner
2020-05-20 0:31 ` Andy Lutomirski
2020-05-15 23:46 ` [patch V6 30/37] x86/entry: Convert reschedule interrupt to IDTENTRY_RAW Thomas Gleixner
2020-05-19 23:57 ` Andy Lutomirski
2020-05-20 15:08 ` Thomas Gleixner
2020-05-15 23:46 ` [patch V6 31/37] x86/entry: Remove the apic/BUILD interrupt leftovers Thomas Gleixner
2020-05-20 0:32 ` Andy Lutomirski
2020-05-15 23:46 ` [patch V6 32/37] x86/entry/64: Remove IRQ stack switching ASM Thomas Gleixner
2020-05-20 0:33 ` Andy Lutomirski
2020-05-15 23:46 ` [patch V6 33/37] x86/entry: Make enter_from_user_mode() static Thomas Gleixner
2020-05-20 0:34 ` Andy Lutomirski
2020-05-15 23:46 ` [patch V6 34/37] x86/entry/32: Remove redundant irq disable code Thomas Gleixner
2020-05-20 0:35 ` Andy Lutomirski
2020-05-15 23:46 ` [patch V6 35/37] x86/entry/64: Remove TRACE_IRQS_*_DEBUG Thomas Gleixner
2020-05-20 0:46 ` Andy Lutomirski
2020-05-15 23:46 ` [patch V6 36/37] x86/entry: Move paranoid irq tracing out of ASM code Thomas Gleixner
2020-05-20 0:53 ` Andy Lutomirski
2020-05-20 15:16 ` Thomas Gleixner
2020-05-20 17:13 ` Andy Lutomirski
2020-05-20 18:33 ` Thomas Gleixner
2020-05-15 23:46 ` [patch V6 37/37] x86/entry: Remove the TRACE_IRQS cruft Thomas Gleixner
2020-05-18 23:07 ` Andy Lutomirski
2020-05-16 17:18 ` [patch V6 00/37] x86/entry: Rework leftovers and merge plan Paul E. McKenney
2020-05-19 12:28 ` Joel Fernandes
2020-05-18 16:07 ` Peter Zijlstra
2020-05-18 18:53 ` Thomas Gleixner
2020-05-19 8:29 ` Peter Zijlstra
2020-05-18 20:24 ` Thomas Gleixner
2020-05-19 8:38 ` Peter Zijlstra
2020-05-19 9:02 ` Peter Zijlstra
2020-05-23 2:52 ` Lai Jiangshan
2020-05-23 13:08 ` Peter Zijlstra
2020-06-15 16:17 ` Peter Zijlstra
2020-05-19 9:06 ` Thomas Gleixner
2020-05-19 18:37 ` Steven Rostedt
2020-05-19 19:09 ` Thomas Gleixner
2020-05-19 19:13 ` Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3dd0e972-1b80-cd6b-6490-5b745ada68c8@suse.com \
--to=jgross@suse.com \
--cc=alexandre.chartre@oracle.com \
--cc=andrew.cooper3@citrix.com \
--cc=boris.ostrovsky@oracle.com \
--cc=brgerst@gmail.com \
--cc=frederic@kernel.org \
--cc=jason.cj.chen@intel.com \
--cc=joel@joelfernandes.org \
--cc=jpoimboe@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=mikelley@microsoft.com \
--cc=paulmck@kernel.org \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=rostedt@goodmis.org \
--cc=sean.j.christopherson@intel.com \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
--cc=wei.liu@kernel.org \
--cc=will@kernel.org \
--cc=x86@kernel.org \
--cc=yakui.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).