From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1751203AbbBWFW1 (ORCPT <rfc822;w@1wt.eu>);
	Mon, 23 Feb 2015 00:22:27 -0500
Received: from mail-la0-f42.google.com ([209.85.215.42]:40287 "EHLO
	mail-la0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1750797AbbBWFW0 (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Mon, 23 Feb 2015 00:22:26 -0500
MIME-Version: 1.0
In-Reply-To: <54EA8641.6040609@redhat.com>
References: <b0ba174ea882ed36cf7011e872baf427c23b7e09.1424458621.git.luto@amacapital.net>
 <20150221093150.GA27841@gmail.com> <20150221163840.GA32073@pd.tnic>
 <20150221172914.GB32073@pd.tnic> <alpine.LFD.2.11.1502212328210.11588@eddie.linux-mips.org>
 <CALCETrU=9Kvq82fBRfw9RLxzyj=LhnLzGV+vWtH+etpqypLatg@mail.gmail.com>
 <20150222110629.GB7529@pd.tnic> <54EA8641.6040609@redhat.com>
From: Andy Lutomirski <luto@amacapital.net>
Date: Sun, 22 Feb 2015 21:22:04 -0800
Message-ID: <CALCETrUU_V+Hyp2Fz=DtEzdEFAOtuXWL4OS+rP-yKWBFJ413HA@mail.gmail.com>
Subject: Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs
To: Rik van Riel <riel@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>, "Maciej W. Rozycki" <macro@linux-mips.org>,
        Ingo Molnar <mingo@kernel.org>, Oleg Nesterov <oleg@redhat.com>,
        X86 ML <x86@kernel.org>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        Linus Torvalds <torvalds@linux-foundation.org>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Sun, Feb 22, 2015 at 5:45 PM, Rik van Riel <riel@redhat.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 02/22/2015 06:06 AM, Borislav Petkov wrote:
>> On Sat, Feb 21, 2015 at 06:18:01PM -0800, Andy Lutomirski wrote:
>>> That's true.  The question is whether there are enough of them,
>>> and whether twiddling TS is fast enough, that it's worth it.
>>
>> Yes, and let me make it clear what I'm trying to do here: I want to
>> make sure that eager FPU handling (both allocation and switching -
>> and no, I'm not confusing the concepts) *doesn't* *hurt* any
>> relevant workload. If it does, then we'll stop wasting time right
>> there.
>>
>> But(!), if the CR0.TS lazy dance turns out to be really slow and
>> the eager handling doesn't bring a noticeable slowdown, in
>> comparison, we should consider doing the eager thing by default.
>> After running a lot more benchmarks, of course.
>>
>> Which brings me to the main reason why we're doing this: code
>> simplification. If we switch to eager, we'll kill a lot of
>> non-trivial code and the FPU handling in the kernel becomes dumb
>> and nice again.
>
> Currently the FPU handling does something really dumb for
> KVM VCPU threads.  Specifically, every time we enter a
> KVM guest, we save the userspace FPU state of the VCPU
> thread, and every time we leave the KVM guest, we load
> the userspace FPU state of the VCPU thread.
>
> This is done for a thread that hardly ever exits to
> userspace, and generally just switches between kernel
> and guest mode.
>
> The reason for this acrobatics is that at every
> context switch time, the userspace FPU state is
> saved & loaded.
>
> I am working on a patch series to avoid that needless
> FPU save & restore, by moving the point at which the
> user FPU state is loaded out to the point where we
> return to userspace, in do_notify_resume.
>
> One implication of this is that in kernel mode, we
> can no longer just assume that the user space FPU
> state is always loaded, and we need to check for that
> (like the lazy FPU code does today).  I would really
> like to keep that code around, for obvious reasons :)

I like that stuff, except for the fact that it still has code that
depends on whether we're in eager or lazy mode, even though eager is a
little less eager with your patches.  Ideally I'd like to see your
patches applied *and* lazy mode removed.

--Andy