From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751203AbbBWFW1 (ORCPT ); Mon, 23 Feb 2015 00:22:27 -0500 Received: from mail-la0-f42.google.com ([209.85.215.42]:40287 "EHLO mail-la0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750797AbbBWFW0 (ORCPT ); Mon, 23 Feb 2015 00:22:26 -0500 MIME-Version: 1.0 In-Reply-To: <54EA8641.6040609@redhat.com> References: <20150221093150.GA27841@gmail.com> <20150221163840.GA32073@pd.tnic> <20150221172914.GB32073@pd.tnic> <20150222110629.GB7529@pd.tnic> <54EA8641.6040609@redhat.com> From: Andy Lutomirski Date: Sun, 22 Feb 2015 21:22:04 -0800 Message-ID: Subject: Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs To: Rik van Riel Cc: Borislav Petkov , "Maciej W. Rozycki" , Ingo Molnar , Oleg Nesterov , X86 ML , "linux-kernel@vger.kernel.org" , Linus Torvalds Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Feb 22, 2015 at 5:45 PM, Rik van Riel wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 02/22/2015 06:06 AM, Borislav Petkov wrote: >> On Sat, Feb 21, 2015 at 06:18:01PM -0800, Andy Lutomirski wrote: >>> That's true. The question is whether there are enough of them, >>> and whether twiddling TS is fast enough, that it's worth it. >> >> Yes, and let me make it clear what I'm trying to do here: I want to >> make sure that eager FPU handling (both allocation and switching - >> and no, I'm not confusing the concepts) *doesn't* *hurt* any >> relevant workload. If it does, then we'll stop wasting time right >> there. >> >> But(!), if the CR0.TS lazy dance turns out to be really slow and >> the eager handling doesn't bring a noticeable slowdown, in >> comparison, we should consider doing the eager thing by default. >> After running a lot more benchmarks, of course. >> >> Which brings me to the main reason why we're doing this: code >> simplification. If we switch to eager, we'll kill a lot of >> non-trivial code and the FPU handling in the kernel becomes dumb >> and nice again. > > Currently the FPU handling does something really dumb for > KVM VCPU threads. Specifically, every time we enter a > KVM guest, we save the userspace FPU state of the VCPU > thread, and every time we leave the KVM guest, we load > the userspace FPU state of the VCPU thread. > > This is done for a thread that hardly ever exits to > userspace, and generally just switches between kernel > and guest mode. > > The reason for this acrobatics is that at every > context switch time, the userspace FPU state is > saved & loaded. > > I am working on a patch series to avoid that needless > FPU save & restore, by moving the point at which the > user FPU state is loaded out to the point where we > return to userspace, in do_notify_resume. > > One implication of this is that in kernel mode, we > can no longer just assume that the user space FPU > state is always loaded, and we need to check for that > (like the lazy FPU code does today). I would really > like to keep that code around, for obvious reasons :) I like that stuff, except for the fact that it still has code that depends on whether we're in eager or lazy mode, even though eager is a little less eager with your patches. Ideally I'd like to see your patches applied *and* lazy mode removed. --Andy