From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1754725AbbCaRcX (ORCPT <rfc822;w@1wt.eu>);
	Tue, 31 Mar 2015 13:32:23 -0400
Received: from mx1.redhat.com ([209.132.183.28]:39056 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1753282AbbCaRcW (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Tue, 31 Mar 2015 13:32:22 -0400
Message-ID: <551ADA0A.7050701@redhat.com>
Date: Tue, 31 Mar 2015 19:31:54 +0200
From: Denys Vlasenko <dvlasenk@redhat.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0
MIME-Version: 1.0
To: Andy Lutomirski <luto@amacapital.net>, Ingo Molnar <mingo@kernel.org>
CC: Denys Vlasenko <vda.linux@googlemail.com>, Brian Gerst <brgerst@gmail.com>,
        Borislav Petkov <bp@alien8.de>,
        the arch/x86 maintainers <x86@kernel.org>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH] x86/asm/entry/64: better check for canonical address
References: <20150327081141.GA9526@gmail.com> <551534B1.6090908@redhat.com> <20150327111738.GA8749@gmail.com> <CAMzpN2hZrL9WLAHrxM4dsa8wkNHWtX6NVGyhRz_MoiED6Q5X8A@mail.gmail.com> <20150327113430.GC14778@gmail.com> <551549AF.50808@redhat.com> <20150327121645.GC15631@gmail.com> <55154DB3.9000008@redhat.com> <20150328091106.GA5361@gmail.com> <CAK1hOcOOQAq1sWdnK+nTZBightFaJWiTeQCmCPzhG0-y-UYvOw@mail.gmail.com> <20150331164337.GA8462@gmail.com> <CALCETrVHMEM7kRnhu7aVu8UFMSjiMLZBo=7vNfgK4fQx2oZMmg@mail.gmail.com>
In-Reply-To: <CALCETrVHMEM7kRnhu7aVu8UFMSjiMLZBo=7vNfgK4fQx2oZMmg@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 03/31/2015 07:08 PM, Andy Lutomirski wrote:
> On Tue, Mar 31, 2015 at 9:43 AM, Ingo Molnar <mingo@kernel.org> wrote:
>>
>> * Denys Vlasenko <vda.linux@googlemail.com> wrote:
>>
>>>> I guess they could optimize it by adding a single "I am a modern
>>>> OS executing regular userspace" flag to the descriptor [or
>>>> expressing the same as a separate instruction], to avoid all that
>>>> legacy crap that won't trigger on like 99.999999% of systems ...
>>>
>>> Yes, that would be a useful addition. Interrupt servicing on x86
>>> takes a non-negligible hit because of IRET slowness.
>>
>> But ... to react to your other patch: detecting the common easy case
>> and doing a POPF+RET ourselves ought to be pretty good as well?
>>
>> But only if ptregs->rip != the magic RET itself, to avoid recursion.
>>
>> Even with all those extra checks it should still be much faster.
>>
> 
> I have a smallish preference for doing sti;ret instead, because that
> keeps the funny special case entirely localized to the NMI code
> instead of putting it in the IRQ exit path.  I suspect that the
> performance loss is at most a cycle or two (we're adding a branch, but
> sti itself is quite fast).
> 
> That being said, I could easily be convinced otherwise.

Let me try to convince you. sti is 6 cycles.

The patch atop your code would be:

 	movq RIP-ARGOFFSET(%rsp), %rcx
+	cmp $magic_ret, %rcx
+	je  real_iret
-	btr $9, %rdi
 	movq %rdi, (%rsi)
 	movq %rcx, 8(%rsi)
 	movq %rsi, ORIG_RAX-ARGOFFSET(%rsp)
 	popq_cfi %r11
 	popq_cfi %r10
 	popq_cfi %r9
 	popq_cfi %r8
 	popq_cfi %rax
 	popq_cfi %rcx
 	popq_cfi %rdx
 	popq_cfi %rsi
 	popq_cfi %rdi
 	popq %rsp
-	jc 1f
	popfq_cfi
+magic_ret:
	retq
-1:
-	popfq_cfi
-	sti
-	retq

It's a clear (albeit small) win: the branch is almost never taken,
and we do not need sti.