From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:44884) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R8G7A-0002ce-GI for qemu-devel@nongnu.org; Mon, 26 Sep 2011 14:41:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1R8G79-0007Wt-6m for qemu-devel@nongnu.org; Mon, 26 Sep 2011 14:41:16 -0400 Received: from mail-qy0-f180.google.com ([209.85.216.180]:61909) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R8G79-0007Wl-34 for qemu-devel@nongnu.org; Mon, 26 Sep 2011 14:41:15 -0400 Received: by qyc1 with SMTP id 1so6385606qyc.4 for ; Mon, 26 Sep 2011 11:41:14 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <4E80C38F.9020207@siemens.com> References: <4E802DDD.8090100@siemens.com> <4E805923.50806@siemens.com> <4E806573.6000207@siemens.com> <4E80B768.8010000@siemens.com> <4E80C38F.9020207@siemens.com> From: Blue Swirl Date: Mon, 26 Sep 2011 18:40:54 +0000 Message-ID: Content-Type: text/plain; charset=UTF-8 Subject: Re: [Qemu-devel] [PATCH v2] tcg: Remove stack protection from helper functions List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: Laurent Desnogues , Peter Maydell , Anthony Liguori , Mulyadi Santosa , qemu-devel On Mon, Sep 26, 2011 at 6:25 PM, Jan Kiszka wrote: > On 2011-09-26 20:20, Blue Swirl wrote: >> On Mon, Sep 26, 2011 at 5:33 PM, Jan Kiszka wrote: >>> On 2011-09-26 19:22, Blue Swirl wrote: >>>> On Mon, Sep 26, 2011 at 11:56 AM, Peter Maydell >>>> wrote: >>>>> On 26 September 2011 12:43, Jan Kiszka wrote: >>>>>> On 2011-09-26 13:33, Peter Maydell wrote: >>>>>>> On 26 September 2011 11:51, Jan Kiszka wrote: >>>>>>>> This increases the overhead of frequently executed helpers. We need to >>>>>>>> move rule past QEMU_CFLAGS assignment to ensure that the required simple >>>>>>>> assignment picks up all bits. The signal workaround is moved just for >>>>>>>> the sake of consistency. >>>>>>> >>>>>>>> +# NOTE: Must be after the last QEMU_CFLAGS assignment >>>>>>>> +op_helper.o user-exec.o: QEMU_CFLAGS := $(subst -fstack-protector-all,,$(QEMU_CFLAGS)) $(HELPER_CFLAGS) >>>>>>> >>>>>>> Why also user-exec.o ? >>>>>> >>>>>> That's a good question. It doesn't look like it's deserving this. >>>>>> >>>>>>> Why not the other source files with helpers in? >>>>>> >>>>>> Name them and I add them. >>>>> >>>>> target-*/*helper.c ? >>>>> >>>>> But mostly I think what I'm trying to say is that this is making >>>>> a tradeoff between safety and speed, so it ought to come with a >>>>> rationale for why it is OK to remove the safety checks for these >>>>> files. Given that rationale you can identify other files that are >>>>> also safe/worthwhile to flip the flag for. >>>> >>>> I wouldn't remove -fstack-protector-all by default. Especially op code >>>> interfaces with the guest. >>> >>> I'd love to have some function attribute for this, because a stack >>> protector for rather simple arithmetic operations or something like >>> helper_cli/sti are pointlessly burned cycles. >> >> In order to avoid burning the cycles, there is a certain kernel module >> which gives almost native performance. > > Well, even in 2011 there are cases remaining where VT-x/AMV-V is not > available to your favorite hypervisor. > >> >>> Maybe we can introduce op_helper_simple.c. >>> >>>> >>>> For max performance version, I'd check if -fomit-frame-pointer and -O3 >>>> makes sense. See also this article: >>>> https://www.debian-administration.org/article/672/Optimizing_code_via_compiler_flags >>> >>> We already run without frame pointers, -O3 might be worth exploring in >>> addition. Still, that won't take the protector overhead away. >> >> It would be interesting to have some benchmarks. I'd expect that most >> of the run time is spent within generated code, the next largest item >> should be the translator and any helpers should be marginal. > > At least we've a rather static, long-running guest, not much is > recompiled once the system has settled. Optimizing the translator or improving TCG optimizer could also improve performance. By the way, -fomit-frame-pointer is not enabled globally for some reason, it should be the default for non-debug builds.