From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755405AbdGXMJr (ORCPT ); Mon, 24 Jul 2017 08:09:47 -0400 Received: from ozlabs.org ([103.22.144.67]:50257 "EHLO ozlabs.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751482AbdGXMJh (ORCPT ); Mon, 24 Jul 2017 08:09:37 -0400 From: Michael Ellerman To: Peter Zijlstra Cc: Kees Cook , Andrew Morton , Ingo Molnar , Josh Poimboeuf , Christoph Hellwig , "Eric W. Biederman" , Jann Horn , Eric Biggers , Elena Reshetova , Hans Liljestrand , Greg KH , Alexey Dobriyan , "Serge E. Hallyn" , arozansk@redhat.com, Davidlohr Bueso , Manfred Spraul , "axboe\@kernel.dk" , James Bottomley , "x86\@kernel.org" , Arnd Bergmann , "David S. Miller" , Rik van Riel , LKML , linux-arch , "kernel-hardening\@lists.openwall.com" Subject: Re: [PATCH v6 0/2] x86: Implement fast refcount overflow protection In-Reply-To: <20170724084455.o5asa55fckgyjri2@hirez.programming.kicks-ass.net> References: <1500422614-94821-1-git-send-email-keescook@chromium.org> <20170720091106.kigtr6zy7pjgk2s6@gmail.com> <20170721142255.586224f0db9cf0714e654859@linux-foundation.org> <87zibujq5t.fsf@concordia.ellerman.id.au> <20170724084455.o5asa55fckgyjri2@hirez.programming.kicks-ass.net> User-Agent: Notmuch/0.21 (https://notmuchmail.org) Date: Mon, 24 Jul 2017 22:09:32 +1000 Message-ID: <87k22yjatf.fsf@concordia.ellerman.id.au> MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Peter Zijlstra writes: > On Mon, Jul 24, 2017 at 04:38:06PM +1000, Michael Ellerman wrote: > >> What I'm not entirely clear on is what the best trade off is in terms of >> overhead vs checks. The summary of behaviour between the fast and full >> versions you promised Ingo will help there I think. > > That's something that's probably completely different for PPC than it is > for x86. Yeah definitely. I guess I see the x86 version as a lower bound on the semantics we'd need to implement and still claim to implement the refcount stuff. > Both because your primitive is LL/SC and thus the saturation > semantics we need a cmpxchg loop for are more natural in your case Yay! > anyway, and the fact that your LL/SC is horrendously slow in any case. Boo :/ Just kidding. I suspect you're right that we can probably pack a reasonable amount of tests in the body of the LL/SC and not notice. > Also, I still haven't seen an actual benchmark where our cmpxchg loop > actually regresses anything, just a lot of yelling about potential > regressions :/ Heh yeah. Though I have looked at the code it generates on PPC and it's not sleek, though I guess that's not a benchmark is it :) cheers