From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751878AbdJ0N4r (ORCPT ); Fri, 27 Oct 2017 09:56:47 -0400 Received: from merlin.infradead.org ([205.233.59.134]:56186 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751298AbdJ0N4q (ORCPT ); Fri, 27 Oct 2017 09:56:46 -0400 Date: Fri, 27 Oct 2017 15:56:24 +0200 From: Peter Zijlstra To: "Reshetova, Elena" Cc: "linux-kernel@vger.kernel.org" , "gregkh@linuxfoundation.org" , "keescook@chromium.org" , "tglx@linutronix.de" , "mingo@redhat.com" , "ishkamiel@gmail.com" , Will Deacon Subject: Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t Message-ID: <20171027135624.GY3165@worktop.lehotels.local> References: <1508756984-18980-1-git-send-email-elena.reshetova@intel.com> <20171023131224.GC3165@worktop.lehotels.local> <2236FBA76BA1254E88B949DDB74E612B802B6A2E@IRSMSX102.ger.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2236FBA76BA1254E88B949DDB74E612B802B6A2E@IRSMSX102.ger.corp.intel.com> User-Agent: Mutt/1.5.22.1 (2013-10-16) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 27, 2017 at 06:49:55AM +0000, Reshetova, Elena wrote: > Could we possibly have a bit more elaborate discussion on this? > > Or alternatively, what then should be the correct way for a certain > variable (that behaves like a standard refcounter) to check if the > relaxed memory ordering is ok? The refcount_t includes sufficient ordering for normal reference counting. It provides a happens-before relation wrt. dropping the reference, such that all object accesses happen before we drop the reference; because at that point the object can go away and any further access would be UAF. So if the thing being replaced is purely a reference count, all should be well. > This is what Thomas was asking me to do for core kernel conversions > and this is what I don't know how to do correctly. Thomas asked you to verify that nothing relies on the stronger ordering provided by atomic_dec_and_test(). This means you have to audit the code with an eye towards memory ordering. Now the futex one is actually OK. Thomas just wants you to mention that refcount_dec_and_test is weaker and explain that in this case that is fine since pi_state::refcount is in fact a pure refcount and put_pi_state() does not in fact rely on further ordering. Just mentioning the above gives the maintainer: 1) the information he needs to perform review, mentioning memory ordering changes means he needs to look at it. 2) you identify the place where it changes (put_pi_state) which again aids him in review by limiting the amount of code he needs to double check. 3) that you actually looked at the problem; giving more trust you actually know wth you're doing. > Also, I got exactly the same question from xfs maintainer, > so if we provide an API that we hope will be used correctly in the > future, we should have a way for people to verify it. I actually replied to Dave (but have not checked if there's further email on the thread). I object to the claim that the code is correct if he cannot explain the memory ordering. Aside from that; if/when he explain the ordering requirements of those sites and those do indeed require more than refcount_dec_and_test() provides we should add a comment exactly explaining the ordering along with additional barriers, such as smp_mb__after_atomic(). > Maybe it is just me, but I would think that having a way to verify > that your code is ok with this refcounter-specific relaxed memory > ordering applies to any memory ordering requirements, refcounters are > just a subset with certain properties, so is then the full answer is > to figure out how to verify any memory ordering requirement of your > code? Yes; and this can be quite tricky, but undocumented ordering requirements are a bug that need to be fixed. Mostly the code is 'simple' and refcount really does the right and sufficient thing. If the maintainer knows or suspects more ordering is required he can help out; but for that to happen you have to, at the very least, do 1&2 above. > We can also document this logic in more docs or even maybe try to > create a better documentation for current memory ordering bits since > it is not the most easiest read at the moment. Otherwise this might be > just bad enough reason for many people to avoid refcount_t type if it > is just easier to tell "let's take just old atomic, we knew it somehow > worked before".... That's just voodoo programming; and while that is rife I have absolutely no sympathy for it. Memory ordering is hard, but that doesn't mean we should just blindly sprinkle memory barriers around until it 'works'. So while memory-barriers.txt is a longish document, it is readable with a bit of time and effort. There are also tools being developed that can help people validate their code. There are enough people around that actually get this stuff (it really is not _that_ hard, but it does require a functional brain) so if you can describe the problem with sufficient detail we can help out getting the ordering right (or state its broken :-).