From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33778) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDW6n-00027S-Lp for qemu-devel@nongnu.org; Fri, 19 Oct 2018 10:50:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDW6m-0001Su-Ox for qemu-devel@nongnu.org; Fri, 19 Oct 2018 10:50:37 -0400 Date: Fri, 19 Oct 2018 10:50:18 -0400 From: "Emilio G. Cota" Message-ID: <20181019145018.GB7279@flamenco> References: <20181019010625.25294-1-cota@braap.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [Qemu-devel] [RFC v3 0/56] per-CPU locks List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: qemu-devel@nongnu.org, Aleksandar Markovic , Alexander Graf , Alistair Francis , Andrzej Zaborowski , Anthony Green , Artyom Tarasenko , Aurelien Jarno , Bastian Koppelmann , Christian Borntraeger , Chris Wulff , Cornelia Huck , David Gibson , David Hildenbrand , "Edgar E. Iglesias" , Eduardo Habkost , Fabien Chouteau , Guan Xuetao , James Hogan , Laurent Vivier , Marek Vasut , Mark Cave-Ayland , Max Filippov , Michael Clark , Michael Walle , Palmer Dabbelt , Pavel Dovgalyuk , Peter Crosthwaite , Peter Maydell , qemu-arm@nongnu.org, qemu-ppc@nongnu.org, qemu-s390x@nongnu.org, Richard Henderson , Sagar Karandikar , Stafford Horne On Fri, Oct 19, 2018 at 08:59:24 +0200, Paolo Bonzini wrote: > On 19/10/2018 03:05, Emilio G. Cota wrote: > > I'm calling this series a v3 because it supersedes the two series > > I previously sent about using atomics for interrupt_request: > > https://lists.gnu.org/archive/html/qemu-devel/2018-09/msg02013.html > > The approach in that series cannot work reliably; using (locked) atomics > > to set interrupt_request but not using (locked) atomics to read it > > can lead to missed updates. > > The idea here was that changes to protected fields are all followed by > kick. That may not have been the case, granted, but I wonder if the > plan is unworkable. I suspect that the cpu->interrupt_request+kick mechanism is not the issue, otherwise master should not work--we do atomic_read(cpu->interrupt_request) and only if that read != 0 we take the BQL. My guess is that the problem is with other reads of cpu->interrupt_request, e.g. those in cpu_has_work. Currently those reads happen with the BQL held, and updates to cpu->interrupt_request take the BQL. If we drop the BQL from the setters to instead use locked atomics (like in the aforementioned series), those BQL-protected readers might miss updates. Given that we need a per-CPU lock anyway to remove the BQL from the CPU loop, extending this lock to protect cpu->interrupt_request is a simple solution that keeps the current logic and allows for greater scalability. Thanks, Emilio