From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752466AbbFVWXS (ORCPT ); Mon, 22 Jun 2015 18:23:18 -0400 Received: from mx1.redhat.com ([209.132.183.28]:45140 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751107AbbFVWXJ (ORCPT ); Mon, 22 Jun 2015 18:23:09 -0400 Date: Tue, 23 Jun 2015 00:21:52 +0200 From: Oleg Nesterov To: Peter Zijlstra Cc: paulmck@linux.vnet.ibm.com, tj@kernel.org, mingo@redhat.com, linux-kernel@vger.kernel.org, der.herr@hofr.at, dave@stgolabs.net, riel@redhat.com, viro@ZenIV.linux.org.uk, torvalds@linux-foundation.org Subject: Re: [RFC][PATCH 12/13] stop_machine: Remove lglock Message-ID: <20150622222152.GA4460@redhat.com> References: <20150622121623.291363374@infradead.org> <20150622122256.765619039@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150622122256.765619039@infradead.org> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/22, Peter Zijlstra wrote: > > By having stop_two_cpus() acquire two cpu_stopper::locks we gain full > order against the global stop_machine which takes each of these locks > in order. Yes, but stop_machine() locks/unlocs cpu_stopper->lock sequentially, it never holds more than 1 ->lock, so > +static void cpu_stop_queue_work(unsigned int cpu, struct cpu_stop_work *work) > +{ > + struct cpu_stopper *stopper = &per_cpu(cpu_stopper, cpu); > + unsigned long flags; > + > + spin_lock_irqsave(&stopper->lock, flags); > + __cpu_stop_queue_work(cpu, work); > spin_unlock_irqrestore(&stopper->lock, flags); > } ... > int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn, void *arg) > { > - struct cpu_stop_done done; > + struct cpu_stopper *stopper1, *stopper2; > struct cpu_stop_work work1, work2; > struct multi_stop_data msdata; > + struct cpu_stop_done done; > + unsigned long flags; > + > + if (cpu2 < cpu1) > + swap(cpu1, cpu2); ... > + stopper1 = per_cpu_ptr(&cpu_stopper, cpu1); > + stopper2 = per_cpu_ptr(&cpu_stopper, cpu2); > + > + spin_lock_irqsave(&stopper1->lock, flags); > + spin_lock(&stopper2->lock); > + > + __cpu_stop_queue_work(cpu1, &work1); > + __cpu_stop_queue_work(cpu2, &work2); Suppose that stop_two_cpus(cpu1 => 0, cpu2 => 1) races with stop_machine(). - stop_machine takes the lock on CPU 0, adds the work and drops the lock - cpu_stop_queue_work() queues both works - stop_machine takes the lock on CPU 1, etc In this case both CPU 0 and 1 will run multi_cpu_stop() but they will use different multi_stop_data's, so they will wait for each other forever? Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in Please read the FAQ at http://www.tux.org/lkml/