From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752371Ab2H1R4U (ORCPT ); Tue, 28 Aug 2012 13:56:20 -0400 Received: from mx1.redhat.com ([209.132.183.28]:7608 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751167Ab2H1R4S (ORCPT ); Tue, 28 Aug 2012 13:56:18 -0400 Date: Tue, 28 Aug 2012 20:57:16 +0300 From: "Michael S. Tsirkin" To: Rafael Aquini Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, Rusty Russell , Rik van Riel , Mel Gorman , Andi Kleen , Andrew Morton , Konrad Rzeszutek Wilk , Minchan Kim , Peter Zijlstra , "Paul E. McKenney" Subject: Re: [PATCH v9 3/5] virtio_balloon: introduce migration primitives to balloon pages Message-ID: <20120828175716.GA4595@redhat.com> References: <20120826074244.GC19551@redhat.com> <20120827194713.GA6517@t510.redhat.com> <20120828155410.GE2903@redhat.com> <20120828173713.GA1750@t510.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120828173713.GA1750@t510.redhat.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Aug 28, 2012 at 02:37:13PM -0300, Rafael Aquini wrote: > On Tue, Aug 28, 2012 at 06:54:10PM +0300, Michael S. Tsirkin wrote: > > On Mon, Aug 27, 2012 at 04:47:13PM -0300, Rafael Aquini wrote: > > > On Sun, Aug 26, 2012 at 10:42:44AM +0300, Michael S. Tsirkin wrote: > > > > > > > > Reading two atomics and doing math? Result can even be negative. > > > > I did not look at use closely but it looks suspicious. > > > Doc on atomic_read says: > > > " > > > The read is atomic in that the return value is guaranteed to be one of the > > > values initialized or modified with the interface operations if a proper > > > implicit or explicit memory barrier is used after possible runtime > > > initialization by any other thread and the value is modified only with the > > > interface operations. > > > " > > > > > > There's no runtime init by other thread than balloon's itself at device register, > > > and the operations (inc, dec) are made by the proper interface operations > > > only when protected by the spinlock pages_lock. It does not look suspicious, IMHO. > > > > Any use of multiple atomics is suspicious. > > Please just avoid it if you can. What's wrong with locking? > > > > > I'm failing to see how it could become a negative on that case, since you cannot > > > isolate more pages than what was previoulsy inflated to balloon's list. > > > > There is no order guarantee. So in > > A - B you can read B long after both A and B has been incremented. > > Maybe it is safe in this case but it needs careful documentation > > to explain how ordering works. Much easier to keep it all simple. > > > > > > > > > It's already the case everywhere except __wait_on_isolated_pages, > > > > so just fix that, and then we can keep using int instead of atomics. > > > > > > > Sorry, I quite didn't get you here. fix what? > > > > It's in the text you removed above. Access values under lock. > > > > So, you prefer this way: > > /* > * __wait_on_isolated_pages - check if leak_balloon() must wait on isolated > * pages before proceeding with the page release. > * @vb : pointer to the struct virtio_balloon describing this device. > * @leak_target: how many pages we are attempting to release this round. > */ > static inline void __wait_on_isolated_pages(struct virtio_balloon *vb, > size_t leak_target) > { > unsigned int num_pages, isolated_pages; > spin_lock(&vb->pages_lock); > num_pages = vb->num_pages; > isolated_pages = vb->num_isolated_pages; > spin_unlock(&vb->pages_lock); > /* > * If isolated pages are making our leak target bigger than the > * total pages that we can release this round. Let's wait for > * migration returning enough pages back to balloon's list. > */ > wait_event(vb->config_change, > (!isolated_pages || > leak_target <= (num_pages - isolated_pages))); This logic looks strange too - it does not 100% match the comment. > } > > ? Except that it does not work. You need to do the lock/unlock dance and retest within wait_event. > > > > > > > That's 1K on stack - and can become more if we increase > > > > VIRTIO_BALLOON_ARRAY_PFNS_MAX. Probably too much - this is the reason > > > > we use vb->pfns. > > > > > > > If we want to use vb->pfns we'll have to make leak_balloon mutual exclusive with > > > page migration (as it was before), but that will inevictably bring us back to > > > the discussion on breaking the loop when isolated pages make leak_balloon find > > > less pages than it wants to release at each leak round. > > > > > > > I don't think this is an issue. The issue was busy waiting in that case. > > > But, in fact, it is. > As we couldn't drop the mutex that prevents migration from happening, otherwise > the migration threads would screw up with our vb->pfns array, there will be no point > on keep waiting for isolated pages being reinserted on balloon's list, cause the > migration threads that will accomplish that task are also waiting on us dropping > the mutex. > > You may argue that we could flag virtballoon_migratepage() to give up and return > before even trying to aquire the mutex, if a leak is ongoing -- deferring work > to virtballoon_putbackpage(). However, I'm eager to think that for this case, > the CPU time we spent isolating pages for compaction would be simply wasted and, > perhaps, no effective compaction was even reached. > And that makes me think it would have been better to stick with the old logics of > breaking the loop since leak_balloon(), originally, also remains busy waiting > while pursuing its target, anyway. > > That's the trade here, IMO. If one really wants to wait on potentially isolated > pages getting back to the list before proceeding, we'll have to burn a little > more stack space with local variables, unfortunately. Sorry I do not understand what you are saying here. So find a different locking strategy. For example something like: wait_event(vb->config_change, ({ lock if (target <= (num_pages - isolated_pages)) leak balloon cond = target <= (num_pages - isolated_pages)); unlock; cond; }) ) seems to have no issues? -- MST