From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754924Ab2I0MZw (ORCPT ); Thu, 27 Sep 2012 08:25:52 -0400 Received: from e4.ny.us.ibm.com ([32.97.182.144]:43883 "EHLO e4.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754320Ab2I0MZu (ORCPT ); Thu, 27 Sep 2012 08:25:50 -0400 Subject: Re: [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios in PLE handler From: Andrew Theurer Reply-To: habanero@linux.vnet.ibm.com To: Avi Kivity Cc: Raghavendra K T , Peter Zijlstra , "H. Peter Anvin" , Marcelo Tosatti , Ingo Molnar , Rik van Riel , Srikar , "Nikunj A. Dadhania" , KVM , Jiannan Ouyang , chegu vinod , LKML , Srivatsa Vaddagiri , Gleb Natapov , Andrew Jones In-Reply-To: <506440AF.9080202@redhat.com> References: <20120921115942.27611.67488.sendpatchset@codeblue> <1348486479.11847.46.camel@twins> <50604988.2030506@linux.vnet.ibm.com> <1348490165.11847.58.camel@twins> <50606050.309@linux.vnet.ibm.com> <1348494895.11847.64.camel@twins> <50606B33.1040102@linux.vnet.ibm.com> <5061B437.8070300@linux.vnet.ibm.com> <5064101A.5070902@redhat.com> <50643745.6010202@linux.vnet.ibm.com> <506440AF.9080202@redhat.com> Content-Type: text/plain; charset="UTF-8" Organization: IBM Date: Thu, 27 Sep 2012 07:25:41 -0500 Message-ID: <1348748741.10325.198.camel@oc6622382223.ibm.com> Mime-Version: 1.0 X-Mailer: Evolution 2.28.3 (2.28.3-24.el6) Content-Transfer-Encoding: 7bit x-cbid: 12092712-3534-0000-0000-00000D2BA7E9 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2012-09-27 at 14:03 +0200, Avi Kivity wrote: > On 09/27/2012 01:23 PM, Raghavendra K T wrote: > >> > >> This gives us a good case for tracking preemption on a per-vm basis. As > >> long as we aren't preempted, we can keep the PLE window high, and also > >> return immediately from the handler without looking for candidates. > > > > 1) So do you think, deferring preemption patch ( Vatsa was mentioning > > long back) is also another thing worth trying, so we reduce the chance > > of LHP. > > Yes, we have to keep it in mind. It will be useful for fine grained > locks, not so much so coarse locks or IPIs. > > I would still of course prefer a PLE solution, but if we can't get it to > work we can consider preemption deferral. > > > > > IIRC, with defer preemption : > > we will have hook in spinlock/unlock path to measure depth of lock held, > > and shared with host scheduler (may be via MSRs now). > > Host scheduler 'prefers' not to preempt lock holding vcpu. (or rather > > give say one chance. > > A downside is that we have to do that even when undercommitted. > > Also there may be a lot of false positives (deferred preemptions even > when there is no contention). > > > > > 2) looking at the result (comparing A & C) , I do feel we have > > significant in iterating over vcpus (when compared to even vmexit) > > so We still would need undercommit fix sugested by PeterZ (improving by > > 140%). ? > > Looking only at the current runqueue? My worry is that it misses a lot > of cases. Maybe try the current runqueue first and then others. > > Or were you referring to something else? > > > > > So looking back at threads/ discussions so far, I am trying to > > summarize, the discussions so far. I feel, at least here are the few > > potential candidates to go in: > > > > 1) Avoiding double runqueue lock overhead (Andrew Theurer/ PeterZ) > > 2) Dynamically changing PLE window (Avi/Andrew/Chegu) > > 3) preempt_notify handler to identify preempted VCPUs (Avi) > > 4) Avoiding iterating over VCPUs in undercommit scenario. (Raghu/PeterZ) > > 5) Avoiding unnecessary spinning in overcommit scenario (Raghu/Rik) > > 6) Pv spinlock > > 7) Jiannan's proposed improvements > > 8) Defer preemption patches > > > > Did we miss anything (or added extra?) > > > > So here are my action items: > > - I plan to repost this series with what PeterZ, Rik suggested with > > performance analysis. > > - I ll go back and explore on (3) and (6) .. > > > > Please Let me know.. > > Undoubtedly we'll think of more stuff. But this looks like a good start. 9) lazy gang-like scheduling with PLE to cover the non-gang-like exceptions (/me runs and hides from scheduler folks) -Andrew Theurer