From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4387BC433FE for ; Fri, 11 Feb 2022 18:09:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344969AbiBKSJI (ORCPT ); Fri, 11 Feb 2022 13:09:08 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:53394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343603AbiBKSJF (ORCPT ); Fri, 11 Feb 2022 13:09:05 -0500 Received: from mail-oo1-xc2e.google.com (mail-oo1-xc2e.google.com [IPv6:2607:f8b0:4864:20::c2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C71D6CEC for ; Fri, 11 Feb 2022 10:09:03 -0800 (PST) Received: by mail-oo1-xc2e.google.com with SMTP id o128-20020a4a4486000000b003181707ed40so11194770ooa.11 for ; Fri, 11 Feb 2022 10:09:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=4ezZKO/sKq+udK33lTp4upgqqXdLXoYIeBL3uXEwM8c=; b=jB6g23e56s6dlqAXLs7ZEmhT9fEgfh3KilZ2gYZFLE65prmgzUU8dBD29sSysYRq+D GW6fDxC9rgaZjkWi2Y95CbiiK87nu+2uFUOTUfdJ+iPDVqUcpfqVyu+XZkM3rOM0MXUm qBwUHSvY8zVLycV3H+FyQJ5vyyzGZLFy3V4sZGXfUPy4puPy7XYfRdBskcyXdu8wI4dg TL4ScxcQDo9XV2BxdgzdEvWAMIfn1GEhHrv7v2qK2OcFS3f/VRHiwb/jtt7FtPxYpeqL OQW9zCqltpG0LEX4pWnJpP45+guJuWrYnJcwTHgA/0cqvjkQqGSpwSXKWuqd7GQINsvO 9j8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=4ezZKO/sKq+udK33lTp4upgqqXdLXoYIeBL3uXEwM8c=; b=jEJKchS0ay25XTeut/YyGLakVGzCSgdNsE+hmLiAQYkqkJzY7sjWJmtu/yvQzsC2zN AwbeSRg6JEk2dEp70y6xcM8uJdN28yDYNFevuNGgI4JxGax1xkBWjP5ERhY/6nRI/BJd JID3SSqghrSOUElegAUQQAhC317VnEsu/hAVcQg5h3sLOmEkknesKLEHPPejtbnQDgli ylyaibHHhfJB8SlLmd9LIbzVg5MH9OW9jzFzBga8WHc9CDz8+I5zqnJ2UqVewawdPGHQ 2qVqxpWWqZjqnsHKjr5jLNRA1i4dIBIznjs5wEtyJt63u7wv9GTu563knAtqQbJPii3c X7IA== X-Gm-Message-State: AOAM530Gs3+5NJsArQK8UTHxTX8vfyr3Gbpp+W4akP8tOifdVsNgkxg3 uy7qgjfwC0SZ1KHGx4152B6iLZWjtRrfZzSNmHjJHw== X-Google-Smtp-Source: ABdhPJzDrQu5cX510rTOl/VIAVY5rNtKtn3RwhoxPtItyUuUtGU7Po7ow1yWOe7wNOGNq6HZrcnnns2k184OCBimA2g= X-Received: by 2002:a05:6871:581:: with SMTP id u1mr529381oan.139.1644602942929; Fri, 11 Feb 2022 10:09:02 -0800 (PST) MIME-Version: 1.0 References: <20220117085307.93030-1-likexu@tencent.com> <20220117085307.93030-3-likexu@tencent.com> <20220202144308.GB20638@worktop.programming.kicks-ass.net> <69c0fc41-a5bd-fea9-43f6-4724368baf66@intel.com> <67a731dd-53ba-0eb8-377f-9707e5c9be1b@intel.com> <7b5012d8-6ae1-7cde-a381-e82685dfed4f@linux.intel.com> <6afcec02-fb44-7b72-e527-6517a94855d4@linux.intel.com> In-Reply-To: <6afcec02-fb44-7b72-e527-6517a94855d4@linux.intel.com> From: Jim Mattson Date: Fri, 11 Feb 2022 10:08:51 -0800 Message-ID: Subject: Re: [PATCH kvm/queue v2 2/3] perf: x86/core: Add interface to query perfmon_event_map[] directly To: "Liang, Kan" Cc: David Dunn , Dave Hansen , Peter Zijlstra , Like Xu , Paolo Bonzini , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Like Xu , Stephane Eranian Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 11, 2022 at 6:11 AM Liang, Kan wrote: > > > > On 2/10/2022 2:55 PM, David Dunn wrote: > > Kan, > > > > On Thu, Feb 10, 2022 at 11:46 AM Liang, Kan wrote: > > > >> No, we don't, at least for Linux. Because the host own everything. It > >> doesn't need the MSR to tell which one is in use. We track it in an SW way. > >> > >> For the new request from the guest to own a counter, I guess maybe it is > >> worth implementing it. But yes, the existing/legacy guest never check > >> the MSR. > > > > This is the expectation of all software that uses the PMU in every > > guest. It isn't just the Linux perf system. > > > > The KVM vPMU model we have today results in the PMU utilizing software > > simply not working properly in a guest. The only case that can > > consistently "work" today is not giving the guest a PMU at all. > > > > And that's why you are hearing requests to gift the entire PMU to the > > guest while it is running. All existing PMU software knows about the > > various constraints on exactly how each MSR must be used to get sane > > data. And by gifting the entire PMU it allows that software to work > > properly. But that has to be controlled by policy at host level such > > that the owner of the host knows that they are not going to have PMU > > visibility into guests that have control of PMU. > > > > I think here is how a guest event works today with KVM and perf subsystem. > - Guest create an event A > - The guest kernel assigns a guest counter M to event A, and config > the related MSRs of the guest counter M. > - KVM intercepts the MSR access and create a host event B. (The > host event B is based on the settings of the guest counter M. As I said, > at least for Linux, some SW config impacts the counter assignment. KVM > never knows it. Event B can only be a similar event to A.) > - Linux perf subsystem assigns a physical counter N to a host event > B according to event B's constraint. (N may not be the same as M, > because A and B may have different event constraints) > > As you can see, even the entire PMU is given to the guest, we still > cannot guarantee that the physical counter M can be assigned to the > guest event A. All we know about the guest is that it has programmed virtual counter M. It seems obvious to me that we can satisfy that request by giving it physical counter M. If, for whatever reason, we give it physical counter N isntead, and M and N are not completely fungible, then we have failed. > How to fix it? The only thing I can imagine is "passthrough". Let KVM > directly assign the counter M to guest. So, to me, this policy sounds > like let KVM replace the perf to control the whole PMU resources, and we > will handover them to our guest then. Is it what we want? We want PMU virtualization to work. There are at least two ways of doing that: 1) Cede the entire PMU to the guest while it's running. 2) Introduce a new "ultimate" priority level in the host perf subsystem. Only KVM can request events at the ultimate priority, and these requests supersede any other requests. Other solutions are welcome. > Thanks, > Kan