From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D010EC433DB for ; Mon, 15 Mar 2021 22:32:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AAA0564F0F for ; Mon, 15 Mar 2021 22:32:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232940AbhCOWba (ORCPT ); Mon, 15 Mar 2021 18:31:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43066 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232957AbhCOWb3 (ORCPT ); Mon, 15 Mar 2021 18:31:29 -0400 Received: from mail-lj1-x233.google.com (mail-lj1-x233.google.com [IPv6:2a00:1450:4864:20::233]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E870C061763 for ; Mon, 15 Mar 2021 15:31:29 -0700 (PDT) Received: by mail-lj1-x233.google.com with SMTP id 15so18177863ljj.0 for ; Mon, 15 Mar 2021 15:31:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=t48JfE/I+ydsOOfkocl4ExwOYD4Cwrny9/Xf4iuVVeE=; b=YzOHwW9gU+2CCoT9hKyxcZQ9g6o779atEw1QSJLPBRAVTj7Gc0vDffISoB/Cd9mLvk keJsigpwHmgy2X42PpmONt7PB65J85QkeCx9fHcaIGkNovhredA86swM4cAXZwH+PSi8 lRrPBaNDuKAYAs9HX13mcFnjNF4HVRXNtJQ97EfCuIjTjC3+fdoJesfZFgSy75Qsy3h5 3KMoc/MoTsK7c1+mpTCTSVgVngYK9wFNZHp/tgw8qVex1AQ+P9QYvrUzbu9XpmNTz0v7 2GyGh2j9ZBeGQ8E+jCJEwyRovr3E+WurRkyCDBICw0iCS9XJ4oKdl+9A7ZLiVshUP9R/ VTaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=t48JfE/I+ydsOOfkocl4ExwOYD4Cwrny9/Xf4iuVVeE=; b=RfaHq3Ihl/fajHGhrFLzGsZXi6EP0BFjFJ483o0M/JEXeWsSZTcGnMmymcYD/EUhrB RtDqVT4rox+RnCsRI3y6YBW0MzoEubIqGZGFnu7pFuevgi9uRH4wH0LkIgNw7GzPvBTU vtiUMFNNldt+qTsdAGm0D65X9LwHvmmTZ5HP7scM8TtJPVN0gZm5uizTL8qXQ8KJriEd U0zAj89d0vjR7eLI1Dle3/83dn1+7+4FOxT6JTtHG3dfdMGyX7nu4O9sFLrOv9MbEkf3 UEHJ7KXqufgdiIElC8KawRCliIA0C0zV+DM7RPlVoW+TJA2pijn983QyK910ES1mPKgr E6VA== X-Gm-Message-State: AOAM531/lKlNdK1Bvh3nhhMd4GLV+Zcmx5Ps96PktNS3hdUCjcF47Ku0 GtoRqrUkUwf9S0zON56rMlRcy3Csw9BpWdlhtcFXdQ== X-Google-Smtp-Source: ABdhPJxjr/et45iqQeRMTErI0LQhEQm42DqVbpHRSpcjhEUlWFO96pLBoZaf9saYhs4q/r0gUqfHvBityBsZA51vteE= X-Received: by 2002:a2e:988f:: with SMTP id b15mr751964ljj.394.1615847486669; Mon, 15 Mar 2021 15:31:26 -0700 (PDT) MIME-Version: 1.0 References: <20210310003024.2026253-1-jingzhangos@google.com> <20210310003024.2026253-4-jingzhangos@google.com> In-Reply-To: From: Jing Zhang Date: Mon, 15 Mar 2021 17:31:15 -0500 Message-ID: Subject: Re: [RFC PATCH 3/4] KVM: stats: Add ioctl commands to pull statistics in binary format To: Paolo Bonzini Cc: KVM , KVM ARM , Linux MIPS , KVM PPC , Linux S390 , Linux kselftest , Marc Zyngier , James Morse , Julien Thierry , Suzuki K Poulose , Will Deacon , Huacai Chen , Aleksandar Markovic , Thomas Bogendoerfer , Paul Mackerras , Christian Borntraeger , Janosch Frank , David Hildenbrand , Cornelia Huck , Claudio Imbrenda , Sean Christopherson , Vitaly Kuznetsov , Jim Mattson , Peter Shier , Oliver Upton , David Rientjes , Emanuele Giuseppe Esposito Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Hi Paolo, On Wed, Mar 10, 2021 at 8:55 AM Paolo Bonzini wrote: > > On 10/03/21 01:30, Jing Zhang wrote: > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > > index 383df23514b9..87dd62516c8b 100644 > > --- a/virt/kvm/kvm_main.c > > +++ b/virt/kvm/kvm_main.c > > @@ -3464,6 +3464,51 @@ static long kvm_vcpu_ioctl(struct file *filp, > > r = kvm_arch_vcpu_ioctl_set_fpu(vcpu, fpu); > > break; > > } > > + case KVM_STATS_GET_INFO: { > > + struct kvm_stats_info stats_info; > > + > > + r = -EFAULT; > > + stats_info.num_stats = VCPU_STAT_COUNT; > > + if (copy_to_user(argp, &stats_info, sizeof(stats_info))) > > + goto out; > > + r = 0; > > + break; > > + } > > + case KVM_STATS_GET_NAMES: { > > + struct kvm_stats_names stats_names; > > + > > + r = -EFAULT; > > + if (copy_from_user(&stats_names, argp, sizeof(stats_names))) > > + goto out; > > + r = -EINVAL; > > + if (stats_names.size < VCPU_STAT_COUNT * KVM_STATS_NAME_LEN) > > + goto out; > > + > > + r = -EFAULT; > > + if (copy_to_user(argp + sizeof(stats_names), > > + kvm_vcpu_stat_strings, > > + VCPU_STAT_COUNT * KVM_STATS_NAME_LEN)) > > The only reason to separate the strings in patch 1 is to pass them here. > But this is a poor API because it imposes a limit on the length of the > statistics, and makes that length part of the binary interface. > > I would prefer a completely different interface, where you have a file > descriptor that can be created and associated to a vCPU or VM (or even > to /dev/kvm). Having a file descriptor is important because the fd can We are considering about how to create the file descriptor. It might be risky to create an extra fd for every vCPU. It will easily hit the fd limit for the process or the system for machines running a ton of small VMs. Looks like creating an extra file descriptor for every VM is a better option. And then we can check per vCPU stats through Ioctl of this VM fd by passing the vCPU index. What do you think? > be passed to a less-privileged process that takes care of gathering the > metrics > > The result of reading the file descriptor could be either ASCII or > binary. IMO the real cost lies in opening and reading a multitude of > files rather than in the ASCII<->binary conversion. > > The format could be one of the following: > > * binary: > > 4 bytes flags (always zero) > 4 bytes number of statistics > 4 bytes offset of the first stat description > 4 bytes offset of the first stat value > stat descriptions: > - 4 bytes for the type (for now always zero: uint64_t) > - 4 bytes for the flags (for now always zero) > - length of name > - name > statistics in 64-bit format > > * text: > > stat1_name uint64 123 > stat2_name uint64 456 > ... > > What do you think? > > Paolo >