From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2EAD8C433E9 for ; Fri, 12 Mar 2021 22:28:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EF0F164F80 for ; Fri, 12 Mar 2021 22:28:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235567AbhCLW1v (ORCPT ); Fri, 12 Mar 2021 17:27:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49654 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235528AbhCLW1W (ORCPT ); Fri, 12 Mar 2021 17:27:22 -0500 Received: from mail-lj1-x22e.google.com (mail-lj1-x22e.google.com [IPv6:2a00:1450:4864:20::22e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 929B7C061574 for ; Fri, 12 Mar 2021 14:27:21 -0800 (PST) Received: by mail-lj1-x22e.google.com with SMTP id y1so9224748ljm.10 for ; Fri, 12 Mar 2021 14:27:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=pUE5cPRj82pN6oWDtfrTyERGHLKsQg6bLu9MJ6uTUso=; b=FNFco3tIosoTdAF0Wn5QZPLnndHLHfuu5ZjcTO4lddmc1BnCcft1kp1aq/gnNi98u1 6ZLzdR+j6/KGh5DyWwV10oNbY+qwOhlcOgMhbTJcEMifuqfrn7mh90a7E9YcI+K+aDgn KZnD7ROKB5mN+d5Xld1AF6bwLayt8U/qx5SnZo2lcsY6WC34s18HysDE9eTziUSQ+sbo 0h31Me02VhrNbF9WI6eLeeoAdWAqWqYNdaWNj4Io9yrKEIr7HTSDIY1UJhRkxy0ap+0+ 8MuZ0nNKyhnmK0apDZEm1gQh9uNUldn+69B59Wm6GXYhMvpwkC+Dw8Kaz6pUf8K3ve1H wMww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=pUE5cPRj82pN6oWDtfrTyERGHLKsQg6bLu9MJ6uTUso=; b=F5c1VJ4NrxL/E4f+Ew6FJzllnb8fXapJSCH2g4eJwmXk5ONUhpOu2L0HwK+zqLGS4G E9PeHAyWjjxi1Z49juoOglUg3RQ0dr/mB4niziyOMQP0lZ4qThYM1bpQqKHVmArDePjI 2B7UtfBgL7uB4YUhXxSkSinmgMkjBL+ML9xJSPfFkYsI90xrMW4vx4GKn+ZI7yaUTYE3 ahEizSM4k1iCi9q3eUXJqsTGqbRyB4naJQkuCAoHdIaYqjyp2hVLmYVd3w7zu9VV6Nww YxE+eVZOvcOkJR3ntqVn9SjQD+17yoE7rZS0giUuBZf+AAJBv1vqz1Ya143pntQEbAdg 0hTw== X-Gm-Message-State: AOAM533KiKECn2kGpuStcAXsWV2kUYXCQM62Icm9jtQ4BX38tkK7LmVz gi508cHPiY+pE+CdbjBY0DpBLpqRMyvZ1RO5LhHQwQ== X-Google-Smtp-Source: ABdhPJypcdjl26p+Fe6F+oyWMPjaNvoNfDYHRTkKiolYtZzG0+rkKu8Hda3ErL9OGzBCmPJ4OLdg2Yr71f2cJ/NFeoU= X-Received: by 2002:a2e:8e75:: with SMTP id t21mr3678642ljk.216.1615588039907; Fri, 12 Mar 2021 14:27:19 -0800 (PST) MIME-Version: 1.0 References: <20210310003024.2026253-1-jingzhangos@google.com> <20210310003024.2026253-4-jingzhangos@google.com> In-Reply-To: From: Jing Zhang Date: Fri, 12 Mar 2021 16:27:08 -0600 Message-ID: Subject: Re: [RFC PATCH 3/4] KVM: stats: Add ioctl commands to pull statistics in binary format To: Paolo Bonzini Cc: KVM , KVM ARM , Linux MIPS , KVM PPC , Linux S390 , Linux kselftest , Marc Zyngier , James Morse , Julien Thierry , Suzuki K Poulose , Will Deacon , Huacai Chen , Aleksandar Markovic , Thomas Bogendoerfer , Paul Mackerras , Christian Borntraeger , Janosch Frank , David Hildenbrand , Cornelia Huck , Claudio Imbrenda , Sean Christopherson , Vitaly Kuznetsov , Jim Mattson , Peter Shier , Oliver Upton , David Rientjes , Emanuele Giuseppe Esposito Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Hi Paolo, On Fri, Mar 12, 2021 at 12:11 PM Paolo Bonzini wrote: > > On 10/03/21 22:41, Jing Zhang wrote: > >> I would prefer a completely different interface, where you have a file > >> descriptor that can be created and associated to a vCPU or VM (or even > >> to /dev/kvm). Having a file descriptor is important because the fd can > >> be passed to a less-privileged process that takes care of gathering the > >> metrics > > Separate file descriptor solution is very tempting. We are still considering it > > seriously. Our biggest concern is that the metrics gathering/handling process > > is not necessary running on the same node as the one file descriptor belongs to. > > It scales better to pass metrics data directly than to pass file descriptors. > > If you want to pass metrics data directly, you can just read the file > descriptor from your VMM, just like you're using the ioctls now. > However the file descriptor also allows a privilege-separated same-host > interface. It makes sense. > > >> 4 bytes flags (always zero) Could you give some potential use for this flag? > >> 4 bytes number of statistics > >> 4 bytes offset of the first stat description > >> 4 bytes offset of the first stat value > >> stat descriptions: > >> - 4 bytes for the type (for now always zero: uint64_t) Potential use for this type? Should we move this outside descriptor? Since all stats probably have the same size. > >> - 4 bytes for the flags (for now always zero) Potential use for this flag? > >> - length of name > >> - name > >> statistics in 64-bit format > > > > The binary format presented above is very flexible. I understand why it is > > organized this way. > > In our situation, the metrics data could be pulled periodically as short as > > half second. They are used by different kinds of monitors/triggers/alerts. > > To enhance efficiency and reduce traffic caused by metrics passing, we > > treat all metrics info/data as two kinds. One is immutable information, > > which doesn't change in a given system boot. The other is mutable > > data (statistics data), which is pulled/transferred periodically at a high > > frequency. > > The format allows to place the values before the descriptions. So you > could use pread to only read the first part of the file descriptor, and > the file_operations implementation would then skip the work of building > the immutable data. It doesn't have to be implemented from the > beginning like that, but the above format supports it. Good point! I'll be working on the new fd-based interface and come back with new patchset. > > Paolo > Thanks, Jing