linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
To: David Rientjes <rientjes@google.com>
Cc: linux-s390@vger.kernel.org, kvm@vger.kernel.org,
	David Hildenbrand <david@redhat.com>,
	Cornelia Huck <cohuck@redhat.com>,
	Emanuele Giuseppe Esposito <e.emanuelegiuseppe@gmail.com>,
	linux-kernel@vger.kernel.org, kvm-ppc@vger.kernel.org,
	Jonathan Adams <jwadams@google.com>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel@vger.kernel.org,
	Paolo Bonzini <pbonzini@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	Jim Mattson <jmattson@google.com>
Subject: Re: [PATCH v2 0/5] Statsfs: a new ram-based file sytem for Linux kernel statistics
Date: Tue, 5 May 2020 11:18:39 +0200	[thread overview]
Message-ID: <f2654143-b8e5-5a1f-8bd0-0cb0df2cd638@redhat.com> (raw)
In-Reply-To: <alpine.DEB.2.22.394.2005041429210.224786@chino.kir.corp.google.com>



On 5/4/20 11:37 PM, David Rientjes wrote:
> On Mon, 4 May 2020, Emanuele Giuseppe Esposito wrote:
> 
>>
>> In this patch series I introduce statsfs, a synthetic ram-based virtual
>> filesystem that takes care of gathering and displaying statistics for the
>> Linux kernel subsystems.
>>
> 
> This is exciting, we have been looking in the same area recently.  Adding
> Jonathan Adams <jwadams@google.com>.
> 
> In your diffstat, one thing I notice that is omitted: an update to
> Documentation/* :)  Any chance of getting some proposed Documentation/
> updates with structure of the fs, the per subsystem breakdown, and best
> practices for managing the stats from the kernel level?

Yes, I will write some documentation. Thank you for the suggestion.

>>
>> Values represent quantites that are gathered by the statsfs user. Examples
>> of values include the number of vm exits of a given kind, the amount of
>> memory used by some data structure, the length of the longest hash table
>> chain, or anything like that. Values are defined with the
>> statsfs_source_add_values function. Each value is defined by a struct
>> statsfs_value; the same statsfs_value can be added to many different
>> sources. A value can be considered "simple" if it fetches data from a
>> user-provided location, or "aggregate" if it groups all values in the
>> subordinates sources that include the same statsfs_value.
>>
> 
> This seems like it could have a lot of overhead if we wanted to
> periodically track the totality of subsystem stats as a form of telemetry
> gathering from userspace.  To collect telemetry for 1,000 different stats,
> do we need to issue lseek()+read() syscalls for each of them individually
> (or, worse, open()+read()+close())?
> 
> Any thoughts on how that can be optimized?  A couple of ideas:
> 
>   - an interface that allows gathering of all stats for a particular
>     interface through a single file that would likely be encoded in binary
>     and the responsibility of userspace to disseminate, or
> 
>   - an interface that extends beyond this proposal and allows the reader to
>     specify which stats they are interested in collecting and then the
>     kernel will only provide these stats in a well formed structure and
>     also be binary encoded.

Are you thinking of another file, containing all the stats for the 
directory in binary format?

> We've found that the one-file-per-stat method is pretty much a show
> stopper from the performance view and we always must execute at least two
> syscalls to obtain a single stat.
> 
> Since this is becoming a generic API (good!!), maybe we can discuss
> possible ways to optimize gathering of stats in mass?

Sure, the idea of a binary format was considered from the beginning in 
[1], and it can be done either together with the current filesystem, or 
as a replacement via different mount options.

Thank you,
Emanuele

>> [1] https://lore.kernel.org/kvm/5d6cdcb1-d8ad-7ae6-7351-3544e2fa366d@redhat.com/?fbclid=IwAR18LHJ0PBcXcDaLzILFhHsl3qpT3z2vlG60RnqgbpGYhDv7L43n0ZXJY8M


>>
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>>
>> v1->v2 remove unnecessary list_foreach_safe loops, fix wrong indentation,
>> change statsfs in stats_fs
>>
>> Emanuele Giuseppe Esposito (5):
>>    refcount, kref: add dec-and-test wrappers for rw_semaphores
>>    stats_fs API: create, add and remove stats_fs sources and values
>>    kunit: tests for stats_fs API
>>    stats_fs fs: virtual fs to show stats to the end-user
>>    kvm_main: replace debugfs with stats_fs
>>
>>   MAINTAINERS                     |    7 +
>>   arch/arm64/kvm/Kconfig          |    1 +
>>   arch/arm64/kvm/guest.c          |    2 +-
>>   arch/mips/kvm/Kconfig           |    1 +
>>   arch/mips/kvm/mips.c            |    2 +-
>>   arch/powerpc/kvm/Kconfig        |    1 +
>>   arch/powerpc/kvm/book3s.c       |    6 +-
>>   arch/powerpc/kvm/booke.c        |    8 +-
>>   arch/s390/kvm/Kconfig           |    1 +
>>   arch/s390/kvm/kvm-s390.c        |   16 +-
>>   arch/x86/include/asm/kvm_host.h |    2 +-
>>   arch/x86/kvm/Kconfig            |    1 +
>>   arch/x86/kvm/Makefile           |    2 +-
>>   arch/x86/kvm/debugfs.c          |   64 --
>>   arch/x86/kvm/stats_fs.c         |   56 ++
>>   arch/x86/kvm/x86.c              |    6 +-
>>   fs/Kconfig                      |   12 +
>>   fs/Makefile                     |    1 +
>>   fs/stats_fs/Makefile            |    6 +
>>   fs/stats_fs/inode.c             |  337 ++++++++++
>>   fs/stats_fs/internal.h          |   35 +
>>   fs/stats_fs/stats_fs-tests.c    | 1088 +++++++++++++++++++++++++++++++
>>   fs/stats_fs/stats_fs.c          |  773 ++++++++++++++++++++++
>>   include/linux/kref.h            |   11 +
>>   include/linux/kvm_host.h        |   39 +-
>>   include/linux/refcount.h        |    2 +
>>   include/linux/stats_fs.h        |  304 +++++++++
>>   include/uapi/linux/magic.h      |    1 +
>>   lib/refcount.c                  |   32 +
>>   tools/lib/api/fs/fs.c           |   21 +
>>   virt/kvm/arm/arm.c              |    2 +-
>>   virt/kvm/kvm_main.c             |  314 ++-------
>>   32 files changed, 2772 insertions(+), 382 deletions(-)
>>   delete mode 100644 arch/x86/kvm/debugfs.c
>>   create mode 100644 arch/x86/kvm/stats_fs.c
>>   create mode 100644 fs/stats_fs/Makefile
>>   create mode 100644 fs/stats_fs/inode.c
>>   create mode 100644 fs/stats_fs/internal.h
>>   create mode 100644 fs/stats_fs/stats_fs-tests.c
>>   create mode 100644 fs/stats_fs/stats_fs.c
>>   create mode 100644 include/linux/stats_fs.h
>>
>> -- 
>> 2.25.2
>>


  reply	other threads:[~2020-05-05  9:20 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-04 11:03 [PATCH v2 0/5] Statsfs: a new ram-based file sytem for Linux kernel statistics Emanuele Giuseppe Esposito
2020-05-04 11:03 ` [PATCH v2 1/5] refcount, kref: add dec-and-test wrappers for rw_semaphores Emanuele Giuseppe Esposito
2020-05-04 11:03 ` [PATCH v2 2/5] stats_fs API: create, add and remove stats_fs sources and values Emanuele Giuseppe Esposito
2020-05-04 22:11   ` Randy Dunlap
2020-05-04 11:03 ` [PATCH v2 3/5] kunit: tests for stats_fs API Emanuele Giuseppe Esposito
2020-05-04 11:03 ` [PATCH v2 4/5] stats_fs fs: virtual fs to show stats to the end-user Emanuele Giuseppe Esposito
2020-05-04 11:03 ` [PATCH v2 5/5] kvm_main: replace debugfs with stats_fs Emanuele Giuseppe Esposito
2020-05-04 21:37 ` [PATCH v2 0/5] Statsfs: a new ram-based file sytem for Linux kernel statistics David Rientjes
2020-05-05  9:18   ` Emanuele Giuseppe Esposito [this message]
2020-05-05 16:53     ` Jim Mattson
2020-05-05 17:02       ` Paolo Bonzini
2020-05-05 17:07         ` David Rientjes
2020-05-05 17:21           ` Paolo Bonzini
2020-05-05 17:30             ` Christian Borntraeger
2020-06-04 11:59   ` Amit Kucheria
     [not found] ` <CA+VK+GN=iDhDV2ZDJbBsxrjZ3Qoyotk_L0DvsbwDVvqrpFZ8fQ@mail.gmail.com>
2020-05-08  9:44   ` Paolo Bonzini
2020-05-11  9:37     ` Emanuele Giuseppe Esposito
2020-05-11 17:02     ` Jonathan Adams
2020-05-11 17:34       ` Paolo Bonzini
2020-05-14 17:35         ` Jonathan Adams
2020-05-14 17:42           ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f2654143-b8e5-5a1f-8bd0-0cb0df2cd638@redhat.com \
    --to=eesposit@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=david@redhat.com \
    --cc=e.emanuelegiuseppe@gmail.com \
    --cc=jmattson@google.com \
    --cc=jwadams@google.com \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=pbonzini@redhat.com \
    --cc=rientjes@google.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=vkuznets@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).