kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ricardo Koller <ricarkol@google.com>
To: Jing Zhang <jingzhangos@google.com>
Cc: KVM <kvm@vger.kernel.org>, KVMARM <kvmarm@lists.cs.columbia.edu>,
	LinuxMIPS <linux-mips@vger.kernel.org>,
	KVMPPC <kvm-ppc@vger.kernel.org>,
	LinuxS390 <linux-s390@vger.kernel.org>,
	Linuxkselftest <linux-kselftest@vger.kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Marc Zyngier <maz@kernel.org>, James Morse <james.morse@arm.com>,
	Julien Thierry <julien.thierry.kdev@gmail.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Will Deacon <will@kernel.org>,
	Huacai Chen <chenhuacai@kernel.org>,
	Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	Paul Mackerras <paulus@ozlabs.org>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Janosch Frank <frankja@linux.ibm.com>,
	David Hildenbrand <david@redhat.com>,
	Cornelia Huck <cohuck@redhat.com>,
	Claudio Imbrenda <imbrenda@linux.ibm.com>,
	Sean Christopherson <seanjc@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Jim Mattson <jmattson@google.com>,
	Peter Shier <pshier@google.com>, Oliver Upton <oupton@google.com>,
	David Rientjes <rientjes@google.com>,
	Emanuele Giuseppe Esposito <eesposit@redhat.com>
Subject: Re: [PATCH v5 2/4] KVM: stats: Add fd-based API to read binary stats data
Date: Thu, 20 May 2021 13:50:56 -0700	[thread overview]
Message-ID: <YKbLsICiX4iI+0AB@google.com> (raw)
In-Reply-To: <CAAdAUtiZxw9_qQ4eszsxSQmmL7j46LRhixL5zYttcOzL0da42Q@mail.gmail.com>

On Thu, May 20, 2021 at 02:46:41PM -0500, Jing Zhang wrote:
> Hi Ricardo,
> 
> On Thu, May 20, 2021 at 1:58 PM Ricardo Koller <ricarkol@google.com> wrote:
> >
> > On Thu, May 20, 2021 at 12:37:59PM -0500, Jing Zhang wrote:
> > > Hi Ricardo,
> > >
> > > On Wed, May 19, 2021 at 11:21 PM Ricardo Koller <ricarkol@google.com> wrote:
> > > >
> > > > On Mon, May 17, 2021 at 02:53:12PM +0000, Jing Zhang wrote:
> > > > > Provides a file descriptor per VM to read VM stats info/data.
> > > > > Provides a file descriptor per vCPU to read vCPU stats info/data.
> > > > >
> > > > > Signed-off-by: Jing Zhang <jingzhangos@google.com>
> > > > > ---
> > > > >  arch/arm64/kvm/guest.c    |  26 +++++
> > > > >  arch/mips/kvm/mips.c      |  52 +++++++++
> > > > >  arch/powerpc/kvm/book3s.c |  52 +++++++++
> > > > >  arch/powerpc/kvm/booke.c  |  45 ++++++++
> > > > >  arch/s390/kvm/kvm-s390.c  | 117 ++++++++++++++++++++
> > > > >  arch/x86/kvm/x86.c        |  53 +++++++++
> > > > >  include/linux/kvm_host.h  | 127 ++++++++++++++++++++++
> > > > >  include/uapi/linux/kvm.h  |  50 +++++++++
> > > > >  virt/kvm/kvm_main.c       | 223 ++++++++++++++++++++++++++++++++++++++
> > > > >  9 files changed, 745 insertions(+)
> > > > >
> > > >
> > > > > +static ssize_t kvm_vcpu_stats_read(struct file *file, char __user *user_buffer,
> > > > > +                           size_t size, loff_t *offset)
> > > > > +{
> > > > > +     char id[KVM_STATS_ID_MAXLEN];
> > > > > +     struct kvm_vcpu *vcpu = file->private_data;
> > > > > +     ssize_t copylen, len, remain = size;
> > > > > +     size_t size_header, size_desc, size_stats;
> > > > > +     loff_t pos = *offset;
> > > > > +     char __user *dest = user_buffer;
> > > > > +     void *src;
> > > >
> > > > Nit. Better to do pointer arithmetic on a "char *".  Note that gcc and
> > > > clang will do the expected thing.
> > > >
> > > > > +
> > > > > +     snprintf(id, sizeof(id), "kvm-%d/vcpu-%d",
> > > > > +                     task_pid_nr(current), vcpu->vcpu_id);
> > > > > +     size_header = sizeof(kvm_vcpu_stats_header);
> > > > > +     size_desc =
> > > > > +             kvm_vcpu_stats_header.count * sizeof(struct _kvm_stats_desc);
> > > > > +     size_stats = sizeof(vcpu->stat);
> > > > > +
> > > > > +     len = sizeof(id) + size_header + size_desc + size_stats - pos;
> > > > > +     len = min(len, remain);
> > > > > +     if (len <= 0)
> > > > > +             return 0;
> > > > > +     remain = len;
> > > >
> > > > If 'desc_offset' is not right after the header, then the 'len'
> > > > calculation is missing the gap into account. For example, assuming there
> > > > is a gap of 0x1000000 between the header and the descriptors:
> > > >
> > > >         desc_offset = sizeof(id) + size_header + 0x1000000
> > > >
> > > > and the user calls the ioctl with enough space for the whole file,
> > > > including the gap:
> > > >
> > > >         *offset = 0
> > > >         size = sizeof(id) + size_header + size_desc + size_stats + 0x1000000
> > > >
> > > > then 'remain' gets the wrong size:
> > > >
> > > >         remain = sizeof(id) + size_header + size_desc + size_stats
> > > >
> > > > and ... (more below)
> > > >
> > > > > +
> > > > > +     /* Copy kvm vcpu stats header id string */
> > > > > +     copylen = sizeof(id) - pos;
> > > > > +     copylen = min(copylen, remain);
> > > > > +     if (copylen > 0) {
> > > > > +             src = (void *)id + pos;
> > > > > +             if (copy_to_user(dest, src, copylen))
> > > > > +                     return -EFAULT;
> > > > > +             remain -= copylen;
> > > > > +             pos += copylen;
> > > > > +             dest += copylen;
> > > > > +     }
> > > > > +     /* Copy kvm vcpu stats header */
> > > > > +     copylen = sizeof(id) + size_header - pos;
> > > > > +     copylen = min(copylen, remain);
> > > > > +     if (copylen > 0) {
> > > > > +             src = (void *)&kvm_vcpu_stats_header;
> > > > > +             src += pos - sizeof(id);
> > > > > +             if (copy_to_user(dest, src, copylen))
> > > > > +                     return -EFAULT;
> > > > > +             remain -= copylen;
> > > > > +             pos += copylen;
> > > > > +             dest += copylen;
> > > > > +     }
> > > > > +     /* Copy kvm vcpu stats descriptors */
> > > > > +     copylen = kvm_vcpu_stats_header.desc_offset + size_desc - pos;
> > > >
> > > > This would be the state at this point:
> > > >
> > > >         pos     = sizeof(id) + size_header
> > > >         copylen = sizeof(id) + size_header + 0x1000000 + size_desc - (sizeof(id) + size_header)
> > > >                 = 0x1000000 + size_desc
> > > >         remain  = size_desc + size_stats
> > > >
> > > > > +     copylen = min(copylen, remain);
> > > >
> > > >         copylen = size_desc + size_stats
> > > >
> > > > which is not enough to copy the descriptors (and the data).
> > > >
> > > > > +     if (copylen > 0) {
> > > > > +             src = (void *)&kvm_vcpu_stats_desc;
> > > > > +             src += pos - kvm_vcpu_stats_header.desc_offset;
> > > >
> > > > Moreover, src also needs to take the gap into account.
> > > >
> > > >         src     = &kvm_vcpu_stats_desc + (sizeof(id) + size_header) - (sizeof(id) + size_header + 0x1000000)
> > > >                 = &kvm_vcpu_stats_desc - 0x1000000
> > > >
> > > > Otherwise, src ends up pointing at the wrong place.
> > > >
> > > > > +             if (copy_to_user(dest, src, copylen))
> > > > > +                     return -EFAULT;
> > > > > +             remain -= copylen;
> > > > > +             pos += copylen;
> > > > > +             dest += copylen;
> > > > > +     }
> > > > > +     /* Copy kvm vcpu stats values */
> > > > > +     copylen = kvm_vcpu_stats_header.data_offset + size_stats - pos;
> > > >
> > > > The same problem occurs here. There is a potential gap before
> > > > data_offset that needs to be taken into account for src and len.
> > > >
> > > > Would it be possible to just ensure that there is no gap? maybe even
> > > > remove data_offset and desc_offset and always place them adjacent, and
> > > > have the descriptors right after the header.
> > > >
> > > I guess I didn't make it clear about the offset fields in the header block.
> > > We don't create any gap here. In this implementation, kernel knows that
> > > descriptor block is right after header block and data block is right after
> > > descriptor block.
> > > The reason we have offset fields for descriptor block and data block is
> > > for flexibility and future potential extension. e.g. we might add another
> > > block between header block and descriptor block in the future for some
> > > other metadata information.
> > > I think we are good here.
> >
> > Hi Jing,
> >
> > I realize they are adjacent right now, as the function wouldn't work if
> > they weren't. My comment was more about code maintenance, what happens
> > if the layout changes. This function depends on an asumption about a
> > layout defined somewhere else. For example,
> >
> >         copylen = kvm_vm_stats_header.desc_offset + size_desc - pos;
> >
> > makes an assumption about desc_offset being set to:
> >
> >         .desc_offset = sizeof(struct kvm_stats_header),
> >
> > and if desc_offset is not exactly that, then the function doesn't
> > explicitely fail and instead does unexpected things (probably undetected
> > by tests).
> >
> > I think the solution is to just check the assumptions. Either an assert
> > or just bail out with a warning:
> >
> >         /* This function currently depends on the following layout. */
> >         if (kvm_vm_stats_header.desc_offset != sizeof(struct kvm_stats_header) ||
> >                         kvm_vm_stats_header.data_offset != sizeof(struct kvm_stats_header) +
> >                         sizeof(kvm_vm_stats_desc)) {
> >                 warning(...);
> >                 return 0;
> >         }
> >
> I understand your concern. But whenever layout changes, the read function needs
> to be updated anyway. The read function is actually the place to cook
> the data layout
> of the anonymous file.

Could it be a good idea for header.data_offset and header.desc_offset to
be set here (in the function)? so the function has full control of the
file layout.

> If the vm/vcpu stats header has an incorrect
> offset value that is
> defined in the read function, the test will complain about wrong stats
> descriptor field
> values usually.
> Anyway, I will add more sanity tests in the selftest to cover the
> potential risks.
> Thanks.
> > > > > +     copylen = min(copylen, remain);
> > > > > +     if (copylen > 0) {
> > > > > +             src = (void *)&vcpu->stat;
> > > > > +             src += pos - kvm_vcpu_stats_header.data_offset;
> > > > > +             if (copy_to_user(dest, src, copylen))
> > > > > +                     return -EFAULT;
> > > > > +             remain -= copylen;
> > > > > +             pos += copylen;
> > > > > +             dest += copylen;
> > > > > +     }
> > > > > +
> > > > > +     *offset = pos;
> > > > > +     return len;
> > > > > +}
> > > > > +
> > > > >
> > > >
> > > >
> > > >
> > > > > +static ssize_t kvm_vm_stats_read(struct file *file, char __user *user_buffer,
> > > > > +                           size_t size, loff_t *offset)
> > > > > +{
> > > >
> > > > Consider moving the common code between kvm_vcpu_stats_read and this one
> > > > into some function that takes pointers to header, desc, and data. Unless
> > > > there is something vcpu or vm specific besides that.
> > > >
> > > Will do that, thanks.
> > > > > +     char id[KVM_STATS_ID_MAXLEN];
> > > > > +     struct kvm *kvm = file->private_data;
> > > > > +     ssize_t copylen, len, remain = size;
> > > > > +     size_t size_header, size_desc, size_stats;
> > > > > +     loff_t pos = *offset;
> > > > > +     char __user *dest = user_buffer;
> > > > > +     void *src;
> > > > > +
> > > > > +     snprintf(id, sizeof(id), "kvm-%d", task_pid_nr(current));
> > > > > +     size_header = sizeof(kvm_vm_stats_header);
> > > > > +     size_desc = kvm_vm_stats_header.count * sizeof(struct _kvm_stats_desc);
> > > > > +     size_stats = sizeof(kvm->stat);
> > > > > +
> > > > > +     len = sizeof(id) + size_header + size_desc + size_stats - pos;
> > > > > +     len = min(len, remain);
> > > > > +     if (len <= 0)
> > > > > +             return 0;
> > > > > +     remain = len;
> > > > > +
> > > > > +     /* Copy kvm vm stats header id string */
> > > > > +     copylen = sizeof(id) - pos;
> > > > > +     copylen = min(copylen, remain);
> > > > > +     if (copylen > 0) {
> > > > > +             src = (void *)id + pos;
> > > > > +             if (copy_to_user(dest, src, copylen))
> > > > > +                     return -EFAULT;
> > > > > +             remain -= copylen;
> > > > > +             pos += copylen;
> > > > > +             dest += copylen;
> > > > > +     }
> > > > > +     /* Copy kvm vm stats header */
> > > > > +     copylen = sizeof(id) + size_header - pos;
> > > > > +     copylen = min(copylen, remain);
> > > > > +     if (copylen > 0) {
> > > > > +             src = (void *)&kvm_vm_stats_header;
> > > > > +             src += pos - sizeof(id);
> > > > > +             if (copy_to_user(dest, src, copylen))
> > > > > +                     return -EFAULT;
> > > > > +             remain -= copylen;
> > > > > +             pos += copylen;
> > > > > +             dest += copylen;
> > > > > +     }
> > > > > +     /* Copy kvm vm stats descriptors */
> > > > > +     copylen = kvm_vm_stats_header.desc_offset + size_desc - pos;
> > > > > +     copylen = min(copylen, remain);
> > > > > +     if (copylen > 0) {
> > > > > +             src = (void *)&kvm_vm_stats_desc;
> > > > > +             src += pos - kvm_vm_stats_header.desc_offset;
> > > > > +             if (copy_to_user(dest, src, copylen))
> > > > > +                     return -EFAULT;
> > > > > +             remain -= copylen;
> > > > > +             pos += copylen;
> > > > > +             dest += copylen;
> > > > > +     }
> > > > > +     /* Copy kvm vm stats values */
> > > > > +     copylen = kvm_vm_stats_header.data_offset + size_stats - pos;
> > > > > +     copylen = min(copylen, remain);
> > > > > +     if (copylen > 0) {
> > > > > +             src = (void *)&kvm->stat;
> > > > > +             src += pos - kvm_vm_stats_header.data_offset;
> > > > > +             if (copy_to_user(dest, src, copylen))
> > > > > +                     return -EFAULT;
> > > > > +             remain -= copylen;
> > > > > +             pos += copylen;
> > > > > +             dest += copylen;
> > > > > +     }
> > > > > +
> > > > > +     *offset = pos;
> > > > > +     return len;
> > > > > +}
> > > > > +
> > > > > --
> > > > > 2.31.1.751.gd2f1c929bd-goog
> > > > >
> > > > > _______________________________________________
> > > > > kvmarm mailing list
> > > > > kvmarm@lists.cs.columbia.edu
> > > > > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> 
> Jing

  reply	other threads:[~2021-05-20 20:51 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-17 14:53 [PATCH v5 0/4] KVM statistics data fd-based binary interface Jing Zhang
2021-05-17 14:53 ` [PATCH v5 1/4] KVM: stats: Separate common stats from architecture specific ones Jing Zhang
2021-05-17 23:39   ` David Matlack
2021-05-18  0:10     ` Jing Zhang
2021-05-18 16:27       ` David Matlack
2021-05-18 17:25         ` Jing Zhang
2021-05-18 18:40           ` Krish Sadhukhan
2021-05-21 19:04             ` Jing Zhang
2021-05-17 14:53 ` [PATCH v5 2/4] KVM: stats: Add fd-based API to read binary stats data Jing Zhang
2021-05-19 17:12   ` David Matlack
2021-05-19 19:02     ` Jing Zhang
2021-05-20  4:21   ` Ricardo Koller
2021-05-20 17:37     ` Jing Zhang
2021-05-20 18:58       ` Ricardo Koller
2021-05-20 19:46         ` Jing Zhang
2021-05-20 20:50           ` Ricardo Koller [this message]
2021-05-20 21:14             ` Jing Zhang
2021-05-17 14:53 ` [PATCH v5 3/4] KVM: stats: Add documentation for statistics data binary interface Jing Zhang
2021-05-19 16:57   ` David Matlack
2021-05-19 19:29     ` Jing Zhang
2021-05-19 20:30       ` Jing Zhang
2021-05-19 17:02   ` David Matlack
2021-05-19 19:30     ` Jing Zhang
2021-05-17 14:53 ` [PATCH v5 4/4] KVM: selftests: Add selftest for KVM " Jing Zhang
2021-05-19 17:21   ` David Matlack
2021-05-19 17:58     ` Jing Zhang
2021-05-19 22:00   ` Ricardo Koller
2021-05-19 22:54     ` Jing Zhang
2021-05-20 21:30     ` Jing Zhang
2021-05-17 14:55 ` [PATCH v5 0/4] KVM statistics data fd-based " Jing Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YKbLsICiX4iI+0AB@google.com \
    --to=ricarkol@google.com \
    --cc=aleksandar.qemu.devel@gmail.com \
    --cc=borntraeger@de.ibm.com \
    --cc=chenhuacai@kernel.org \
    --cc=cohuck@redhat.com \
    --cc=david@redhat.com \
    --cc=eesposit@redhat.com \
    --cc=frankja@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=james.morse@arm.com \
    --cc=jingzhangos@google.com \
    --cc=jmattson@google.com \
    --cc=julien.thierry.kdev@gmail.com \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=oupton@google.com \
    --cc=paulus@ozlabs.org \
    --cc=pbonzini@redhat.com \
    --cc=pshier@google.com \
    --cc=rientjes@google.com \
    --cc=seanjc@google.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tsbogend@alpha.franken.de \
    --cc=vkuznets@redhat.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).