All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] Add a new acquire resource to query vcpu statistics
@ 2022-05-17 14:33 Matias Ezequiel Vara Larsen
  2022-05-17 14:33 ` [RFC PATCH 1/2] xen/memory : Add stats_table resource type Matias Ezequiel Vara Larsen
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Matias Ezequiel Vara Larsen @ 2022-05-17 14:33 UTC (permalink / raw)
  To: xen-devel
  Cc: Matias Ezequiel Vara Larsen, Andrew Cooper, George Dunlap,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu,
	Dario Faggioli, Anthony PERARD

Hello all,

The purpose of this RFC is to get feedback about a new acquire resource that
exposes vcpu statistics for a given domain. The current mechanism to get those
statistics is by querying the hypervisor. This mechanism relies on a hypercall
and holds the domctl spinlock during its execution. When a pv tool like xcp-rrdd
periodically samples these counters, it ends up affecting other paths that share
that spinlock. By using acquire resources, the pv tool only requires a few
hypercalls to set the shared memory region and samples are got without issuing
any other hypercall. The original idea has been suggested by Andrew Cooper to
which I have been discussing about how to implement the current PoC. You can
find the RFC patch series at [1]. The series is rebased on top of stable-4.15.

I am currently a bit blocked on 1) what to expose and 2) how to expose it. For
1), I decided to expose what xcp-rrdd is querying, e.g., XEN_DOMCTL_getvcpuinfo.
More precisely, xcp-rrd gets runstate.time[RUNSTATE_running]. This is a uint64_t
counter. However, the time spent in other states may be interesting too.
Regarding 2), I am not sure if simply using an array of uint64_t is enough or if
a different interface should be exposed. The remaining question is when to get
new values. For the moment, I am updating this counter during
vcpu_runstate_change().

The current series includes a simple pv tool that shows how this new interface is
used. This tool maps the counter and periodically samples it.

Any feedback/help would be appreciated.

Thanks, Matias.

[1] https://github.com/MatiasVara/xen/tree/feature_stats

Matias Ezequiel Vara Larsen (2):
  xen/memory : Add stats_table resource type
  tools/misc: Add xen-stats tool

 tools/misc/Makefile         |  5 +++
 tools/misc/xen-stats.c      | 83 +++++++++++++++++++++++++++++++++++++
 xen/common/domain.c         | 42 +++++++++++++++++++
 xen/common/memory.c         | 29 +++++++++++++
 xen/common/sched/core.c     |  5 +++
 xen/include/public/memory.h |  1 +
 xen/include/xen/sched.h     |  5 +++
 7 files changed, 170 insertions(+)
 create mode 100644 tools/misc/xen-stats.c

-- 
2.25.1



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC PATCH 1/2] xen/memory : Add stats_table resource type
  2022-05-17 14:33 [RFC PATCH 0/2] Add a new acquire resource to query vcpu statistics Matias Ezequiel Vara Larsen
@ 2022-05-17 14:33 ` Matias Ezequiel Vara Larsen
  2022-06-17 20:54   ` George Dunlap
  2022-06-22  8:59   ` Jan Beulich
  2022-05-17 14:33 ` [RFC PATCH 2/2] tools/misc: Add xen-stats tool Matias Ezequiel Vara Larsen
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 14+ messages in thread
From: Matias Ezequiel Vara Larsen @ 2022-05-17 14:33 UTC (permalink / raw)
  To: xen-devel
  Cc: Matias Ezequiel Vara Larsen, Andrew Cooper, George Dunlap,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu,
	Dario Faggioli

Allow to map vcpu stats using acquire_resource().

Signed-off-by: Matias Ezequiel Vara Larsen <matias.vara@vates.fr>
---
 xen/common/domain.c         | 42 +++++++++++++++++++++++++++++++++++++
 xen/common/memory.c         | 29 +++++++++++++++++++++++++
 xen/common/sched/core.c     |  5 +++++
 xen/include/public/memory.h |  1 +
 xen/include/xen/sched.h     |  5 +++++
 5 files changed, 82 insertions(+)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 17cc32fde3..ddd9f88874 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -132,6 +132,42 @@ static void vcpu_info_reset(struct vcpu *v)
     v->vcpu_info_mfn = INVALID_MFN;
 }
 
+static void stats_free_buffer(struct vcpu * v)
+{
+    struct page_info *pg = v->stats.pg;
+
+    if ( !pg )
+        return;
+
+    v->stats.va = NULL;
+
+    if ( v->stats.va )
+        unmap_domain_page_global(v->stats.va);
+
+    v->stats.va = NULL;
+
+    free_domheap_page(pg);
+}
+
+static int stats_alloc_buffer(struct vcpu *v)
+{
+    struct domain *d = v->domain;
+    struct page_info *pg;
+
+    pg = alloc_domheap_page(d, MEMF_no_refcount);
+
+    if ( !pg )
+        return -ENOMEM;
+
+    v->stats.va = __map_domain_page_global(pg);
+    if ( !v->stats.va )
+        return -ENOMEM;
+
+    v->stats.pg = pg;
+    clear_page(v->stats.va);
+    return 0;
+}
+
 static void vmtrace_free_buffer(struct vcpu *v)
 {
     const struct domain *d = v->domain;
@@ -203,6 +239,9 @@ static int vmtrace_alloc_buffer(struct vcpu *v)
  */
 static int vcpu_teardown(struct vcpu *v)
 {
+
+    stats_free_buffer(v);
+
     vmtrace_free_buffer(v);
 
     return 0;
@@ -269,6 +308,9 @@ struct vcpu *vcpu_create(struct domain *d, unsigned int vcpu_id)
     if ( vmtrace_alloc_buffer(v) != 0 )
         goto fail_wq;
 
+    if ( stats_alloc_buffer(v) != 0 )
+        goto fail_wq;
+
     if ( arch_vcpu_create(v) != 0 )
         goto fail_sched;
 
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 297b98a562..39de6d9d05 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -1099,6 +1099,10 @@ static unsigned int resource_max_frames(const struct domain *d,
     case XENMEM_resource_vmtrace_buf:
         return d->vmtrace_size >> PAGE_SHIFT;
 
+    // WIP: to figure out the correct size of the resource
+    case XENMEM_resource_stats_table:
+        return 1;
+
     default:
         return -EOPNOTSUPP;
     }
@@ -1162,6 +1166,28 @@ static int acquire_vmtrace_buf(
     return nr_frames;
 }
 
+static int acquire_stats_table(struct domain *d,
+                                unsigned int id,
+                                unsigned int frame,
+                                unsigned int nr_frames,
+                                xen_pfn_t mfn_list[])
+{
+    const struct vcpu *v = domain_vcpu(d, id);
+    mfn_t mfn;
+
+    if ( !v )
+        return -ENOENT;
+
+    if ( !v->stats.pg )
+        return -EINVAL;
+
+    mfn = page_to_mfn(v->stats.pg);
+    mfn_list[0] = mfn_x(mfn);
+
+    printk("acquire_perf_table: id: %d, nr_frames: %d, %p, domainid: %d\n", id, nr_frames, v->stats.pg, d->domain_id);
+    return 1;
+}
+
 /*
  * Returns -errno on error, or positive in the range [1, nr_frames] on
  * success.  Returning less than nr_frames contitutes a request for a
@@ -1182,6 +1208,9 @@ static int _acquire_resource(
     case XENMEM_resource_vmtrace_buf:
         return acquire_vmtrace_buf(d, id, frame, nr_frames, mfn_list);
 
+    case XENMEM_resource_stats_table:
+        return acquire_stats_table(d, id, frame, nr_frames, mfn_list);
+
     default:
         return -EOPNOTSUPP;
     }
diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index 8f4b1ca10d..2a8b534977 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -264,6 +264,7 @@ static inline void vcpu_runstate_change(
 {
     s_time_t delta;
     struct sched_unit *unit = v->sched_unit;
+    uint64_t * runstate;
 
     ASSERT(spin_is_locked(get_sched_res(v->processor)->schedule_lock));
     if ( v->runstate.state == new_state )
@@ -287,6 +288,10 @@ static inline void vcpu_runstate_change(
     }
 
     v->runstate.state = new_state;
+
+    // WIP: use a different interface
+    runstate = (uint64_t*)v->stats.va;
+    memcpy(runstate, &v->runstate.time[0], sizeof(v->runstate.time[0]));
 }
 
 void sched_guest_idle(void (*idle) (void), unsigned int cpu)
diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index 50e73eef98..752fd0be0f 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -626,6 +626,7 @@ struct xen_mem_acquire_resource {
 #define XENMEM_resource_ioreq_server 0
 #define XENMEM_resource_grant_table 1
 #define XENMEM_resource_vmtrace_buf 2
+#define XENMEM_resource_stats_table 3
 
     /*
      * IN - a type-specific resource identifier, which must be zero
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 5485d08afb..bc99adea7e 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -284,6 +284,11 @@ struct vcpu
         struct page_info *pg; /* One contiguous allocation of d->vmtrace_size */
     } vmtrace;
 
+    struct {
+        struct page_info *pg;
+        void * va;
+    } stats;
+
     struct arch_vcpu arch;
 
 #ifdef CONFIG_IOREQ_SERVER
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 2/2] tools/misc: Add xen-stats tool
  2022-05-17 14:33 [RFC PATCH 0/2] Add a new acquire resource to query vcpu statistics Matias Ezequiel Vara Larsen
  2022-05-17 14:33 ` [RFC PATCH 1/2] xen/memory : Add stats_table resource type Matias Ezequiel Vara Larsen
@ 2022-05-17 14:33 ` Matias Ezequiel Vara Larsen
  2022-05-31 11:16   ` Anthony PERARD
  2022-06-17  3:26 ` [RFC PATCH 0/2] Add a new acquire resource to query vcpu statistics Henry Wang
  2022-06-17 19:54 ` George Dunlap
  3 siblings, 1 reply; 14+ messages in thread
From: Matias Ezequiel Vara Larsen @ 2022-05-17 14:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Matias Ezequiel Vara Larsen, Wei Liu, Anthony PERARD

Add a demostration tool that uses the stats_table resource to
query vcpu time for a DomU.

Signed-off-by: Matias Ezequiel Vara Larsen <matias.vara@vates.fr>
---
 tools/misc/Makefile    |  5 +++
 tools/misc/xen-stats.c | 83 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 88 insertions(+)
 create mode 100644 tools/misc/xen-stats.c

diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index 2b683819d4..b510e3aceb 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -135,4 +135,9 @@ xencov: xencov.o
 xen-ucode: xen-ucode.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
+xen-stats.o: CFLAGS += $(CFLAGS_libxenforeginmemory)
+
+xen-stats: xen-stats.o
+	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(LDLIBS_libxenforeignmemory) $(APPEND_LDFLAGS)
+
 -include $(DEPS_INCLUDE)
diff --git a/tools/misc/xen-stats.c b/tools/misc/xen-stats.c
new file mode 100644
index 0000000000..5d4a3239cc
--- /dev/null
+++ b/tools/misc/xen-stats.c
@@ -0,0 +1,83 @@
+#include <err.h>
+#include <errno.h>
+#include <error.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/mman.h>
+#include <signal.h>
+
+#include <xenctrl.h>
+#include <xenforeignmemory.h>
+#include <xen-tools/libs.h>
+
+static sig_atomic_t interrupted;
+static void close_handler(int signum)
+{
+    interrupted = 1;
+}
+
+int main(int argc, char **argv)
+{
+    xenforeignmemory_handle *fh;
+    xenforeignmemory_resource_handle *res;
+    size_t size;
+    int rc, nr_frames, domid, frec, vcpu;
+    uint64_t * info;
+    struct sigaction act;
+
+    if (argc != 4 ) {
+        fprintf(stderr, "Usage: %s <domid> <vcpu> <period>\n", argv[0]);
+        return 1;
+    }
+
+    // TODO: this depends on the resource
+    nr_frames = 1;
+
+    domid = atoi(argv[1]);
+    frec = atoi(argv[3]);
+    vcpu = atoi(argv[2]);
+
+    act.sa_handler = close_handler;
+    act.sa_flags = 0;
+    sigemptyset(&act.sa_mask);
+    sigaction(SIGHUP,  &act, NULL);
+    sigaction(SIGTERM, &act, NULL);
+    sigaction(SIGINT,  &act, NULL);
+    sigaction(SIGALRM, &act, NULL);
+
+    fh = xenforeignmemory_open(NULL, 0);
+
+    if ( !fh )
+        err(1, "xenforeignmemory_open");
+
+    rc = xenforeignmemory_resource_size(
+        fh, domid, XENMEM_resource_stats_table,
+        vcpu, &size);
+
+    if ( rc )
+        err(1, "    Fail: Get size: %d - %s\n", errno, strerror(errno));
+
+    if ( (size >> XC_PAGE_SHIFT) != nr_frames )
+        err(1, "    Fail: Get size: expected %u frames, got %zu\n",
+                    nr_frames, size >> XC_PAGE_SHIFT);
+
+    res = xenforeignmemory_map_resource(
+        fh, domid, XENMEM_resource_stats_table,
+        vcpu, 0, size >> XC_PAGE_SHIFT,
+        (void **)&info, PROT_READ, 0);
+
+    if ( !res )
+        err(1, "    Fail: Map %d - %s\n", errno, strerror(errno));
+
+    while ( !interrupted ) {
+        sleep(frec);
+        printf("running cpu_time: %ld\n", *info);
+    }
+
+    rc = xenforeignmemory_unmap_resource(fh, res);
+    if ( rc )
+        err(1, "    Fail: Unmap %d - %s\n", errno, strerror(errno));
+
+    return 0;
+}
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 2/2] tools/misc: Add xen-stats tool
  2022-05-17 14:33 ` [RFC PATCH 2/2] tools/misc: Add xen-stats tool Matias Ezequiel Vara Larsen
@ 2022-05-31 11:16   ` Anthony PERARD
  2022-06-03 11:08     ` Matias Ezequiel Vara Larsen
  0 siblings, 1 reply; 14+ messages in thread
From: Anthony PERARD @ 2022-05-31 11:16 UTC (permalink / raw)
  To: Matias Ezequiel Vara Larsen
  Cc: xen-devel, Matias Ezequiel Vara Larsen, Wei Liu

Hi Matias,

On Tue, May 17, 2022 at 04:33:15PM +0200, Matias Ezequiel Vara Larsen wrote:
> Add a demostration tool that uses the stats_table resource to
> query vcpu time for a DomU.
> 
> Signed-off-by: Matias Ezequiel Vara Larsen <matias.vara@vates.fr>
> ---
> diff --git a/tools/misc/Makefile b/tools/misc/Makefile
> index 2b683819d4..b510e3aceb 100644
> --- a/tools/misc/Makefile
> +++ b/tools/misc/Makefile
> @@ -135,4 +135,9 @@ xencov: xencov.o
>  xen-ucode: xen-ucode.o
>  	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
>  
> +xen-stats.o: CFLAGS += $(CFLAGS_libxenforeginmemory)
> +
> +xen-stats: xen-stats.o

The tools seems to be only about vcpu, maybe `xen-stats` is a bit too
generic. Would `xen-vcpus-stats`, or maybe something with `time` would
be better?

Also, is it a tools that could be useful enough to be installed by
default? Should we at least build it by default so it doesn't rot? (By
adding it to only $(TARGETS).)

> +	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(LDLIBS_libxenforeignmemory) $(APPEND_LDFLAGS)
> +
>  -include $(DEPS_INCLUDE)
> diff --git a/tools/misc/xen-stats.c b/tools/misc/xen-stats.c
> new file mode 100644
> index 0000000000..5d4a3239cc
> --- /dev/null
> +++ b/tools/misc/xen-stats.c
> @@ -0,0 +1,83 @@
> +#include <err.h>
> +#include <errno.h>
> +#include <error.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <sys/mman.h>
> +#include <signal.h>
> +
> +#include <xenctrl.h>

It seems overkill to use this header when the tool only use
xenforeignmemory interface. But I don't know how to replace
XC_PAGE_SHIFT, so I guess that's ok.

> +#include <xenforeignmemory.h>
> +#include <xen-tools/libs.h>

What do you use this headers for? Is it left over?

> +static sig_atomic_t interrupted;
> +static void close_handler(int signum)
> +{
> +    interrupted = 1;
> +}
> +
> +int main(int argc, char **argv)
> +{
> +    xenforeignmemory_handle *fh;
> +    xenforeignmemory_resource_handle *res;
> +    size_t size;
> +    int rc, nr_frames, domid, frec, vcpu;
> +    uint64_t * info;
> +    struct sigaction act;
> +
> +    if (argc != 4 ) {
> +        fprintf(stderr, "Usage: %s <domid> <vcpu> <period>\n", argv[0]);
> +        return 1;
> +    }
> +
> +    // TODO: this depends on the resource
> +    nr_frames = 1;
> +
> +    domid = atoi(argv[1]);
> +    frec = atoi(argv[3]);
> +    vcpu = atoi(argv[2]);

Can you swap the last two line? I think it is better if the order is the same
as on the command line.

> +
> +    act.sa_handler = close_handler;
> +    act.sa_flags = 0;
> +    sigemptyset(&act.sa_mask);
> +    sigaction(SIGHUP,  &act, NULL);
> +    sigaction(SIGTERM, &act, NULL);
> +    sigaction(SIGINT,  &act, NULL);
> +    sigaction(SIGALRM, &act, NULL);
> +
> +    fh = xenforeignmemory_open(NULL, 0);
> +
> +    if ( !fh )
> +        err(1, "xenforeignmemory_open");
> +
> +    rc = xenforeignmemory_resource_size(
> +        fh, domid, XENMEM_resource_stats_table,
> +        vcpu, &size);
> +
> +    if ( rc )
> +        err(1, "    Fail: Get size: %d - %s\n", errno, strerror(errno));

It seems that err() already does print strerror(), and add a "\n", so
why print it again? Also, if we have strerror(), what the point of
printing "errno"?

Also, I'm not sure the extra indentation in the error message is really
useful, but that doesn't really matter.

> +
> +    if ( (size >> XC_PAGE_SHIFT) != nr_frames )
> +        err(1, "    Fail: Get size: expected %u frames, got %zu\n",
> +                    nr_frames, size >> XC_PAGE_SHIFT);

err() prints strerror(errno), maybe errx() is better here.


Thanks,

-- 
Anthony PERARD


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 2/2] tools/misc: Add xen-stats tool
  2022-05-31 11:16   ` Anthony PERARD
@ 2022-06-03 11:08     ` Matias Ezequiel Vara Larsen
  2022-06-20  8:56       ` Matias Ezequiel Vara Larsen
  0 siblings, 1 reply; 14+ messages in thread
From: Matias Ezequiel Vara Larsen @ 2022-06-03 11:08 UTC (permalink / raw)
  To: Anthony PERARD; +Cc: xen-devel, Matias Ezequiel Vara Larsen, Wei Liu

Hello Anthony and thanks for your comments. I addressed them below:

On Tue, May 31, 2022 at 12:16:02PM +0100, Anthony PERARD wrote:
> Hi Matias,
> 
> On Tue, May 17, 2022 at 04:33:15PM +0200, Matias Ezequiel Vara Larsen wrote:
> > Add a demostration tool that uses the stats_table resource to
> > query vcpu time for a DomU.
> > 
> > Signed-off-by: Matias Ezequiel Vara Larsen <matias.vara@vates.fr>
> > ---
> > diff --git a/tools/misc/Makefile b/tools/misc/Makefile
> > index 2b683819d4..b510e3aceb 100644
> > --- a/tools/misc/Makefile
> > +++ b/tools/misc/Makefile
> > @@ -135,4 +135,9 @@ xencov: xencov.o
> >  xen-ucode: xen-ucode.o
> >  	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
> >  
> > +xen-stats.o: CFLAGS += $(CFLAGS_libxenforeginmemory)
> > +
> > +xen-stats: xen-stats.o
> 
> The tools seems to be only about vcpu, maybe `xen-stats` is a bit too
> generic. Would `xen-vcpus-stats`, or maybe something with `time` would
> be better?

Do you think `xen-vcpus-stats` would be good enough?

> Also, is it a tools that could be useful enough to be installed by
> default? Should we at least build it by default so it doesn't rot? (By
> adding it to only $(TARGETS).)

I will add to build this tool by default in the next patches' version.
 
> > +	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(LDLIBS_libxenforeignmemory) $(APPEND_LDFLAGS)
> > +
> >  -include $(DEPS_INCLUDE)
> > diff --git a/tools/misc/xen-stats.c b/tools/misc/xen-stats.c
> > new file mode 100644
> > index 0000000000..5d4a3239cc
> > --- /dev/null
> > +++ b/tools/misc/xen-stats.c
> > @@ -0,0 +1,83 @@
> > +#include <err.h>
> > +#include <errno.h>
> > +#include <error.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <string.h>
> > +#include <sys/mman.h>
> > +#include <signal.h>
> > +
> > +#include <xenctrl.h>
> 
> It seems overkill to use this header when the tool only use
> xenforeignmemory interface. But I don't know how to replace
> XC_PAGE_SHIFT, so I guess that's ok.
> 
> > +#include <xenforeignmemory.h>
> > +#include <xen-tools/libs.h>
> 
> What do you use this headers for? Is it left over?

`xenforeignmemory.h` is used for `xenforeignmemory_*` functions.
`xen-tools/libs.h` is left over so I will remove it in next version.

> > +static sig_atomic_t interrupted;
> > +static void close_handler(int signum)
> > +{
> > +    interrupted = 1;
> > +}
> > +
> > +int main(int argc, char **argv)
> > +{
> > +    xenforeignmemory_handle *fh;
> > +    xenforeignmemory_resource_handle *res;
> > +    size_t size;
> > +    int rc, nr_frames, domid, frec, vcpu;
> > +    uint64_t * info;
> > +    struct sigaction act;
> > +
> > +    if (argc != 4 ) {
> > +        fprintf(stderr, "Usage: %s <domid> <vcpu> <period>\n", argv[0]);
> > +        return 1;
> > +    }
> > +
> > +    // TODO: this depends on the resource
> > +    nr_frames = 1;
> > +
> > +    domid = atoi(argv[1]);
> > +    frec = atoi(argv[3]);
> > +    vcpu = atoi(argv[2]);
> 
> Can you swap the last two line? I think it is better if the order is the same
> as on the command line.

Yes, I can.

> > +
> > +    act.sa_handler = close_handler;
> > +    act.sa_flags = 0;
> > +    sigemptyset(&act.sa_mask);
> > +    sigaction(SIGHUP,  &act, NULL);
> > +    sigaction(SIGTERM, &act, NULL);
> > +    sigaction(SIGINT,  &act, NULL);
> > +    sigaction(SIGALRM, &act, NULL);
> > +
> > +    fh = xenforeignmemory_open(NULL, 0);
> > +
> > +    if ( !fh )
> > +        err(1, "xenforeignmemory_open");
> > +
> > +    rc = xenforeignmemory_resource_size(
> > +        fh, domid, XENMEM_resource_stats_table,
> > +        vcpu, &size);
> > +
> > +    if ( rc )
> > +        err(1, "    Fail: Get size: %d - %s\n", errno, strerror(errno));
> 
> It seems that err() already does print strerror(), and add a "\n", so
> why print it again? Also, if we have strerror(), what the point of
> printing "errno"?

I will remove errno, strerror(errno), and the extra "\n".

> Also, I'm not sure the extra indentation in the error message is really
> useful, but that doesn't really matter.

I will remove the indentation.

> > +
> > +    if ( (size >> XC_PAGE_SHIFT) != nr_frames )
> > +        err(1, "    Fail: Get size: expected %u frames, got %zu\n",
> > +                    nr_frames, size >> XC_PAGE_SHIFT);
> 
> err() prints strerror(errno), maybe errx() is better here. 

I will use errx().

Thanks,
 
> 
> Thanks,
> 
> -- 
> Anthony PERARD


^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [RFC PATCH 0/2] Add a new acquire resource to query vcpu statistics
  2022-05-17 14:33 [RFC PATCH 0/2] Add a new acquire resource to query vcpu statistics Matias Ezequiel Vara Larsen
  2022-05-17 14:33 ` [RFC PATCH 1/2] xen/memory : Add stats_table resource type Matias Ezequiel Vara Larsen
  2022-05-17 14:33 ` [RFC PATCH 2/2] tools/misc: Add xen-stats tool Matias Ezequiel Vara Larsen
@ 2022-06-17  3:26 ` Henry Wang
  2022-06-17 19:54 ` George Dunlap
  3 siblings, 0 replies; 14+ messages in thread
From: Henry Wang @ 2022-06-17  3:26 UTC (permalink / raw)
  To: Matias Ezequiel Vara Larsen, xen-devel
  Cc: Matias Ezequiel Vara Larsen, Andrew Cooper, George Dunlap,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu,
	Dario Faggioli, Anthony PERARD

Hi,

It seems that this series has been stale for more than a month with no comment
from maintainer for patch#1 [1] and actions needed from author for patch#2 [2].
So sending this email as a gentle reminder. Thanks!

[1] https://patchwork.kernel.org/project/xen-devel/patch/d0afb6657b1e78df4857ad7bcc875982e9c022b4.1652797713.git.matias.vara@vates.fr/
[2] https://patchwork.kernel.org/project/xen-devel/patch/e233c4f60c6fe97b93b3adf9affeb0404c554130.1652797713.git.matias.vara@vates.fr/

Kind regards,
Henry

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of
> Matias Ezequiel Vara Larsen
> Subject: [RFC PATCH 0/2] Add a new acquire resource to query vcpu
> statistics
> 
> Hello all,
> 
> The purpose of this RFC is to get feedback about a new acquire resource
> that
> exposes vcpu statistics for a given domain. The current mechanism to get
> those
> statistics is by querying the hypervisor. This mechanism relies on a
> hypercall
> and holds the domctl spinlock during its execution. When a pv tool like xcp-
> rrdd
> periodically samples these counters, it ends up affecting other paths that
> share
> that spinlock. By using acquire resources, the pv tool only requires a few
> hypercalls to set the shared memory region and samples are got without
> issuing
> any other hypercall. The original idea has been suggested by Andrew
> Cooper to
> which I have been discussing about how to implement the current PoC. You
> can
> find the RFC patch series at [1]. The series is rebased on top of stable-4.15.
> 
> I am currently a bit blocked on 1) what to expose and 2) how to expose it.
> For
> 1), I decided to expose what xcp-rrdd is querying, e.g.,
> XEN_DOMCTL_getvcpuinfo.
> More precisely, xcp-rrd gets runstate.time[RUNSTATE_running]. This is a
> uint64_t
> counter. However, the time spent in other states may be interesting too.
> Regarding 2), I am not sure if simply using an array of uint64_t is enough or
> if
> a different interface should be exposed. The remaining question is when to
> get
> new values. For the moment, I am updating this counter during
> vcpu_runstate_change().
> 
> The current series includes a simple pv tool that shows how this new
> interface is
> used. This tool maps the counter and periodically samples it.
> 
> Any feedback/help would be appreciated.
> 
> Thanks, Matias.
> 
> [1] https://github.com/MatiasVara/xen/tree/feature_stats
> 
> Matias Ezequiel Vara Larsen (2):
>   xen/memory : Add stats_table resource type
>   tools/misc: Add xen-stats tool
> 
>  tools/misc/Makefile         |  5 +++
>  tools/misc/xen-stats.c      | 83 +++++++++++++++++++++++++++++++++++++
>  xen/common/domain.c         | 42 +++++++++++++++++++
>  xen/common/memory.c         | 29 +++++++++++++
>  xen/common/sched/core.c     |  5 +++
>  xen/include/public/memory.h |  1 +
>  xen/include/xen/sched.h     |  5 +++
>  7 files changed, 170 insertions(+)
>  create mode 100644 tools/misc/xen-stats.c
> 
> --
> 2.25.1
> 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/2] Add a new acquire resource to query vcpu statistics
  2022-05-17 14:33 [RFC PATCH 0/2] Add a new acquire resource to query vcpu statistics Matias Ezequiel Vara Larsen
                   ` (2 preceding siblings ...)
  2022-06-17  3:26 ` [RFC PATCH 0/2] Add a new acquire resource to query vcpu statistics Henry Wang
@ 2022-06-17 19:54 ` George Dunlap
  2022-06-24 13:06   ` Matias Ezequiel Vara Larsen
  3 siblings, 1 reply; 14+ messages in thread
From: George Dunlap @ 2022-06-17 19:54 UTC (permalink / raw)
  To: Matias Ezequiel Vara Larsen
  Cc: xen-devel, Matias Ezequiel Vara Larsen, Andrew Cooper,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu,
	Dario Faggioli, Anthony Perard

[-- Attachment #1: Type: text/plain, Size: 5319 bytes --]



> On 17 May 2022, at 15:33, Matias Ezequiel Vara Larsen <matiasevara@gmail.com> wrote:
> 
> Hello all,
> 
> The purpose of this RFC is to get feedback about a new acquire resource that
> exposes vcpu statistics for a given domain. The current mechanism to get those
> statistics is by querying the hypervisor. This mechanism relies on a hypercall
> and holds the domctl spinlock during its execution. When a pv tool like xcp-rrdd
> periodically samples these counters, it ends up affecting other paths that share
> that spinlock. By using acquire resources, the pv tool only requires a few
> hypercalls to set the shared memory region and samples are got without issuing
> any other hypercall. The original idea has been suggested by Andrew Cooper to
> which I have been discussing about how to implement the current PoC. You can
> find the RFC patch series at [1]. The series is rebased on top of stable-4.15.
> 
> I am currently a bit blocked on 1) what to expose and 2) how to expose it. For
> 1), I decided to expose what xcp-rrdd is querying, e.g., XEN_DOMCTL_getvcpuinfo.
> More precisely, xcp-rrd gets runstate.time[RUNSTATE_running]. This is a uint64_t
> counter. However, the time spent in other states may be interesting too.
> Regarding 2), I am not sure if simply using an array of uint64_t is enough or if
> a different interface should be exposed. The remaining question is when to get
> new values. For the moment, I am updating this counter during
> vcpu_runstate_change().
> 
> The current series includes a simple pv tool that shows how this new interface is
> used. This tool maps the counter and periodically samples it.
> 
> Any feedback/help would be appreciated.

Hey Matias,

Sorry it’s taken so long to get back to you.  My day-to-day job has shifted away from technical things to community management; this has been on my radar but I never made time to dig into it.

There are some missing details I’ve had to try to piece together about the situation, so let me sketch things out a bit further and see if I understand the situation:

* xcp-rrd currently wants (at minimum) to record runstate.time[RUNSTATE_running] for each vcpu.  Currently that means calling XEN_DOMCTL_getvcpuinfo, which has to hold a single global domctl_lock (!) for the entire hypercall; and of course must be iterated over every vcpu in the system for every update.

* VCPUOP_get_runstate_info copies out a vcpu_runstate_info struct, which contains information on the other runstates.  Additionally, VCPUOP_register_runstate_memory_area already does something similar to what you want: it passes a virtual address to Xen, which Xen maps, and copies information about the various vcpus into (in update_runstate_area()).

* However, the above assumes a domain of “current->domain”: That is a domain can call VCPUOP_get_runstate_info on one of its own vcpus, but dom0 cannot call it to get information about the vcpus of other domains.

* Additionally, VCPUOP_register_runstate_memory_area registers by *virtual address*; this is actually problematic even for guest kernels looking at their own vcpus; but would be completely inappropriate for a dom0 userspace application, which is what you’re looking at.

Your solution is to expose things via the xenforeignmemory interface instead, modelled after the vmtrace_buf functionality.

Does that all sound right?

I think at a high level that’s probably the right way to go.

As you say, my default would be to expose similar information as VCPUOP_get_runstate_info.  I’d even consider just using vcpu_runstate_info_t.

The other option would be to try to make the page a more general “foreign vcpu info” page, which we could expand with more information as we find it useful.

In this patch, you’re allocating 4k *per vcpu on the entire system* to hold a single 64-bit value; even if you decide to use vcpu_runstate_info_t, there’s still quite a large wastage.  Would it make sense rather to have this pass back an array of MFNs designed to be mapped contiguously, with the vcpus listed as an array? This seems to be what XENMEM_resource_ioreq_server does.

The advantage of making the structure extensible is that we wouldn’t need to add another interface, and potentially another full page, if we wanted to add more functionality that we wanted to export.  On the other hand, every new functionality that we add may require adding code to copy things into it; making it so that such code is added bit by bit as it’s requested might be better.

I have some more comments I’ll give on the 1/2 patch.

 -George






> 
> Thanks, Matias.
> 
> [1] https://github.com/MatiasVara/xen/tree/feature_stats
> 
> Matias Ezequiel Vara Larsen (2):
>  xen/memory : Add stats_table resource type
>  tools/misc: Add xen-stats tool
> 
> tools/misc/Makefile         |  5 +++
> tools/misc/xen-stats.c      | 83 +++++++++++++++++++++++++++++++++++++
> xen/common/domain.c         | 42 +++++++++++++++++++
> xen/common/memory.c         | 29 +++++++++++++
> xen/common/sched/core.c     |  5 +++
> xen/include/public/memory.h |  1 +
> xen/include/xen/sched.h     |  5 +++
> 7 files changed, 170 insertions(+)
> create mode 100644 tools/misc/xen-stats.c
> 
> --
> 2.25.1
> 


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] xen/memory : Add stats_table resource type
  2022-05-17 14:33 ` [RFC PATCH 1/2] xen/memory : Add stats_table resource type Matias Ezequiel Vara Larsen
@ 2022-06-17 20:54   ` George Dunlap
  2022-06-29  9:44     ` Matias Ezequiel Vara Larsen
  2022-06-22  8:59   ` Jan Beulich
  1 sibling, 1 reply; 14+ messages in thread
From: George Dunlap @ 2022-06-17 20:54 UTC (permalink / raw)
  To: Matias Ezequiel Vara Larsen
  Cc: xen-devel, Matias Ezequiel Vara Larsen, Andrew Cooper,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu,
	Dario Faggioli


[-- Attachment #1.1: Type: text/plain, Size: 8041 bytes --]

Preface: It looks like this may be one of your first hypervisor patches.  So thank you!  FYI there’s a lot that needs fixing up here; please don’t read a tone of annoyance into it, I’m just trying to tell you what needs fixing in the most efficient manner. :-)

> On 17 May 2022, at 15:33, Matias Ezequiel Vara Larsen <matiasevara@gmail.com> wrote:
> 
> Allow to map vcpu stats using acquire_resource().

This needs a lot more expansion in terms of what it is that you’re doing in this patch and why.

> 
> Signed-off-by: Matias Ezequiel Vara Larsen <matias.vara@vates.fr>
> ---
> xen/common/domain.c         | 42 +++++++++++++++++++++++++++++++++++++
> xen/common/memory.c         | 29 +++++++++++++++++++++++++
> xen/common/sched/core.c     |  5 +++++
> xen/include/public/memory.h |  1 +
> xen/include/xen/sched.h     |  5 +++++
> 5 files changed, 82 insertions(+)
> 
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 17cc32fde3..ddd9f88874 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -132,6 +132,42 @@ static void vcpu_info_reset(struct vcpu *v)
>     v->vcpu_info_mfn = INVALID_MFN;
> }
> 
> +static void stats_free_buffer(struct vcpu * v)
> +{
> +    struct page_info *pg = v->stats.pg;
> +
> +    if ( !pg )
> +        return;
> +
> +    v->stats.va = NULL;
> +
> +    if ( v->stats.va )
> +        unmap_domain_page_global(v->stats.va);
> +
> +    v->stats.va = NULL;

Looks like you meant to delete the first `va->stats.va = NULL` but forgot?

> +
> +    free_domheap_page(pg);

Pretty sure this will crash.

Unfortunately page allocation and freeing is somewhat complicated and requires a bit of boilerplate.  You can look at the vmtrace_alloc_buffer() code for a template, but the general sequence you want is as follows:

* On the allocate side:

1. Allocate the page

   pg = alloc_domheap_page(d, MEMF_no_refcount);

This will allocate a page with the PGC_allocated bit set and a single reference count.  (If you pass a page with PGC_allocated set to free_domheap_page(), it will crash; which is why I said the above.)

2.  Grab a general reference count for the vcpu struct’s reference to it; as well as a writable type count, so that it doesn’t get used as a special page

if (get_page_and_type(pg, d, PGT_writable_page)) {
  put_page_alloc_ref(p);
  /* failure path */
}

* On the free side, don’t call free_domheap_pages() directly.  Rather, drop the allocation, then drop your own type count, thus:

v->stats.va <http://stats.va/> = NULL;

put_page_alloc_ref(pg);
put_page_and_type(pg);

The issue here is that we can’t free the page until all references have dropped; and the whole point of this exercise is to allow guest userspace in dom0 to gain a reference to the page.  We can’t actually free the page until *all* references are gone, including the userspace one in dom0.  The put_page() which brings the reference count to 0 will automatically free the page.


> +}
> +
> +static int stats_alloc_buffer(struct vcpu *v)
> +{
> +    struct domain *d = v->domain;
> +    struct page_info *pg;
> +
> +    pg = alloc_domheap_page(d, MEMF_no_refcount);
> +
> +    if ( !pg )
> +        return -ENOMEM;
> +
> +    v->stats.va = __map_domain_page_global(pg);
> +    if ( !v->stats.va )
> +        return -ENOMEM;
> +
> +    v->stats.pg = pg;
> +    clear_page(v->stats.va);
> +    return 0;
> +}

The other thing to say about this is that the memory is being allocated unconditionally, even if nobody is planning to read it.  The vast majority of Xen users will not be running xcp-rrd, so it will be pointless overheard.

At a basic level, you want to follow suit with the vmtrace buffers, which are only allocated if the proper domctl flag is set on domain creation.  (We could consider instead, or in addition, having a global Xen command-line parameter which would enable this feature for all domains.)

> +
> static void vmtrace_free_buffer(struct vcpu *v)
> {
>     const struct domain *d = v->domain;
> @@ -203,6 +239,9 @@ static int vmtrace_alloc_buffer(struct vcpu *v)
>  */
> static int vcpu_teardown(struct vcpu *v)
> {
> +
> +    stats_free_buffer(v);
> +
>     vmtrace_free_buffer(v);
> 
>     return 0;
> @@ -269,6 +308,9 @@ struct vcpu *vcpu_create(struct domain *d, unsigned int vcpu_id)
>     if ( vmtrace_alloc_buffer(v) != 0 )
>         goto fail_wq;
> 
> +    if ( stats_alloc_buffer(v) != 0 )
> +        goto fail_wq;
> +
>     if ( arch_vcpu_create(v) != 0 )
>         goto fail_sched;
> 
> diff --git a/xen/common/memory.c b/xen/common/memory.c
> index 297b98a562..39de6d9d05 100644
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -1099,6 +1099,10 @@ static unsigned int resource_max_frames(const struct domain *d,
>     case XENMEM_resource_vmtrace_buf:
>         return d->vmtrace_size >> PAGE_SHIFT;
> 
> +    // WIP: to figure out the correct size of the resource
> +    case XENMEM_resource_stats_table:
> +        return 1;
> +
>     default:
>         return -EOPNOTSUPP;
>     }
> @@ -1162,6 +1166,28 @@ static int acquire_vmtrace_buf(
>     return nr_frames;
> }
> 
> +static int acquire_stats_table(struct domain *d,
> +                                unsigned int id,
> +                                unsigned int frame,
> +                                unsigned int nr_frames,
> +                                xen_pfn_t mfn_list[])
> +{
> +    const struct vcpu *v = domain_vcpu(d, id);
> +    mfn_t mfn;
> +

Maybe I’m paranoid, but I might add an “ASSERT(nr_frames == 1)” here

> +    if ( !v )
> +        return -ENOENT;
> +
> +    if ( !v->stats.pg )
> +        return -EINVAL;
> +
> +    mfn = page_to_mfn(v->stats.pg);
> +    mfn_list[0] = mfn_x(mfn);
> +
> +    printk("acquire_perf_table: id: %d, nr_frames: %d, %p, domainid: %d\n", id, nr_frames, v->stats.pg, d->domain_id);
> +    return 1;
> +}
> +
> /*
>  * Returns -errno on error, or positive in the range [1, nr_frames] on
>  * success.  Returning less than nr_frames contitutes a request for a
> @@ -1182,6 +1208,9 @@ static int _acquire_resource(
>     case XENMEM_resource_vmtrace_buf:
>         return acquire_vmtrace_buf(d, id, frame, nr_frames, mfn_list);
> 
> +    case XENMEM_resource_stats_table:
> +        return acquire_stats_table(d, id, frame, nr_frames, mfn_list);
> +
>     default:
>         return -EOPNOTSUPP;
>     }
> diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> index 8f4b1ca10d..2a8b534977 100644
> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -264,6 +264,7 @@ static inline void vcpu_runstate_change(
> {
>     s_time_t delta;
>     struct sched_unit *unit = v->sched_unit;
> +    uint64_t * runstate;
> 
>     ASSERT(spin_is_locked(get_sched_res(v->processor)->schedule_lock));
>     if ( v->runstate.state == new_state )
> @@ -287,6 +288,10 @@ static inline void vcpu_runstate_change(
>     }
> 
>     v->runstate.state = new_state;
> +
> +    // WIP: use a different interface
> +    runstate = (uint64_t*)v->stats.va;

I think you should look at xen.git/xen/include/public/hvm/ioreq.h:shared_iopage_t for inspiration.  Basically, you cast the void pointer to that type, and then just access `iopage->vcpu_ioreq[vcpuid]`.   Put it in a public header, and then both the userspace consumer and Xen can use the same structure.

As I said in my response to the cover letter, I think vcpu_runstate_info_t would be something to look at and gain inspiration from.

The other thing to say here is that this is a hot path; we don’t want to be copying lots of information around if it’s not going to be used.  So you should only allocate the buffers if specifically enabled, and you should only copy the information over if v->stats.va <http://stats.va/> != NULL.

I think that should be enough to start with; we can nail down more when you send v1.

Peace,
 -George


[-- Attachment #1.2: Type: text/html, Size: 13950 bytes --]

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 2/2] tools/misc: Add xen-stats tool
  2022-06-03 11:08     ` Matias Ezequiel Vara Larsen
@ 2022-06-20  8:56       ` Matias Ezequiel Vara Larsen
  0 siblings, 0 replies; 14+ messages in thread
From: Matias Ezequiel Vara Larsen @ 2022-06-20  8:56 UTC (permalink / raw)
  To: Anthony PERARD; +Cc: xen-devel, Matias Ezequiel Vara Larsen, Wei Liu

Hello Anthony, 

On Fri, Jun 03, 2022 at 01:08:20PM +0200, Matias Ezequiel Vara Larsen wrote:
> Hello Anthony and thanks for your comments. I addressed them below:
> 
> On Tue, May 31, 2022 at 12:16:02PM +0100, Anthony PERARD wrote:
> > Hi Matias,
> > 
> > On Tue, May 17, 2022 at 04:33:15PM +0200, Matias Ezequiel Vara Larsen wrote:
> > > Add a demostration tool that uses the stats_table resource to
> > > query vcpu time for a DomU.
> > > 
> > > Signed-off-by: Matias Ezequiel Vara Larsen <matias.vara@vates.fr>
> > > ---
> > > diff --git a/tools/misc/Makefile b/tools/misc/Makefile
> > > index 2b683819d4..b510e3aceb 100644
> > > --- a/tools/misc/Makefile
> > > +++ b/tools/misc/Makefile
> > > @@ -135,4 +135,9 @@ xencov: xencov.o
> > >  xen-ucode: xen-ucode.o
> > >  	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
> > >  
> > > +xen-stats.o: CFLAGS += $(CFLAGS_libxenforeginmemory)
> > > +
> > > +xen-stats: xen-stats.o
> > 
> > The tools seems to be only about vcpu, maybe `xen-stats` is a bit too
> > generic. Would `xen-vcpus-stats`, or maybe something with `time` would
> > be better?
> 
> Do you think `xen-vcpus-stats` would be good enough?
> 

I will pick up `xen-vcpus-stats` for v1 if you are not against it.

Thanks,

Matias

> > Also, is it a tools that could be useful enough to be installed by
> > default? Should we at least build it by default so it doesn't rot? (By
> > adding it to only $(TARGETS).)
> 
> I will add to build this tool by default in the next patches' version.
>  
> > > +	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(LDLIBS_libxenforeignmemory) $(APPEND_LDFLAGS)
> > > +
> > >  -include $(DEPS_INCLUDE)
> > > diff --git a/tools/misc/xen-stats.c b/tools/misc/xen-stats.c
> > > new file mode 100644
> > > index 0000000000..5d4a3239cc
> > > --- /dev/null
> > > +++ b/tools/misc/xen-stats.c
> > > @@ -0,0 +1,83 @@
> > > +#include <err.h>
> > > +#include <errno.h>
> > > +#include <error.h>
> > > +#include <stdio.h>
> > > +#include <stdlib.h>
> > > +#include <string.h>
> > > +#include <sys/mman.h>
> > > +#include <signal.h>
> > > +
> > > +#include <xenctrl.h>
> > 
> > It seems overkill to use this header when the tool only use
> > xenforeignmemory interface. But I don't know how to replace
> > XC_PAGE_SHIFT, so I guess that's ok.
> > 
> > > +#include <xenforeignmemory.h>
> > > +#include <xen-tools/libs.h>
> > 
> > What do you use this headers for? Is it left over?
> 
> `xenforeignmemory.h` is used for `xenforeignmemory_*` functions.
> `xen-tools/libs.h` is left over so I will remove it in next version.
> 
> > > +static sig_atomic_t interrupted;
> > > +static void close_handler(int signum)
> > > +{
> > > +    interrupted = 1;
> > > +}
> > > +
> > > +int main(int argc, char **argv)
> > > +{
> > > +    xenforeignmemory_handle *fh;
> > > +    xenforeignmemory_resource_handle *res;
> > > +    size_t size;
> > > +    int rc, nr_frames, domid, frec, vcpu;
> > > +    uint64_t * info;
> > > +    struct sigaction act;
> > > +
> > > +    if (argc != 4 ) {
> > > +        fprintf(stderr, "Usage: %s <domid> <vcpu> <period>\n", argv[0]);
> > > +        return 1;
> > > +    }
> > > +
> > > +    // TODO: this depends on the resource
> > > +    nr_frames = 1;
> > > +
> > > +    domid = atoi(argv[1]);
> > > +    frec = atoi(argv[3]);
> > > +    vcpu = atoi(argv[2]);
> > 
> > Can you swap the last two line? I think it is better if the order is the same
> > as on the command line.
> 
> Yes, I can.
> 
> > > +
> > > +    act.sa_handler = close_handler;
> > > +    act.sa_flags = 0;
> > > +    sigemptyset(&act.sa_mask);
> > > +    sigaction(SIGHUP,  &act, NULL);
> > > +    sigaction(SIGTERM, &act, NULL);
> > > +    sigaction(SIGINT,  &act, NULL);
> > > +    sigaction(SIGALRM, &act, NULL);
> > > +
> > > +    fh = xenforeignmemory_open(NULL, 0);
> > > +
> > > +    if ( !fh )
> > > +        err(1, "xenforeignmemory_open");
> > > +
> > > +    rc = xenforeignmemory_resource_size(
> > > +        fh, domid, XENMEM_resource_stats_table,
> > > +        vcpu, &size);
> > > +
> > > +    if ( rc )
> > > +        err(1, "    Fail: Get size: %d - %s\n", errno, strerror(errno));
> > 
> > It seems that err() already does print strerror(), and add a "\n", so
> > why print it again? Also, if we have strerror(), what the point of
> > printing "errno"?
> 
> I will remove errno, strerror(errno), and the extra "\n".
> 
> > Also, I'm not sure the extra indentation in the error message is really
> > useful, but that doesn't really matter.
> 
> I will remove the indentation.
> 
> > > +
> > > +    if ( (size >> XC_PAGE_SHIFT) != nr_frames )
> > > +        err(1, "    Fail: Get size: expected %u frames, got %zu\n",
> > > +                    nr_frames, size >> XC_PAGE_SHIFT);
> > 
> > err() prints strerror(errno), maybe errx() is better here. 
> 
> I will use errx().
> 
> Thanks,
>  
> > 
> > Thanks,
> > 
> > -- 
> > Anthony PERARD


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] xen/memory : Add stats_table resource type
  2022-05-17 14:33 ` [RFC PATCH 1/2] xen/memory : Add stats_table resource type Matias Ezequiel Vara Larsen
  2022-06-17 20:54   ` George Dunlap
@ 2022-06-22  8:59   ` Jan Beulich
  2022-06-22 10:05     ` George Dunlap
  1 sibling, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2022-06-22  8:59 UTC (permalink / raw)
  To: Matias Ezequiel Vara Larsen
  Cc: Matias Ezequiel Vara Larsen, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Wei Liu, Dario Faggioli,
	xen-devel

On 17.05.2022 16:33, Matias Ezequiel Vara Larsen wrote:
> @@ -287,6 +288,10 @@ static inline void vcpu_runstate_change(
>      }
>  
>      v->runstate.state = new_state;
> +
> +    // WIP: use a different interface
> +    runstate = (uint64_t*)v->stats.va;
> +    memcpy(runstate, &v->runstate.time[0], sizeof(v->runstate.time[0]));
>  }

One remark on top of what George has said: By exposing this information the
way you do, you allow updating and reading of it to be fully asynchronous.
That way a consumer may fetch inconsistent (partially updated) state (and
this would be even more so when further fields would be added). For the
data to be useful, you need to add a mechanism for consumers to know when
an update is in progress, so they can wait and then retry. You'll find a
number of instances of such in the code base.

In general I also have to admit that I'm not sure the exposed data really
qualifies as a "resource", and hence I'm not really convinced of your use
of the resource mapping interface as a vehicle.

Jan


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] xen/memory : Add stats_table resource type
  2022-06-22  8:59   ` Jan Beulich
@ 2022-06-22 10:05     ` George Dunlap
  2022-06-22 10:09       ` Jan Beulich
  0 siblings, 1 reply; 14+ messages in thread
From: George Dunlap @ 2022-06-22 10:05 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Matias Ezequiel Vara Larsen, Matias Ezequiel Vara Larsen,
	Andrew Cooper, Julien Grall, Stefano Stabellini, Wei Liu,
	Dario Faggioli, xen-devel

[-- Attachment #1: Type: text/plain, Size: 819 bytes --]



> On 22 Jun 2022, at 09:59, Jan Beulich <jbeulich@suse.com> wrote:
> 
[snip]
> In general I also have to admit that I'm not sure the exposed data really
> qualifies as a "resource", and hence I'm not really convinced of your use
> of the resource mapping interface as a vehicle.

I’m not sure if I’d call any of the things currently mappable via that interface (ioreq pages, vmcall buffers, etc) “resources”.  I’m not sure why the name was chosen, except perhaps that it was meant to be a more generalized form of “page” or “pages".

The alternate is to try to plumb through a new ad-hoc hypercall.  Andy suggested Matias use this interface specifically to avoid having to do that; and it sounds like he believes the interface was designed specifically for this kind of thing.

 -George

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] xen/memory : Add stats_table resource type
  2022-06-22 10:05     ` George Dunlap
@ 2022-06-22 10:09       ` Jan Beulich
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2022-06-22 10:09 UTC (permalink / raw)
  To: George Dunlap
  Cc: Matias Ezequiel Vara Larsen, Matias Ezequiel Vara Larsen,
	Andrew Cooper, Julien Grall, Stefano Stabellini, Wei Liu,
	Dario Faggioli, xen-devel

On 22.06.2022 12:05, George Dunlap wrote:
>> On 22 Jun 2022, at 09:59, Jan Beulich <jbeulich@suse.com> wrote:
>>
> [snip]
>> In general I also have to admit that I'm not sure the exposed data really
>> qualifies as a "resource", and hence I'm not really convinced of your use
>> of the resource mapping interface as a vehicle.
> 
> I’m not sure if I’d call any of the things currently mappable via that interface (ioreq pages, vmcall buffers, etc) “resources”.  I’m not sure why the name was chosen, except perhaps that it was meant to be a more generalized form of “page” or “pages".
> 
> The alternate is to try to plumb through a new ad-hoc hypercall.  Andy suggested Matias use this interface specifically to avoid having to do that; and it sounds like he believes the interface was designed specifically for this kind of thing.

Okay. I guess I wasn't aware of that suggestion.

Jan


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/2] Add a new acquire resource to query vcpu statistics
  2022-06-17 19:54 ` George Dunlap
@ 2022-06-24 13:06   ` Matias Ezequiel Vara Larsen
  0 siblings, 0 replies; 14+ messages in thread
From: Matias Ezequiel Vara Larsen @ 2022-06-24 13:06 UTC (permalink / raw)
  To: George Dunlap
  Cc: xen-devel, Matias Ezequiel Vara Larsen, Andrew Cooper,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu,
	Dario Faggioli, Anthony Perard

Hello George and thanks for the review! you will find my comments below.

On Fri, Jun 17, 2022 at 07:54:32PM +0000, George Dunlap wrote:
> 
> 
> > On 17 May 2022, at 15:33, Matias Ezequiel Vara Larsen <matiasevara@gmail.com> wrote:
> > 
> > Hello all,
> > 
> > The purpose of this RFC is to get feedback about a new acquire resource that
> > exposes vcpu statistics for a given domain. The current mechanism to get those
> > statistics is by querying the hypervisor. This mechanism relies on a hypercall
> > and holds the domctl spinlock during its execution. When a pv tool like xcp-rrdd
> > periodically samples these counters, it ends up affecting other paths that share
> > that spinlock. By using acquire resources, the pv tool only requires a few
> > hypercalls to set the shared memory region and samples are got without issuing
> > any other hypercall. The original idea has been suggested by Andrew Cooper to
> > which I have been discussing about how to implement the current PoC. You can
> > find the RFC patch series at [1]. The series is rebased on top of stable-4.15.
> > 
> > I am currently a bit blocked on 1) what to expose and 2) how to expose it. For
> > 1), I decided to expose what xcp-rrdd is querying, e.g., XEN_DOMCTL_getvcpuinfo.
> > More precisely, xcp-rrd gets runstate.time[RUNSTATE_running]. This is a uint64_t
> > counter. However, the time spent in other states may be interesting too.
> > Regarding 2), I am not sure if simply using an array of uint64_t is enough or if
> > a different interface should be exposed. The remaining question is when to get
> > new values. For the moment, I am updating this counter during
> > vcpu_runstate_change().
> > 
> > The current series includes a simple pv tool that shows how this new interface is
> > used. This tool maps the counter and periodically samples it.
> > 
> > Any feedback/help would be appreciated.
> 
> Hey Matias,
> 
> Sorry it’s taken so long to get back to you.  My day-to-day job has shifted away from technical things to community management; this has been on my radar but I never made time to dig into it.
> 
> There are some missing details I’ve had to try to piece together about the situation, so let me sketch things out a bit further and see if I understand the situation:
> 
> * xcp-rrd currently wants (at minimum) to record runstate.time[RUNSTATE_running] for each vcpu.  Currently that means calling XEN_DOMCTL_getvcpuinfo, which has to hold a single global domctl_lock (!) for the entire hypercall; and of course must be iterated over every vcpu in the system for every update.
> 

For example, in xcp-ng, xcp-rrdd is sampling all the VCPUs of the system every 5
seconds. Also, xcp-rrdd queries other counters like XEN_DOMCTL_getdomaininfo. 

Out of curiosity, do you know any benchmark to measure the impact of this
querying? My guess is that the time of domctl-based operations would increase
with the number of VCPUs. In such a escenario, I am supposing that the time to
query all vcpus increase with the number of vcpus thus holding the domctl_lock
longer. However, this would be only observable in a large
enough system.

> * VCPUOP_get_runstate_info copies out a vcpu_runstate_info struct, which contains information on the other runstates.  Additionally, VCPUOP_register_runstate_memory_area already does something similar to what you want: it passes a virtual address to Xen, which Xen maps, and copies information about the various vcpus into (in update_runstate_area()).
> 
> * However, the above assumes a domain of “current->domain”: That is a domain can call VCPUOP_get_runstate_info on one of its own vcpus, but dom0 cannot call it to get information about the vcpus of other domains.
> 
> * Additionally, VCPUOP_register_runstate_memory_area registers by *virtual address*; this is actually problematic even for guest kernels looking at their own vcpus; but would be completely inappropriate for a dom0 userspace application, which is what you’re looking at.
> 

I just learned about VCPUOP_register_runstate_memory_area a few days ago. I did
not know that it is only for current->domain. Thanks for pointing it out.

> Your solution is to expose things via the xenforeignmemory interface instead, modelled after the vmtrace_buf functionality.
> 
> Does that all sound right?
> 

That's correct. I used vmtrace_buf functionality for inspiration.

> I think at a high level that’s probably the right way to go.
> 
> As you say, my default would be to expose similar information as VCPUOP_get_runstate_info.  I’d even consider just using vcpu_runstate_info_t.
> 
> The other option would be to try to make the page a more general “foreign vcpu info” page, which we could expand with more information as we find it useful.
> 
> In this patch, you’re allocating 4k *per vcpu on the entire system* to hold a single 64-bit value; even if you decide to use vcpu_runstate_info_t, there’s still quite a large wastage.  Would it make sense rather to have this pass back an array of MFNs designed to be mapped contiguously, with the vcpus listed as an array? This seems to be what XENMEM_resource_ioreq_server does.
> 
> The advantage of making the structure extensible is that we wouldn’t need to add another interface, and potentially another full page, if we wanted to add more functionality that we wanted to export.  On the other hand, every new functionality that we add may require adding code to copy things into it; making it so that such code is added bit by bit as it’s requested might be better.
> 

Current PoC is indeed a waste of memory in two senses:
1) data structures are not well chosen 
2) memory is being allocated unconditionally

For 1), you propose to use an extensible structure on top of an array of MFNs. I
checked xen.git/xen/include/public/hvm/ioreq.h, it defines the
structure:

struct shared_iopage {
    struct ioreq vcpu_ioreq[1];
};

And then, it accesses it as:

p->vcpu_ioreq[v->vcpu_id];

I could have similar structures, let me sketch it and then I will write down a
design document. The extensible structures could look like:

struct vcpu_stats{ 
   uint64 runstate_running_time;
   // potentially other runstate-time counters
};

struct shared_statspages {
   // potentially other counters, e.g., domain-info
   struct vcpu_stats vcpu_info[1]
}; 

The shared_statspage structure would be mapped on top of an array of continuous
MFNs. The vcpus are listed as an array. I think this structure could be extended
to record per-domain counters by defining them just before vcpu_info[1].  

What do you think?

For 2), you propose a domctl flag on domain creation to enable/disable the
allocation and use of these buffers. I think that is the right way to go for the
moment.

There is a 3) point regarding what Jan suggested about how to ensure that the
consumed data is consistent. I do not have a response for that yet, I will think
about it.

I will address these points and submit v1 in the next few weeks.   

Thanks, Matias.

> I have some more comments I’ll give on the 1/2 patch.
> 
>  -George
> 
> 
> 
> 
> 
> 
> > 
> > Thanks, Matias.
> > 
> > [1] https://github.com/MatiasVara/xen/tree/feature_stats
> > 
> > Matias Ezequiel Vara Larsen (2):
> >  xen/memory : Add stats_table resource type
> >  tools/misc: Add xen-stats tool
> > 
> > tools/misc/Makefile         |  5 +++
> > tools/misc/xen-stats.c      | 83 +++++++++++++++++++++++++++++++++++++
> > xen/common/domain.c         | 42 +++++++++++++++++++
> > xen/common/memory.c         | 29 +++++++++++++
> > xen/common/sched/core.c     |  5 +++
> > xen/include/public/memory.h |  1 +
> > xen/include/xen/sched.h     |  5 +++
> > 7 files changed, 170 insertions(+)
> > create mode 100644 tools/misc/xen-stats.c
> > 
> > --
> > 2.25.1
> > 
> 




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] xen/memory : Add stats_table resource type
  2022-06-17 20:54   ` George Dunlap
@ 2022-06-29  9:44     ` Matias Ezequiel Vara Larsen
  0 siblings, 0 replies; 14+ messages in thread
From: Matias Ezequiel Vara Larsen @ 2022-06-29  9:44 UTC (permalink / raw)
  To: George Dunlap
  Cc: xen-devel, Matias Ezequiel Vara Larsen, Andrew Cooper,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu,
	Dario Faggioli

On Fri, Jun 17, 2022 at 08:54:34PM +0000, George Dunlap wrote:
> Preface: It looks like this may be one of your first hypervisor patches.  So thank you!  FYI there’s a lot that needs fixing up here; please don’t read a tone of annoyance into it, I’m just trying to tell you what needs fixing in the most efficient manner. :-)
> 

Thanks for the comments :) I have addressed some of them already in the response to the
cover letter.

> > On 17 May 2022, at 15:33, Matias Ezequiel Vara Larsen <matiasevara@gmail.com> wrote:
> > 
> > Allow to map vcpu stats using acquire_resource().
> 
> This needs a lot more expansion in terms of what it is that you’re doing in this patch and why.
> 

Got it. I'll improve the commit message in v1.

> > 
> > Signed-off-by: Matias Ezequiel Vara Larsen <matias.vara@vates.fr>
> > ---
> > xen/common/domain.c         | 42 +++++++++++++++++++++++++++++++++++++
> > xen/common/memory.c         | 29 +++++++++++++++++++++++++
> > xen/common/sched/core.c     |  5 +++++
> > xen/include/public/memory.h |  1 +
> > xen/include/xen/sched.h     |  5 +++++
> > 5 files changed, 82 insertions(+)
> > 
> > diff --git a/xen/common/domain.c b/xen/common/domain.c
> > index 17cc32fde3..ddd9f88874 100644
> > --- a/xen/common/domain.c
> > +++ b/xen/common/domain.c
> > @@ -132,6 +132,42 @@ static void vcpu_info_reset(struct vcpu *v)
> >     v->vcpu_info_mfn = INVALID_MFN;
> > }
> > 
> > +static void stats_free_buffer(struct vcpu * v)
> > +{
> > +    struct page_info *pg = v->stats.pg;
> > +
> > +    if ( !pg )
> > +        return;
> > +
> > +    v->stats.va = NULL;
> > +
> > +    if ( v->stats.va )
> > +        unmap_domain_page_global(v->stats.va);
> > +
> > +    v->stats.va = NULL;
> 
> Looks like you meant to delete the first `va->stats.va = NULL` but forgot?
> 

Apologies for this, I completely missed.

> > +
> > +    free_domheap_page(pg);
> 
> Pretty sure this will crash.
> 
> Unfortunately page allocation and freeing is somewhat complicated and requires a bit of boilerplate.  You can look at the vmtrace_alloc_buffer() code for a template, but the general sequence you want is as follows:
> 
> * On the allocate side:
> 
> 1. Allocate the page
> 
>    pg = alloc_domheap_page(d, MEMF_no_refcount);
> 
> This will allocate a page with the PGC_allocated bit set and a single reference count.  (If you pass a page with PGC_allocated set to free_domheap_page(), it will crash; which is why I said the above.)
> 
> 2.  Grab a general reference count for the vcpu struct’s reference to it; as well as a writable type count, so that it doesn’t get used as a special page
> 
> if (get_page_and_type(pg, d, PGT_writable_page)) {
>   put_page_alloc_ref(p);
>   /* failure path */
> }
> 
> * On the free side, don’t call free_domheap_pages() directly.  Rather, drop the allocation, then drop your own type count, thus:
> 
> v->stats.va <http://stats.va/> = NULL;
> 
> put_page_alloc_ref(pg);
> put_page_and_type(pg);
> 
> The issue here is that we can’t free the page until all references have dropped; and the whole point of this exercise is to allow guest userspace in dom0 to gain a reference to the page.  We can’t actually free the page until *all* references are gone, including the userspace one in dom0.  The put_page() which brings the reference count to 0 will automatically free the page.
> 

Thanks for the explanation. For some reason, this is not crashing. I will
reimplement the allocation, releasing, and then update the documentation that I
proposed at
https://lists.xenproject.org/archives/html/xen-devel/2022-05/msg01963.html. The
idea of this reference document is to guide someone that would like to export a new
resource by relying on the acquire-resource interface. 

> 
> > +}
> > +
> > +static int stats_alloc_buffer(struct vcpu *v)
> > +{
> > +    struct domain *d = v->domain;
> > +    struct page_info *pg;
> > +
> > +    pg = alloc_domheap_page(d, MEMF_no_refcount);
> > +
> > +    if ( !pg )
> > +        return -ENOMEM;
> > +
> > +    v->stats.va = __map_domain_page_global(pg);
> > +    if ( !v->stats.va )
> > +        return -ENOMEM;
> > +
> > +    v->stats.pg = pg;
> > +    clear_page(v->stats.va);
> > +    return 0;
> > +}
> 
> The other thing to say about this is that the memory is being allocated unconditionally, even if nobody is planning to read it.  The vast majority of Xen users will not be running xcp-rrd, so it will be pointless overheard.
> 
> At a basic level, you want to follow suit with the vmtrace buffers, which are only allocated if the proper domctl flag is set on domain creation.  (We could consider instead, or in addition, having a global Xen command-line parameter which would enable this feature for all domains.)
>

I agree. I will add a domctl flag on domain creation to enable the allocation of
these buffers.
 
> > +
> > static void vmtrace_free_buffer(struct vcpu *v)
> > {
> >     const struct domain *d = v->domain;
> > @@ -203,6 +239,9 @@ static int vmtrace_alloc_buffer(struct vcpu *v)
> >  */
> > static int vcpu_teardown(struct vcpu *v)
> > {
> > +
> > +    stats_free_buffer(v);
> > +
> >     vmtrace_free_buffer(v);
> > 
> >     return 0;
> > @@ -269,6 +308,9 @@ struct vcpu *vcpu_create(struct domain *d, unsigned int vcpu_id)
> >     if ( vmtrace_alloc_buffer(v) != 0 )
> >         goto fail_wq;
> > 
> > +    if ( stats_alloc_buffer(v) != 0 )
> > +        goto fail_wq;
> > +
> >     if ( arch_vcpu_create(v) != 0 )
> >         goto fail_sched;
> > 
> > diff --git a/xen/common/memory.c b/xen/common/memory.c
> > index 297b98a562..39de6d9d05 100644
> > --- a/xen/common/memory.c
> > +++ b/xen/common/memory.c
> > @@ -1099,6 +1099,10 @@ static unsigned int resource_max_frames(const struct domain *d,
> >     case XENMEM_resource_vmtrace_buf:
> >         return d->vmtrace_size >> PAGE_SHIFT;
> > 
> > +    // WIP: to figure out the correct size of the resource
> > +    case XENMEM_resource_stats_table:
> > +        return 1;
> > +
> >     default:
> >         return -EOPNOTSUPP;
> >     }
> > @@ -1162,6 +1166,28 @@ static int acquire_vmtrace_buf(
> >     return nr_frames;
> > }
> > 
> > +static int acquire_stats_table(struct domain *d,
> > +                                unsigned int id,
> > +                                unsigned int frame,
> > +                                unsigned int nr_frames,
> > +                                xen_pfn_t mfn_list[])
> > +{
> > +    const struct vcpu *v = domain_vcpu(d, id);
> > +    mfn_t mfn;
> > +
> 
> Maybe I’m paranoid, but I might add an “ASSERT(nr_frames == 1)” here
>

Thanks, I will have this in mind in v1.
 
> > +    if ( !v )
> > +        return -ENOENT;
> > +
> > +    if ( !v->stats.pg )
> > +        return -EINVAL;
> > +
> > +    mfn = page_to_mfn(v->stats.pg);
> > +    mfn_list[0] = mfn_x(mfn);
> > +
> > +    printk("acquire_perf_table: id: %d, nr_frames: %d, %p, domainid: %d\n", id, nr_frames, v->stats.pg, d->domain_id);
> > +    return 1;
> > +}
> > +
> > /*
> >  * Returns -errno on error, or positive in the range [1, nr_frames] on
> >  * success.  Returning less than nr_frames contitutes a request for a
> > @@ -1182,6 +1208,9 @@ static int _acquire_resource(
> >     case XENMEM_resource_vmtrace_buf:
> >         return acquire_vmtrace_buf(d, id, frame, nr_frames, mfn_list);
> > 
> > +    case XENMEM_resource_stats_table:
> > +        return acquire_stats_table(d, id, frame, nr_frames, mfn_list);
> > +
> >     default:
> >         return -EOPNOTSUPP;
> >     }
> > diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> > index 8f4b1ca10d..2a8b534977 100644
> > --- a/xen/common/sched/core.c
> > +++ b/xen/common/sched/core.c
> > @@ -264,6 +264,7 @@ static inline void vcpu_runstate_change(
> > {
> >     s_time_t delta;
> >     struct sched_unit *unit = v->sched_unit;
> > +    uint64_t * runstate;
> > 
> >     ASSERT(spin_is_locked(get_sched_res(v->processor)->schedule_lock));
> >     if ( v->runstate.state == new_state )
> > @@ -287,6 +288,10 @@ static inline void vcpu_runstate_change(
> >     }
> > 
> >     v->runstate.state = new_state;
> > +
> > +    // WIP: use a different interface
> > +    runstate = (uint64_t*)v->stats.va;
> 
> I think you should look at xen.git/xen/include/public/hvm/ioreq.h:shared_iopage_t for inspiration.  Basically, you cast the void pointer to that type, and then just access `iopage->vcpu_ioreq[vcpuid]`.   Put it in a public header, and then both the userspace consumer and Xen can use the same structure.
> 

Thanks for pointing it out. I've addresed this comment in the response to the cover
letter.

> As I said in my response to the cover letter, I think vcpu_runstate_info_t would be something to look at and gain inspiration from.
> 
> The other thing to say here is that this is a hot path; we don’t want to be copying lots of information around if it’s not going to be used.  So you should only allocate the buffers if specifically enabled, and you should only copy the information over if v->stats.va <http://stats.va/> != NULL.
> 
> I think that should be enough to start with; we can nail down more when you send v1.
> 

Thanks for the comments, I will tackle them in v1.

Matias

> Peace,
>  -George
> 




^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-06-29  9:44 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-17 14:33 [RFC PATCH 0/2] Add a new acquire resource to query vcpu statistics Matias Ezequiel Vara Larsen
2022-05-17 14:33 ` [RFC PATCH 1/2] xen/memory : Add stats_table resource type Matias Ezequiel Vara Larsen
2022-06-17 20:54   ` George Dunlap
2022-06-29  9:44     ` Matias Ezequiel Vara Larsen
2022-06-22  8:59   ` Jan Beulich
2022-06-22 10:05     ` George Dunlap
2022-06-22 10:09       ` Jan Beulich
2022-05-17 14:33 ` [RFC PATCH 2/2] tools/misc: Add xen-stats tool Matias Ezequiel Vara Larsen
2022-05-31 11:16   ` Anthony PERARD
2022-06-03 11:08     ` Matias Ezequiel Vara Larsen
2022-06-20  8:56       ` Matias Ezequiel Vara Larsen
2022-06-17  3:26 ` [RFC PATCH 0/2] Add a new acquire resource to query vcpu statistics Henry Wang
2022-06-17 19:54 ` George Dunlap
2022-06-24 13:06   ` Matias Ezequiel Vara Larsen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.