All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Yuval Shaia <yuval.shaia@oracle.com>
Cc: marcel.apfelbaum@gmail.com, armbru@redhat.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v4 4/9] {hmp, hw/pvrdma}: Expose device internals via monitor interface
Date: Wed, 6 Mar 2019 12:20:49 +0000	[thread overview]
Message-ID: <20190306122048.GE2727@work-vm> (raw)
In-Reply-To: <20190306102218.GB7486@lap1>

* Yuval Shaia (yuval.shaia@oracle.com) wrote:
> On Sun, Mar 03, 2019 at 10:33:40PM +0200, Yuval Shaia wrote:
> > Allow interrogating device internals through HMP interface.
> > The exposed indicators can be used for troubleshooting by developers or
> > sysadmin.
> > There is no need to expose these attributes to a management system (e.x.
> > libvirt) because (1) most of them are not "device-management' related
> > info and (2) there is no guarantee the interface is stable.
> > 
> > Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
> > ---
> >  hmp-commands-info.hx       | 14 ++++++++
> >  hmp.c                      | 27 +++++++++++++++
> >  hmp.h                      |  1 +
> >  hw/rdma/Makefile.objs      |  2 +-
> >  hw/rdma/rdma_backend.c     | 70 +++++++++++++++++++++++++++++---------
> >  hw/rdma/rdma_hmp.c         | 30 ++++++++++++++++
> >  hw/rdma/rdma_rm.c          | 60 ++++++++++++++++++++++++++++++++
> >  hw/rdma/rdma_rm.h          |  1 +
> >  hw/rdma/rdma_rm_defs.h     | 27 ++++++++++++++-
> >  hw/rdma/vmw/pvrdma.h       |  5 +++
> >  hw/rdma/vmw/pvrdma_main.c  | 21 ++++++++++++
> >  include/hw/rdma/rdma_hmp.h | 40 ++++++++++++++++++++++
> >  12 files changed, 279 insertions(+), 19 deletions(-)
> >  create mode 100644 hw/rdma/rdma_hmp.c
> >  create mode 100644 include/hw/rdma/rdma_hmp.h
> > 
> > diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
> > index cbee8b944d..c59444c461 100644
> > --- a/hmp-commands-info.hx
> > +++ b/hmp-commands-info.hx
> > @@ -202,6 +202,20 @@ STEXI
> >  @item info pic
> >  @findex info pic
> >  Show PIC state.
> > +ETEXI
> > +
> > +    {
> > +        .name       = "rdma",
> > +        .args_type  = "",
> > +        .params     = "",
> > +        .help       = "show RDMA state",
> > +        .cmd        = hmp_info_rdma,
> > +    },
> > +
> > +STEXI
> > +@item info rdma
> > +@findex info rdma
> > +Show RDMA state.
> >  ETEXI
> >  
> >      {
> > diff --git a/hmp.c b/hmp.c
> > index 1e006eeb49..68511b2441 100644
> > --- a/hmp.c
> > +++ b/hmp.c
> > @@ -51,6 +51,7 @@
> >  #include "qemu/error-report.h"
> >  #include "exec/ramlist.h"
> >  #include "hw/intc/intc.h"
> > +#include "hw/rdma/rdma_hmp.h"
> >  #include "migration/snapshot.h"
> >  #include "migration/misc.h"
> >  
> > @@ -968,6 +969,32 @@ void hmp_info_pic(Monitor *mon, const QDict *qdict)
> >                                     hmp_info_pic_foreach, mon);
> >  }
> >  
> > +static int hmp_info_rdma_foreach(Object *obj, void *opaque)
> > +{
> > +    RdmaStatsProvider *rdma;
> > +    RdmaStatsProviderClass *k;
> > +    Monitor *mon = opaque;
> > +
> > +    if (object_dynamic_cast(obj, TYPE_RDMA_STATS_PROVIDER)) {
> > +        rdma = RDMA_STATS_PROVIDER(obj);
> > +        k = RDMA_STATS_PROVIDER_GET_CLASS(obj);
> > +        if (k->print_statistics) {
> > +            k->print_statistics(mon, rdma);
> > +        } else {
> > +            monitor_printf(mon, "RDMA statistics not available for %s.\n",
> > +                           object_get_typename(obj));
> > +        }
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> > +void hmp_info_rdma(Monitor *mon, const QDict *qdict)
> > +{
> > +    object_child_foreach_recursive(object_get_root(),
> > +                                   hmp_info_rdma_foreach, mon);
> > +}
> > +
> 
> Hi Markus and David,
> Is this implementation is acceptable by you?

I think from the HMP side I'm fine, so:

Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

I would say that there's really two parts to this patch, all the places
you change your code to gather the stats, and the separate pieces that
wire it into HMP;  if you had to repost for another reason I'd
split the patch like that.

Dave

> Thanks,
> Yuval
> 
> 
> >  void hmp_info_pci(Monitor *mon, const QDict *qdict)
> >  {
> >      PciInfoList *info_list, *info;
> > diff --git a/hmp.h b/hmp.h
> > index 5f1addcca2..666949afc3 100644
> > --- a/hmp.h
> > +++ b/hmp.h
> > @@ -36,6 +36,7 @@ void hmp_info_spice(Monitor *mon, const QDict *qdict);
> >  void hmp_info_balloon(Monitor *mon, const QDict *qdict);
> >  void hmp_info_irq(Monitor *mon, const QDict *qdict);
> >  void hmp_info_pic(Monitor *mon, const QDict *qdict);
> > +void hmp_info_rdma(Monitor *mon, const QDict *qdict);
> >  void hmp_info_pci(Monitor *mon, const QDict *qdict);
> >  void hmp_info_block_jobs(Monitor *mon, const QDict *qdict);
> >  void hmp_info_tpm(Monitor *mon, const QDict *qdict);
> > diff --git a/hw/rdma/Makefile.objs b/hw/rdma/Makefile.objs
> > index bd36cbf51c..dd59faff51 100644
> > --- a/hw/rdma/Makefile.objs
> > +++ b/hw/rdma/Makefile.objs
> > @@ -1,5 +1,5 @@
> >  ifeq ($(CONFIG_PVRDMA),y)
> > -obj-$(CONFIG_PCI) += rdma_utils.o rdma_backend.o rdma_rm.o
> > +obj-$(CONFIG_PCI) += rdma_utils.o rdma_backend.o rdma_rm.o rdma_hmp.o
> >  obj-$(CONFIG_PCI) += vmw/pvrdma_dev_ring.o vmw/pvrdma_cmd.o \
> >                       vmw/pvrdma_qp_ops.o vmw/pvrdma_main.o
> >  endif
> > diff --git a/hw/rdma/rdma_backend.c b/hw/rdma/rdma_backend.c
> > index 9679b842d1..bc2fefcf93 100644
> > --- a/hw/rdma/rdma_backend.c
> > +++ b/hw/rdma/rdma_backend.c
> > @@ -64,9 +64,9 @@ static inline void complete_work(enum ibv_wc_status status, uint32_t vendor_err,
> >      comp_handler(ctx, &wc);
> >  }
> >  
> > -static void rdma_poll_cq(RdmaDeviceResources *rdma_dev_res, struct ibv_cq *ibcq)
> > +static int rdma_poll_cq(RdmaDeviceResources *rdma_dev_res, struct ibv_cq *ibcq)
> >  {
> > -    int i, ne;
> > +    int i, ne, total_ne = 0;
> >      BackendCtx *bctx;
> >      struct ibv_wc wc[2];
> >  
> > @@ -89,12 +89,18 @@ static void rdma_poll_cq(RdmaDeviceResources *rdma_dev_res, struct ibv_cq *ibcq)
> >              rdma_rm_dealloc_cqe_ctx(rdma_dev_res, wc[i].wr_id);
> >              g_free(bctx);
> >          }
> > +        total_ne += ne;
> >      } while (ne > 0);
> > +    atomic_sub(&rdma_dev_res->stats.missing_cqe, total_ne);
> >      qemu_mutex_unlock(&rdma_dev_res->lock);
> >  
> >      if (ne < 0) {
> >          rdma_error_report("ibv_poll_cq fail, rc=%d, errno=%d", ne, errno);
> >      }
> > +
> > +    rdma_dev_res->stats.completions += total_ne;
> > +
> > +    return total_ne;
> >  }
> >  
> >  static void *comp_handler_thread(void *arg)
> > @@ -122,6 +128,9 @@ static void *comp_handler_thread(void *arg)
> >      while (backend_dev->comp_thread.run) {
> >          do {
> >              rc = qemu_poll_ns(pfds, 1, THR_POLL_TO * (int64_t)SCALE_MS);
> > +            if (!rc) {
> > +                backend_dev->rdma_dev_res->stats.poll_cq_ppoll_to++;
> > +            }
> >          } while (!rc && backend_dev->comp_thread.run);
> >  
> >          if (backend_dev->comp_thread.run) {
> > @@ -138,6 +147,7 @@ static void *comp_handler_thread(void *arg)
> >                                    errno);
> >              }
> >  
> > +            backend_dev->rdma_dev_res->stats.poll_cq_from_bk++;
> >              rdma_poll_cq(backend_dev->rdma_dev_res, ev_cq);
> >  
> >              ibv_ack_cq_events(ev_cq, 1);
> > @@ -271,7 +281,13 @@ int rdma_backend_query_port(RdmaBackendDev *backend_dev,
> >  
> >  void rdma_backend_poll_cq(RdmaDeviceResources *rdma_dev_res, RdmaBackendCQ *cq)
> >  {
> > -    rdma_poll_cq(rdma_dev_res, cq->ibcq);
> > +    int polled;
> > +
> > +    rdma_dev_res->stats.poll_cq_from_guest++;
> > +    polled = rdma_poll_cq(rdma_dev_res, cq->ibcq);
> > +    if (!polled) {
> > +        rdma_dev_res->stats.poll_cq_from_guest_empty++;
> > +    }
> >  }
> >  
> >  static GHashTable *ah_hash;
> > @@ -333,7 +349,7 @@ static void ah_cache_init(void)
> >  
> >  static int build_host_sge_array(RdmaDeviceResources *rdma_dev_res,
> >                                  struct ibv_sge *dsge, struct ibv_sge *ssge,
> > -                                uint8_t num_sge)
> > +                                uint8_t num_sge, uint64_t *total_length)
> >  {
> >      RdmaRmMR *mr;
> >      int ssge_idx;
> > @@ -349,6 +365,8 @@ static int build_host_sge_array(RdmaDeviceResources *rdma_dev_res,
> >          dsge->length = ssge[ssge_idx].length;
> >          dsge->lkey = rdma_backend_mr_lkey(&mr->backend_mr);
> >  
> > +        *total_length += dsge->length;
> > +
> >          dsge++;
> >      }
> >  
> > @@ -445,8 +463,10 @@ void rdma_backend_post_send(RdmaBackendDev *backend_dev,
> >              rc = mad_send(backend_dev, sgid_idx, sgid, sge, num_sge);
> >              if (rc) {
> >                  complete_work(IBV_WC_GENERAL_ERR, VENDOR_ERR_MAD_SEND, ctx);
> > +                backend_dev->rdma_dev_res->stats.mad_tx_err++;
> >              } else {
> >                  complete_work(IBV_WC_SUCCESS, 0, ctx);
> > +                backend_dev->rdma_dev_res->stats.mad_tx++;
> >              }
> >          }
> >          return;
> > @@ -458,20 +478,21 @@ void rdma_backend_post_send(RdmaBackendDev *backend_dev,
> >      rc = rdma_rm_alloc_cqe_ctx(backend_dev->rdma_dev_res, &bctx_id, bctx);
> >      if (unlikely(rc)) {
> >          complete_work(IBV_WC_GENERAL_ERR, VENDOR_ERR_NOMEM, ctx);
> > -        goto out_free_bctx;
> > +        goto err_free_bctx;
> >      }
> >  
> > -    rc = build_host_sge_array(backend_dev->rdma_dev_res, new_sge, sge, num_sge);
> > +    rc = build_host_sge_array(backend_dev->rdma_dev_res, new_sge, sge, num_sge,
> > +                              &backend_dev->rdma_dev_res->stats.tx_len);
> >      if (rc) {
> >          complete_work(IBV_WC_GENERAL_ERR, rc, ctx);
> > -        goto out_dealloc_cqe_ctx;
> > +        goto err_dealloc_cqe_ctx;
> >      }
> >  
> >      if (qp_type == IBV_QPT_UD) {
> >          wr.wr.ud.ah = create_ah(backend_dev, qp->ibpd, sgid_idx, dgid);
> >          if (!wr.wr.ud.ah) {
> >              complete_work(IBV_WC_GENERAL_ERR, VENDOR_ERR_FAIL_BACKEND, ctx);
> > -            goto out_dealloc_cqe_ctx;
> > +            goto err_dealloc_cqe_ctx;
> >          }
> >          wr.wr.ud.remote_qpn = dqpn;
> >          wr.wr.ud.remote_qkey = dqkey;
> > @@ -488,15 +509,19 @@ void rdma_backend_post_send(RdmaBackendDev *backend_dev,
> >          rdma_error_report("ibv_post_send fail, qpn=0x%x, rc=%d, errno=%d",
> >                            qp->ibqp->qp_num, rc, errno);
> >          complete_work(IBV_WC_GENERAL_ERR, VENDOR_ERR_FAIL_BACKEND, ctx);
> > -        goto out_dealloc_cqe_ctx;
> > +        goto err_dealloc_cqe_ctx;
> >      }
> >  
> > +    atomic_inc(&backend_dev->rdma_dev_res->stats.missing_cqe);
> > +    backend_dev->rdma_dev_res->stats.tx++;
> > +
> >      return;
> >  
> > -out_dealloc_cqe_ctx:
> > +err_dealloc_cqe_ctx:
> > +    backend_dev->rdma_dev_res->stats.tx_err++;
> >      rdma_rm_dealloc_cqe_ctx(backend_dev->rdma_dev_res, bctx_id);
> >  
> > -out_free_bctx:
> > +err_free_bctx:
> >      g_free(bctx);
> >  }
> >  
> > @@ -554,6 +579,9 @@ void rdma_backend_post_recv(RdmaBackendDev *backend_dev,
> >              rc = save_mad_recv_buffer(backend_dev, sge, num_sge, ctx);
> >              if (rc) {
> >                  complete_work(IBV_WC_GENERAL_ERR, rc, ctx);
> > +                rdma_dev_res->stats.mad_rx_bufs_err++;
> > +            } else {
> > +                rdma_dev_res->stats.mad_rx_bufs++;
> >              }
> >          }
> >          return;
> > @@ -565,13 +593,14 @@ void rdma_backend_post_recv(RdmaBackendDev *backend_dev,
> >      rc = rdma_rm_alloc_cqe_ctx(rdma_dev_res, &bctx_id, bctx);
> >      if (unlikely(rc)) {
> >          complete_work(IBV_WC_GENERAL_ERR, VENDOR_ERR_NOMEM, ctx);
> > -        goto out_free_bctx;
> > +        goto err_free_bctx;
> >      }
> >  
> > -    rc = build_host_sge_array(rdma_dev_res, new_sge, sge, num_sge);
> > +    rc = build_host_sge_array(rdma_dev_res, new_sge, sge, num_sge,
> > +                              &backend_dev->rdma_dev_res->stats.rx_bufs_len);
> >      if (rc) {
> >          complete_work(IBV_WC_GENERAL_ERR, rc, ctx);
> > -        goto out_dealloc_cqe_ctx;
> > +        goto err_dealloc_cqe_ctx;
> >      }
> >  
> >      wr.num_sge = num_sge;
> > @@ -582,15 +611,19 @@ void rdma_backend_post_recv(RdmaBackendDev *backend_dev,
> >          rdma_error_report("ibv_post_recv fail, qpn=0x%x, rc=%d, errno=%d",
> >                            qp->ibqp->qp_num, rc, errno);
> >          complete_work(IBV_WC_GENERAL_ERR, VENDOR_ERR_FAIL_BACKEND, ctx);
> > -        goto out_dealloc_cqe_ctx;
> > +        goto err_dealloc_cqe_ctx;
> >      }
> >  
> > +    atomic_inc(&backend_dev->rdma_dev_res->stats.missing_cqe);
> > +    rdma_dev_res->stats.rx_bufs++;
> > +
> >      return;
> >  
> > -out_dealloc_cqe_ctx:
> > +err_dealloc_cqe_ctx:
> > +    backend_dev->rdma_dev_res->stats.rx_bufs_err++;
> >      rdma_rm_dealloc_cqe_ctx(rdma_dev_res, bctx_id);
> >  
> > -out_free_bctx:
> > +err_free_bctx:
> >      g_free(bctx);
> >  }
> >  
> > @@ -929,12 +962,14 @@ static void process_incoming_mad_req(RdmaBackendDev *backend_dev,
> >      bctx = rdma_rm_get_cqe_ctx(backend_dev->rdma_dev_res, cqe_ctx_id);
> >      if (unlikely(!bctx)) {
> >          rdma_error_report("No matching ctx for req %ld", cqe_ctx_id);
> > +        backend_dev->rdma_dev_res->stats.mad_rx_err++;
> >          return;
> >      }
> >  
> >      mad = rdma_pci_dma_map(backend_dev->dev, bctx->sge.addr,
> >                             bctx->sge.length);
> >      if (!mad || bctx->sge.length < msg->umad_len + MAD_HDR_SIZE) {
> > +        backend_dev->rdma_dev_res->stats.mad_rx_err++;
> >          complete_work(IBV_WC_GENERAL_ERR, VENDOR_ERR_INV_MAD_BUFF,
> >                        bctx->up_ctx);
> >      } else {
> > @@ -949,6 +984,7 @@ static void process_incoming_mad_req(RdmaBackendDev *backend_dev,
> >          wc.byte_len = msg->umad_len;
> >          wc.status = IBV_WC_SUCCESS;
> >          wc.wc_flags = IBV_WC_GRH;
> > +        backend_dev->rdma_dev_res->stats.mad_rx++;
> >          comp_handler(bctx->up_ctx, &wc);
> >      }
> >  
> > diff --git a/hw/rdma/rdma_hmp.c b/hw/rdma/rdma_hmp.c
> > new file mode 100644
> > index 0000000000..c5814473c5
> > --- /dev/null
> > +++ b/hw/rdma/rdma_hmp.c
> > @@ -0,0 +1,30 @@
> > +/*
> > + * RDMA device: Human Monitor interface
> > + *
> > + * Copyright (C) 2018 Oracle
> > + * Copyright (C) 2018 Red Hat Inc
> > + *
> > + * Authors:
> > + *     Yuval Shaia <yuval.shaia@oracle.com>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "hw/rdma/rdma_hmp.h"
> > +#include "qemu/module.h"
> > +
> > +static const TypeInfo rdma_hmp_info = {
> > +    .name = TYPE_RDMA_STATS_PROVIDER,
> > +    .parent = TYPE_INTERFACE,
> > +    .class_size = sizeof(RdmaStatsProviderClass),
> > +};
> > +
> > +static void rdma_hmp_register_types(void)
> > +{
> > +    type_register_static(&rdma_hmp_info);
> > +}
> > +
> > +type_init(rdma_hmp_register_types)
> > diff --git a/hw/rdma/rdma_rm.c b/hw/rdma/rdma_rm.c
> > index 14580ca379..e019de1a14 100644
> > --- a/hw/rdma/rdma_rm.c
> > +++ b/hw/rdma/rdma_rm.c
> > @@ -16,6 +16,7 @@
> >  #include "qemu/osdep.h"
> >  #include "qapi/error.h"
> >  #include "cpu.h"
> > +#include "monitor/monitor.h"
> >  
> >  #include "trace.h"
> >  #include "rdma_utils.h"
> > @@ -26,6 +27,58 @@
> >  #define PG_DIR_SZ { TARGET_PAGE_SIZE / sizeof(__u64) }
> >  #define PG_TBL_SZ { TARGET_PAGE_SIZE / sizeof(__u64) }
> >  
> > +void rdma_dump_device_counters(Monitor *mon, RdmaDeviceResources *dev_res)
> > +{
> > +    monitor_printf(mon, "\ttx               : %" PRId64 "\n",
> > +                   dev_res->stats.tx);
> > +    monitor_printf(mon, "\ttx_len           : %" PRId64 "\n",
> > +                   dev_res->stats.tx_len);
> > +    monitor_printf(mon, "\ttx_err           : %" PRId64 "\n",
> > +                   dev_res->stats.tx_err);
> > +    monitor_printf(mon, "\trx_bufs          : %" PRId64 "\n",
> > +                   dev_res->stats.rx_bufs);
> > +    monitor_printf(mon, "\trx_bufs_len      : %" PRId64 "\n",
> > +                   dev_res->stats.rx_bufs_len);
> > +    monitor_printf(mon, "\trx_bufs_err      : %" PRId64 "\n",
> > +                   dev_res->stats.rx_bufs_err);
> > +    monitor_printf(mon, "\tcomps            : %" PRId64 "\n",
> > +                   dev_res->stats.completions);
> > +    monitor_printf(mon, "\tmissing_comps    : %" PRId32 "\n",
> > +                   dev_res->stats.missing_cqe);
> > +    monitor_printf(mon, "\tpoll_cq (bk)     : %" PRId64 "\n",
> > +                   dev_res->stats.poll_cq_from_bk);
> > +    monitor_printf(mon, "\tpoll_cq_ppoll_to : %" PRId64 "\n",
> > +                   dev_res->stats.poll_cq_ppoll_to);
> > +    monitor_printf(mon, "\tpoll_cq (fe)     : %" PRId64 "\n",
> > +                   dev_res->stats.poll_cq_from_guest);
> > +    monitor_printf(mon, "\tpoll_cq_empty    : %" PRId64 "\n",
> > +                   dev_res->stats.poll_cq_from_guest_empty);
> > +    monitor_printf(mon, "\tmad_tx           : %" PRId64 "\n",
> > +                   dev_res->stats.mad_tx);
> > +    monitor_printf(mon, "\tmad_tx_err       : %" PRId64 "\n",
> > +                   dev_res->stats.mad_tx_err);
> > +    monitor_printf(mon, "\tmad_rx           : %" PRId64 "\n",
> > +                   dev_res->stats.mad_rx);
> > +    monitor_printf(mon, "\tmad_rx_err       : %" PRId64 "\n",
> > +                   dev_res->stats.mad_rx_err);
> > +    monitor_printf(mon, "\tmad_rx_bufs      : %" PRId64 "\n",
> > +                   dev_res->stats.mad_rx_bufs);
> > +    monitor_printf(mon, "\tmad_rx_bufs_err  : %" PRId64 "\n",
> > +                   dev_res->stats.mad_rx_bufs_err);
> > +    monitor_printf(mon, "\tPDs              : %" PRId32 "\n",
> > +                   dev_res->pd_tbl.used);
> > +    monitor_printf(mon, "\tMRs              : %" PRId32 "\n",
> > +                   dev_res->mr_tbl.used);
> > +    monitor_printf(mon, "\tUCs              : %" PRId32 "\n",
> > +                   dev_res->uc_tbl.used);
> > +    monitor_printf(mon, "\tQPs              : %" PRId32 "\n",
> > +                   dev_res->qp_tbl.used);
> > +    monitor_printf(mon, "\tCQs              : %" PRId32 "\n",
> > +                   dev_res->cq_tbl.used);
> > +    monitor_printf(mon, "\tCEQ_CTXs         : %" PRId32 "\n",
> > +                   dev_res->cqe_ctx_tbl.used);
> > +}
> > +
> >  static inline void res_tbl_init(const char *name, RdmaRmResTbl *tbl,
> >                                  uint32_t tbl_sz, uint32_t res_sz)
> >  {
> > @@ -37,6 +90,7 @@ static inline void res_tbl_init(const char *name, RdmaRmResTbl *tbl,
> >      tbl->bitmap = bitmap_new(tbl_sz);
> >      tbl->tbl_sz = tbl_sz;
> >      tbl->res_sz = res_sz;
> > +    tbl->used = 0;
> >      qemu_mutex_init(&tbl->lock);
> >  }
> >  
> > @@ -76,6 +130,8 @@ static inline void *rdma_res_tbl_alloc(RdmaRmResTbl *tbl, uint32_t *handle)
> >  
> >      set_bit(*handle, tbl->bitmap);
> >  
> > +    tbl->used++;
> > +
> >      qemu_mutex_unlock(&tbl->lock);
> >  
> >      memset(tbl->tbl + *handle * tbl->res_sz, 0, tbl->res_sz);
> > @@ -93,6 +149,7 @@ static inline void rdma_res_tbl_dealloc(RdmaRmResTbl *tbl, uint32_t handle)
> >  
> >      if (handle < tbl->tbl_sz) {
> >          clear_bit(handle, tbl->bitmap);
> > +        tbl->used--;
> >      }
> >  
> >      qemu_mutex_unlock(&tbl->lock);
> > @@ -620,6 +677,9 @@ int rdma_rm_init(RdmaDeviceResources *dev_res, struct ibv_device_attr *dev_attr,
> >  
> >      qemu_mutex_init(&dev_res->lock);
> >  
> > +    memset(&dev_res->stats, 0, sizeof(dev_res->stats));
> > +    atomic_set(&dev_res->stats.missing_cqe, 0);
> > +
> >      return 0;
> >  }
> >  
> > diff --git a/hw/rdma/rdma_rm.h b/hw/rdma/rdma_rm.h
> > index 9ec87d2667..a527d306f6 100644
> > --- a/hw/rdma/rdma_rm.h
> > +++ b/hw/rdma/rdma_rm.h
> > @@ -20,6 +20,7 @@
> >  #include "rdma_backend_defs.h"
> >  #include "rdma_rm_defs.h"
> >  
> > +void rdma_dump_device_counters(Monitor *mon, RdmaDeviceResources *dev_res);
> >  int rdma_rm_init(RdmaDeviceResources *dev_res, struct ibv_device_attr *dev_attr,
> >                   Error **errp);
> >  void rdma_rm_fini(RdmaDeviceResources *dev_res, RdmaBackendDev *backend_dev,
> > diff --git a/hw/rdma/rdma_rm_defs.h b/hw/rdma/rdma_rm_defs.h
> > index f0ee1f3072..4b8d704cfe 100644
> > --- a/hw/rdma/rdma_rm_defs.h
> > +++ b/hw/rdma/rdma_rm_defs.h
> > @@ -34,7 +34,9 @@
> >  #define MAX_QP_INIT_RD_ATOM   16
> >  #define MAX_AH                64
> >  
> > -#define MAX_RM_TBL_NAME 16
> > +#define MAX_RM_TBL_NAME             16
> > +#define MAX_CONSEQ_EMPTY_POLL_CQ    4096 /* considered as error above this */
> > +
> >  typedef struct RdmaRmResTbl {
> >      char name[MAX_RM_TBL_NAME];
> >      QemuMutex lock;
> > @@ -42,6 +44,7 @@ typedef struct RdmaRmResTbl {
> >      size_t tbl_sz;
> >      size_t res_sz;
> >      void *tbl;
> > +    uint32_t used; /* number of used entries in the table */
> >  } RdmaRmResTbl;
> >  
> >  typedef struct RdmaRmPD {
> > @@ -96,6 +99,27 @@ typedef struct RdmaRmPort {
> >      enum ibv_port_state state;
> >  } RdmaRmPort;
> >  
> > +typedef struct RdmaRmStats {
> > +    uint64_t tx;
> > +    uint64_t tx_len;
> > +    uint64_t tx_err;
> > +    uint64_t rx_bufs;
> > +    uint64_t rx_bufs_len;
> > +    uint64_t rx_bufs_err;
> > +    uint64_t completions;
> > +    uint64_t mad_tx;
> > +    uint64_t mad_tx_err;
> > +    uint64_t mad_rx;
> > +    uint64_t mad_rx_err;
> > +    uint64_t mad_rx_bufs;
> > +    uint64_t mad_rx_bufs_err;
> > +    uint64_t poll_cq_from_bk;
> > +    uint64_t poll_cq_from_guest;
> > +    uint64_t poll_cq_from_guest_empty;
> > +    uint64_t poll_cq_ppoll_to;
> > +    uint32_t missing_cqe;
> > +} RdmaRmStats;
> > +
> >  typedef struct RdmaDeviceResources {
> >      RdmaRmPort port;
> >      RdmaRmResTbl pd_tbl;
> > @@ -106,6 +130,7 @@ typedef struct RdmaDeviceResources {
> >      RdmaRmResTbl cqe_ctx_tbl;
> >      GHashTable *qp_hash; /* Keeps mapping between real and emulated */
> >      QemuMutex lock;
> > +    RdmaRmStats stats;
> >  } RdmaDeviceResources;
> >  
> >  #endif
> > diff --git a/hw/rdma/vmw/pvrdma.h b/hw/rdma/vmw/pvrdma.h
> > index 0879224957..167706ec2c 100644
> > --- a/hw/rdma/vmw/pvrdma.h
> > +++ b/hw/rdma/vmw/pvrdma.h
> > @@ -70,6 +70,10 @@ typedef struct DSRInfo {
> >      PvrdmaRing cq;
> >  } DSRInfo;
> >  
> > +typedef struct PVRDMADevStats {
> > +    uint64_t commands;
> > +} PVRDMADevStats;
> > +
> >  typedef struct PVRDMADev {
> >      PCIDevice parent_obj;
> >      MemoryRegion msix;
> > @@ -89,6 +93,7 @@ typedef struct PVRDMADev {
> >      CharBackend mad_chr;
> >      VMXNET3State *func0;
> >      Notifier shutdown_notifier;
> > +    PVRDMADevStats stats;
> >  } PVRDMADev;
> >  #define PVRDMA_DEV(dev) OBJECT_CHECK(PVRDMADev, (dev), PVRDMA_HW_NAME)
> >  
> > diff --git a/hw/rdma/vmw/pvrdma_main.c b/hw/rdma/vmw/pvrdma_main.c
> > index b6061f4b6e..659331ac93 100644
> > --- a/hw/rdma/vmw/pvrdma_main.c
> > +++ b/hw/rdma/vmw/pvrdma_main.c
> > @@ -25,6 +25,8 @@
> >  #include "cpu.h"
> >  #include "trace.h"
> >  #include "sysemu/sysemu.h"
> > +#include "monitor/monitor.h"
> > +#include "hw/rdma/rdma_hmp.h"
> >  
> >  #include "../rdma_rm.h"
> >  #include "../rdma_backend.h"
> > @@ -55,6 +57,18 @@ static Property pvrdma_dev_properties[] = {
> >      DEFINE_PROP_END_OF_LIST(),
> >  };
> >  
> > +static void pvrdma_print_statistics(Monitor *mon, RdmaStatsProvider *obj)
> > +{
> > +    PVRDMADev *dev = PVRDMA_DEV(obj);
> > +    PCIDevice *pdev = PCI_DEVICE(dev);
> > +
> > +    monitor_printf(mon, "%s, %x.%x\n", pdev->name, PCI_SLOT(pdev->devfn),
> > +                   PCI_FUNC(pdev->devfn));
> > +    monitor_printf(mon, "\tcommands         : %" PRId64 "\n",
> > +                   dev->stats.commands);
> > +    rdma_dump_device_counters(mon, &dev->rdma_dev_res);
> > +}
> > +
> >  static void free_dev_ring(PCIDevice *pci_dev, PvrdmaRing *ring,
> >                            void *ring_state)
> >  {
> > @@ -394,6 +408,7 @@ static void pvrdma_regs_write(void *opaque, hwaddr addr, uint64_t val,
> >          if (val == 0) {
> >              trace_pvrdma_regs_write(addr, val, "REQUEST", "");
> >              pvrdma_exec_cmd(dev);
> > +            dev->stats.commands++;
> >          }
> >          break;
> >      default:
> > @@ -612,6 +627,8 @@ static void pvrdma_realize(PCIDevice *pdev, Error **errp)
> >          goto out;
> >      }
> >  
> > +    memset(&dev->stats, 0, sizeof(dev->stats));
> > +
> >      dev->shutdown_notifier.notify = pvrdma_shutdown_notifier;
> >      qemu_register_shutdown_notifier(&dev->shutdown_notifier);
> >  
> > @@ -631,6 +648,7 @@ static void pvrdma_class_init(ObjectClass *klass, void *data)
> >  {
> >      DeviceClass *dc = DEVICE_CLASS(klass);
> >      PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> > +    RdmaStatsProviderClass *ir = RDMA_STATS_PROVIDER_CLASS(klass);
> >  
> >      k->realize = pvrdma_realize;
> >      k->exit = pvrdma_exit;
> > @@ -642,6 +660,8 @@ static void pvrdma_class_init(ObjectClass *klass, void *data)
> >      dc->desc = "RDMA Device";
> >      dc->props = pvrdma_dev_properties;
> >      set_bit(DEVICE_CATEGORY_NETWORK, dc->categories);
> > +
> > +    ir->print_statistics = pvrdma_print_statistics;
> >  }
> >  
> >  static const TypeInfo pvrdma_info = {
> > @@ -651,6 +671,7 @@ static const TypeInfo pvrdma_info = {
> >      .class_init = pvrdma_class_init,
> >      .interfaces = (InterfaceInfo[]) {
> >          { INTERFACE_CONVENTIONAL_PCI_DEVICE },
> > +        { TYPE_RDMA_STATS_PROVIDER },
> >          { }
> >      }
> >  };
> > diff --git a/include/hw/rdma/rdma_hmp.h b/include/hw/rdma/rdma_hmp.h
> > new file mode 100644
> > index 0000000000..dd23f2bc84
> > --- /dev/null
> > +++ b/include/hw/rdma/rdma_hmp.h
> > @@ -0,0 +1,40 @@
> > +/*
> > + * RDMA device: Human Monitor interface
> > + *
> > + * Copyright (C) 2019 Oracle
> > + * Copyright (C) 2019 Red Hat Inc
> > + *
> > + * Authors:
> > + *     Yuval Shaia <yuval.shaia@oracle.com>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#ifndef RDMA_HMP_H
> > +#define RDMA_HMP_H
> > +
> > +#include "qom/object.h"
> > +
> > +#define TYPE_RDMA_STATS_PROVIDER "rdma"
> > +
> > +#define RDMA_STATS_PROVIDER_CLASS(klass) \
> > +    OBJECT_CLASS_CHECK(RdmaStatsProviderClass, (klass), \
> > +                       TYPE_RDMA_STATS_PROVIDER)
> > +#define RDMA_STATS_PROVIDER_GET_CLASS(obj) \
> > +    OBJECT_GET_CLASS(RdmaStatsProviderClass, (obj), \
> > +                     TYPE_RDMA_STATS_PROVIDER)
> > +#define RDMA_STATS_PROVIDER(obj) \
> > +    INTERFACE_CHECK(RdmaStatsProvider, (obj), \
> > +                    TYPE_RDMA_STATS_PROVIDER)
> > +
> > +typedef struct RdmaStatsProvider RdmaStatsProvider;
> > +
> > +typedef struct RdmaStatsProviderClass {
> > +    InterfaceClass parent;
> > +
> > +    void (*print_statistics)(Monitor *mon, RdmaStatsProvider *obj);
> > +} RdmaStatsProviderClass;
> > +
> > +#endif
> > -- 
> > 2.17.2
> > 
> > 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2019-03-06 12:21 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-03 20:33 [Qemu-devel] [PATCH v4 0/9] Misc fixes to pvrdma device Yuval Shaia
2019-03-03 20:33 ` [Qemu-devel] [PATCH v4 1/9] hw/rdma: Switch to generic error reporting way Yuval Shaia
2019-03-03 20:33 ` [Qemu-devel] [PATCH v4 2/9] hw/rdma: Introduce protected qlist Yuval Shaia
2019-03-03 20:33 ` [Qemu-devel] [PATCH v4 3/9] hw/rdma: Protect against concurrent execution of poll_cq Yuval Shaia
2019-03-03 20:33 ` [Qemu-devel] [PATCH v4 4/9] {hmp, hw/pvrdma}: Expose device internals via monitor interface Yuval Shaia
2019-03-06 10:22   ` Yuval Shaia
2019-03-06 12:20     ` Dr. David Alan Gilbert [this message]
2019-03-06 12:46       ` Yuval Shaia
2019-03-07  9:50   ` Marcel Apfelbaum
2019-03-08 16:37   ` Markus Armbruster
2019-03-08 18:57     ` Yuval Shaia
2019-03-10  8:06     ` Yuval Shaia
2019-03-03 20:33 ` [Qemu-devel] [PATCH v4 5/9] hw/rdma: Free all MAD receive buffers when device is closed Yuval Shaia
2019-03-03 20:33 ` [Qemu-devel] [PATCH v4 6/9] hw/rdma: Free all receive buffers when QP is destroyed Yuval Shaia
2019-03-03 20:33 ` [Qemu-devel] [PATCH v4 7/9] hw/pvrdma: Delete unneeded function argument Yuval Shaia
2019-03-03 20:33 ` [Qemu-devel] [PATCH v4 8/9] hw/pvrdma: Delete pvrdma_exit function Yuval Shaia
2019-03-03 20:33 ` [Qemu-devel] [PATCH v4 9/9] hw/pvrdma: Unregister from shutdown notifier when device goes down Yuval Shaia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190306122048.GE2727@work-vm \
    --to=dgilbert@redhat.com \
    --cc=armbru@redhat.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=qemu-devel@nongnu.org \
    --cc=yuval.shaia@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.