[Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once
@ 2018-05-24  4:44 Peter Xu
  2018-05-24  4:44 ` [Qemu-devel] [PATCH v4 1/2] qemu-error: " Peter Xu
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Peter Xu @ 2018-05-24  4:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: Philippe Mathieu-Daudé,
	peterx, Jason Wang, Michael S . Tsirkin, Eric Blake,
	Markus Armbruster

v4:
- patch 2: pick r-b from Philippe
- patch 1: replace all __* variables into *_ [Eric]
- patch 1: enhance the commit message of patch 1, mention return code
  of macros [Markus]

v3:
- reindent in patch 2, dump more things [Philippe]

v2:
- for patch 1: replace tabs, add trivial comment [Markus]
  (I didn't add much comment otherwise I'll need to duplicate what's
   there in error_report())
- add patch 2

Patch 1 introduce the helpers.

Patch 2 use it to replace VT-d trace_vtd_err().

Please review.  Thanks.

Peter Xu (2):
  qemu-error: introduce {error|warn}_report_once
  intel-iommu: start to use error_report_once

 include/qemu/error-report.h | 32 ++++++++++++++++++++
 hw/i386/intel_iommu.c       | 59 +++++++++++++++++++++----------------
 hw/i386/trace-events        |  1 -
 3 files changed, 65 insertions(+), 27 deletions(-)

-- 
2.17.0

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once
  2018-05-24  4:44 [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once Peter Xu
@ 2018-05-24  4:44 ` Peter Xu
  2018-05-29  9:30   ` Cornelia Huck
                     ` (2 more replies)
  2018-05-24  4:44 ` [Qemu-devel] [PATCH v4 2/2] intel-iommu: start to use error_report_once Peter Xu
  2018-08-15  5:58 ` [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once Markus Armbruster
  2 siblings, 3 replies; 22+ messages in thread
From: Peter Xu @ 2018-05-24  4:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: Philippe Mathieu-Daudé,
	peterx, Jason Wang, Michael S . Tsirkin, Eric Blake,
	Markus Armbruster

There are many error_report()s that can be used in frequently called
functions, especially on IO paths.  That can be unideal in that
malicious guest can try to trigger the error tons of time which might
use up the log space on the host (e.g., libvirt can capture the stderr
of QEMU and put it persistently onto disk).  In VT-d emulation code, we
have trace_vtd_error() tracer.  AFAIU all those places can be replaced
by something like error_report() but trace points are mostly used to
avoid the DDOS attack that mentioned above.  However using trace points
mean that errors are not dumped if trace not enabled.

It's not a big deal in most modern server managements since we have
things like logrotate to maintain the logs and make sure the quota is
expected.  However it'll still be nice that we just provide another way
to restrict message generations.  In most cases, this kind of
error_report()s will only provide valid information on the first message
sent, and all the rest of similar messages will be mostly talking about
the same thing.  This patch introduces *_report_once() helpers to allow
a message to be dumped only once during one QEMU process's life cycle.
It will make sure: (1) it's on by deffault, so we can even get something
without turning the trace on and reproducing, and (2) it won't be
affected by DDOS attack.

To implement it, I stole the printk_once() macro from Linux.

CC: Eric Blake <eblake@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/qemu/error-report.h | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h
index e1c8ae1a52..c7ec54cb97 100644
--- a/include/qemu/error-report.h
+++ b/include/qemu/error-report.h
@@ -44,6 +44,38 @@ void error_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
 void warn_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
 void info_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);

+/*
+ * Similar to error_report(), but it only prints the message once.  It
+ * returns true when it prints the first time, otherwise false.
+ */
+#define error_report_once(fmt, ...)             \
+    ({                                          \
+        static bool print_once_;               \
+        bool ret_print_once_ = !print_once_;  \
+                                                \
+        if (!print_once_) {                    \
+            print_once_ = true;                \
+            error_report(fmt, ##__VA_ARGS__);   \
+        }                                       \
+        unlikely(ret_print_once_);             \
+    })
+
+/*
+ * Similar to warn_report(), but it only prints the message once.  It
+ * returns true when it prints the first time, otherwise false.
+ */
+#define warn_report_once(fmt, ...)              \
+    ({                                          \
+        static bool print_once_;               \
+        bool ret_print_once_ = !print_once_;  \
+                                                \
+        if (!print_once_) {                    \
+            print_once_ = true;                \
+            warn_report(fmt, ##__VA_ARGS__);   \
+        }                                       \
+        unlikely(ret_print_once_);             \
+    })
+
 const char *error_get_progname(void);
 extern bool enable_timestamp_msg;

-- 
2.17.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v4 2/2] intel-iommu: start to use error_report_once
  2018-05-24  4:44 [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once Peter Xu
  2018-05-24  4:44 ` [Qemu-devel] [PATCH v4 1/2] qemu-error: " Peter Xu
@ 2018-05-24  4:44 ` Peter Xu
  2018-06-13  8:05   ` Markus Armbruster
  2018-08-15  5:58 ` [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once Markus Armbruster
  2 siblings, 1 reply; 22+ messages in thread
From: Peter Xu @ 2018-05-24  4:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: Philippe Mathieu-Daudé,
	peterx, Jason Wang, Michael S . Tsirkin, Eric Blake,
	Markus Armbruster

Replace existing trace_vtd_err() with error_report_once() then stderr
will capture something if any of the error happens, meanwhile we don't
suffer from any DDOS.  Then remove the trace point.  Since at it,
provide more information where proper (now we can pass parameters into
the report function).

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/i386/intel_iommu.c | 59 ++++++++++++++++++++++++-------------------
 hw/i386/trace-events  |  1 -
 2 files changed, 33 insertions(+), 27 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index fb31de9416..fa6df921ee 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -285,14 +285,14 @@ static void vtd_generate_fault_event(IntelIOMMUState *s, uint32_t pre_fsts)
 {
     if (pre_fsts & VTD_FSTS_PPF || pre_fsts & VTD_FSTS_PFO ||
         pre_fsts & VTD_FSTS_IQE) {
-        trace_vtd_err("There are previous interrupt conditions "
-                      "to be serviced by software, fault event "
-                      "is not generated.");
+        error_report_once("There are previous interrupt conditions "
+                          "to be serviced by software, fault event "
+                          "is not generated");
         return;
     }
     vtd_set_clear_mask_long(s, DMAR_FECTL_REG, 0, VTD_FECTL_IP);
     if (vtd_get_long_raw(s, DMAR_FECTL_REG) & VTD_FECTL_IM) {
-        trace_vtd_err("Interrupt Mask set, irq is not generated.");
+        error_report_once("Interrupt Mask set, irq is not generated");
     } else {
         vtd_generate_interrupt(s, DMAR_FEADDR_REG, DMAR_FEDATA_REG);
         vtd_set_clear_mask_long(s, DMAR_FECTL_REG, VTD_FECTL_IP, 0);
@@ -400,20 +400,20 @@ static void vtd_report_dmar_fault(IntelIOMMUState *s, uint16_t source_id,
     trace_vtd_dmar_fault(source_id, fault, addr, is_write);
 
     if (fsts_reg & VTD_FSTS_PFO) {
-        trace_vtd_err("New fault is not recorded due to "
-                      "Primary Fault Overflow.");
+        error_report_once("New fault is not recorded due to "
+                          "Primary Fault Overflow");
         return;
     }
 
     if (vtd_try_collapse_fault(s, source_id)) {
-        trace_vtd_err("New fault is not recorded due to "
-                      "compression of faults.");
+        error_report_once("New fault is not recorded due to "
+                          "compression of faults");
         return;
     }
 
     if (vtd_is_frcd_set(s, s->next_frcd_reg)) {
-        trace_vtd_err("Next Fault Recording Reg is used, "
-                      "new fault is not recorded, set PFO field.");
+        error_report_once("Next Fault Recording Reg is used, "
+                          "new fault is not recorded, set PFO field");
         vtd_set_clear_mask_long(s, DMAR_FSTS_REG, 0, VTD_FSTS_PFO);
         return;
     }
@@ -421,8 +421,8 @@ static void vtd_report_dmar_fault(IntelIOMMUState *s, uint16_t source_id,
     vtd_record_frcd(s, s->next_frcd_reg, source_id, addr, fault, is_write);
 
     if (fsts_reg & VTD_FSTS_PPF) {
-        trace_vtd_err("There are pending faults already, "
-                      "fault event is not generated.");
+        error_report_once("There are pending faults already, "
+                          "fault event is not generated");
         vtd_set_frcd_and_update_ppf(s, s->next_frcd_reg);
         s->next_frcd_reg++;
         if (s->next_frcd_reg == DMAR_FRCD_REG_NR) {
@@ -1339,7 +1339,8 @@ static uint64_t vtd_context_cache_invalidate(IntelIOMMUState *s, uint64_t val)
         break;
 
     default:
-        trace_vtd_err("Context cache invalidate type error.");
+        error_report_once("%s: invalid context: 0x%"PRIx64,
+                          __func__, val);
         caig = 0;
     }
     return caig;
@@ -1445,7 +1446,8 @@ static uint64_t vtd_iotlb_flush(IntelIOMMUState *s, uint64_t val)
         am = VTD_IVA_AM(addr);
         addr = VTD_IVA_ADDR(addr);
         if (am > VTD_MAMV) {
-            trace_vtd_err("IOTLB PSI flush: address mask overflow.");
+            error_report_once("%s: address mask overflow: 0x%"PRIx64,
+                              __func__, vtd_get_quad_raw(s, DMAR_IVA_REG));
             iaig = 0;
             break;
         }
@@ -1454,7 +1456,8 @@ static uint64_t vtd_iotlb_flush(IntelIOMMUState *s, uint64_t val)
         break;
 
     default:
-        trace_vtd_err("IOTLB flush: invalid granularity.");
+        error_report_once("%s: invalid granularity: 0x%"PRIx64,
+                          __func__, val);
         iaig = 0;
     }
     return iaig;
@@ -1604,8 +1607,8 @@ static void vtd_handle_ccmd_write(IntelIOMMUState *s)
     /* Context-cache invalidation request */
     if (val & VTD_CCMD_ICC) {
         if (s->qi_enabled) {
-            trace_vtd_err("Queued Invalidation enabled, "
-                          "should not use register-based invalidation");
+            error_report_once("Queued Invalidation enabled, "
+                              "should not use register-based invalidation");
             return;
         }
         ret = vtd_context_cache_invalidate(s, val);
@@ -1625,8 +1628,8 @@ static void vtd_handle_iotlb_write(IntelIOMMUState *s)
     /* IOTLB invalidation request */
     if (val & VTD_TLB_IVT) {
         if (s->qi_enabled) {
-            trace_vtd_err("Queued Invalidation enabled, "
-                          "should not use register-based invalidation.");
+            error_report_once("Queued Invalidation enabled, "
+                              "should not use register-based invalidation");
             return;
         }
         ret = vtd_iotlb_flush(s, val);
@@ -1644,7 +1647,7 @@ static bool vtd_get_inv_desc(dma_addr_t base_addr, uint32_t offset,
     dma_addr_t addr = base_addr + offset * sizeof(*inv_desc);
     if (dma_memory_read(&address_space_memory, addr, inv_desc,
         sizeof(*inv_desc))) {
-        trace_vtd_err("Read INV DESC failed.");
+        error_report_once("Read INV DESC failed");
         inv_desc->lo = 0;
         inv_desc->hi = 0;
         return false;
@@ -1999,7 +2002,8 @@ static uint64_t vtd_mem_read(void *opaque, hwaddr addr, unsigned size)
     trace_vtd_reg_read(addr, size);
 
     if (addr + size > DMAR_REG_SIZE) {
-        trace_vtd_err("Read MMIO over range.");
+        error_report_once("%s: MMIO over range: addr=0x%"PRIx64
+                          " size=0x%u", __func__, addr, size);
         return (uint64_t)-1;
     }
 
@@ -2050,7 +2054,8 @@ static void vtd_mem_write(void *opaque, hwaddr addr,
     trace_vtd_reg_write(addr, size, val);
 
     if (addr + size > DMAR_REG_SIZE) {
-        trace_vtd_err("Write MMIO over range.");
+        error_report_once("%s: MMIO over range: addr=0x%"PRIx64
+                          " size=0x%u", __func__, addr, size);
         return;
     }
 
@@ -2432,7 +2437,8 @@ static int vtd_irte_get(IntelIOMMUState *iommu, uint16_t index,
     addr = iommu->intr_root + index * sizeof(*entry);
     if (dma_memory_read(&address_space_memory, addr, entry,
                         sizeof(*entry))) {
-        trace_vtd_err("Memory read failed for IRTE.");
+        error_report_once("%s: read failed: ind=0x%"PRIu16
+                          " addr=0x%"PRIx64, __func__, index, addr);
         return -VTD_FR_IR_ROOT_INVAL;
     }
 
@@ -2564,14 +2570,15 @@ static int vtd_interrupt_remap_msi(IntelIOMMUState *iommu,
     }
 
     if (origin->address & VTD_MSI_ADDR_HI_MASK) {
-        trace_vtd_err("MSI address high 32 bits non-zero when "
-                      "Interrupt Remapping enabled.");
+        error_report_once("%s: MSI address high 32 bits non-zero detected: "
+                          "address=0x%"PRIx64, __func__, origin->address);
         return -VTD_FR_IR_REQ_RSVD;
     }
 
     addr.data = origin->address & VTD_MSI_ADDR_LO_MASK;
     if (addr.addr.__head != 0xfee) {
-        trace_vtd_err("MSI addr low 32 bit invalid.");
+        error_report_once("%s: MSI address low 32 bit invalid: 0x%"PRIx32,
+                          __func__, addr.data);
         return -VTD_FR_IR_REQ_RSVD;
     }
 
diff --git a/hw/i386/trace-events b/hw/i386/trace-events
index 22d44648af..e08cf2a9a7 100644
--- a/hw/i386/trace-events
+++ b/hw/i386/trace-events
@@ -68,7 +68,6 @@ vtd_ir_remap_msi_req(uint64_t addr, uint64_t data) "addr 0x%"PRIx64" data 0x%"PR
 vtd_fsts_ppf(bool set) "FSTS PPF bit set to %d"
 vtd_fsts_clear_ip(void) ""
 vtd_frr_new(int index, uint64_t hi, uint64_t lo) "index %d high 0x%"PRIx64" low 0x%"PRIx64
-vtd_err(const char *str) "%s"
 vtd_err_dmar_iova_overflow(uint64_t iova) "iova 0x%"PRIx64
 vtd_err_dmar_slpte_read_error(uint64_t iova, int level) "iova 0x%"PRIx64" level %d"
 vtd_err_dmar_slpte_perm_error(uint64_t iova, int level, uint64_t slpte, bool is_write) "iova 0x%"PRIx64" level %d slpte 0x%"PRIx64" write %d"
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once
  2018-05-24  4:44 ` [Qemu-devel] [PATCH v4 1/2] qemu-error: " Peter Xu
@ 2018-05-29  9:30   ` Cornelia Huck
  2018-05-30  3:30     ` Peter Xu
  2018-05-30  4:47   ` Michael S. Tsirkin
  2018-06-13  8:01   ` Markus Armbruster
  2 siblings, 1 reply; 22+ messages in thread
From: Cornelia Huck @ 2018-05-29  9:30 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Michael S . Tsirkin, Jason Wang, Markus Armbruster,
	Philippe Mathieu-Daudé,
	Halil Pasic

On Thu, 24 May 2018 12:44:53 +0800
Peter Xu <peterx@redhat.com> wrote:

> There are many error_report()s that can be used in frequently called
> functions, especially on IO paths.  That can be unideal in that
> malicious guest can try to trigger the error tons of time which might
> use up the log space on the host (e.g., libvirt can capture the stderr
> of QEMU and put it persistently onto disk).  In VT-d emulation code, we
> have trace_vtd_error() tracer.  AFAIU all those places can be replaced
> by something like error_report() but trace points are mostly used to
> avoid the DDOS attack that mentioned above.  However using trace points
> mean that errors are not dumped if trace not enabled.
> 
> It's not a big deal in most modern server managements since we have
> things like logrotate to maintain the logs and make sure the quota is
> expected.  However it'll still be nice that we just provide another way
> to restrict message generations.  In most cases, this kind of
> error_report()s will only provide valid information on the first message
> sent, and all the rest of similar messages will be mostly talking about
> the same thing.  This patch introduces *_report_once() helpers to allow
> a message to be dumped only once during one QEMU process's life cycle.
> It will make sure: (1) it's on by deffault, so we can even get something
> without turning the trace on and reproducing, and (2) it won't be
> affected by DDOS attack.

This is good for something (sub-)system wide, where it is enough to
alert the user once; but we may want to print something e.g. once per
the device where it happens (see v3 of "vfio-ccw: add force unlimited
prefetch property" for an example).

> 
> To implement it, I stole the printk_once() macro from Linux.

Would something akin to printk_ratelimited() also make sense to avoid
log flooding?

> 
> CC: Eric Blake <eblake@redhat.com>
> CC: Markus Armbruster <armbru@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  include/qemu/error-report.h | 32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
> 
> diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h
> index e1c8ae1a52..c7ec54cb97 100644
> --- a/include/qemu/error-report.h
> +++ b/include/qemu/error-report.h
> @@ -44,6 +44,38 @@ void error_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
>  void warn_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
>  void info_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
>  
> +/*
> + * Similar to error_report(), but it only prints the message once.  It
> + * returns true when it prints the first time, otherwise false.
> + */
> +#define error_report_once(fmt, ...)             \
> +    ({                                          \
> +        static bool print_once_;               \
> +        bool ret_print_once_ = !print_once_;  \
> +                                                \
> +        if (!print_once_) {                    \
> +            print_once_ = true;                \
> +            error_report(fmt, ##__VA_ARGS__);   \
> +        }                                       \
> +        unlikely(ret_print_once_);             \
> +    })
> +
> +/*
> + * Similar to warn_report(), but it only prints the message once.  It
> + * returns true when it prints the first time, otherwise false.
> + */
> +#define warn_report_once(fmt, ...)              \
> +    ({                                          \
> +        static bool print_once_;               \
> +        bool ret_print_once_ = !print_once_;  \
> +                                                \
> +        if (!print_once_) {                    \
> +            print_once_ = true;                \
> +            warn_report(fmt, ##__VA_ARGS__);   \
> +        }                                       \
> +        unlikely(ret_print_once_);             \
> +    })
> +
>  const char *error_get_progname(void);
>  extern bool enable_timestamp_msg;
>  

The patch mentioned above implements printing a warning once via a
passed-in variable (in that case, a per-device property). That looks
like a superset of what this patch provides.

Would such a functionality be useful for other callers as well? If so,
we should probably implement it in the common error handling functions.
[I'm currently planning to queue the vfio-ccw patch as-is; we can
switch to a common interface later if it makes sense.]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once
  2018-05-29  9:30   ` Cornelia Huck
@ 2018-05-30  3:30     ` Peter Xu
  2018-05-30 13:36       ` Cornelia Huck
  0 siblings, 1 reply; 22+ messages in thread
From: Peter Xu @ 2018-05-30  3:30 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: qemu-devel, Michael S . Tsirkin, Jason Wang, Markus Armbruster,
	Philippe Mathieu-Daudé,
	Halil Pasic

On Tue, May 29, 2018 at 11:30:00AM +0200, Cornelia Huck wrote:
> On Thu, 24 May 2018 12:44:53 +0800
> Peter Xu <peterx@redhat.com> wrote:
> 
> > There are many error_report()s that can be used in frequently called
> > functions, especially on IO paths.  That can be unideal in that
> > malicious guest can try to trigger the error tons of time which might
> > use up the log space on the host (e.g., libvirt can capture the stderr
> > of QEMU and put it persistently onto disk).  In VT-d emulation code, we
> > have trace_vtd_error() tracer.  AFAIU all those places can be replaced
> > by something like error_report() but trace points are mostly used to
> > avoid the DDOS attack that mentioned above.  However using trace points
> > mean that errors are not dumped if trace not enabled.
> > 
> > It's not a big deal in most modern server managements since we have
> > things like logrotate to maintain the logs and make sure the quota is
> > expected.  However it'll still be nice that we just provide another way
> > to restrict message generations.  In most cases, this kind of
> > error_report()s will only provide valid information on the first message
> > sent, and all the rest of similar messages will be mostly talking about
> > the same thing.  This patch introduces *_report_once() helpers to allow
> > a message to be dumped only once during one QEMU process's life cycle.
> > It will make sure: (1) it's on by deffault, so we can even get something
> > without turning the trace on and reproducing, and (2) it won't be
> > affected by DDOS attack.
> 
> This is good for something (sub-)system wide, where it is enough to
> alert the user once; but we may want to print something e.g. once per
> the device where it happens (see v3 of "vfio-ccw: add force unlimited
> prefetch property" for an example).

I'm glad that we start to have more users of it, no matter which
implementation we'll choose.  At least it means it makes some sense to
have such a thing.

For me this patch works nicely enough.  Of course there can be
per-device errors, but AFAICT mostly we don't have any error at
all... and my debugging experience is that when multiple error happens
on different devices simutaneously, they'll very possible that it's
caused by the same problem (again, errors are rare after all), or, the
rest of the problems are caused by the first error only (so the first
error might cause a collapse of the rest).  That's why I wanted to
always debug with the first error I encounter, because that's mostly
always the root cause. In that sense, current patch works nicely for
me (note that we can have device ID embeded in the error message with
this one).

At the same time, I think this patch should be easier to use of course
- no extra variables to define, and very self contained. So I would
slightly prefer current patch.

However I'm also fine with the approach proposed in the vfio-ccw patch
too.  Though if so I would possibly drop the 2nd patch too since if
with the vfio-ccw patch I'll need to introduce one bool for every
trace_vtd_err() place... then I'd not bother with it any more but
instead live with that trace_*(). ;) Or I can define one per-IOMMU
error boolean and pass it in for each of the error_report_once(), but
it seems a bit awkward too.

> 
> > 
> > To implement it, I stole the printk_once() macro from Linux.
> 
> Would something akin to printk_ratelimited() also make sense to avoid
> log flooding?

Yes it will.  IMHO we can have that too as follow up if we want, and
it does not conflict with this print_once().  I'd say currently this
error_report_once() is good enough for me, especially lightweighted.
I suspect we'll need more lines to implement a ratelimited version.

> 
> > 
> > CC: Eric Blake <eblake@redhat.com>
> > CC: Markus Armbruster <armbru@redhat.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  include/qemu/error-report.h | 32 ++++++++++++++++++++++++++++++++
> >  1 file changed, 32 insertions(+)
> > 
> > diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h
> > index e1c8ae1a52..c7ec54cb97 100644
> > --- a/include/qemu/error-report.h
> > +++ b/include/qemu/error-report.h
> > @@ -44,6 +44,38 @@ void error_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
> >  void warn_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
> >  void info_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
> >  
> > +/*
> > + * Similar to error_report(), but it only prints the message once.  It
> > + * returns true when it prints the first time, otherwise false.
> > + */
> > +#define error_report_once(fmt, ...)             \
> > +    ({                                          \
> > +        static bool print_once_;               \
> > +        bool ret_print_once_ = !print_once_;  \
> > +                                                \
> > +        if (!print_once_) {                    \
> > +            print_once_ = true;                \
> > +            error_report(fmt, ##__VA_ARGS__);   \
> > +        }                                       \
> > +        unlikely(ret_print_once_);             \
> > +    })
> > +
> > +/*
> > + * Similar to warn_report(), but it only prints the message once.  It
> > + * returns true when it prints the first time, otherwise false.
> > + */
> > +#define warn_report_once(fmt, ...)              \
> > +    ({                                          \
> > +        static bool print_once_;               \
> > +        bool ret_print_once_ = !print_once_;  \
> > +                                                \
> > +        if (!print_once_) {                    \
> > +            print_once_ = true;                \
> > +            warn_report(fmt, ##__VA_ARGS__);   \
> > +        }                                       \
> > +        unlikely(ret_print_once_);             \
> > +    })
> > +
> >  const char *error_get_progname(void);
> >  extern bool enable_timestamp_msg;
> >  
> 
> The patch mentioned above implements printing a warning once via a
> passed-in variable (in that case, a per-device property). That looks
> like a superset of what this patch provides.
> 
> Would such a functionality be useful for other callers as well? If so,
> we should probably implement it in the common error handling functions.
> [I'm currently planning to queue the vfio-ccw patch as-is; we can
> switch to a common interface later if it makes sense.]

Yeah; I left my thoughts above.  I'll be happy with either way.

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once
  2018-05-24  4:44 ` [Qemu-devel] [PATCH v4 1/2] qemu-error: " Peter Xu
  2018-05-29  9:30   ` Cornelia Huck
@ 2018-05-30  4:47   ` Michael S. Tsirkin
  2018-05-30  6:39     ` Peter Xu
  2018-05-30 15:15     ` Halil Pasic
  2018-06-13  8:01   ` Markus Armbruster
  2 siblings, 2 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2018-05-30  4:47 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Philippe Mathieu-Daudé,
	Jason Wang, Eric Blake, Markus Armbruster

On Thu, May 24, 2018 at 12:44:53PM +0800, Peter Xu wrote:
> There are many error_report()s that can be used in frequently called
> functions, especially on IO paths.  That can be unideal in that
> malicious guest can try to trigger the error tons of time which might
> use up the log space on the host (e.g., libvirt can capture the stderr
> of QEMU and put it persistently onto disk).

I think the problem is real enough but I think the API
isn't great as it stresses the mechanism. Which fundamentally does
not matter - we can print once or 10 times, or whatever.

What happens here is a guest bug as opposed to hypervisor
bug. So I think a better name would be guest_error.

Internally we can still have something similar to this
mechanism.

Another idea is to reset these guest error counters on guest reset.
Device reset too? I'm not 100% sure as guest can trigger device resets.

>  In VT-d emulation code, we
> have trace_vtd_error() tracer.  AFAIU all those places can be replaced
> by something like error_report() but trace points are mostly used to
> avoid the DDOS attack that mentioned above.  However using trace points
> mean that errors are not dumped if trace not enabled.
> 
> It's not a big deal in most modern server managements since we have
> things like logrotate to maintain the logs and make sure the quota is
> expected.  However it'll still be nice that we just provide another way
> to restrict message generations.  In most cases, this kind of
> error_report()s will only provide valid information on the first message
> sent, and all the rest of similar messages will be mostly talking about
> the same thing.  This patch introduces *_report_once() helpers to allow
> a message to be dumped only once during one QEMU process's life cycle.
> It will make sure: (1) it's on by deffault, so we can even get something
> without turning the trace on and reproducing, and (2) it won't be
> affected by DDOS attack.
> 
> To implement it, I stole the printk_once() macro from Linux.
> 
> CC: Eric Blake <eblake@redhat.com>
> CC: Markus Armbruster <armbru@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  include/qemu/error-report.h | 32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
> 
> diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h
> index e1c8ae1a52..c7ec54cb97 100644
> --- a/include/qemu/error-report.h
> +++ b/include/qemu/error-report.h
> @@ -44,6 +44,38 @@ void error_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
>  void warn_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
>  void info_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
>  
> +/*
> + * Similar to error_report(), but it only prints the message once.  It
> + * returns true when it prints the first time, otherwise false.
> + */
> +#define error_report_once(fmt, ...)             \
> +    ({                                          \
> +        static bool print_once_;               \
> +        bool ret_print_once_ = !print_once_;  \
> +                                                \
> +        if (!print_once_) {                    \
> +            print_once_ = true;                \
> +            error_report(fmt, ##__VA_ARGS__);   \
> +        }                                       \
> +        unlikely(ret_print_once_);             \
> +    })
> +
> +/*
> + * Similar to warn_report(), but it only prints the message once.  It
> + * returns true when it prints the first time, otherwise false.
> + */
> +#define warn_report_once(fmt, ...)              \
> +    ({                                          \
> +        static bool print_once_;               \
> +        bool ret_print_once_ = !print_once_;  \
> +                                                \
> +        if (!print_once_) {                    \
> +            print_once_ = true;                \
> +            warn_report(fmt, ##__VA_ARGS__);   \
> +        }                                       \
> +        unlikely(ret_print_once_);             \
> +    })
> +
>  const char *error_get_progname(void);
>  extern bool enable_timestamp_msg;
>  
> -- 
> 2.17.0

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once
  2018-05-30  4:47   ` Michael S. Tsirkin
@ 2018-05-30  6:39     ` Peter Xu
  2018-05-30 13:53       ` Cornelia Huck
  2018-05-30 15:15     ` Halil Pasic
  1 sibling, 1 reply; 22+ messages in thread
From: Peter Xu @ 2018-05-30  6:39 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, Philippe Mathieu-Daudé,
	Jason Wang, Eric Blake, Markus Armbruster

On Wed, May 30, 2018 at 07:47:32AM +0300, Michael S. Tsirkin wrote:
> On Thu, May 24, 2018 at 12:44:53PM +0800, Peter Xu wrote:
> > There are many error_report()s that can be used in frequently called
> > functions, especially on IO paths.  That can be unideal in that
> > malicious guest can try to trigger the error tons of time which might
> > use up the log space on the host (e.g., libvirt can capture the stderr
> > of QEMU and put it persistently onto disk).
> 
> I think the problem is real enough but I think the API
> isn't great as it stresses the mechanism. Which fundamentally does
> not matter - we can print once or 10 times, or whatever.
> 
> What happens here is a guest bug as opposed to hypervisor
> bug. So I think a better name would be guest_error.

For me error_report_once() is okay since after all it's only a way to
dump something for the hypervisor management software (or the person
who manages the QEMU instance), and I don't have a strong opinion to
introduce a new guest_error() API.

> 
> Internally we can still have something similar to this
> mechanism.
> 
> Another idea is to reset these guest error counters on guest reset.
> Device reset too? I'm not 100% sure as guest can trigger device resets.

Yes maybe we can, but I don't know whether that's necessary.  If we
consider the possiblility of a malicious guest here, resetting the
counter after system reset might be dangerous too, since the guest can
still flush the host log by the sequence of system reset, trigger the
error, system reset, ...

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once
  2018-05-30  3:30     ` Peter Xu
@ 2018-05-30 13:36       ` Cornelia Huck
  2018-06-13  7:56         ` Markus Armbruster
  0 siblings, 1 reply; 22+ messages in thread
From: Cornelia Huck @ 2018-05-30 13:36 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Michael S . Tsirkin, Jason Wang, Markus Armbruster,
	Philippe Mathieu-Daudé,
	Halil Pasic

On Wed, 30 May 2018 11:30:45 +0800
Peter Xu <peterx@redhat.com> wrote:

> On Tue, May 29, 2018 at 11:30:00AM +0200, Cornelia Huck wrote:
> > On Thu, 24 May 2018 12:44:53 +0800
> > Peter Xu <peterx@redhat.com> wrote:
> >   
> > > There are many error_report()s that can be used in frequently called
> > > functions, especially on IO paths.  That can be unideal in that
> > > malicious guest can try to trigger the error tons of time which might
> > > use up the log space on the host (e.g., libvirt can capture the stderr
> > > of QEMU and put it persistently onto disk).  In VT-d emulation code, we
> > > have trace_vtd_error() tracer.  AFAIU all those places can be replaced
> > > by something like error_report() but trace points are mostly used to
> > > avoid the DDOS attack that mentioned above.  However using trace points
> > > mean that errors are not dumped if trace not enabled.
> > > 
> > > It's not a big deal in most modern server managements since we have
> > > things like logrotate to maintain the logs and make sure the quota is
> > > expected.  However it'll still be nice that we just provide another way
> > > to restrict message generations.  In most cases, this kind of
> > > error_report()s will only provide valid information on the first message
> > > sent, and all the rest of similar messages will be mostly talking about
> > > the same thing.  This patch introduces *_report_once() helpers to allow
> > > a message to be dumped only once during one QEMU process's life cycle.
> > > It will make sure: (1) it's on by deffault, so we can even get something
> > > without turning the trace on and reproducing, and (2) it won't be
> > > affected by DDOS attack.  
> > 
> > This is good for something (sub-)system wide, where it is enough to
> > alert the user once; but we may want to print something e.g. once per
> > the device where it happens (see v3 of "vfio-ccw: add force unlimited
> > prefetch property" for an example).  
> 
> I'm glad that we start to have more users of it, no matter which
> implementation we'll choose.  At least it means it makes some sense to
> have such a thing.
> 
> For me this patch works nicely enough.  Of course there can be
> per-device errors, but AFAICT mostly we don't have any error at
> all... and my debugging experience is that when multiple error happens
> on different devices simutaneously, they'll very possible that it's
> caused by the same problem (again, errors are rare after all), or, the
> rest of the problems are caused by the first error only (so the first
> error might cause a collapse of the rest).  That's why I wanted to
> always debug with the first error I encounter, because that's mostly
> always the root cause. In that sense, current patch works nicely for
> me (note that we can have device ID embeded in the error message with
> this one).

I think we have slightly different use cases here. In your case,
something (an error) happened and you don't really care about any
subsequent messages. In the vfio-ccw case, we want to log when a guest
tries to do things on a certain device, so we can possibly throw a
magic switch for that device. We still want messages after that so we
can catch further devices for which we may want to throw the magic
switch. (Similar for the other message, we want to give a hint why a
certain device may not work as expected.)

> 
> At the same time, I think this patch should be easier to use of course
> - no extra variables to define, and very self contained. So I would
> slightly prefer current patch.
> 
> However I'm also fine with the approach proposed in the vfio-ccw patch
> too.  Though if so I would possibly drop the 2nd patch too since if
> with the vfio-ccw patch I'll need to introduce one bool for every
> trace_vtd_err() place... then I'd not bother with it any more but
> instead live with that trace_*(). ;) Or I can define one per-IOMMU
> error boolean and pass it in for each of the error_report_once(), but
> it seems a bit awkward too.

I think we can have both the fine-grained control and convenience
macros for those cases where we really just want to print a message
once.

> 
> >   
> > > 
> > > To implement it, I stole the printk_once() macro from Linux.  
> > 
> > Would something akin to printk_ratelimited() also make sense to avoid
> > log flooding?  
> 
> Yes it will.  IMHO we can have that too as follow up if we want, and
> it does not conflict with this print_once().  I'd say currently this
> error_report_once() is good enough for me, especially lightweighted.
> I suspect we'll need more lines to implement a ratelimited version.

Yes, and I agree that wants to be a separate patch should we find a use
case for it.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once
  2018-05-30  6:39     ` Peter Xu
@ 2018-05-30 13:53       ` Cornelia Huck
  2018-06-13  7:59         ` Markus Armbruster
  0 siblings, 1 reply; 22+ messages in thread
From: Cornelia Huck @ 2018-05-30 13:53 UTC (permalink / raw)
  To: Peter Xu
  Cc: Michael S. Tsirkin, Markus Armbruster, Jason Wang, qemu-devel,
	Philippe Mathieu-Daudé,
	Halil Pasic

On Wed, 30 May 2018 14:39:55 +0800
Peter Xu <peterx@redhat.com> wrote:

> On Wed, May 30, 2018 at 07:47:32AM +0300, Michael S. Tsirkin wrote:
> > On Thu, May 24, 2018 at 12:44:53PM +0800, Peter Xu wrote:  
> > > There are many error_report()s that can be used in frequently called
> > > functions, especially on IO paths.  That can be unideal in that
> > > malicious guest can try to trigger the error tons of time which might
> > > use up the log space on the host (e.g., libvirt can capture the stderr
> > > of QEMU and put it persistently onto disk).  
> > 
> > I think the problem is real enough but I think the API
> > isn't great as it stresses the mechanism. Which fundamentally does
> > not matter - we can print once or 10 times, or whatever.
> > 
> > What happens here is a guest bug as opposed to hypervisor
> > bug. So I think a better name would be guest_error.  
> 
> For me error_report_once() is okay since after all it's only a way to
> dump something for the hypervisor management software (or the person
> who manages the QEMU instance), and I don't have a strong opinion to
> introduce a new guest_error() API.

If we go with that suggestion, guest_{error,warn} should also prefix
the message with "Guest:" or so. Otherwise, it does not offer that much
more benefit.

[And I think it should be a wrapper around the report_once
infrastructure.]

> 
> > 
> > Internally we can still have something similar to this
> > mechanism.
> > 
> > Another idea is to reset these guest error counters on guest reset.
> > Device reset too? I'm not 100% sure as guest can trigger device resets.  
> 
> Yes maybe we can, but I don't know whether that's necessary.  If we
> consider the possiblility of a malicious guest here, resetting the
> counter after system reset might be dangerous too, since the guest can
> still flush the host log by the sequence of system reset, trigger the
> error, system reset, ...

For device reset, we probably should not reset the counters for that
reason. System reset is debatable (we might have another guest kernel
or so running after a system reset, don't we?)

I think the same applies for the vfio-ccw use case referenced in the
other branch of this thread.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once
  2018-05-30  4:47   ` Michael S. Tsirkin
  2018-05-30  6:39     ` Peter Xu
@ 2018-05-30 15:15     ` Halil Pasic
  2018-05-30 15:19       ` Michael S. Tsirkin
  1 sibling, 1 reply; 22+ messages in thread
From: Halil Pasic @ 2018-05-30 15:15 UTC (permalink / raw)
  To: Michael S. Tsirkin, Peter Xu
  Cc: Markus Armbruster, Jason Wang, qemu-devel, Philippe Mathieu-Daudé

On 05/30/2018 06:47 AM, Michael S. Tsirkin wrote:
> On Thu, May 24, 2018 at 12:44:53PM +0800, Peter Xu wrote:
>> There are many error_report()s that can be used in frequently called
>> functions, especially on IO paths.  That can be unideal in that
>> malicious guest can try to trigger the error tons of time which might
>> use up the log space on the host (e.g., libvirt can capture the stderr
>> of QEMU and put it persistently onto disk).
> 
> I think the problem is real enough but I think the API
> isn't great as it stresses the mechanism. Which fundamentally does
> not matter - we can print once or 10 times, or whatever.
> 
> What happens here is a guest bug as opposed to hypervisor
> bug. So I think a better name would be guest_error.

I don't agree with your argument against the name report_once
Michael. In my reading the commit message describes one of use
cases for which the infrastructure introduced by this patch is
a supposed to be a good fit. But report_once is not restricted
to this example.

In my previous life in the userspace I had to debug problems
where the original error message got log-rotated away because of an
onslaught of error messages that were a consequence of the original
one, and not very helpful.

IMHO raising the issue of guest_error is a very sane thing to do,
but it is a different problem. I think guest_error is about how and
to whom the error is to be reported. IMHO report the error to the
ones that are affected by it and to the ones that can do something
about it (e.g. fix it) is a good rule of thumb. The latter may be
different for hypervisor and for guest bugs.

In my understanding this is really about spamming the log problem.
Of course one can try to solve/mitigate the problem at different
levels. It could be declared
1) a problem to be solved in the logging library more or less
transparently
2) a problem to be solved by the environment and it's admin (e.g.
log aggregation, filtering, and rotation)
3) a problem that the client code of the logging library has to
explicitly deal with

The once and rate_limited are 3).

To sum it up guest error or not and once or not are orthogonal
problems in my view.

Regards,
Halil

> 
> Internally we can still have something similar to this
> mechanism.
> 
> Another idea is to reset these guest error counters on guest reset.
> Device reset too? I'm not 100% sure as guest can trigger device resets.
> 

[..]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once
  2018-05-30 15:15     ` Halil Pasic
@ 2018-05-30 15:19       ` Michael S. Tsirkin
  2018-05-30 15:30         ` Halil Pasic
  0 siblings, 1 reply; 22+ messages in thread
From: Michael S. Tsirkin @ 2018-05-30 15:19 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Peter Xu, Markus Armbruster, Jason Wang, qemu-devel,
	Philippe Mathieu-Daudé

On Wed, May 30, 2018 at 05:15:19PM +0200, Halil Pasic wrote:
> 
> 
> On 05/30/2018 06:47 AM, Michael S. Tsirkin wrote:
> > On Thu, May 24, 2018 at 12:44:53PM +0800, Peter Xu wrote:
> > > There are many error_report()s that can be used in frequently called
> > > functions, especially on IO paths.  That can be unideal in that
> > > malicious guest can try to trigger the error tons of time which might
> > > use up the log space on the host (e.g., libvirt can capture the stderr
> > > of QEMU and put it persistently onto disk).
> > 
> > I think the problem is real enough but I think the API
> > isn't great as it stresses the mechanism. Which fundamentally does
> > not matter - we can print once or 10 times, or whatever.
> > 
> > What happens here is a guest bug as opposed to hypervisor
> > bug. So I think a better name would be guest_error.
> 
> I don't agree with your argument against the name report_once
> Michael. In my reading the commit message describes one of use
> cases for which the infrastructure introduced by this patch is
> a supposed to be a good fit. But report_once is not restricted
> to this example.

All I'm saying is that we should distinguish between
guest and host errors at code level.



> In my previous life in the userspace I had to debug problems
> where the original error message got log-rotated away because of an
> onslaught of error messages that were a consequence of the original
> one, and not very helpful.
> 
> IMHO raising the issue of guest_error is a very sane thing to do,
> but it is a different problem. I think guest_error is about how and
> to whom the error is to be reported. IMHO report the error to the
> ones that are affected by it and to the ones that can do something
> about it (e.g. fix it) is a good rule of thumb. The latter may be
> different for hypervisor and for guest bugs.
> 
> In my understanding this is really about spamming the log problem.
> Of course one can try to solve/mitigate the problem at different
> levels. It could be declared
> 1) a problem to be solved in the logging library more or less
> transparently
> 2) a problem to be solved by the environment and it's admin (e.g.
> log aggregation, filtering, and rotation)
> 3) a problem that the client code of the logging library has to
> explicitly deal with
> 
> The once and rate_limited are 3).
> 
> To sum it up guest error or not and once or not are orthogonal
> problems in my view.
> 
> Regards,
> Halil

Right. But as long as we are changing this code, I'd like
to see guest errors reported in a way that makes it
easy to distinguish them from host errors.

> > 
> > Internally we can still have something similar to this
> > mechanism.
> > 
> > Another idea is to reset these guest error counters on guest reset.
> > Device reset too? I'm not 100% sure as guest can trigger device resets.
> > 
> 
> 
> [..]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once
  2018-05-30 15:19       ` Michael S. Tsirkin
@ 2018-05-30 15:30         ` Halil Pasic
  0 siblings, 0 replies; 22+ messages in thread
From: Halil Pasic @ 2018-05-30 15:30 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Peter Xu, Markus Armbruster, Jason Wang, qemu-devel,
	Philippe Mathieu-Daudé



On 05/30/2018 05:19 PM, Michael S. Tsirkin wrote:
> On Wed, May 30, 2018 at 05:15:19PM +0200, Halil Pasic wrote:
>>
>>
>> On 05/30/2018 06:47 AM, Michael S. Tsirkin wrote:
>>> On Thu, May 24, 2018 at 12:44:53PM +0800, Peter Xu wrote:
>>>> There are many error_report()s that can be used in frequently called
>>>> functions, especially on IO paths.  That can be unideal in that
>>>> malicious guest can try to trigger the error tons of time which might
>>>> use up the log space on the host (e.g., libvirt can capture the stderr
>>>> of QEMU and put it persistently onto disk).
>>>
>>> I think the problem is real enough but I think the API
>>> isn't great as it stresses the mechanism. Which fundamentally does
>>> not matter - we can print once or 10 times, or whatever.
>>>
>>> What happens here is a guest bug as opposed to hypervisor
>>> bug. So I think a better name would be guest_error.
>>
>> I don't agree with your argument against the name report_once
>> Michael. In my reading the commit message describes one of use
>> cases for which the infrastructure introduced by this patch is
>> a supposed to be a good fit. But report_once is not restricted
>> to this example.
> 
> All I'm saying is that we should distinguish between
> guest and host errors at code level.
> 
> 
> 
>> In my previous life in the userspace I had to debug problems
>> where the original error message got log-rotated away because of an
>> onslaught of error messages that were a consequence of the original
>> one, and not very helpful.
>>
>> IMHO raising the issue of guest_error is a very sane thing to do,
>> but it is a different problem. I think guest_error is about how and
>> to whom the error is to be reported. IMHO report the error to the
>> ones that are affected by it and to the ones that can do something
>> about it (e.g. fix it) is a good rule of thumb. The latter may be
>> different for hypervisor and for guest bugs.
>>
>> In my understanding this is really about spamming the log problem.
>> Of course one can try to solve/mitigate the problem at different
>> levels. It could be declared
>> 1) a problem to be solved in the logging library more or less
>> transparently
>> 2) a problem to be solved by the environment and it's admin (e.g.
>> log aggregation, filtering, and rotation)
>> 3) a problem that the client code of the logging library has to
>> explicitly deal with
>>
>> The once and rate_limited are 3).
>>
>> To sum it up guest error or not and once or not are orthogonal
>> problems in my view.
>>
>> Regards,
>> Halil
> 
> Right. But as long as we are changing this code, I'd like
> to see guest errors reported in a way that makes it
> easy to distinguish them from host errors.
> 

I do second these ideas. My goal was to avoid mixing concerns.

Regards,
Halil

[..]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once
  2018-05-30 13:36       ` Cornelia Huck
@ 2018-06-13  7:56         ` Markus Armbruster
  0 siblings, 0 replies; 22+ messages in thread
From: Markus Armbruster @ 2018-06-13  7:56 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Peter Xu, Michael S . Tsirkin, qemu-devel, Jason Wang,
	Markus Armbruster, Philippe Mathieu-Daudé,
	Halil Pasic

Cornelia Huck <cohuck@redhat.com> writes:

> On Wed, 30 May 2018 11:30:45 +0800
> Peter Xu <peterx@redhat.com> wrote:
>
>> On Tue, May 29, 2018 at 11:30:00AM +0200, Cornelia Huck wrote:
>> > On Thu, 24 May 2018 12:44:53 +0800
>> > Peter Xu <peterx@redhat.com> wrote:
>> >   
>> > > There are many error_report()s that can be used in frequently called
>> > > functions, especially on IO paths.  That can be unideal in that
>> > > malicious guest can try to trigger the error tons of time which might
>> > > use up the log space on the host (e.g., libvirt can capture the stderr
>> > > of QEMU and put it persistently onto disk).  In VT-d emulation code, we
>> > > have trace_vtd_error() tracer.  AFAIU all those places can be replaced
>> > > by something like error_report() but trace points are mostly used to
>> > > avoid the DDOS attack that mentioned above.  However using trace points
>> > > mean that errors are not dumped if trace not enabled.
>> > > 
>> > > It's not a big deal in most modern server managements since we have
>> > > things like logrotate to maintain the logs and make sure the quota is
>> > > expected.  However it'll still be nice that we just provide another way
>> > > to restrict message generations.  In most cases, this kind of
>> > > error_report()s will only provide valid information on the first message
>> > > sent, and all the rest of similar messages will be mostly talking about
>> > > the same thing.  This patch introduces *_report_once() helpers to allow
>> > > a message to be dumped only once during one QEMU process's life cycle.
>> > > It will make sure: (1) it's on by deffault, so we can even get something
>> > > without turning the trace on and reproducing, and (2) it won't be
>> > > affected by DDOS attack.  
>> > 
>> > This is good for something (sub-)system wide, where it is enough to
>> > alert the user once; but we may want to print something e.g. once per
>> > the device where it happens (see v3 of "vfio-ccw: add force unlimited
>> > prefetch property" for an example).  
>> 
>> I'm glad that we start to have more users of it, no matter which
>> implementation we'll choose.  At least it means it makes some sense to
>> have such a thing.
>> 
>> For me this patch works nicely enough.  Of course there can be
>> per-device errors, but AFAICT mostly we don't have any error at
>> all... and my debugging experience is that when multiple error happens
>> on different devices simutaneously, they'll very possible that it's
>> caused by the same problem (again, errors are rare after all), or, the
>> rest of the problems are caused by the first error only (so the first
>> error might cause a collapse of the rest).  That's why I wanted to
>> always debug with the first error I encounter, because that's mostly
>> always the root cause. In that sense, current patch works nicely for
>> me (note that we can have device ID embeded in the error message with
>> this one).
>
> I think we have slightly different use cases here. In your case,
> something (an error) happened and you don't really care about any
> subsequent messages. In the vfio-ccw case, we want to log when a guest
> tries to do things on a certain device, so we can possibly throw a
> magic switch for that device. We still want messages after that so we
> can catch further devices for which we may want to throw the magic
> switch. (Similar for the other message, we want to give a hint why a
> certain device may not work as expected.)
>
>> 
>> At the same time, I think this patch should be easier to use of course
>> - no extra variables to define, and very self contained. So I would
>> slightly prefer current patch.
>> 
>> However I'm also fine with the approach proposed in the vfio-ccw patch
>> too.  Though if so I would possibly drop the 2nd patch too since if
>> with the vfio-ccw patch I'll need to introduce one bool for every
>> trace_vtd_err() place... then I'd not bother with it any more but
>> instead live with that trace_*(). ;) Or I can define one per-IOMMU
>> error boolean and pass it in for each of the error_report_once(), but
>> it seems a bit awkward too.
>
> I think we can have both the fine-grained control and convenience
> macros for those cases where we really just want to print a message
> once.

Yes.  Cornelia, feel free to post a followup patch that satisfies your
needs.

>> 
>> >   
>> > > 
>> > > To implement it, I stole the printk_once() macro from Linux.  
>> > 
>> > Would something akin to printk_ratelimited() also make sense to avoid
>> > log flooding?  
>> 
>> Yes it will.  IMHO we can have that too as follow up if we want, and
>> it does not conflict with this print_once().  I'd say currently this
>> error_report_once() is good enough for me, especially lightweighted.
>> I suspect we'll need more lines to implement a ratelimited version.
>
> Yes, and I agree that wants to be a separate patch should we find a use
> case for it.

Yes.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once
  2018-05-30 13:53       ` Cornelia Huck
@ 2018-06-13  7:59         ` Markus Armbruster
  0 siblings, 0 replies; 22+ messages in thread
From: Markus Armbruster @ 2018-06-13  7:59 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Peter Xu, Michael S. Tsirkin, Markus Armbruster, Jason Wang,
	qemu-devel, Philippe Mathieu-Daudé,
	Halil Pasic

Cornelia Huck <cohuck@redhat.com> writes:

> On Wed, 30 May 2018 14:39:55 +0800
> Peter Xu <peterx@redhat.com> wrote:
>
>> On Wed, May 30, 2018 at 07:47:32AM +0300, Michael S. Tsirkin wrote:
>> > On Thu, May 24, 2018 at 12:44:53PM +0800, Peter Xu wrote:  
>> > > There are many error_report()s that can be used in frequently called
>> > > functions, especially on IO paths.  That can be unideal in that
>> > > malicious guest can try to trigger the error tons of time which might
>> > > use up the log space on the host (e.g., libvirt can capture the stderr
>> > > of QEMU and put it persistently onto disk).  
>> > 
>> > I think the problem is real enough but I think the API
>> > isn't great as it stresses the mechanism. Which fundamentally does
>> > not matter - we can print once or 10 times, or whatever.
>> > 
>> > What happens here is a guest bug as opposed to hypervisor
>> > bug. So I think a better name would be guest_error.  
>> 
>> For me error_report_once() is okay since after all it's only a way to
>> dump something for the hypervisor management software (or the person
>> who manages the QEMU instance), and I don't have a strong opinion to
>> introduce a new guest_error() API.
>
> If we go with that suggestion, guest_{error,warn} should also prefix
> the message with "Guest:" or so. Otherwise, it does not offer that much
> more benefit.
>
> [And I think it should be a wrapper around the report_once
> infrastructure.]

I agree.  Keep error_report_once() as low-level function (okay to stress
mechanism there), then wrap whatever higher level functions we find
useful around them, in followup patches.

[...]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once
  2018-05-24  4:44 ` [Qemu-devel] [PATCH v4 1/2] qemu-error: " Peter Xu
  2018-05-29  9:30   ` Cornelia Huck
  2018-05-30  4:47   ` Michael S. Tsirkin
@ 2018-06-13  8:01   ` Markus Armbruster
  2018-06-13  9:08     ` Peter Xu
  2 siblings, 1 reply; 22+ messages in thread
From: Markus Armbruster @ 2018-06-13  8:01 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Michael S . Tsirkin, Jason Wang, Philippe Mathieu-Daudé

Peter Xu <peterx@redhat.com> writes:

> There are many error_report()s that can be used in frequently called
> functions, especially on IO paths.  That can be unideal in that
> malicious guest can try to trigger the error tons of time which might
> use up the log space on the host (e.g., libvirt can capture the stderr
> of QEMU and put it persistently onto disk).  In VT-d emulation code, we
> have trace_vtd_error() tracer.  AFAIU all those places can be replaced
> by something like error_report() but trace points are mostly used to
> avoid the DDOS attack that mentioned above.  However using trace points
> mean that errors are not dumped if trace not enabled.
>
> It's not a big deal in most modern server managements since we have
> things like logrotate to maintain the logs and make sure the quota is
> expected.  However it'll still be nice that we just provide another way
> to restrict message generations.  In most cases, this kind of
> error_report()s will only provide valid information on the first message
> sent, and all the rest of similar messages will be mostly talking about
> the same thing.  This patch introduces *_report_once() helpers to allow
> a message to be dumped only once during one QEMU process's life cycle.
> It will make sure: (1) it's on by deffault, so we can even get something

default

> without turning the trace on and reproducing, and (2) it won't be
> affected by DDOS attack.
>
> To implement it, I stole the printk_once() macro from Linux.
>
> CC: Eric Blake <eblake@redhat.com>
> CC: Markus Armbruster <armbru@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  include/qemu/error-report.h | 32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
>
> diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h
> index e1c8ae1a52..c7ec54cb97 100644
> --- a/include/qemu/error-report.h
> +++ b/include/qemu/error-report.h
> @@ -44,6 +44,38 @@ void error_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
>  void warn_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
>  void info_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
>  
> +/*
> + * Similar to error_report(), but it only prints the message once.  It
> + * returns true when it prints the first time, otherwise false.

I like to start function contracts with a single line stating the
function's purpose, and I prefer imperative mood, like this:

    * Similar to error_report(), but it only prints the message once.
    * Return true when it prints, false otherwise.

> + */
> +#define error_report_once(fmt, ...)             \
> +    ({                                          \
> +        static bool print_once_;               \
> +        bool ret_print_once_ = !print_once_;  \
> +                                                \
> +        if (!print_once_) {                    \
> +            print_once_ = true;                \
> +            error_report(fmt, ##__VA_ARGS__);   \
> +        }                                       \
> +        unlikely(ret_print_once_);             \
> +    })

Please align the backslashes, say with emacs command c-backslash-region,
bound to C-c C-\.

> +
> +/*
> + * Similar to warn_report(), but it only prints the message once.  It
> + * returns true when it prints the first time, otherwise false.
> + */
> +#define warn_report_once(fmt, ...)              \
> +    ({                                          \
> +        static bool print_once_;               \
> +        bool ret_print_once_ = !print_once_;  \
> +                                                \
> +        if (!print_once_) {                    \
> +            print_once_ = true;                \
> +            warn_report(fmt, ##__VA_ARGS__);   \
> +        }                                       \
> +        unlikely(ret_print_once_);             \
> +    })

Likewise.

> +
>  const char *error_get_progname(void);
>  extern bool enable_timestamp_msg;

With these nits addressed:
Reviewed-by: Markus Armbruster <armbru@redhat.com>

I can touch them up when I apply.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/2] intel-iommu: start to use error_report_once
  2018-05-24  4:44 ` [Qemu-devel] [PATCH v4 2/2] intel-iommu: start to use error_report_once Peter Xu
@ 2018-06-13  8:05   ` Markus Armbruster
  2018-06-13  9:36     ` Auger Eric
  0 siblings, 1 reply; 22+ messages in thread
From: Markus Armbruster @ 2018-06-13  8:05 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Michael S . Tsirkin, Jason Wang, Markus Armbruster,
	Philippe Mathieu-Daudé

Peter Xu <peterx@redhat.com> writes:

> Replace existing trace_vtd_err() with error_report_once() then stderr
> will capture something if any of the error happens, meanwhile we don't
> suffer from any DDOS.  Then remove the trace point.  Since at it,
> provide more information where proper (now we can pass parameters into
> the report function).
>
> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  hw/i386/intel_iommu.c | 59 ++++++++++++++++++++++++-------------------
>  hw/i386/trace-events  |  1 -
>  2 files changed, 33 insertions(+), 27 deletions(-)

Michael, would you give your Reviewed-by or Acked-by?  I'd take the
series through my tree then.

[...]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once
  2018-06-13  8:01   ` Markus Armbruster
@ 2018-06-13  9:08     ` Peter Xu
  0 siblings, 0 replies; 22+ messages in thread
From: Peter Xu @ 2018-06-13  9:08 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: qemu-devel, Michael S . Tsirkin, Jason Wang, Philippe Mathieu-Daudé

On Wed, Jun 13, 2018 at 10:01:22AM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > There are many error_report()s that can be used in frequently called
> > functions, especially on IO paths.  That can be unideal in that
> > malicious guest can try to trigger the error tons of time which might
> > use up the log space on the host (e.g., libvirt can capture the stderr
> > of QEMU and put it persistently onto disk).  In VT-d emulation code, we
> > have trace_vtd_error() tracer.  AFAIU all those places can be replaced
> > by something like error_report() but trace points are mostly used to
> > avoid the DDOS attack that mentioned above.  However using trace points
> > mean that errors are not dumped if trace not enabled.
> >
> > It's not a big deal in most modern server managements since we have
> > things like logrotate to maintain the logs and make sure the quota is
> > expected.  However it'll still be nice that we just provide another way
> > to restrict message generations.  In most cases, this kind of
> > error_report()s will only provide valid information on the first message
> > sent, and all the rest of similar messages will be mostly talking about
> > the same thing.  This patch introduces *_report_once() helpers to allow
> > a message to be dumped only once during one QEMU process's life cycle.
> > It will make sure: (1) it's on by deffault, so we can even get something
> 
> default
> 
> > without turning the trace on and reproducing, and (2) it won't be
> > affected by DDOS attack.
> >
> > To implement it, I stole the printk_once() macro from Linux.
> >
> > CC: Eric Blake <eblake@redhat.com>
> > CC: Markus Armbruster <armbru@redhat.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  include/qemu/error-report.h | 32 ++++++++++++++++++++++++++++++++
> >  1 file changed, 32 insertions(+)
> >
> > diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h
> > index e1c8ae1a52..c7ec54cb97 100644
> > --- a/include/qemu/error-report.h
> > +++ b/include/qemu/error-report.h
> > @@ -44,6 +44,38 @@ void error_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
> >  void warn_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
> >  void info_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
> >  
> > +/*
> > + * Similar to error_report(), but it only prints the message once.  It
> > + * returns true when it prints the first time, otherwise false.
> 
> I like to start function contracts with a single line stating the
> function's purpose, and I prefer imperative mood, like this:
> 
>     * Similar to error_report(), but it only prints the message once.
>     * Return true when it prints, false otherwise.
> 
> > + */
> > +#define error_report_once(fmt, ...)             \
> > +    ({                                          \
> > +        static bool print_once_;               \
> > +        bool ret_print_once_ = !print_once_;  \
> > +                                                \
> > +        if (!print_once_) {                    \
> > +            print_once_ = true;                \
> > +            error_report(fmt, ##__VA_ARGS__);   \
> > +        }                                       \
> > +        unlikely(ret_print_once_);             \
> > +    })
> 
> Please align the backslashes, say with emacs command c-backslash-region,
> bound to C-c C-\.

I am with evil mode so mostly I'm using evil-indent.  It's strange why
the patches were not indented correctly.  Now indent will be fine
locally if I redo the evil-indent.  I must have done something wrong
before. :(

> 
> > +
> > +/*
> > + * Similar to warn_report(), but it only prints the message once.  It
> > + * returns true when it prints the first time, otherwise false.
> > + */
> > +#define warn_report_once(fmt, ...)              \
> > +    ({                                          \
> > +        static bool print_once_;               \
> > +        bool ret_print_once_ = !print_once_;  \
> > +                                                \
> > +        if (!print_once_) {                    \
> > +            print_once_ = true;                \
> > +            warn_report(fmt, ##__VA_ARGS__);   \
> > +        }                                       \
> > +        unlikely(ret_print_once_);             \
> > +    })
> 
> Likewise.
> 
> > +
> >  const char *error_get_progname(void);
> >  extern bool enable_timestamp_msg;
> 
> With these nits addressed:
> Reviewed-by: Markus Armbruster <armbru@redhat.com>
> 
> I can touch them up when I apply.

Thanks, Markus.

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/2] intel-iommu: start to use error_report_once
  2018-06-13  8:05   ` Markus Armbruster
@ 2018-06-13  9:36     ` Auger Eric
  2018-06-14 12:51       ` Markus Armbruster
  0 siblings, 1 reply; 22+ messages in thread
From: Auger Eric @ 2018-06-13  9:36 UTC (permalink / raw)
  To: Markus Armbruster, Peter Xu
  Cc: Jason Wang, Philippe Mathieu-Daudé, qemu-devel, Michael S . Tsirkin

Hi,
On 06/13/2018 10:05 AM, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
>> Replace existing trace_vtd_err() with error_report_once() then stderr
>> will capture something if any of the error happens, meanwhile we don't
>> suffer from any DDOS.  Then remove the trace point.  Since at it,
>> provide more information where proper (now we can pass parameters into
>> the report function).
>>
>> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
>> Signed-off-by: Peter Xu <peterx@redhat.com>
>> ---
>>  hw/i386/intel_iommu.c | 59 ++++++++++++++++++++++++-------------------
>>  hw/i386/trace-events  |  1 -
>>  2 files changed, 33 insertions(+), 27 deletions(-)
> 
> Michael, would you give your Reviewed-by or Acked-by?  I'd take the
> series through my tree then.
> 
> [...]
> 
Sorry to enter this thread at this late stage. Just one question: on the
smmuv3 emulation code, Peter (Maydell) urged me to use
qemu_log_mask(LOG_GUEST_ERROR, ...) whenever the error was triggered by
a guest bad behavior. So what is the final guidance to avoid the DOS you
mention?

Thanks

Eric

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/2] intel-iommu: start to use error_report_once
  2018-06-13  9:36     ` Auger Eric
@ 2018-06-14 12:51       ` Markus Armbruster
  2018-06-14 12:56         ` Peter Maydell
  0 siblings, 1 reply; 22+ messages in thread
From: Markus Armbruster @ 2018-06-14 12:51 UTC (permalink / raw)
  To: Auger Eric
  Cc: Markus Armbruster, Peter Xu, Jason Wang, Michael S . Tsirkin,
	Philippe Mathieu-Daudé,
	qemu-devel, Peter Maydell

Auger Eric <eric.auger@redhat.com> writes:

> Hi,
> On 06/13/2018 10:05 AM, Markus Armbruster wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>>> Replace existing trace_vtd_err() with error_report_once() then stderr
>>> will capture something if any of the error happens, meanwhile we don't
>>> suffer from any DDOS.  Then remove the trace point.  Since at it,
>>> provide more information where proper (now we can pass parameters into
>>> the report function).
>>>
>>> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
>>> Signed-off-by: Peter Xu <peterx@redhat.com>
>>> ---
>>>  hw/i386/intel_iommu.c | 59 ++++++++++++++++++++++++-------------------
>>>  hw/i386/trace-events  |  1 -
>>>  2 files changed, 33 insertions(+), 27 deletions(-)
>> 
>> Michael, would you give your Reviewed-by or Acked-by?  I'd take the
>> series through my tree then.
>> 
>> [...]
>> 
> Sorry to enter this thread at this late stage. Just one question: on the
> smmuv3 emulation code, Peter (Maydell) urged me to use
> qemu_log_mask(LOG_GUEST_ERROR, ...) whenever the error was triggered by
> a guest bad behavior. So what is the final guidance to avoid the DOS you
> mention?

Does the user need to know about the error condition?

If yes, we should tell him.  Logging is not telling.

Peter, what do you think?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/2] intel-iommu: start to use error_report_once
  2018-06-14 12:51       ` Markus Armbruster
@ 2018-06-14 12:56         ` Peter Maydell
  0 siblings, 0 replies; 22+ messages in thread
From: Peter Maydell @ 2018-06-14 12:56 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Auger Eric, Peter Xu, Jason Wang, Michael S . Tsirkin,
	Philippe Mathieu-Daudé,
	QEMU Developers

On 14 June 2018 at 13:51, Markus Armbruster <armbru@redhat.com> wrote:
> Auger Eric <eric.auger@redhat.com> writes:
>
>> Hi,
>> On 06/13/2018 10:05 AM, Markus Armbruster wrote:
>>> Peter Xu <peterx@redhat.com> writes:
>>>
>>>> Replace existing trace_vtd_err() with error_report_once() then stderr
>>>> will capture something if any of the error happens, meanwhile we don't
>>>> suffer from any DDOS.  Then remove the trace point.  Since at it,
>>>> provide more information where proper (now we can pass parameters into
>>>> the report function).
>>>>
>>>> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
>>>> Signed-off-by: Peter Xu <peterx@redhat.com>
>>>> ---
>>>>  hw/i386/intel_iommu.c | 59 ++++++++++++++++++++++++-------------------
>>>>  hw/i386/trace-events  |  1 -
>>>>  2 files changed, 33 insertions(+), 27 deletions(-)
>>>
>>> Michael, would you give your Reviewed-by or Acked-by?  I'd take the
>>> series through my tree then.
>>>
>>> [...]
>>>
>> Sorry to enter this thread at this late stage. Just one question: on the
>> smmuv3 emulation code, Peter (Maydell) urged me to use
>> qemu_log_mask(LOG_GUEST_ERROR, ...) whenever the error was triggered by
>> a guest bad behavior. So what is the final guidance to avoid the DOS you
>> mention?
>
> Does the user need to know about the error condition?
>
> If yes, we should tell him.  Logging is not telling.
>
> Peter, what do you think?

My view is that "guest does something stupid" should be
LOG_GUEST_ERROR, because (a) maybe you'd like to know about it
but (b) usually you don't unless you happen to be the person
trying to develop or debug the guest OS, in which case you can
turn on the logging to help you in that debugging task.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once
  2018-05-24  4:44 [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once Peter Xu
  2018-05-24  4:44 ` [Qemu-devel] [PATCH v4 1/2] qemu-error: " Peter Xu
  2018-05-24  4:44 ` [Qemu-devel] [PATCH v4 2/2] intel-iommu: start to use error_report_once Peter Xu
@ 2018-08-15  5:58 ` Markus Armbruster
  2018-08-15  6:10   ` Peter Xu
  2 siblings, 1 reply; 22+ messages in thread
From: Markus Armbruster @ 2018-08-15  5:58 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Michael S . Tsirkin, Jason Wang, Philippe Mathieu-Daudé

I'm afraid I let this fall through the cracks.  Sorry about that!

PATCH 2 conflicts semantically with commmit 63b88968f13: the latter adds
a trace_vtd_err() the former doesn't replace.  Please respin.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once
  2018-08-15  5:58 ` [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once Markus Armbruster
@ 2018-08-15  6:10   ` Peter Xu
  0 siblings, 0 replies; 22+ messages in thread
From: Peter Xu @ 2018-08-15  6:10 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: qemu-devel, Michael S . Tsirkin, Jason Wang, Philippe Mathieu-Daudé

On Wed, Aug 15, 2018 at 07:58:52AM +0200, Markus Armbruster wrote:
> I'm afraid I let this fall through the cracks.  Sorry about that!
> 
> PATCH 2 conflicts semantically with commmit 63b88968f13: the latter adds
> a trace_vtd_err() the former doesn't replace.  Please respin.

Will do.

Regards,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2018-08-15  6:10 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-24  4:44 [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once Peter Xu
2018-05-24  4:44 ` [Qemu-devel] [PATCH v4 1/2] qemu-error: " Peter Xu
2018-05-29  9:30   ` Cornelia Huck
2018-05-30  3:30     ` Peter Xu
2018-05-30 13:36       ` Cornelia Huck
2018-06-13  7:56         ` Markus Armbruster
2018-05-30  4:47   ` Michael S. Tsirkin
2018-05-30  6:39     ` Peter Xu
2018-05-30 13:53       ` Cornelia Huck
2018-06-13  7:59         ` Markus Armbruster
2018-05-30 15:15     ` Halil Pasic
2018-05-30 15:19       ` Michael S. Tsirkin
2018-05-30 15:30         ` Halil Pasic
2018-06-13  8:01   ` Markus Armbruster
2018-06-13  9:08     ` Peter Xu
2018-05-24  4:44 ` [Qemu-devel] [PATCH v4 2/2] intel-iommu: start to use error_report_once Peter Xu
2018-06-13  8:05   ` Markus Armbruster
2018-06-13  9:36     ` Auger Eric
2018-06-14 12:51       ` Markus Armbruster
2018-06-14 12:56         ` Peter Maydell
2018-08-15  5:58 ` [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once Markus Armbruster
2018-08-15  6:10   ` Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.