* [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once @ 2018-05-24 4:44 Peter Xu 2018-05-24 4:44 ` [Qemu-devel] [PATCH v4 1/2] qemu-error: " Peter Xu ` (2 more replies) 0 siblings, 3 replies; 22+ messages in thread From: Peter Xu @ 2018-05-24 4:44 UTC (permalink / raw) To: qemu-devel Cc: Philippe Mathieu-Daudé, peterx, Jason Wang, Michael S . Tsirkin, Eric Blake, Markus Armbruster v4: - patch 2: pick r-b from Philippe - patch 1: replace all __* variables into *_ [Eric] - patch 1: enhance the commit message of patch 1, mention return code of macros [Markus] v3: - reindent in patch 2, dump more things [Philippe] v2: - for patch 1: replace tabs, add trivial comment [Markus] (I didn't add much comment otherwise I'll need to duplicate what's there in error_report()) - add patch 2 Patch 1 introduce the helpers. Patch 2 use it to replace VT-d trace_vtd_err(). Please review. Thanks. Peter Xu (2): qemu-error: introduce {error|warn}_report_once intel-iommu: start to use error_report_once include/qemu/error-report.h | 32 ++++++++++++++++++++ hw/i386/intel_iommu.c | 59 +++++++++++++++++++++---------------- hw/i386/trace-events | 1 - 3 files changed, 65 insertions(+), 27 deletions(-) -- 2.17.0 ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once 2018-05-24 4:44 [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once Peter Xu @ 2018-05-24 4:44 ` Peter Xu 2018-05-29 9:30 ` Cornelia Huck ` (2 more replies) 2018-05-24 4:44 ` [Qemu-devel] [PATCH v4 2/2] intel-iommu: start to use error_report_once Peter Xu 2018-08-15 5:58 ` [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once Markus Armbruster 2 siblings, 3 replies; 22+ messages in thread From: Peter Xu @ 2018-05-24 4:44 UTC (permalink / raw) To: qemu-devel Cc: Philippe Mathieu-Daudé, peterx, Jason Wang, Michael S . Tsirkin, Eric Blake, Markus Armbruster There are many error_report()s that can be used in frequently called functions, especially on IO paths. That can be unideal in that malicious guest can try to trigger the error tons of time which might use up the log space on the host (e.g., libvirt can capture the stderr of QEMU and put it persistently onto disk). In VT-d emulation code, we have trace_vtd_error() tracer. AFAIU all those places can be replaced by something like error_report() but trace points are mostly used to avoid the DDOS attack that mentioned above. However using trace points mean that errors are not dumped if trace not enabled. It's not a big deal in most modern server managements since we have things like logrotate to maintain the logs and make sure the quota is expected. However it'll still be nice that we just provide another way to restrict message generations. In most cases, this kind of error_report()s will only provide valid information on the first message sent, and all the rest of similar messages will be mostly talking about the same thing. This patch introduces *_report_once() helpers to allow a message to be dumped only once during one QEMU process's life cycle. It will make sure: (1) it's on by deffault, so we can even get something without turning the trace on and reproducing, and (2) it won't be affected by DDOS attack. To implement it, I stole the printk_once() macro from Linux. CC: Eric Blake <eblake@redhat.com> CC: Markus Armbruster <armbru@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> --- include/qemu/error-report.h | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h index e1c8ae1a52..c7ec54cb97 100644 --- a/include/qemu/error-report.h +++ b/include/qemu/error-report.h @@ -44,6 +44,38 @@ void error_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2); void warn_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2); void info_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2); +/* + * Similar to error_report(), but it only prints the message once. It + * returns true when it prints the first time, otherwise false. + */ +#define error_report_once(fmt, ...) \ + ({ \ + static bool print_once_; \ + bool ret_print_once_ = !print_once_; \ + \ + if (!print_once_) { \ + print_once_ = true; \ + error_report(fmt, ##__VA_ARGS__); \ + } \ + unlikely(ret_print_once_); \ + }) + +/* + * Similar to warn_report(), but it only prints the message once. It + * returns true when it prints the first time, otherwise false. + */ +#define warn_report_once(fmt, ...) \ + ({ \ + static bool print_once_; \ + bool ret_print_once_ = !print_once_; \ + \ + if (!print_once_) { \ + print_once_ = true; \ + warn_report(fmt, ##__VA_ARGS__); \ + } \ + unlikely(ret_print_once_); \ + }) + const char *error_get_progname(void); extern bool enable_timestamp_msg; -- 2.17.0 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once 2018-05-24 4:44 ` [Qemu-devel] [PATCH v4 1/2] qemu-error: " Peter Xu @ 2018-05-29 9:30 ` Cornelia Huck 2018-05-30 3:30 ` Peter Xu 2018-05-30 4:47 ` Michael S. Tsirkin 2018-06-13 8:01 ` Markus Armbruster 2 siblings, 1 reply; 22+ messages in thread From: Cornelia Huck @ 2018-05-29 9:30 UTC (permalink / raw) To: Peter Xu Cc: qemu-devel, Michael S . Tsirkin, Jason Wang, Markus Armbruster, Philippe Mathieu-Daudé, Halil Pasic On Thu, 24 May 2018 12:44:53 +0800 Peter Xu <peterx@redhat.com> wrote: > There are many error_report()s that can be used in frequently called > functions, especially on IO paths. That can be unideal in that > malicious guest can try to trigger the error tons of time which might > use up the log space on the host (e.g., libvirt can capture the stderr > of QEMU and put it persistently onto disk). In VT-d emulation code, we > have trace_vtd_error() tracer. AFAIU all those places can be replaced > by something like error_report() but trace points are mostly used to > avoid the DDOS attack that mentioned above. However using trace points > mean that errors are not dumped if trace not enabled. > > It's not a big deal in most modern server managements since we have > things like logrotate to maintain the logs and make sure the quota is > expected. However it'll still be nice that we just provide another way > to restrict message generations. In most cases, this kind of > error_report()s will only provide valid information on the first message > sent, and all the rest of similar messages will be mostly talking about > the same thing. This patch introduces *_report_once() helpers to allow > a message to be dumped only once during one QEMU process's life cycle. > It will make sure: (1) it's on by deffault, so we can even get something > without turning the trace on and reproducing, and (2) it won't be > affected by DDOS attack. This is good for something (sub-)system wide, where it is enough to alert the user once; but we may want to print something e.g. once per the device where it happens (see v3 of "vfio-ccw: add force unlimited prefetch property" for an example). > > To implement it, I stole the printk_once() macro from Linux. Would something akin to printk_ratelimited() also make sense to avoid log flooding? > > CC: Eric Blake <eblake@redhat.com> > CC: Markus Armbruster <armbru@redhat.com> > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > include/qemu/error-report.h | 32 ++++++++++++++++++++++++++++++++ > 1 file changed, 32 insertions(+) > > diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h > index e1c8ae1a52..c7ec54cb97 100644 > --- a/include/qemu/error-report.h > +++ b/include/qemu/error-report.h > @@ -44,6 +44,38 @@ void error_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2); > void warn_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2); > void info_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2); > > +/* > + * Similar to error_report(), but it only prints the message once. It > + * returns true when it prints the first time, otherwise false. > + */ > +#define error_report_once(fmt, ...) \ > + ({ \ > + static bool print_once_; \ > + bool ret_print_once_ = !print_once_; \ > + \ > + if (!print_once_) { \ > + print_once_ = true; \ > + error_report(fmt, ##__VA_ARGS__); \ > + } \ > + unlikely(ret_print_once_); \ > + }) > + > +/* > + * Similar to warn_report(), but it only prints the message once. It > + * returns true when it prints the first time, otherwise false. > + */ > +#define warn_report_once(fmt, ...) \ > + ({ \ > + static bool print_once_; \ > + bool ret_print_once_ = !print_once_; \ > + \ > + if (!print_once_) { \ > + print_once_ = true; \ > + warn_report(fmt, ##__VA_ARGS__); \ > + } \ > + unlikely(ret_print_once_); \ > + }) > + > const char *error_get_progname(void); > extern bool enable_timestamp_msg; > The patch mentioned above implements printing a warning once via a passed-in variable (in that case, a per-device property). That looks like a superset of what this patch provides. Would such a functionality be useful for other callers as well? If so, we should probably implement it in the common error handling functions. [I'm currently planning to queue the vfio-ccw patch as-is; we can switch to a common interface later if it makes sense.] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once 2018-05-29 9:30 ` Cornelia Huck @ 2018-05-30 3:30 ` Peter Xu 2018-05-30 13:36 ` Cornelia Huck 0 siblings, 1 reply; 22+ messages in thread From: Peter Xu @ 2018-05-30 3:30 UTC (permalink / raw) To: Cornelia Huck Cc: qemu-devel, Michael S . Tsirkin, Jason Wang, Markus Armbruster, Philippe Mathieu-Daudé, Halil Pasic On Tue, May 29, 2018 at 11:30:00AM +0200, Cornelia Huck wrote: > On Thu, 24 May 2018 12:44:53 +0800 > Peter Xu <peterx@redhat.com> wrote: > > > There are many error_report()s that can be used in frequently called > > functions, especially on IO paths. That can be unideal in that > > malicious guest can try to trigger the error tons of time which might > > use up the log space on the host (e.g., libvirt can capture the stderr > > of QEMU and put it persistently onto disk). In VT-d emulation code, we > > have trace_vtd_error() tracer. AFAIU all those places can be replaced > > by something like error_report() but trace points are mostly used to > > avoid the DDOS attack that mentioned above. However using trace points > > mean that errors are not dumped if trace not enabled. > > > > It's not a big deal in most modern server managements since we have > > things like logrotate to maintain the logs and make sure the quota is > > expected. However it'll still be nice that we just provide another way > > to restrict message generations. In most cases, this kind of > > error_report()s will only provide valid information on the first message > > sent, and all the rest of similar messages will be mostly talking about > > the same thing. This patch introduces *_report_once() helpers to allow > > a message to be dumped only once during one QEMU process's life cycle. > > It will make sure: (1) it's on by deffault, so we can even get something > > without turning the trace on and reproducing, and (2) it won't be > > affected by DDOS attack. > > This is good for something (sub-)system wide, where it is enough to > alert the user once; but we may want to print something e.g. once per > the device where it happens (see v3 of "vfio-ccw: add force unlimited > prefetch property" for an example). I'm glad that we start to have more users of it, no matter which implementation we'll choose. At least it means it makes some sense to have such a thing. For me this patch works nicely enough. Of course there can be per-device errors, but AFAICT mostly we don't have any error at all... and my debugging experience is that when multiple error happens on different devices simutaneously, they'll very possible that it's caused by the same problem (again, errors are rare after all), or, the rest of the problems are caused by the first error only (so the first error might cause a collapse of the rest). That's why I wanted to always debug with the first error I encounter, because that's mostly always the root cause. In that sense, current patch works nicely for me (note that we can have device ID embeded in the error message with this one). At the same time, I think this patch should be easier to use of course - no extra variables to define, and very self contained. So I would slightly prefer current patch. However I'm also fine with the approach proposed in the vfio-ccw patch too. Though if so I would possibly drop the 2nd patch too since if with the vfio-ccw patch I'll need to introduce one bool for every trace_vtd_err() place... then I'd not bother with it any more but instead live with that trace_*(). ;) Or I can define one per-IOMMU error boolean and pass it in for each of the error_report_once(), but it seems a bit awkward too. > > > > > To implement it, I stole the printk_once() macro from Linux. > > Would something akin to printk_ratelimited() also make sense to avoid > log flooding? Yes it will. IMHO we can have that too as follow up if we want, and it does not conflict with this print_once(). I'd say currently this error_report_once() is good enough for me, especially lightweighted. I suspect we'll need more lines to implement a ratelimited version. > > > > > CC: Eric Blake <eblake@redhat.com> > > CC: Markus Armbruster <armbru@redhat.com> > > Signed-off-by: Peter Xu <peterx@redhat.com> > > --- > > include/qemu/error-report.h | 32 ++++++++++++++++++++++++++++++++ > > 1 file changed, 32 insertions(+) > > > > diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h > > index e1c8ae1a52..c7ec54cb97 100644 > > --- a/include/qemu/error-report.h > > +++ b/include/qemu/error-report.h > > @@ -44,6 +44,38 @@ void error_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2); > > void warn_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2); > > void info_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2); > > > > +/* > > + * Similar to error_report(), but it only prints the message once. It > > + * returns true when it prints the first time, otherwise false. > > + */ > > +#define error_report_once(fmt, ...) \ > > + ({ \ > > + static bool print_once_; \ > > + bool ret_print_once_ = !print_once_; \ > > + \ > > + if (!print_once_) { \ > > + print_once_ = true; \ > > + error_report(fmt, ##__VA_ARGS__); \ > > + } \ > > + unlikely(ret_print_once_); \ > > + }) > > + > > +/* > > + * Similar to warn_report(), but it only prints the message once. It > > + * returns true when it prints the first time, otherwise false. > > + */ > > +#define warn_report_once(fmt, ...) \ > > + ({ \ > > + static bool print_once_; \ > > + bool ret_print_once_ = !print_once_; \ > > + \ > > + if (!print_once_) { \ > > + print_once_ = true; \ > > + warn_report(fmt, ##__VA_ARGS__); \ > > + } \ > > + unlikely(ret_print_once_); \ > > + }) > > + > > const char *error_get_progname(void); > > extern bool enable_timestamp_msg; > > > > The patch mentioned above implements printing a warning once via a > passed-in variable (in that case, a per-device property). That looks > like a superset of what this patch provides. > > Would such a functionality be useful for other callers as well? If so, > we should probably implement it in the common error handling functions. > [I'm currently planning to queue the vfio-ccw patch as-is; we can > switch to a common interface later if it makes sense.] Yeah; I left my thoughts above. I'll be happy with either way. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once 2018-05-30 3:30 ` Peter Xu @ 2018-05-30 13:36 ` Cornelia Huck 2018-06-13 7:56 ` Markus Armbruster 0 siblings, 1 reply; 22+ messages in thread From: Cornelia Huck @ 2018-05-30 13:36 UTC (permalink / raw) To: Peter Xu Cc: qemu-devel, Michael S . Tsirkin, Jason Wang, Markus Armbruster, Philippe Mathieu-Daudé, Halil Pasic On Wed, 30 May 2018 11:30:45 +0800 Peter Xu <peterx@redhat.com> wrote: > On Tue, May 29, 2018 at 11:30:00AM +0200, Cornelia Huck wrote: > > On Thu, 24 May 2018 12:44:53 +0800 > > Peter Xu <peterx@redhat.com> wrote: > > > > > There are many error_report()s that can be used in frequently called > > > functions, especially on IO paths. That can be unideal in that > > > malicious guest can try to trigger the error tons of time which might > > > use up the log space on the host (e.g., libvirt can capture the stderr > > > of QEMU and put it persistently onto disk). In VT-d emulation code, we > > > have trace_vtd_error() tracer. AFAIU all those places can be replaced > > > by something like error_report() but trace points are mostly used to > > > avoid the DDOS attack that mentioned above. However using trace points > > > mean that errors are not dumped if trace not enabled. > > > > > > It's not a big deal in most modern server managements since we have > > > things like logrotate to maintain the logs and make sure the quota is > > > expected. However it'll still be nice that we just provide another way > > > to restrict message generations. In most cases, this kind of > > > error_report()s will only provide valid information on the first message > > > sent, and all the rest of similar messages will be mostly talking about > > > the same thing. This patch introduces *_report_once() helpers to allow > > > a message to be dumped only once during one QEMU process's life cycle. > > > It will make sure: (1) it's on by deffault, so we can even get something > > > without turning the trace on and reproducing, and (2) it won't be > > > affected by DDOS attack. > > > > This is good for something (sub-)system wide, where it is enough to > > alert the user once; but we may want to print something e.g. once per > > the device where it happens (see v3 of "vfio-ccw: add force unlimited > > prefetch property" for an example). > > I'm glad that we start to have more users of it, no matter which > implementation we'll choose. At least it means it makes some sense to > have such a thing. > > For me this patch works nicely enough. Of course there can be > per-device errors, but AFAICT mostly we don't have any error at > all... and my debugging experience is that when multiple error happens > on different devices simutaneously, they'll very possible that it's > caused by the same problem (again, errors are rare after all), or, the > rest of the problems are caused by the first error only (so the first > error might cause a collapse of the rest). That's why I wanted to > always debug with the first error I encounter, because that's mostly > always the root cause. In that sense, current patch works nicely for > me (note that we can have device ID embeded in the error message with > this one). I think we have slightly different use cases here. In your case, something (an error) happened and you don't really care about any subsequent messages. In the vfio-ccw case, we want to log when a guest tries to do things on a certain device, so we can possibly throw a magic switch for that device. We still want messages after that so we can catch further devices for which we may want to throw the magic switch. (Similar for the other message, we want to give a hint why a certain device may not work as expected.) > > At the same time, I think this patch should be easier to use of course > - no extra variables to define, and very self contained. So I would > slightly prefer current patch. > > However I'm also fine with the approach proposed in the vfio-ccw patch > too. Though if so I would possibly drop the 2nd patch too since if > with the vfio-ccw patch I'll need to introduce one bool for every > trace_vtd_err() place... then I'd not bother with it any more but > instead live with that trace_*(). ;) Or I can define one per-IOMMU > error boolean and pass it in for each of the error_report_once(), but > it seems a bit awkward too. I think we can have both the fine-grained control and convenience macros for those cases where we really just want to print a message once. > > > > > > > > > To implement it, I stole the printk_once() macro from Linux. > > > > Would something akin to printk_ratelimited() also make sense to avoid > > log flooding? > > Yes it will. IMHO we can have that too as follow up if we want, and > it does not conflict with this print_once(). I'd say currently this > error_report_once() is good enough for me, especially lightweighted. > I suspect we'll need more lines to implement a ratelimited version. Yes, and I agree that wants to be a separate patch should we find a use case for it. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once 2018-05-30 13:36 ` Cornelia Huck @ 2018-06-13 7:56 ` Markus Armbruster 0 siblings, 0 replies; 22+ messages in thread From: Markus Armbruster @ 2018-06-13 7:56 UTC (permalink / raw) To: Cornelia Huck Cc: Peter Xu, Michael S . Tsirkin, qemu-devel, Jason Wang, Markus Armbruster, Philippe Mathieu-Daudé, Halil Pasic Cornelia Huck <cohuck@redhat.com> writes: > On Wed, 30 May 2018 11:30:45 +0800 > Peter Xu <peterx@redhat.com> wrote: > >> On Tue, May 29, 2018 at 11:30:00AM +0200, Cornelia Huck wrote: >> > On Thu, 24 May 2018 12:44:53 +0800 >> > Peter Xu <peterx@redhat.com> wrote: >> > >> > > There are many error_report()s that can be used in frequently called >> > > functions, especially on IO paths. That can be unideal in that >> > > malicious guest can try to trigger the error tons of time which might >> > > use up the log space on the host (e.g., libvirt can capture the stderr >> > > of QEMU and put it persistently onto disk). In VT-d emulation code, we >> > > have trace_vtd_error() tracer. AFAIU all those places can be replaced >> > > by something like error_report() but trace points are mostly used to >> > > avoid the DDOS attack that mentioned above. However using trace points >> > > mean that errors are not dumped if trace not enabled. >> > > >> > > It's not a big deal in most modern server managements since we have >> > > things like logrotate to maintain the logs and make sure the quota is >> > > expected. However it'll still be nice that we just provide another way >> > > to restrict message generations. In most cases, this kind of >> > > error_report()s will only provide valid information on the first message >> > > sent, and all the rest of similar messages will be mostly talking about >> > > the same thing. This patch introduces *_report_once() helpers to allow >> > > a message to be dumped only once during one QEMU process's life cycle. >> > > It will make sure: (1) it's on by deffault, so we can even get something >> > > without turning the trace on and reproducing, and (2) it won't be >> > > affected by DDOS attack. >> > >> > This is good for something (sub-)system wide, where it is enough to >> > alert the user once; but we may want to print something e.g. once per >> > the device where it happens (see v3 of "vfio-ccw: add force unlimited >> > prefetch property" for an example). >> >> I'm glad that we start to have more users of it, no matter which >> implementation we'll choose. At least it means it makes some sense to >> have such a thing. >> >> For me this patch works nicely enough. Of course there can be >> per-device errors, but AFAICT mostly we don't have any error at >> all... and my debugging experience is that when multiple error happens >> on different devices simutaneously, they'll very possible that it's >> caused by the same problem (again, errors are rare after all), or, the >> rest of the problems are caused by the first error only (so the first >> error might cause a collapse of the rest). That's why I wanted to >> always debug with the first error I encounter, because that's mostly >> always the root cause. In that sense, current patch works nicely for >> me (note that we can have device ID embeded in the error message with >> this one). > > I think we have slightly different use cases here. In your case, > something (an error) happened and you don't really care about any > subsequent messages. In the vfio-ccw case, we want to log when a guest > tries to do things on a certain device, so we can possibly throw a > magic switch for that device. We still want messages after that so we > can catch further devices for which we may want to throw the magic > switch. (Similar for the other message, we want to give a hint why a > certain device may not work as expected.) > >> >> At the same time, I think this patch should be easier to use of course >> - no extra variables to define, and very self contained. So I would >> slightly prefer current patch. >> >> However I'm also fine with the approach proposed in the vfio-ccw patch >> too. Though if so I would possibly drop the 2nd patch too since if >> with the vfio-ccw patch I'll need to introduce one bool for every >> trace_vtd_err() place... then I'd not bother with it any more but >> instead live with that trace_*(). ;) Or I can define one per-IOMMU >> error boolean and pass it in for each of the error_report_once(), but >> it seems a bit awkward too. > > I think we can have both the fine-grained control and convenience > macros for those cases where we really just want to print a message > once. Yes. Cornelia, feel free to post a followup patch that satisfies your needs. >> >> > >> > > >> > > To implement it, I stole the printk_once() macro from Linux. >> > >> > Would something akin to printk_ratelimited() also make sense to avoid >> > log flooding? >> >> Yes it will. IMHO we can have that too as follow up if we want, and >> it does not conflict with this print_once(). I'd say currently this >> error_report_once() is good enough for me, especially lightweighted. >> I suspect we'll need more lines to implement a ratelimited version. > > Yes, and I agree that wants to be a separate patch should we find a use > case for it. Yes. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once 2018-05-24 4:44 ` [Qemu-devel] [PATCH v4 1/2] qemu-error: " Peter Xu 2018-05-29 9:30 ` Cornelia Huck @ 2018-05-30 4:47 ` Michael S. Tsirkin 2018-05-30 6:39 ` Peter Xu 2018-05-30 15:15 ` Halil Pasic 2018-06-13 8:01 ` Markus Armbruster 2 siblings, 2 replies; 22+ messages in thread From: Michael S. Tsirkin @ 2018-05-30 4:47 UTC (permalink / raw) To: Peter Xu Cc: qemu-devel, Philippe Mathieu-Daudé, Jason Wang, Eric Blake, Markus Armbruster On Thu, May 24, 2018 at 12:44:53PM +0800, Peter Xu wrote: > There are many error_report()s that can be used in frequently called > functions, especially on IO paths. That can be unideal in that > malicious guest can try to trigger the error tons of time which might > use up the log space on the host (e.g., libvirt can capture the stderr > of QEMU and put it persistently onto disk). I think the problem is real enough but I think the API isn't great as it stresses the mechanism. Which fundamentally does not matter - we can print once or 10 times, or whatever. What happens here is a guest bug as opposed to hypervisor bug. So I think a better name would be guest_error. Internally we can still have something similar to this mechanism. Another idea is to reset these guest error counters on guest reset. Device reset too? I'm not 100% sure as guest can trigger device resets. > In VT-d emulation code, we > have trace_vtd_error() tracer. AFAIU all those places can be replaced > by something like error_report() but trace points are mostly used to > avoid the DDOS attack that mentioned above. However using trace points > mean that errors are not dumped if trace not enabled. > > It's not a big deal in most modern server managements since we have > things like logrotate to maintain the logs and make sure the quota is > expected. However it'll still be nice that we just provide another way > to restrict message generations. In most cases, this kind of > error_report()s will only provide valid information on the first message > sent, and all the rest of similar messages will be mostly talking about > the same thing. This patch introduces *_report_once() helpers to allow > a message to be dumped only once during one QEMU process's life cycle. > It will make sure: (1) it's on by deffault, so we can even get something > without turning the trace on and reproducing, and (2) it won't be > affected by DDOS attack. > > To implement it, I stole the printk_once() macro from Linux. > > CC: Eric Blake <eblake@redhat.com> > CC: Markus Armbruster <armbru@redhat.com> > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > include/qemu/error-report.h | 32 ++++++++++++++++++++++++++++++++ > 1 file changed, 32 insertions(+) > > diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h > index e1c8ae1a52..c7ec54cb97 100644 > --- a/include/qemu/error-report.h > +++ b/include/qemu/error-report.h > @@ -44,6 +44,38 @@ void error_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2); > void warn_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2); > void info_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2); > > +/* > + * Similar to error_report(), but it only prints the message once. It > + * returns true when it prints the first time, otherwise false. > + */ > +#define error_report_once(fmt, ...) \ > + ({ \ > + static bool print_once_; \ > + bool ret_print_once_ = !print_once_; \ > + \ > + if (!print_once_) { \ > + print_once_ = true; \ > + error_report(fmt, ##__VA_ARGS__); \ > + } \ > + unlikely(ret_print_once_); \ > + }) > + > +/* > + * Similar to warn_report(), but it only prints the message once. It > + * returns true when it prints the first time, otherwise false. > + */ > +#define warn_report_once(fmt, ...) \ > + ({ \ > + static bool print_once_; \ > + bool ret_print_once_ = !print_once_; \ > + \ > + if (!print_once_) { \ > + print_once_ = true; \ > + warn_report(fmt, ##__VA_ARGS__); \ > + } \ > + unlikely(ret_print_once_); \ > + }) > + > const char *error_get_progname(void); > extern bool enable_timestamp_msg; > > -- > 2.17.0 ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once 2018-05-30 4:47 ` Michael S. Tsirkin @ 2018-05-30 6:39 ` Peter Xu 2018-05-30 13:53 ` Cornelia Huck 2018-05-30 15:15 ` Halil Pasic 1 sibling, 1 reply; 22+ messages in thread From: Peter Xu @ 2018-05-30 6:39 UTC (permalink / raw) To: Michael S. Tsirkin Cc: qemu-devel, Philippe Mathieu-Daudé, Jason Wang, Eric Blake, Markus Armbruster On Wed, May 30, 2018 at 07:47:32AM +0300, Michael S. Tsirkin wrote: > On Thu, May 24, 2018 at 12:44:53PM +0800, Peter Xu wrote: > > There are many error_report()s that can be used in frequently called > > functions, especially on IO paths. That can be unideal in that > > malicious guest can try to trigger the error tons of time which might > > use up the log space on the host (e.g., libvirt can capture the stderr > > of QEMU and put it persistently onto disk). > > I think the problem is real enough but I think the API > isn't great as it stresses the mechanism. Which fundamentally does > not matter - we can print once or 10 times, or whatever. > > What happens here is a guest bug as opposed to hypervisor > bug. So I think a better name would be guest_error. For me error_report_once() is okay since after all it's only a way to dump something for the hypervisor management software (or the person who manages the QEMU instance), and I don't have a strong opinion to introduce a new guest_error() API. > > Internally we can still have something similar to this > mechanism. > > Another idea is to reset these guest error counters on guest reset. > Device reset too? I'm not 100% sure as guest can trigger device resets. Yes maybe we can, but I don't know whether that's necessary. If we consider the possiblility of a malicious guest here, resetting the counter after system reset might be dangerous too, since the guest can still flush the host log by the sequence of system reset, trigger the error, system reset, ... Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once 2018-05-30 6:39 ` Peter Xu @ 2018-05-30 13:53 ` Cornelia Huck 2018-06-13 7:59 ` Markus Armbruster 0 siblings, 1 reply; 22+ messages in thread From: Cornelia Huck @ 2018-05-30 13:53 UTC (permalink / raw) To: Peter Xu Cc: Michael S. Tsirkin, Markus Armbruster, Jason Wang, qemu-devel, Philippe Mathieu-Daudé, Halil Pasic On Wed, 30 May 2018 14:39:55 +0800 Peter Xu <peterx@redhat.com> wrote: > On Wed, May 30, 2018 at 07:47:32AM +0300, Michael S. Tsirkin wrote: > > On Thu, May 24, 2018 at 12:44:53PM +0800, Peter Xu wrote: > > > There are many error_report()s that can be used in frequently called > > > functions, especially on IO paths. That can be unideal in that > > > malicious guest can try to trigger the error tons of time which might > > > use up the log space on the host (e.g., libvirt can capture the stderr > > > of QEMU and put it persistently onto disk). > > > > I think the problem is real enough but I think the API > > isn't great as it stresses the mechanism. Which fundamentally does > > not matter - we can print once or 10 times, or whatever. > > > > What happens here is a guest bug as opposed to hypervisor > > bug. So I think a better name would be guest_error. > > For me error_report_once() is okay since after all it's only a way to > dump something for the hypervisor management software (or the person > who manages the QEMU instance), and I don't have a strong opinion to > introduce a new guest_error() API. If we go with that suggestion, guest_{error,warn} should also prefix the message with "Guest:" or so. Otherwise, it does not offer that much more benefit. [And I think it should be a wrapper around the report_once infrastructure.] > > > > > Internally we can still have something similar to this > > mechanism. > > > > Another idea is to reset these guest error counters on guest reset. > > Device reset too? I'm not 100% sure as guest can trigger device resets. > > Yes maybe we can, but I don't know whether that's necessary. If we > consider the possiblility of a malicious guest here, resetting the > counter after system reset might be dangerous too, since the guest can > still flush the host log by the sequence of system reset, trigger the > error, system reset, ... For device reset, we probably should not reset the counters for that reason. System reset is debatable (we might have another guest kernel or so running after a system reset, don't we?) I think the same applies for the vfio-ccw use case referenced in the other branch of this thread. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once 2018-05-30 13:53 ` Cornelia Huck @ 2018-06-13 7:59 ` Markus Armbruster 0 siblings, 0 replies; 22+ messages in thread From: Markus Armbruster @ 2018-06-13 7:59 UTC (permalink / raw) To: Cornelia Huck Cc: Peter Xu, Michael S. Tsirkin, Markus Armbruster, Jason Wang, qemu-devel, Philippe Mathieu-Daudé, Halil Pasic Cornelia Huck <cohuck@redhat.com> writes: > On Wed, 30 May 2018 14:39:55 +0800 > Peter Xu <peterx@redhat.com> wrote: > >> On Wed, May 30, 2018 at 07:47:32AM +0300, Michael S. Tsirkin wrote: >> > On Thu, May 24, 2018 at 12:44:53PM +0800, Peter Xu wrote: >> > > There are many error_report()s that can be used in frequently called >> > > functions, especially on IO paths. That can be unideal in that >> > > malicious guest can try to trigger the error tons of time which might >> > > use up the log space on the host (e.g., libvirt can capture the stderr >> > > of QEMU and put it persistently onto disk). >> > >> > I think the problem is real enough but I think the API >> > isn't great as it stresses the mechanism. Which fundamentally does >> > not matter - we can print once or 10 times, or whatever. >> > >> > What happens here is a guest bug as opposed to hypervisor >> > bug. So I think a better name would be guest_error. >> >> For me error_report_once() is okay since after all it's only a way to >> dump something for the hypervisor management software (or the person >> who manages the QEMU instance), and I don't have a strong opinion to >> introduce a new guest_error() API. > > If we go with that suggestion, guest_{error,warn} should also prefix > the message with "Guest:" or so. Otherwise, it does not offer that much > more benefit. > > [And I think it should be a wrapper around the report_once > infrastructure.] I agree. Keep error_report_once() as low-level function (okay to stress mechanism there), then wrap whatever higher level functions we find useful around them, in followup patches. [...] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once 2018-05-30 4:47 ` Michael S. Tsirkin 2018-05-30 6:39 ` Peter Xu @ 2018-05-30 15:15 ` Halil Pasic 2018-05-30 15:19 ` Michael S. Tsirkin 1 sibling, 1 reply; 22+ messages in thread From: Halil Pasic @ 2018-05-30 15:15 UTC (permalink / raw) To: Michael S. Tsirkin, Peter Xu Cc: Markus Armbruster, Jason Wang, qemu-devel, Philippe Mathieu-Daudé On 05/30/2018 06:47 AM, Michael S. Tsirkin wrote: > On Thu, May 24, 2018 at 12:44:53PM +0800, Peter Xu wrote: >> There are many error_report()s that can be used in frequently called >> functions, especially on IO paths. That can be unideal in that >> malicious guest can try to trigger the error tons of time which might >> use up the log space on the host (e.g., libvirt can capture the stderr >> of QEMU and put it persistently onto disk). > > I think the problem is real enough but I think the API > isn't great as it stresses the mechanism. Which fundamentally does > not matter - we can print once or 10 times, or whatever. > > What happens here is a guest bug as opposed to hypervisor > bug. So I think a better name would be guest_error. I don't agree with your argument against the name report_once Michael. In my reading the commit message describes one of use cases for which the infrastructure introduced by this patch is a supposed to be a good fit. But report_once is not restricted to this example. In my previous life in the userspace I had to debug problems where the original error message got log-rotated away because of an onslaught of error messages that were a consequence of the original one, and not very helpful. IMHO raising the issue of guest_error is a very sane thing to do, but it is a different problem. I think guest_error is about how and to whom the error is to be reported. IMHO report the error to the ones that are affected by it and to the ones that can do something about it (e.g. fix it) is a good rule of thumb. The latter may be different for hypervisor and for guest bugs. In my understanding this is really about spamming the log problem. Of course one can try to solve/mitigate the problem at different levels. It could be declared 1) a problem to be solved in the logging library more or less transparently 2) a problem to be solved by the environment and it's admin (e.g. log aggregation, filtering, and rotation) 3) a problem that the client code of the logging library has to explicitly deal with The once and rate_limited are 3). To sum it up guest error or not and once or not are orthogonal problems in my view. Regards, Halil > > Internally we can still have something similar to this > mechanism. > > Another idea is to reset these guest error counters on guest reset. > Device reset too? I'm not 100% sure as guest can trigger device resets. > [..] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once 2018-05-30 15:15 ` Halil Pasic @ 2018-05-30 15:19 ` Michael S. Tsirkin 2018-05-30 15:30 ` Halil Pasic 0 siblings, 1 reply; 22+ messages in thread From: Michael S. Tsirkin @ 2018-05-30 15:19 UTC (permalink / raw) To: Halil Pasic Cc: Peter Xu, Markus Armbruster, Jason Wang, qemu-devel, Philippe Mathieu-Daudé On Wed, May 30, 2018 at 05:15:19PM +0200, Halil Pasic wrote: > > > On 05/30/2018 06:47 AM, Michael S. Tsirkin wrote: > > On Thu, May 24, 2018 at 12:44:53PM +0800, Peter Xu wrote: > > > There are many error_report()s that can be used in frequently called > > > functions, especially on IO paths. That can be unideal in that > > > malicious guest can try to trigger the error tons of time which might > > > use up the log space on the host (e.g., libvirt can capture the stderr > > > of QEMU and put it persistently onto disk). > > > > I think the problem is real enough but I think the API > > isn't great as it stresses the mechanism. Which fundamentally does > > not matter - we can print once or 10 times, or whatever. > > > > What happens here is a guest bug as opposed to hypervisor > > bug. So I think a better name would be guest_error. > > I don't agree with your argument against the name report_once > Michael. In my reading the commit message describes one of use > cases for which the infrastructure introduced by this patch is > a supposed to be a good fit. But report_once is not restricted > to this example. All I'm saying is that we should distinguish between guest and host errors at code level. > In my previous life in the userspace I had to debug problems > where the original error message got log-rotated away because of an > onslaught of error messages that were a consequence of the original > one, and not very helpful. > > IMHO raising the issue of guest_error is a very sane thing to do, > but it is a different problem. I think guest_error is about how and > to whom the error is to be reported. IMHO report the error to the > ones that are affected by it and to the ones that can do something > about it (e.g. fix it) is a good rule of thumb. The latter may be > different for hypervisor and for guest bugs. > > In my understanding this is really about spamming the log problem. > Of course one can try to solve/mitigate the problem at different > levels. It could be declared > 1) a problem to be solved in the logging library more or less > transparently > 2) a problem to be solved by the environment and it's admin (e.g. > log aggregation, filtering, and rotation) > 3) a problem that the client code of the logging library has to > explicitly deal with > > The once and rate_limited are 3). > > To sum it up guest error or not and once or not are orthogonal > problems in my view. > > Regards, > Halil Right. But as long as we are changing this code, I'd like to see guest errors reported in a way that makes it easy to distinguish them from host errors. > > > > Internally we can still have something similar to this > > mechanism. > > > > Another idea is to reset these guest error counters on guest reset. > > Device reset too? I'm not 100% sure as guest can trigger device resets. > > > > > [..] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once 2018-05-30 15:19 ` Michael S. Tsirkin @ 2018-05-30 15:30 ` Halil Pasic 0 siblings, 0 replies; 22+ messages in thread From: Halil Pasic @ 2018-05-30 15:30 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Peter Xu, Markus Armbruster, Jason Wang, qemu-devel, Philippe Mathieu-Daudé On 05/30/2018 05:19 PM, Michael S. Tsirkin wrote: > On Wed, May 30, 2018 at 05:15:19PM +0200, Halil Pasic wrote: >> >> >> On 05/30/2018 06:47 AM, Michael S. Tsirkin wrote: >>> On Thu, May 24, 2018 at 12:44:53PM +0800, Peter Xu wrote: >>>> There are many error_report()s that can be used in frequently called >>>> functions, especially on IO paths. That can be unideal in that >>>> malicious guest can try to trigger the error tons of time which might >>>> use up the log space on the host (e.g., libvirt can capture the stderr >>>> of QEMU and put it persistently onto disk). >>> >>> I think the problem is real enough but I think the API >>> isn't great as it stresses the mechanism. Which fundamentally does >>> not matter - we can print once or 10 times, or whatever. >>> >>> What happens here is a guest bug as opposed to hypervisor >>> bug. So I think a better name would be guest_error. >> >> I don't agree with your argument against the name report_once >> Michael. In my reading the commit message describes one of use >> cases for which the infrastructure introduced by this patch is >> a supposed to be a good fit. But report_once is not restricted >> to this example. > > All I'm saying is that we should distinguish between > guest and host errors at code level. > > > >> In my previous life in the userspace I had to debug problems >> where the original error message got log-rotated away because of an >> onslaught of error messages that were a consequence of the original >> one, and not very helpful. >> >> IMHO raising the issue of guest_error is a very sane thing to do, >> but it is a different problem. I think guest_error is about how and >> to whom the error is to be reported. IMHO report the error to the >> ones that are affected by it and to the ones that can do something >> about it (e.g. fix it) is a good rule of thumb. The latter may be >> different for hypervisor and for guest bugs. >> >> In my understanding this is really about spamming the log problem. >> Of course one can try to solve/mitigate the problem at different >> levels. It could be declared >> 1) a problem to be solved in the logging library more or less >> transparently >> 2) a problem to be solved by the environment and it's admin (e.g. >> log aggregation, filtering, and rotation) >> 3) a problem that the client code of the logging library has to >> explicitly deal with >> >> The once and rate_limited are 3). >> >> To sum it up guest error or not and once or not are orthogonal >> problems in my view. >> >> Regards, >> Halil > > Right. But as long as we are changing this code, I'd like > to see guest errors reported in a way that makes it > easy to distinguish them from host errors. > I do second these ideas. My goal was to avoid mixing concerns. Regards, Halil [..] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once 2018-05-24 4:44 ` [Qemu-devel] [PATCH v4 1/2] qemu-error: " Peter Xu 2018-05-29 9:30 ` Cornelia Huck 2018-05-30 4:47 ` Michael S. Tsirkin @ 2018-06-13 8:01 ` Markus Armbruster 2018-06-13 9:08 ` Peter Xu 2 siblings, 1 reply; 22+ messages in thread From: Markus Armbruster @ 2018-06-13 8:01 UTC (permalink / raw) To: Peter Xu Cc: qemu-devel, Michael S . Tsirkin, Jason Wang, Philippe Mathieu-Daudé Peter Xu <peterx@redhat.com> writes: > There are many error_report()s that can be used in frequently called > functions, especially on IO paths. That can be unideal in that > malicious guest can try to trigger the error tons of time which might > use up the log space on the host (e.g., libvirt can capture the stderr > of QEMU and put it persistently onto disk). In VT-d emulation code, we > have trace_vtd_error() tracer. AFAIU all those places can be replaced > by something like error_report() but trace points are mostly used to > avoid the DDOS attack that mentioned above. However using trace points > mean that errors are not dumped if trace not enabled. > > It's not a big deal in most modern server managements since we have > things like logrotate to maintain the logs and make sure the quota is > expected. However it'll still be nice that we just provide another way > to restrict message generations. In most cases, this kind of > error_report()s will only provide valid information on the first message > sent, and all the rest of similar messages will be mostly talking about > the same thing. This patch introduces *_report_once() helpers to allow > a message to be dumped only once during one QEMU process's life cycle. > It will make sure: (1) it's on by deffault, so we can even get something default > without turning the trace on and reproducing, and (2) it won't be > affected by DDOS attack. > > To implement it, I stole the printk_once() macro from Linux. > > CC: Eric Blake <eblake@redhat.com> > CC: Markus Armbruster <armbru@redhat.com> > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > include/qemu/error-report.h | 32 ++++++++++++++++++++++++++++++++ > 1 file changed, 32 insertions(+) > > diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h > index e1c8ae1a52..c7ec54cb97 100644 > --- a/include/qemu/error-report.h > +++ b/include/qemu/error-report.h > @@ -44,6 +44,38 @@ void error_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2); > void warn_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2); > void info_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2); > > +/* > + * Similar to error_report(), but it only prints the message once. It > + * returns true when it prints the first time, otherwise false. I like to start function contracts with a single line stating the function's purpose, and I prefer imperative mood, like this: * Similar to error_report(), but it only prints the message once. * Return true when it prints, false otherwise. > + */ > +#define error_report_once(fmt, ...) \ > + ({ \ > + static bool print_once_; \ > + bool ret_print_once_ = !print_once_; \ > + \ > + if (!print_once_) { \ > + print_once_ = true; \ > + error_report(fmt, ##__VA_ARGS__); \ > + } \ > + unlikely(ret_print_once_); \ > + }) Please align the backslashes, say with emacs command c-backslash-region, bound to C-c C-\. > + > +/* > + * Similar to warn_report(), but it only prints the message once. It > + * returns true when it prints the first time, otherwise false. > + */ > +#define warn_report_once(fmt, ...) \ > + ({ \ > + static bool print_once_; \ > + bool ret_print_once_ = !print_once_; \ > + \ > + if (!print_once_) { \ > + print_once_ = true; \ > + warn_report(fmt, ##__VA_ARGS__); \ > + } \ > + unlikely(ret_print_once_); \ > + }) Likewise. > + > const char *error_get_progname(void); > extern bool enable_timestamp_msg; With these nits addressed: Reviewed-by: Markus Armbruster <armbru@redhat.com> I can touch them up when I apply. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once 2018-06-13 8:01 ` Markus Armbruster @ 2018-06-13 9:08 ` Peter Xu 0 siblings, 0 replies; 22+ messages in thread From: Peter Xu @ 2018-06-13 9:08 UTC (permalink / raw) To: Markus Armbruster Cc: qemu-devel, Michael S . Tsirkin, Jason Wang, Philippe Mathieu-Daudé On Wed, Jun 13, 2018 at 10:01:22AM +0200, Markus Armbruster wrote: > Peter Xu <peterx@redhat.com> writes: > > > There are many error_report()s that can be used in frequently called > > functions, especially on IO paths. That can be unideal in that > > malicious guest can try to trigger the error tons of time which might > > use up the log space on the host (e.g., libvirt can capture the stderr > > of QEMU and put it persistently onto disk). In VT-d emulation code, we > > have trace_vtd_error() tracer. AFAIU all those places can be replaced > > by something like error_report() but trace points are mostly used to > > avoid the DDOS attack that mentioned above. However using trace points > > mean that errors are not dumped if trace not enabled. > > > > It's not a big deal in most modern server managements since we have > > things like logrotate to maintain the logs and make sure the quota is > > expected. However it'll still be nice that we just provide another way > > to restrict message generations. In most cases, this kind of > > error_report()s will only provide valid information on the first message > > sent, and all the rest of similar messages will be mostly talking about > > the same thing. This patch introduces *_report_once() helpers to allow > > a message to be dumped only once during one QEMU process's life cycle. > > It will make sure: (1) it's on by deffault, so we can even get something > > default > > > without turning the trace on and reproducing, and (2) it won't be > > affected by DDOS attack. > > > > To implement it, I stole the printk_once() macro from Linux. > > > > CC: Eric Blake <eblake@redhat.com> > > CC: Markus Armbruster <armbru@redhat.com> > > Signed-off-by: Peter Xu <peterx@redhat.com> > > --- > > include/qemu/error-report.h | 32 ++++++++++++++++++++++++++++++++ > > 1 file changed, 32 insertions(+) > > > > diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h > > index e1c8ae1a52..c7ec54cb97 100644 > > --- a/include/qemu/error-report.h > > +++ b/include/qemu/error-report.h > > @@ -44,6 +44,38 @@ void error_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2); > > void warn_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2); > > void info_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2); > > > > +/* > > + * Similar to error_report(), but it only prints the message once. It > > + * returns true when it prints the first time, otherwise false. > > I like to start function contracts with a single line stating the > function's purpose, and I prefer imperative mood, like this: > > * Similar to error_report(), but it only prints the message once. > * Return true when it prints, false otherwise. > > > + */ > > +#define error_report_once(fmt, ...) \ > > + ({ \ > > + static bool print_once_; \ > > + bool ret_print_once_ = !print_once_; \ > > + \ > > + if (!print_once_) { \ > > + print_once_ = true; \ > > + error_report(fmt, ##__VA_ARGS__); \ > > + } \ > > + unlikely(ret_print_once_); \ > > + }) > > Please align the backslashes, say with emacs command c-backslash-region, > bound to C-c C-\. I am with evil mode so mostly I'm using evil-indent. It's strange why the patches were not indented correctly. Now indent will be fine locally if I redo the evil-indent. I must have done something wrong before. :( > > > + > > +/* > > + * Similar to warn_report(), but it only prints the message once. It > > + * returns true when it prints the first time, otherwise false. > > + */ > > +#define warn_report_once(fmt, ...) \ > > + ({ \ > > + static bool print_once_; \ > > + bool ret_print_once_ = !print_once_; \ > > + \ > > + if (!print_once_) { \ > > + print_once_ = true; \ > > + warn_report(fmt, ##__VA_ARGS__); \ > > + } \ > > + unlikely(ret_print_once_); \ > > + }) > > Likewise. > > > + > > const char *error_get_progname(void); > > extern bool enable_timestamp_msg; > > With these nits addressed: > Reviewed-by: Markus Armbruster <armbru@redhat.com> > > I can touch them up when I apply. Thanks, Markus. -- Peter Xu ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v4 2/2] intel-iommu: start to use error_report_once 2018-05-24 4:44 [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once Peter Xu 2018-05-24 4:44 ` [Qemu-devel] [PATCH v4 1/2] qemu-error: " Peter Xu @ 2018-05-24 4:44 ` Peter Xu 2018-06-13 8:05 ` Markus Armbruster 2018-08-15 5:58 ` [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once Markus Armbruster 2 siblings, 1 reply; 22+ messages in thread From: Peter Xu @ 2018-05-24 4:44 UTC (permalink / raw) To: qemu-devel Cc: Philippe Mathieu-Daudé, peterx, Jason Wang, Michael S . Tsirkin, Eric Blake, Markus Armbruster Replace existing trace_vtd_err() with error_report_once() then stderr will capture something if any of the error happens, meanwhile we don't suffer from any DDOS. Then remove the trace point. Since at it, provide more information where proper (now we can pass parameters into the report function). Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Peter Xu <peterx@redhat.com> --- hw/i386/intel_iommu.c | 59 ++++++++++++++++++++++++------------------- hw/i386/trace-events | 1 - 2 files changed, 33 insertions(+), 27 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index fb31de9416..fa6df921ee 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -285,14 +285,14 @@ static void vtd_generate_fault_event(IntelIOMMUState *s, uint32_t pre_fsts) { if (pre_fsts & VTD_FSTS_PPF || pre_fsts & VTD_FSTS_PFO || pre_fsts & VTD_FSTS_IQE) { - trace_vtd_err("There are previous interrupt conditions " - "to be serviced by software, fault event " - "is not generated."); + error_report_once("There are previous interrupt conditions " + "to be serviced by software, fault event " + "is not generated"); return; } vtd_set_clear_mask_long(s, DMAR_FECTL_REG, 0, VTD_FECTL_IP); if (vtd_get_long_raw(s, DMAR_FECTL_REG) & VTD_FECTL_IM) { - trace_vtd_err("Interrupt Mask set, irq is not generated."); + error_report_once("Interrupt Mask set, irq is not generated"); } else { vtd_generate_interrupt(s, DMAR_FEADDR_REG, DMAR_FEDATA_REG); vtd_set_clear_mask_long(s, DMAR_FECTL_REG, VTD_FECTL_IP, 0); @@ -400,20 +400,20 @@ static void vtd_report_dmar_fault(IntelIOMMUState *s, uint16_t source_id, trace_vtd_dmar_fault(source_id, fault, addr, is_write); if (fsts_reg & VTD_FSTS_PFO) { - trace_vtd_err("New fault is not recorded due to " - "Primary Fault Overflow."); + error_report_once("New fault is not recorded due to " + "Primary Fault Overflow"); return; } if (vtd_try_collapse_fault(s, source_id)) { - trace_vtd_err("New fault is not recorded due to " - "compression of faults."); + error_report_once("New fault is not recorded due to " + "compression of faults"); return; } if (vtd_is_frcd_set(s, s->next_frcd_reg)) { - trace_vtd_err("Next Fault Recording Reg is used, " - "new fault is not recorded, set PFO field."); + error_report_once("Next Fault Recording Reg is used, " + "new fault is not recorded, set PFO field"); vtd_set_clear_mask_long(s, DMAR_FSTS_REG, 0, VTD_FSTS_PFO); return; } @@ -421,8 +421,8 @@ static void vtd_report_dmar_fault(IntelIOMMUState *s, uint16_t source_id, vtd_record_frcd(s, s->next_frcd_reg, source_id, addr, fault, is_write); if (fsts_reg & VTD_FSTS_PPF) { - trace_vtd_err("There are pending faults already, " - "fault event is not generated."); + error_report_once("There are pending faults already, " + "fault event is not generated"); vtd_set_frcd_and_update_ppf(s, s->next_frcd_reg); s->next_frcd_reg++; if (s->next_frcd_reg == DMAR_FRCD_REG_NR) { @@ -1339,7 +1339,8 @@ static uint64_t vtd_context_cache_invalidate(IntelIOMMUState *s, uint64_t val) break; default: - trace_vtd_err("Context cache invalidate type error."); + error_report_once("%s: invalid context: 0x%"PRIx64, + __func__, val); caig = 0; } return caig; @@ -1445,7 +1446,8 @@ static uint64_t vtd_iotlb_flush(IntelIOMMUState *s, uint64_t val) am = VTD_IVA_AM(addr); addr = VTD_IVA_ADDR(addr); if (am > VTD_MAMV) { - trace_vtd_err("IOTLB PSI flush: address mask overflow."); + error_report_once("%s: address mask overflow: 0x%"PRIx64, + __func__, vtd_get_quad_raw(s, DMAR_IVA_REG)); iaig = 0; break; } @@ -1454,7 +1456,8 @@ static uint64_t vtd_iotlb_flush(IntelIOMMUState *s, uint64_t val) break; default: - trace_vtd_err("IOTLB flush: invalid granularity."); + error_report_once("%s: invalid granularity: 0x%"PRIx64, + __func__, val); iaig = 0; } return iaig; @@ -1604,8 +1607,8 @@ static void vtd_handle_ccmd_write(IntelIOMMUState *s) /* Context-cache invalidation request */ if (val & VTD_CCMD_ICC) { if (s->qi_enabled) { - trace_vtd_err("Queued Invalidation enabled, " - "should not use register-based invalidation"); + error_report_once("Queued Invalidation enabled, " + "should not use register-based invalidation"); return; } ret = vtd_context_cache_invalidate(s, val); @@ -1625,8 +1628,8 @@ static void vtd_handle_iotlb_write(IntelIOMMUState *s) /* IOTLB invalidation request */ if (val & VTD_TLB_IVT) { if (s->qi_enabled) { - trace_vtd_err("Queued Invalidation enabled, " - "should not use register-based invalidation."); + error_report_once("Queued Invalidation enabled, " + "should not use register-based invalidation"); return; } ret = vtd_iotlb_flush(s, val); @@ -1644,7 +1647,7 @@ static bool vtd_get_inv_desc(dma_addr_t base_addr, uint32_t offset, dma_addr_t addr = base_addr + offset * sizeof(*inv_desc); if (dma_memory_read(&address_space_memory, addr, inv_desc, sizeof(*inv_desc))) { - trace_vtd_err("Read INV DESC failed."); + error_report_once("Read INV DESC failed"); inv_desc->lo = 0; inv_desc->hi = 0; return false; @@ -1999,7 +2002,8 @@ static uint64_t vtd_mem_read(void *opaque, hwaddr addr, unsigned size) trace_vtd_reg_read(addr, size); if (addr + size > DMAR_REG_SIZE) { - trace_vtd_err("Read MMIO over range."); + error_report_once("%s: MMIO over range: addr=0x%"PRIx64 + " size=0x%u", __func__, addr, size); return (uint64_t)-1; } @@ -2050,7 +2054,8 @@ static void vtd_mem_write(void *opaque, hwaddr addr, trace_vtd_reg_write(addr, size, val); if (addr + size > DMAR_REG_SIZE) { - trace_vtd_err("Write MMIO over range."); + error_report_once("%s: MMIO over range: addr=0x%"PRIx64 + " size=0x%u", __func__, addr, size); return; } @@ -2432,7 +2437,8 @@ static int vtd_irte_get(IntelIOMMUState *iommu, uint16_t index, addr = iommu->intr_root + index * sizeof(*entry); if (dma_memory_read(&address_space_memory, addr, entry, sizeof(*entry))) { - trace_vtd_err("Memory read failed for IRTE."); + error_report_once("%s: read failed: ind=0x%"PRIu16 + " addr=0x%"PRIx64, __func__, index, addr); return -VTD_FR_IR_ROOT_INVAL; } @@ -2564,14 +2570,15 @@ static int vtd_interrupt_remap_msi(IntelIOMMUState *iommu, } if (origin->address & VTD_MSI_ADDR_HI_MASK) { - trace_vtd_err("MSI address high 32 bits non-zero when " - "Interrupt Remapping enabled."); + error_report_once("%s: MSI address high 32 bits non-zero detected: " + "address=0x%"PRIx64, __func__, origin->address); return -VTD_FR_IR_REQ_RSVD; } addr.data = origin->address & VTD_MSI_ADDR_LO_MASK; if (addr.addr.__head != 0xfee) { - trace_vtd_err("MSI addr low 32 bit invalid."); + error_report_once("%s: MSI address low 32 bit invalid: 0x%"PRIx32, + __func__, addr.data); return -VTD_FR_IR_REQ_RSVD; } diff --git a/hw/i386/trace-events b/hw/i386/trace-events index 22d44648af..e08cf2a9a7 100644 --- a/hw/i386/trace-events +++ b/hw/i386/trace-events @@ -68,7 +68,6 @@ vtd_ir_remap_msi_req(uint64_t addr, uint64_t data) "addr 0x%"PRIx64" data 0x%"PR vtd_fsts_ppf(bool set) "FSTS PPF bit set to %d" vtd_fsts_clear_ip(void) "" vtd_frr_new(int index, uint64_t hi, uint64_t lo) "index %d high 0x%"PRIx64" low 0x%"PRIx64 -vtd_err(const char *str) "%s" vtd_err_dmar_iova_overflow(uint64_t iova) "iova 0x%"PRIx64 vtd_err_dmar_slpte_read_error(uint64_t iova, int level) "iova 0x%"PRIx64" level %d" vtd_err_dmar_slpte_perm_error(uint64_t iova, int level, uint64_t slpte, bool is_write) "iova 0x%"PRIx64" level %d slpte 0x%"PRIx64" write %d" -- 2.17.0 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v4 2/2] intel-iommu: start to use error_report_once 2018-05-24 4:44 ` [Qemu-devel] [PATCH v4 2/2] intel-iommu: start to use error_report_once Peter Xu @ 2018-06-13 8:05 ` Markus Armbruster 2018-06-13 9:36 ` Auger Eric 0 siblings, 1 reply; 22+ messages in thread From: Markus Armbruster @ 2018-06-13 8:05 UTC (permalink / raw) To: Peter Xu Cc: qemu-devel, Michael S . Tsirkin, Jason Wang, Markus Armbruster, Philippe Mathieu-Daudé Peter Xu <peterx@redhat.com> writes: > Replace existing trace_vtd_err() with error_report_once() then stderr > will capture something if any of the error happens, meanwhile we don't > suffer from any DDOS. Then remove the trace point. Since at it, > provide more information where proper (now we can pass parameters into > the report function). > > Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > hw/i386/intel_iommu.c | 59 ++++++++++++++++++++++++------------------- > hw/i386/trace-events | 1 - > 2 files changed, 33 insertions(+), 27 deletions(-) Michael, would you give your Reviewed-by or Acked-by? I'd take the series through my tree then. [...] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v4 2/2] intel-iommu: start to use error_report_once 2018-06-13 8:05 ` Markus Armbruster @ 2018-06-13 9:36 ` Auger Eric 2018-06-14 12:51 ` Markus Armbruster 0 siblings, 1 reply; 22+ messages in thread From: Auger Eric @ 2018-06-13 9:36 UTC (permalink / raw) To: Markus Armbruster, Peter Xu Cc: Jason Wang, Philippe Mathieu-Daudé, qemu-devel, Michael S . Tsirkin Hi, On 06/13/2018 10:05 AM, Markus Armbruster wrote: > Peter Xu <peterx@redhat.com> writes: > >> Replace existing trace_vtd_err() with error_report_once() then stderr >> will capture something if any of the error happens, meanwhile we don't >> suffer from any DDOS. Then remove the trace point. Since at it, >> provide more information where proper (now we can pass parameters into >> the report function). >> >> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> >> Signed-off-by: Peter Xu <peterx@redhat.com> >> --- >> hw/i386/intel_iommu.c | 59 ++++++++++++++++++++++++------------------- >> hw/i386/trace-events | 1 - >> 2 files changed, 33 insertions(+), 27 deletions(-) > > Michael, would you give your Reviewed-by or Acked-by? I'd take the > series through my tree then. > > [...] > Sorry to enter this thread at this late stage. Just one question: on the smmuv3 emulation code, Peter (Maydell) urged me to use qemu_log_mask(LOG_GUEST_ERROR, ...) whenever the error was triggered by a guest bad behavior. So what is the final guidance to avoid the DOS you mention? Thanks Eric ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v4 2/2] intel-iommu: start to use error_report_once 2018-06-13 9:36 ` Auger Eric @ 2018-06-14 12:51 ` Markus Armbruster 2018-06-14 12:56 ` Peter Maydell 0 siblings, 1 reply; 22+ messages in thread From: Markus Armbruster @ 2018-06-14 12:51 UTC (permalink / raw) To: Auger Eric Cc: Markus Armbruster, Peter Xu, Jason Wang, Michael S . Tsirkin, Philippe Mathieu-Daudé, qemu-devel, Peter Maydell Auger Eric <eric.auger@redhat.com> writes: > Hi, > On 06/13/2018 10:05 AM, Markus Armbruster wrote: >> Peter Xu <peterx@redhat.com> writes: >> >>> Replace existing trace_vtd_err() with error_report_once() then stderr >>> will capture something if any of the error happens, meanwhile we don't >>> suffer from any DDOS. Then remove the trace point. Since at it, >>> provide more information where proper (now we can pass parameters into >>> the report function). >>> >>> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> >>> Signed-off-by: Peter Xu <peterx@redhat.com> >>> --- >>> hw/i386/intel_iommu.c | 59 ++++++++++++++++++++++++------------------- >>> hw/i386/trace-events | 1 - >>> 2 files changed, 33 insertions(+), 27 deletions(-) >> >> Michael, would you give your Reviewed-by or Acked-by? I'd take the >> series through my tree then. >> >> [...] >> > Sorry to enter this thread at this late stage. Just one question: on the > smmuv3 emulation code, Peter (Maydell) urged me to use > qemu_log_mask(LOG_GUEST_ERROR, ...) whenever the error was triggered by > a guest bad behavior. So what is the final guidance to avoid the DOS you > mention? Does the user need to know about the error condition? If yes, we should tell him. Logging is not telling. Peter, what do you think? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v4 2/2] intel-iommu: start to use error_report_once 2018-06-14 12:51 ` Markus Armbruster @ 2018-06-14 12:56 ` Peter Maydell 0 siblings, 0 replies; 22+ messages in thread From: Peter Maydell @ 2018-06-14 12:56 UTC (permalink / raw) To: Markus Armbruster Cc: Auger Eric, Peter Xu, Jason Wang, Michael S . Tsirkin, Philippe Mathieu-Daudé, QEMU Developers On 14 June 2018 at 13:51, Markus Armbruster <armbru@redhat.com> wrote: > Auger Eric <eric.auger@redhat.com> writes: > >> Hi, >> On 06/13/2018 10:05 AM, Markus Armbruster wrote: >>> Peter Xu <peterx@redhat.com> writes: >>> >>>> Replace existing trace_vtd_err() with error_report_once() then stderr >>>> will capture something if any of the error happens, meanwhile we don't >>>> suffer from any DDOS. Then remove the trace point. Since at it, >>>> provide more information where proper (now we can pass parameters into >>>> the report function). >>>> >>>> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> >>>> Signed-off-by: Peter Xu <peterx@redhat.com> >>>> --- >>>> hw/i386/intel_iommu.c | 59 ++++++++++++++++++++++++------------------- >>>> hw/i386/trace-events | 1 - >>>> 2 files changed, 33 insertions(+), 27 deletions(-) >>> >>> Michael, would you give your Reviewed-by or Acked-by? I'd take the >>> series through my tree then. >>> >>> [...] >>> >> Sorry to enter this thread at this late stage. Just one question: on the >> smmuv3 emulation code, Peter (Maydell) urged me to use >> qemu_log_mask(LOG_GUEST_ERROR, ...) whenever the error was triggered by >> a guest bad behavior. So what is the final guidance to avoid the DOS you >> mention? > > Does the user need to know about the error condition? > > If yes, we should tell him. Logging is not telling. > > Peter, what do you think? My view is that "guest does something stupid" should be LOG_GUEST_ERROR, because (a) maybe you'd like to know about it but (b) usually you don't unless you happen to be the person trying to develop or debug the guest OS, in which case you can turn on the logging to help you in that debugging task. thanks -- PMM ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once 2018-05-24 4:44 [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once Peter Xu 2018-05-24 4:44 ` [Qemu-devel] [PATCH v4 1/2] qemu-error: " Peter Xu 2018-05-24 4:44 ` [Qemu-devel] [PATCH v4 2/2] intel-iommu: start to use error_report_once Peter Xu @ 2018-08-15 5:58 ` Markus Armbruster 2018-08-15 6:10 ` Peter Xu 2 siblings, 1 reply; 22+ messages in thread From: Markus Armbruster @ 2018-08-15 5:58 UTC (permalink / raw) To: Peter Xu Cc: qemu-devel, Michael S . Tsirkin, Jason Wang, Philippe Mathieu-Daudé I'm afraid I let this fall through the cracks. Sorry about that! PATCH 2 conflicts semantically with commmit 63b88968f13: the latter adds a trace_vtd_err() the former doesn't replace. Please respin. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once 2018-08-15 5:58 ` [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once Markus Armbruster @ 2018-08-15 6:10 ` Peter Xu 0 siblings, 0 replies; 22+ messages in thread From: Peter Xu @ 2018-08-15 6:10 UTC (permalink / raw) To: Markus Armbruster Cc: qemu-devel, Michael S . Tsirkin, Jason Wang, Philippe Mathieu-Daudé On Wed, Aug 15, 2018 at 07:58:52AM +0200, Markus Armbruster wrote: > I'm afraid I let this fall through the cracks. Sorry about that! > > PATCH 2 conflicts semantically with commmit 63b88968f13: the latter adds > a trace_vtd_err() the former doesn't replace. Please respin. Will do. Regards, -- Peter Xu ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2018-08-15 6:10 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-05-24 4:44 [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once Peter Xu 2018-05-24 4:44 ` [Qemu-devel] [PATCH v4 1/2] qemu-error: " Peter Xu 2018-05-29 9:30 ` Cornelia Huck 2018-05-30 3:30 ` Peter Xu 2018-05-30 13:36 ` Cornelia Huck 2018-06-13 7:56 ` Markus Armbruster 2018-05-30 4:47 ` Michael S. Tsirkin 2018-05-30 6:39 ` Peter Xu 2018-05-30 13:53 ` Cornelia Huck 2018-06-13 7:59 ` Markus Armbruster 2018-05-30 15:15 ` Halil Pasic 2018-05-30 15:19 ` Michael S. Tsirkin 2018-05-30 15:30 ` Halil Pasic 2018-06-13 8:01 ` Markus Armbruster 2018-06-13 9:08 ` Peter Xu 2018-05-24 4:44 ` [Qemu-devel] [PATCH v4 2/2] intel-iommu: start to use error_report_once Peter Xu 2018-06-13 8:05 ` Markus Armbruster 2018-06-13 9:36 ` Auger Eric 2018-06-14 12:51 ` Markus Armbruster 2018-06-14 12:56 ` Peter Maydell 2018-08-15 5:58 ` [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once Markus Armbruster 2018-08-15 6:10 ` Peter Xu
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.