All of lore.kernel.org
 help / color / mirror / Atom feed
* The costs of logging and not logging
@ 2011-11-21 16:29 Mark Kampe
  2011-11-21 18:54 ` Sage Weil
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Kampe @ 2011-11-21 16:29 UTC (permalink / raw)
  To: ceph-devel

The bugs we most dread are situations that only happen rarely,
and are only detected long after the damage has been done.
Given the business we are in, we will face many of them.
We apparently have such bugs open at this very moment.

In most cases, the primary debugging tools one has are
audit and diagnostic logs ... which WE do not have because
they are too expensive (because they are synchronously
written with C++ streams) to leave enabled all the time.

I think it is a mistake to think of audit and diagnostic
logs as a tool to be turned on when we have a problem to
debug.  There should be a basic level of logging that is
always enabled (so we will have data after the first
instance of the bug) ... which can be cranked up from
verbose to bombastic when we find a problem that won't
yield to more moderate interrogation:

  (a) after the problem happens is too late to
      start collecting data.

  (b) these logs are gold mines of information for
      a myriad of purposes we cannot yet even imagine.

This can only be done if the logging mechanism is
sufficiently inexpensive that we are not afraid to
use it:
     low execution artifact from the logging operations
     reansonable memory costs for bufferring
     small enough on disk that we can keep them for months

Not having such a mechanism is (if I correctly
understand) already hurting us for internal debugging,
and will quickly cripple us when we have customer
(i.e. people who cannot diagnose problems for
themselves) problems to debug.

There are many tricks to make logging cheap, and the
sizes acceptable.  There are probably a dozen open-source
implementations that already do what we need, and if they
don't something basic can be built in a two-digit number
of hours.  The real cost is not in the mechanism but in
adapting existing code to use it.  This cost can be
mitigated by making the changes opportunistically ...
one component at a time, as dictated by need/fear.

But we cannot make that change-over until we have a
mechanism.  Because the greatest cost is not the
mechanism, but the change-over, we should give more
than passing thought to what mechanism to choose ...
so that the decision we make remains a good one for
the next few years.

This may be something that we need to do sooner,
rather than later.

regards,
    ---mark---

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: The costs of logging and not logging
  2011-11-21 16:29 The costs of logging and not logging Mark Kampe
@ 2011-11-21 18:54 ` Sage Weil
  2011-11-21 19:31   ` Mark Kampe
  0 siblings, 1 reply; 5+ messages in thread
From: Sage Weil @ 2011-11-21 18:54 UTC (permalink / raw)
  To: Mark Kampe; +Cc: ceph-devel

On Mon, 21 Nov 2011, Mark Kampe wrote:
> The bugs we most dread are situations that only happen rarely,
> and are only detected long after the damage has been done.
> Given the business we are in, we will face many of them.
> We apparently have such bugs open at this very moment.
> 
> In most cases, the primary debugging tools one has are
> audit and diagnostic logs ... which WE do not have because
> they are too expensive (because they are synchronously
> written with C++ streams) to leave enabled all the time.
> 
> I think it is a mistake to think of audit and diagnostic
> logs as a tool to be turned on when we have a problem to
> debug.  There should be a basic level of logging that is
> always enabled (so we will have data after the first
> instance of the bug) ... which can be cranked up from
> verbose to bombastic when we find a problem that won't
> yield to more moderate interrogation:
> 
>  (a) after the problem happens is too late to
>      start collecting data.
> 
>  (b) these logs are gold mines of information for
>      a myriad of purposes we cannot yet even imagine.
> 
> This can only be done if the logging mechanism is
> sufficiently inexpensive that we are not afraid to
> use it:
>     low execution artifact from the logging operations
>     reansonable memory costs for bufferring
>     small enough on disk that we can keep them for months
> 
> Not having such a mechanism is (if I correctly
> understand) already hurting us for internal debugging,
> and will quickly cripple us when we have customer
> (i.e. people who cannot diagnose problems for
> themselves) problems to debug.
> 
> There are many tricks to make logging cheap, and the
> sizes acceptable.  There are probably a dozen open-source
> implementations that already do what we need, and if they
> don't something basic can be built in a two-digit number
> of hours.  The real cost is not in the mechanism but in
> adapting existing code to use it.  This cost can be
> mitigated by making the changes opportunistically ...
> one component at a time, as dictated by need/fear.
> 
> But we cannot make that change-over until we have a
> mechanism.  Because the greatest cost is not the
> mechanism, but the change-over, we should give more
> than passing thought to what mechanism to choose ...
> so that the decision we make remains a good one for
> the next few years.
> 
> This may be something that we need to do sooner,
> rather than later.

Yep.

I see two main issues with the slowness of the current logs:

 - all of the string rendering in the operator<<()'s is slow.  things like 
prefixing every line with a dump of the pg state is great for debugging, 
but makes the overhead very high.  we could scale all of that back, but 
it'd be a big project.
 - the logging always goes to a file, synchronously.  we could write to a 
ring buffer and either write it out only on crash, or (at the very least) 
write it async.

I wonder, though, if something different might work.  gcc lets you 
arbitrarily instrument function calls with -finstrument-functions.  
Something that logs function calls and arguments to an in-memory ring 
buffer and dumps that out on crash could potentially have a low overhead 
(if we restrict it to certain code) and would give us lots of insight into 
what happend leading up to the crash.

sage

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: The costs of logging and not logging
  2011-11-21 18:54 ` Sage Weil
@ 2011-11-21 19:31   ` Mark Kampe
  2011-11-21 19:56     ` Wido den Hollander
  2011-11-21 20:08     ` Noah Watkins
  0 siblings, 2 replies; 5+ messages in thread
From: Mark Kampe @ 2011-11-21 19:31 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

I'm a big believer in asynchronous flushes of an in-memory
ring-buffer.  For user-mode processes a core-dump-grave
robber can reliably pull out all of the un-flushed entries
... and the same process will also work for the vast majority
of all kernel crashes.

String logging is popular because:
   1. It trivial to do (in the instrumented code)
   2. It is trivial to read (in the recovered logs)
   3. It is easily usable by grep/perl/etc type tools

But binary logging:
   1. is much (e.g. 4-10x) smaller (especially for standard header info)
   2. is much faster to take (no formatting, less data to copy)
   3. is much faster to process (no re-parsing)
   4. is smaller to store on disk and faster to ship for diagnosis

   and a log dumper can quickly produce output that
   is identical to what the strings would have been

So I also prefer binary logs ... even though they require
the importation of additional classes.  But ...

  (a) the log classes must be kept upwards compatible so
      that old logs can be ready by newer tools.

  (b) the binary records should glow-in-the-dark, so that
      they can be recovered even from corrupted ring-buffers
      and blocks whose meta-data has been lost.

> I see two main issues with the slowness of the current logs:
>
>   - all of the string rendering in the operator<<()'s is slow.  things like
> prefixing every line with a dump of the pg state is great for debugging,
> but makes the overhead very high.  we could scale all of that back, but
> it'd be a big project.
>   - the logging always goes to a file, synchronously.  we could write to a
> ring buffer and either write it out only on crash, or (at the very least)
> write it async.

> I wonder, though, if something different might work.  gcc lets you
> arbitrarily instrument function calls with -finstrument-functions.
> Something that logs function calls and arguments to an in-memory ring
> buffer and dumps that out on crash could potentially have a low overhead
> (if we restrict it to certain code) and would give us lots of insight into
> what happend leading up to the crash.

This does have the advantage of being automatic ... but it is much more
information, perhaps without much more value.  My experience with
logging is that you don't have to capture very much information,
and that in fact we often go back to weed out no-longer-interesting
information.  Not only does too much information take cycles and
space, but it also makes it hard to find "the good stuff".  I think that 
human architects can make very good decisions about what
information should be captured, and when.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: The costs of logging and not logging
  2011-11-21 19:31   ` Mark Kampe
@ 2011-11-21 19:56     ` Wido den Hollander
  2011-11-21 20:08     ` Noah Watkins
  1 sibling, 0 replies; 5+ messages in thread
From: Wido den Hollander @ 2011-11-21 19:56 UTC (permalink / raw)
  To: Mark Kampe; +Cc: Sage Weil, ceph-devel

Hi,

On 11/21/2011 08:31 PM, Mark Kampe wrote:
> I'm a big believer in asynchronous flushes of an in-memory
> ring-buffer. For user-mode processes a core-dump-grave
> robber can reliably pull out all of the un-flushed entries
> ... and the same process will also work for the vast majority
> of all kernel crashes.
>
> String logging is popular because:
> 1. It trivial to do (in the instrumented code)
> 2. It is trivial to read (in the recovered logs)
> 3. It is easily usable by grep/perl/etc type tools
>
> But binary logging:
> 1. is much (e.g. 4-10x) smaller (especially for standard header info)
> 2. is much faster to take (no formatting, less data to copy)
> 3. is much faster to process (no re-parsing)
> 4. is smaller to store on disk and faster to ship for diagnosis

I'm a fan of this concept as well. In our hosting environments we use 
Varnish a lot to speed up HTTP traffic. This reverse proxy also uses a 
in-memory ring-buffer for logging.

Varnish logs everything into the ring and with processes like 
"varnishlog" you can grab the data you want, filter out whatever you 
want or what you don't want.

You can use varnishncsa to generate a NCSA formatted logfile out of the 
same buffer, without suffering performance in the proxy itself.

A ring-buffer can also be used for statistics, if you'd log a entry into 
the buffer for every operation the OSD does, you would also be able to 
generate statistics out of this, something Ceph is still lacking.

If you are logging to disk and you run into disk I/O problems (or 
network I/O) you'd only loose logging entries.

Wido

>
> and a log dumper can quickly produce output that
> is identical to what the strings would have been
>
> So I also prefer binary logs ... even though they require
> the importation of additional classes. But ...
>
> (a) the log classes must be kept upwards compatible so
> that old logs can be ready by newer tools.
>
> (b) the binary records should glow-in-the-dark, so that
> they can be recovered even from corrupted ring-buffers
> and blocks whose meta-data has been lost.
>
>> I see two main issues with the slowness of the current logs:
>>
>> - all of the string rendering in the operator<<()'s is slow. things like
>> prefixing every line with a dump of the pg state is great for debugging,
>> but makes the overhead very high. we could scale all of that back, but
>> it'd be a big project.
>> - the logging always goes to a file, synchronously. we could write to a
>> ring buffer and either write it out only on crash, or (at the very least)
>> write it async.
>
>> I wonder, though, if something different might work. gcc lets you
>> arbitrarily instrument function calls with -finstrument-functions.
>> Something that logs function calls and arguments to an in-memory ring
>> buffer and dumps that out on crash could potentially have a low overhead
>> (if we restrict it to certain code) and would give us lots of insight
>> into
>> what happend leading up to the crash.
>
> This does have the advantage of being automatic ... but it is much more
> information, perhaps without much more value. My experience with
> logging is that you don't have to capture very much information,
> and that in fact we often go back to weed out no-longer-interesting
> information. Not only does too much information take cycles and
> space, but it also makes it hard to find "the good stuff". I think that
> human architects can make very good decisions about what
> information should be captured, and when.
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: The costs of logging and not logging
  2011-11-21 19:31   ` Mark Kampe
  2011-11-21 19:56     ` Wido den Hollander
@ 2011-11-21 20:08     ` Noah Watkins
  1 sibling, 0 replies; 5+ messages in thread
From: Noah Watkins @ 2011-11-21 20:08 UTC (permalink / raw)
  To: Mark Kampe; +Cc: Sage Weil, ceph-devel

On Mon, Nov 21, 2011 at 11:31, Mark Kampe <mark.kampe@dreamhost.com> wrote:
> But binary logging:
>  1. is much (e.g. 4-10x) smaller (especially for standard header info)
>  2. is much faster to take (no formatting, less data to copy)
>  3. is much faster to process (no re-parsing)
>  4. is smaller to store on disk and faster to ship for diagnosis

Binary logging is great, and certainly do-able. Here is what I think
the negatives are:

The method by which binary logs can achieve reduced sizes, especially
when the bulk of the data is ascii-based context, is using schemas.
For example, log the function name as an enum and extrapolate back to
the readable form in post-processing. This is especially cumbersome
for code bases with ad-hoc logging and lots of churn, as the schema
must continually be maintained.

-Noah
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-11-21 20:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-21 16:29 The costs of logging and not logging Mark Kampe
2011-11-21 18:54 ` Sage Weil
2011-11-21 19:31   ` Mark Kampe
2011-11-21 19:56     ` Wido den Hollander
2011-11-21 20:08     ` Noah Watkins

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.