All of lore.kernel.org
 help / color / mirror / Atom feed
From: Burn Alting <burn.alting@iinet.net.au>
To: Richard Guy Briggs <rgb@redhat.com>
Cc: Linux Audit <linux-audit@redhat.com>
Subject: Re: Occasional delayed output of events
Date: Sat, 16 Jan 2021 13:42:23 +1100	[thread overview]
Message-ID: <bdcc07550a210ac539863ab182b6cc2dabf473e8.camel@iinet.net.au> (raw)
In-Reply-To: <20210116003558.GK2015948@madcap2.tricolour.ca>


[-- Attachment #1.1: Type: text/plain, Size: 10552 bytes --]

On Fri, 2021-01-15 at 19:35 -0500, Richard Guy Briggs wrote:
> On 2021-01-16 09:18, Burn Alting wrote:
> > On Sun, 2021-01-10 at 15:39 +1100, Burn Alting wrote:
> > > On Tue, 2021-01-05 at 07:12 +1100, Burn Alting wrote:
> > > > On Mon, 2021-01-04 at 09:46 -0500, Steve Grubb wrote:
> > > > > On Monday, January 4, 2021 2:55:25 AM EST Burn Alting wrote:
> > > > > > On Sun, 2021-01-03 at 10:41 -0500, Steve Grubb wrote:
> > > > > > > On Friday, January 1, 2021 4:22:33 PM EST Burn Alting wrote:
> > > > > > > > Sometimes, events recorded in /var/log/audit/audit.log
> > > > > > > > appearsomeseconds past co- located events which
> > > > > > > > resultsinauparse:au_check_events() marking  these events complete
> > > > > > > > beforetheyare. An example of this can be seen below with the
> > > > > > > > offending eventid44609.This has been plaguing me for a year or two
> > > > > > > > and this morning wasthefirst time I still had access to the raw
> > > > > > > > audit.log files (I monitor alotof event types and the log files roll
> > > > > > > > over fairly quickly).Theexample below is from a fully patched Centos
> > > > > > > > 7 but I have also seenthison a patched Fedora 32.Has this been seen
> > > > > > > > before? Do we need to re-evaluate howauparse'completes' an event (ie
> > > > > > > > 2 seconds is too quick).
> > > > > > > 
> > > > > > > I have never seen this. But on the way to disk, auditd only
> > > > > > > doeslightprocessing of the event.  If the format is enriched, it looks
> > > > > > > thingsupon a record by record basis. It does not collect events until
> > > > > > > theyarecomplete - it dumps it to disk as soon as it can tack on
> > > > > > > theextrainformation.So, the question would be, does this delay happen
> > > > > > > on the way to disk? Oristhis an artifact of post processing the logs
> > > > > > > with an auparse basedutility?Can this be observed repeatedly on the
> > > > > > > same raw logs? If so,then maybeauparse does have some issue. But if
> > > > > > > this is a postprocessing issue, thenthe wall clock doesn't matter
> > > > > > > because this eventshould have collected uptogether.I'd say this merits
> > > > > > > some investigation.
> > > > > > 
> > > > > > OK. I think this needs to be addressed on two fronts. There may
> > > > > > bemore.A.  Within post processing ... a 2 second timeout is not
> > > > > > sufficient.Iwould suggest we modify auparse.c:au_check_events() to i)
> > > > > > perform theeventtype checks first, then  ii) increase the timeout of 2
> > > > > > seconds to be alarger value based onempirical tests.
> > > > > 
> > > > > In the post processing, there are 2 use cases. The first is events that
> > > > > are ondisk. In this usage, the 2 second timeout does not come into effect
> > > > > becausethe events are run through probably within nanoseconds or
> > > > > microseconds at theworst. The only time it would come into effect is if
> > > > > the terminating record ismissing.
> > > > 
> > > > In this first case, the 2 second timeout is on the event's time, not
> > > > the'processing time'.  See ausearch-lol.c:check_events()
> > > > andauparse.c:au_get_ready_event().And I use the checkpointing code to avoid
> > > > the incomplete event issue.
> > > > In my case, I have not lost records, it's just that an event has arrived on
> > > > diskwith an event time more than 2 seconds after the previously written
> > > > event.Basically,a. The event was delayed getting to auditd and we look to
> > > > the kernel for asolution.b. The event arrived at a reasonable point in time
> > > > at auditd and for some reasonauditd delayed it's printing (by the way I tend
> > > > to use RAW log format, notenriched.
> > > > In either case, I believe ausearch-lol.c:check_events()
> > > > andauparse.c:au_get_ready_event() do need to be changed as we have complete
> > > > eventswritten by auditdwhich these two routines fail to process properly.
> > > Changing the two second timeout in ausearch-lol.c:check_events() and, one
> > > assumesin  auparse.c:au_get_ready_event() (but I have not tested the auparse
> > > code) fixesthe processing  of the delayed event.Changing the value to say 10
> > > seconds fixes my example use case, but given thekernel or auditd could emit an
> > > event with a larger delay, should this be aconfiguration item in
> > > /etc/audit/auditd.conf?
> > > I have raised both a bugzilla report (
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1914603) and Issue (
> > > https://github.com/linux-audit/audit-userspace/issues/148)
> > > How do you want me to proceed ... a simple change to 10 seconds or a
> > > moreversatile configuration item in auditd.conf?I can perform either and issue
> > > a PR if so required.
> > 
> > What do people think ... I would point out this currently occurs in both audit-
> > 2.8.5-4.el7.x86_64 and audit-3.0-0.17.20191104git1c2f876.el8.x86_64 and a
> > justcompiled checkout of the audit-userspace code audit-3.0-1.fc33.x86_64.I have
> > found examples (under audit-3.0-0.17.20191104git1c2f876.el8.x86_64) that Ineed a
> > 15 second delay.I accept there may be an issue in the kernel but if it is not
> > simple (I only seethis occasionally), then we need a fix in the userspace. Since
> > I have found variedtimes required, do I go down the path of a big value OR a
> > configuration item in/etc/audit/auditd.conf?
> 
> Or we go back to userspace code looking for the EOE record?  Thisdoesn't help if
> they arrive out of order.  Do we number the records inthe kernel?  N of M...

I like the N of M concept but there would be a LOT of change - especially for all
the non-kernel event sources. The EOE would be the most seamless, but at a cost.My
preference is to allow the 2 second 'timer' to be configurable.
> > > > > The other use case is realtime processing as an audispd plugin. In this
> > > > > usecase the wall clock could matter because records could potentially get
> > > > > lostdue to overflows or another plugin taking too long. This is the use
> > > > > case wherethe wall clock matters. And again, it matters when records get
> > > > > lost or delayedin transit. As long as everything is flowing, it should not
> > > > > factor into eventprocessing.
> > > > > > B. I will build a temporary auditd daemon to perform some
> > > > > > empiricaltestingto see how long events can reside within the daemon. I
> > > > > > may needsomeadvice on this. I assume that the code that sets the
> > > > > > timestamp isinsrc/auditd.c:send_audit_event().
> > > > > 
> > > > > This is only for audit daemon's internal events. For all "real" events,
> > > > > it'sset in the kernel.
> > > > 
> > > > If that is the case and the kernel is establishing the timestamp, then
> > > > eitherthe kernel has delayed the eventsarrival at the daemon or the daemon
> > > > has delayedit's writing.
> > > > > > If so, I will see if I can put orchestration debug code in to
> > > > > > monitoranevent's 'time in daemon' until this point. I will then report
> > > > > > on this.I believe given that AUDIT_PROCTITLE and AUDIT_EOE is fairly
> > > > > > widespread,thenthe testing switch in A. will not be a big issue (time
> > > > > > cost wise). Itwillalso mean that if we over compensate the timeout that
> > > > > > would causeadditionalmemory cost in auparse() then this is mittigated.
> > > > > 
> > > > > I'd suggest breaking up the event completion tests so that an exact
> > > > > collectiontermination reason code could be associated to the event.
> > > > > > With respect to 'There may be more' fronts. Are there other points
> > > > > > inthe'audit ecosystem' that makes use of the '2 second timeout'.
> > > > > 
> > > > > Ausearch/report has its own special copy of the event collection logic.
> > > > > Itshould be nearly identical to what auparse does.
> > > > 
> > > > They appear identical  ... ausearch-lol.c:check_events()
> > > > andauparse.c:au_get_ready_event().
> > > > > > I will start work on this, this coming weekend if the above makes sense.
> > > > > 
> > > > > One other thought, the current shipping code is audit-3.0, doing a
> > > > > diffbetween it and audit-2.8.5 for the auparse directory does show
> > > > > somedifferences in event collection/grouping/next_event. A lot of the
> > > > > differencesare cosmetic to fix extra whitespace or indentation. But if you
> > > > > skip all that,there are some real changes that probably were because of
> > > > > bug reports. Forexample,
> > > > 
> > > > I will go through these, although this occurs on my Centos 7's (audit-2.8.5-
> > > > 4.el7.x86_64) as well as my 8's (audit-3.0-
> > > > 0.17.20191104git1c2f876.el8.x86_64).
> > > > > @@ -259,15 +260,6 @@ static event_list_t
> > > > > *au_get_ready_event(        if(lowest && lowest->status == EBS_COMPLETE)
> > > > > {                lowest->status =EBS_EMPTY;                au->au_ready
> > > > > --;-               // Try to consolidatethe array so that we
> > > > > iterate-               // over a smaller portion
> > > > > nexttime-               if (lowest == &lol->array[lol
> > > > > ->maxi]){-                       au_lolnode *ptr
> > > > > =lowest;-                       while (ptr->status == EBS_EMPTY && lol-
> > > > > >maxi >0) {-                               lol->maxi
> > > > > --;-                               ptr = &lol->array[lol
> > > > > ->maxi];-                       }-               }                returnlo
> > > > > west->l;        }and@@ -1536,6 +1550,13 @@ static
> > > > > intau_auparse_next_event(auparse                aup_list_create(l);       
> > > > >          aup_list_set_event(l, &e);                aup_list_append(l, au-
> > > > > >cur_buf,au->list_idx, au-
> > > > > > line_number);
> > > > > +               // Eat standalone EOE - main event was already
> > > > > markedcomplete+               if (l->head->type ==
> > > > > AUDIT_EOE){+                       au->cur_buf
> > > > > =NULL;+                       aup_list_clear(l);+                       fr
> > > > > ee(l);+                       continue;+               }                if
> > > > > (au_lol_append(au->au_lo, l) == NULL)
> > > > > {                        free((char*)e.host); #ifdef LOL_EVENTS_DEBUG01I
> > > > > don't know if those have an effect on what you are seeing. But that is
> > > > > theonly substantial changes that I can see.-Steve
> > > 
> > > Burn
> 
> - RGB
> --Richard Guy Briggs <rgb@redhat.com>Sr. S/W Engineer, Kernel Security, Base
> Operating SystemsRemote, Ottawa, Red Hat CanadaIRC: rgb, SunRaycerVoice:
> +1.647.777.2635, Internal: (81) 32635

[-- Attachment #1.2: Type: text/html, Size: 13816 bytes --]

[-- Attachment #2: Type: text/plain, Size: 102 bytes --]

--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit

  reply	other threads:[~2021-01-16  2:43 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-01 21:22 Occasional delayed output of events Burn Alting
2021-01-03 15:41 ` Steve Grubb
2021-01-04  7:55   ` Burn Alting
2021-01-04 14:46     ` Steve Grubb
2021-01-04 20:12       ` Burn Alting
2021-01-10  4:39         ` Burn Alting
2021-01-15 22:18           ` Burn Alting
2021-01-16  0:35             ` Richard Guy Briggs
2021-01-16  2:42               ` Burn Alting [this message]
2021-01-17 14:07                 ` Paul Moore
2021-01-17 21:12                   ` Steve Grubb
2021-01-18 13:54                     ` Paul Moore
2021-01-18 14:31                       ` Steve Grubb
2021-01-18 20:34                         ` Burn Alting
2021-01-18 20:36                         ` Paul Moore
2021-01-19  8:18                           ` Burn Alting
2021-01-19 15:31                             ` Lenny Bruzenak
2021-01-19 19:11                             ` Paul Moore
2021-01-19 19:38                               ` Burn Alting
2021-01-19 20:26                                 ` Paul Moore
2021-01-19 21:51                                   ` Steve Grubb
2021-01-20  6:38                                     ` Burn Alting
2021-01-20 22:50                                       ` Paul Moore
2021-01-23 22:55                                         ` Burn Alting
2021-01-25 23:53                                           ` Steve Grubb
2021-01-26  0:11                                             ` Burn Alting
2021-01-26  0:20                                               ` Steve Grubb
2021-01-26  0:29                                                 ` Burn Alting
2021-01-26 11:53                                                   ` Burn Alting
2021-01-26 20:42                                                     ` Steve Grubb
2021-01-27 12:12                                                       ` Burn Alting
2021-01-19 20:42                               ` Richard Guy Briggs

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bdcc07550a210ac539863ab182b6cc2dabf473e8.camel@iinet.net.au \
    --to=burn.alting@iinet.net.au \
    --cc=burn@swtf.dyndns.org \
    --cc=linux-audit@redhat.com \
    --cc=rgb@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.